Hi all, Using 0.92.2
We're looking into custom garbage collection methods. Due to some business logic, we'd like to be able to delete rows based on the value of one of the columns, these deletes can be eventual rather than immediate. We have written a Map Reduce job that works, but we aren't sure if it's fast enough in the long run. I have two questions: Would it be possible to implement a coprocessor that would essentially do the column value check during a major compaction, and only write rows that pass the check? I'm not sure this is feasible because based on what I understand, the reads occur at the key-value level and not the row level. Since our deletes can be eventual, would it be possible/faster to just tombstone the rows rather than delete them during our map reduce job, and let the major compaction handle the actual deletion? If I'm not mistaken addDeleteMarker would be the method for this. Thanks for your time. Troy Bryant
