Thanks for your reply, Ted. I looked into the coprocessor example you provided. It will definitely address my specific need. However, two aspects of this approach seem less than ideal to me: 1. Being a coprocessor service, I believe the endpoint needs to be pre-installed on the region servers. This is not possible in typical cases where the user does not have influence over the HBase installation or administrators. 2. In my use case, I already know the row key for which I need the specified column qualifier prefixes to be deleted. Using a scan for just one known row, as in the coprocessor example, appears to be a bit of an overkill...
Overall, the coprocessor approach seems somewhat like using a hammer to push in a pushpin. Specifying a filter from the client side is much easier and more straightforward, IMHO. On Wed, Dec 24, 2014 at 2:01 PM, Ted Yu <[email protected]> wrote: > Have you looked > at > hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example/BulkDeleteEndpoint.java > to see if it fits your need ? > > Cheers > > On Wed, Dec 24, 2014 at 1:34 PM, Devaraja Swami <[email protected]> > wrote: > > > Are there any plans for including a Filter for Delete? > > Currently, the only way seems to be via checkAndDelete in HTable/Table. > > This is helpful but does not cover all use cases. > > > > For e.g., I use column qualifier prefixes as a sort of poor man's 2rd > level > > of indexing (i.e, 3 levels of indexing comprising row key --> column > > qualifier prefix --> column qualifier suffix). This works well for Get > and > > Scan, since I can use a prefix column qualifier filter for the 2nd > indexing > > level. > > However, I am not able to specify that an entire set of column qualifiers > > sharing the same prefix should be deleted, without doing a Get first to > > identify all the full column qualifier values with the same prefix, and > > then adding those qualifiers to the Delete. This is obviously highly > > inefficient. > > > > checkAndDelete doesn't help here since it does not support prefix tests. > > Moreover, I cannot just add a new column family for every unique column > > qualifier prefix I need in my data model. In general, using just one > column > > family per table seems to be most efficient. > > > > I can think of other use cases where one would need to delete a lot of > > columns that match one of the available HBase filters, but whose exact > > column qualifier values are not known at deletion time at the client. > > > > All these uses cases can be taken care of by allowing Delete to support a > > setFilter method, exactly as in the case of Get and Scan. > > >
