On Wed, May 9, 2012 at 2:43 PM, Adam Fuchs <[email protected]> wrote: > I would also add that "small number of entries" in this case is probably > measured in the millions or tens of millions. If you're talking about > deleting more entries than that then you might start to look into the > iterator method.
Just to clarify, a filter is a type of iterator. > > Cheers, > Adam > > > On Wed, May 9, 2012 at 11:01 AM, Billie J Rinaldi > <[email protected]> wrote: >> >> On Wednesday, May 9, 2012 10:31:46 AM, "Sean Pines" <[email protected]> >> wrote: >> > I have a use case that involves me removing a record from Accumulo >> > based on the Row ID and the Column Family. >> > >> > In the shell, I noticed the command "deletemany" which allows you to >> > specify column family/column qualifier. Is there an equivalent of this >> > in the Java API? >> > >> > In the Java API, I noticed the method: >> > deleteRows(String tableName, org.apache.hadoop.io.Text start, >> > org.apache.hadoop.io.Text end) >> > Delete rows between (start, end] >> > >> > However that only seems to work for deleting a range of RowIDs >> > >> > I would also imagine that deleting rows is costly; is there a better >> > way to approach something like this? >> > The workaround I have for now is to just overwrite the row with an >> > empty string in the value field and ignore any entries that have that. >> > However this just leaves lingering rows for each "delete" and I'd like >> > to avoid that if at all possible. >> > >> > Thanks! >> >> Connector provides a createBatchDeleter method. You can set the range and >> columns for BatchDeleter just like you would with a Scanner. This is not an >> efficient operation (despite the current javadocs for BatchDeleter), but it >> works well if you're deleting a small number of entries. It scans for the >> affected key/value pairs, pulls them back to the client, then inserts >> deletion entries for each. The deleteRows method, on the other hand, is >> efficient because large ranges can just be dropped. If you want to delete a >> lot of things and deleteRows won't work for you, consider using a majc scope >> Filter that filters out what you don't want, compact the table, then remove >> the filter. >> >> Billie > >
