Like you said, a column is required for a row to exist, else if it
doesn't exist you don't need to delete it right? :)

What we do for fast "almost row key only" scanning is using the
FirstKeyOnlyFilter on the Scan. See the RowCounter job's code:

Scan scan = new Scan();
scan.setFilter(new FirstKeyOnlyFilter());

Then you can even setCaching to some high number for really fast
scanning, although deleting will still be the bottleneck.

J-D

On Tue, Nov 2, 2010 at 3:14 AM, Henning Blohm <[email protected]> wrote:
> Hi,
>
>  I need to delete a range of rows from an HBase table. A time-to-live
> setting as proposed in
>
> http://www.mail-archive.com/[email protected]/msg09492.html
>
> will not do as there will be no clear point in time when that clean up
> will be required / adviced.
>
> The way it is implemented now essentially looks like this:
>
>                        HTable c = _table(<table>);
>                        Scan s = new Scan("".getBytes(),endKey.getBytes());
>                        s.addColumn(<family>.getBytes());
>                        ResultScanner rs = c.getScanner(s);
>                        try {
>                                Result r;
>                                while ((r=rs.next())!=null) {
>                                        c.delete(new Delete(r.getRow()));
>                                }
>                        } finally {
>                                rs.close();
>                        }
>                        c.flushCommits();
>
> While that works, it is suboptimal as it seems to require to define a
> column or column family
> to retrieve data for. Worse: It seems that a column is required that is
> always present to
> really hit all relevant rows.
>
> However, all that is required is the keys!
>
> I found https://issues.apache.org/jira/browse/HBASE-1481 and was
> wondering whether
> there has been any progress on that.
>
> What is the best way to accomplish something like key-only scanning?
>
> Thanks,
>  Henning
>

Reply via email to