Re: m.putDelete versus RowDeletingIterator?

Keith Turner Wed, 09 Oct 2013 15:30:20 -0700

On Wed, Oct 9, 2013 at 4:21 PM, Eric Newton <[email protected]> wrote:

> They do different things.
>
> Deleting mutations marks each entry with a delete marker.  Using the
> iterator marks a whole row with a single mutation.
>
> If you have a million entries in your row, the iterator is faster for
> the delete, but requires a seek to the start of the row for every
> read, so reads are slower.
>
> If your row has one entry, they are the same thing.
>
> Somewhere under N keys... the mutation path will be quite fast, and
> still preserve your reading speed.  I'll just pull a number out of
> thin air... let's say a few thousand.
>

The iterator may still be useful even if rows have few columns because a
row can be deleted w/o reading the row.  W/ m.putDelete() you may need to
read the row and insert a delete for each column value.   If you know what
columns to delete then you can avoid the read

If I have 10M rows to delete, each row having 10 unpredictable columns.
 With the iterator I can batch write 10M row deletion mutations.   Without
the iterator I do 10M seeks, 100M reads and write 100M deletes.

>
> -Eric
>
>
>
> On Wed, Oct 9, 2013 at 4:01 PM, David Medinets <[email protected]>
> wrote:
> > Are there any reason to favor one approach over the other?
>

Re: m.putDelete versus RowDeletingIterator?

Reply via email to