Qualification: My response misleads. I was responding to delete of an explicit version and got carried away. Please see Lars' answer for the proper response. St.Ack
On Mon, Dec 9, 2013 at 1:30 PM, Stack <[email protected]> wrote: > On Mon, Dec 9, 2013 at 4:47 PM, Niels Basjes <[email protected]> wrote: > >> >> Why has it been designed/implemented like this? >> What is the logic behind this model? >> > > Hey Niels: > > It is probably fair to call this an instance of implementation leaking and > polluted our data model. We should fix it. > > Currently, deletes always sort before all other types when all other > coordinates are the same (same row, same column family, same timestamp, > etc.) IIRC, it was done this way along time ago because it made delete > reasoning 'easier'. This forced sort ordering is why you see the behavior > you note in your shell experiments. > > Our Sergey recently has suggested we undo our factoring in 'type' when > sorting KeyValues/Cells; rather, we would distinguish pivoting on sequence > id when all else matches. Awkwardly, we'd then have to let user add > sequence id when querying a specific Cell. This would not be easy to do. > Sequence id is an internal, amorphous notion at the moment -- it exists > while KeyValues are in flight but is (mostly) dropped after KeyValues > persist to hfiles -- but it looks like it is fast becoming more tangible > given some issues that arise around WAL replay at recovery time and in > corner cases replicating. > > What is your thinking on this Niels? Its current implementation > interrupts your ability building an app on hbase? > > Thanks, > St.Ack >
