On Thu, Dec 22, 2011 at 4:49 PM, Aaron Cordova <[email protected]> wrote:
> I think it's fine to consider different versions of 'identical keys', meaning 
> row,colfam,colqual, because in that case the implementation still treats two 
> keys that only differ by timestamp as two unique keys. But I don't think we 
> should allow multiple identical _versions_ of identical keys, to use your 
> terminology. I think we should throw all but one away if the user does happen 
> to try to insert them and if the user wants to aggregate across values, he or 
> she must use different version numbers or timestamps or whatever.
>
> If generating unique timestamps within mutations that want to perform several 
> updates to the same row,colfam,colqual is a problem, why don't we allow the 
> user to 'put()' multiple updates into a mutation, and on the server then 
> assign slightly different timestamps to the identical row,colfam,colqual 
> triples that are found in a mutation. Would that make everyone happy?

This still does not address the issue of separate mutations inserting
the exact same key.  Also timestamps are only set on the keys in a
mutation if the user does not set them.

So if a table comes to have multiple keys that are exactly the same,
what do you propose?  That we drop them?  Which one will you drop?
One nice thing about Accumulo is that if you wish to have this
behavior, you can very easily write an iterator to do it.  I think you
are proposing that we configure an iterator to do this by default?

I think if the user is inserting things with exact same key and
expecting it to behave like a treemap (honor order of arrival), then
it never will.  Even if we drop duplicate keys, we will not achieve
the map behavior you described.

Reply via email to