And just to be clear, since there are several definitions of key flying around 
- in the following case:

row1,colfam1,colqual1,4 -> valueA
row1,colfam1,colqual1,5 -> valueB

These can coexist peacefully - although the versioning iterator might supress 
all but k versions.

in this case:

row1,colfam1,colqual1,4 -> valueA
row1,colfam1,colqual1,4 -> valueB

Accumulo should throw one away arbitrarily. I think what you mentioned, a 
system iterator that performs this logic, would be a good implementation.

On Dec 22, 2011, at 5:09 PM, Keith Turner wrote:

> On Thu, Dec 22, 2011 at 4:49 PM, Aaron Cordova <[email protected]> wrote:
>> I think it's fine to consider different versions of 'identical keys', 
>> meaning row,colfam,colqual, because in that case the implementation still 
>> treats two keys that only differ by timestamp as two unique keys. But I 
>> don't think we should allow multiple identical _versions_ of identical keys, 
>> to use your terminology. I think we should throw all but one away if the 
>> user does happen to try to insert them and if the user wants to aggregate 
>> across values, he or she must use different version numbers or timestamps or 
>> whatever.
>> 
>> If generating unique timestamps within mutations that want to perform 
>> several updates to the same row,colfam,colqual is a problem, why don't we 
>> allow the user to 'put()' multiple updates into a mutation, and on the 
>> server then assign slightly different timestamps to the identical 
>> row,colfam,colqual triples that are found in a mutation. Would that make 
>> everyone happy?
> 
> This still does not address the issue of separate mutations inserting
> the exact same key.  Also timestamps are only set on the keys in a
> mutation if the user does not set them.
> 
> So if a table comes to have multiple keys that are exactly the same,
> what do you propose?  That we drop them?  Which one will you drop?
> One nice thing about Accumulo is that if you wish to have this
> behavior, you can very easily write an iterator to do it.  I think you
> are proposing that we configure an iterator to do this by default?
> 
> I think if the user is inserting things with exact same key and
> expecting it to behave like a treemap (honor order of arrival), then
> it never will.  Even if we drop duplicate keys, we will not achieve
> the map behavior you described.

Reply via email to