So should there be a Jira for this? This wouldn’t fully fix my concern though. I wonder whether the “language” should make it more obvious when dealing with coordinates (row, family, qualifier, ts) rather than values.
Cosmin On 3/1/14, 3:30 PM, "Matt Corgan" <[email protected]> wrote: >Hmm, I don't think KeyValue.hashCode should be including the value. I'm >surprised it hasn't turned up a bug, but maybe that's because there's >barely any code relying on it. Looks like KeyValue.equals now farms out >the work to CellComparator, and maybe KeyValue.hashCode should do the >same. > Note that CellComparator.hashCode does not include the value. > > >On Fri, Feb 28, 2014 at 10:20 AM, Cosmin Lehene <[email protected]> wrote: > >> Thanks Matt, Stack, >> >> My question/comment was biased by the perspective of a co-processor >> implementation, but I guess it may well apply for HBase development. >> From that perspective you're both in HBase-land and Java-land. >> >> A collection of cells needs to be compared to another collection of >>cells >> (I¹m doing a diff). >> Java collections will end up comparing individual objects for equality >>so >> it boils down to a Cell object being equal to another Cell object. So >>from >> a java/oo perspective the question is: are two cells with different >>values >> equal (I.e. Can I swap them?) >> >> The HBase answer is indeed yes they are equal as long as row, family, >> qualifier, timestamp and type are the same. >> >> The Java answer, however may be different (and hence the expectations >>of a >> developer) as, in general it will be based on the known contract. >> >> And the general hashCode contract is >> >> * If two objects are equal according to the equals(Object) method, then >> calling the hashCode method on each of the two objects must produce the >> same integer result. >> >> >> >> And the equals javadoc >> >> * Note that it is generally necessary to override the {@code hashCode} >> * method whenever this method is overridden, so as to maintain the >> * general contract for the {@code hashCode} method, which states >> * that equal objects must have equal hash codes. >> >> >> But in our case, the object equality will pass but hash codes will be >> different (https://gist.github.com/clehene/9276434) >> >> It¹s obvious why the behavior is as is in Hbase, so rather than >> nitpicking, I wonder whether this could be made obvious as it may help >> avoid some unexpected behaviors :) >> >> Thanks, >> Cosmin >> >> On 2/27/14, 10:22 AM, "Stack" <[email protected]> wrote: >> >> >On Wed, Feb 26, 2014 at 8:31 PM, Matt Corgan <[email protected]> >>wrote: >> >.... >> > >> >> But maybe one of the committers could add a sentence to emphasize >>that >> >> value is excluded. >> >> >> >> >> >We should underline that data is not considered comparing Cells >> >(KeyValues). Apart from the fact that it could make for some >>interesting >> >performance issues, the system isn't plumbed for dealing with >>coordinates >> >that differ in their value only. Rather, the mvcc/sequenceid is used >> >splitting Cells whose coordinates are otherwise the same). >> > >> >What was your expectation mighty Cosmin? What you think HBase should >>do >> >with values that differ in value only? >> > >> >Thanks, >> >St.Ack >> >>
