RE: Question about AbstractType class
Thanks Sylvain. Your answer already helped me out a lot! I was using a ByteBuffer.get function that is changing the ByteBuffer's position. And I got all kinds of stranges effects and exceptions I didn't get in 0.6.x. Changed that code and all problems are gone... Many thanks!! Ignace -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Wednesday, April 20, 2011 4:04 PM To: user@cassandra.apache.org Subject: Re: Question about AbstractType class On Wed, Apr 20, 2011 at 3:06 PM, Desimpel, Ignace wrote: > As said above, the remaing bytes won't (always) be the actual bytes. Sorry I answered a bit quickly, I meant to say that the actual bytes won't (always) be the full backing array. That is, we never guarantee that BB.arrayOffset() == 0, nor BB.position() == 0, nor BB.limit() == backingArray.length. But the remaining() bytes will be the actual bytes, my bad. -- Sylvain
Re: Question about AbstractType class
On Wed, Apr 20, 2011 at 3:06 PM, Desimpel, Ignace wrote: > As said above, the remaing bytes won't (always) be the actual bytes. Sorry I answered a bit quickly, I meant to say that the actual bytes won't (always) be the full backing array. That is, we never guarantee that BB.arrayOffset() == 0, nor BB.position() == 0, nor BB.limit() == backingArray.length. But the remaining() bytes will be the actual bytes, my bad. -- Sylvain
RE: Question about AbstractType class
-Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Wednesday, April 20, 2011 2:07 PM To: user@cassandra.apache.org Subject: Re: Question about AbstractType class On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace wrote: > Cassandra version 0.7.4 > > > > Hi, > > > > I created my own java class as an extension of the AbstractType class. > But I'm not sure about the following items related to the compare function : > > # The remaining bytes of the buffer sometimes is zero during thrift > get_slice execution, however I never store any zero length column name > nor query for it . If normal, what would be the correct handling of > the zero remaining bytes? It is normal, the empty ByteBuffer is used in slice queries to indicate the beginning of the row (start=""). More generally, compare and validate should work for anything you store but also anything you provide for the 'start' and 'end' argument of slices. > Would it be something like : > > public int compare(ByteBuffer o1, ByteBuffer o2){ int ar1Rem = > o1.remaining(); int ar2Rem = o2.remaining(); if ( ar1Rem == 0 || > ar2Rem == 0 ) { if ( ar1Rem != 0 ) { > return 1; > } else if ( ar2Rem != 0 ) { > return -1; > } else { > return 0; > } > } > //Add the real compare here > ...} That looks reasonable (though not optimal in the number of comparison :)) ->OK > # Since in version 0.6.3 the same function was passing an array of > bytes, I assumed that I could now call the ByteBuffer.array() function > in order to get the array of bytes backing up the ByteBuffer. It's not that simple. First, even if you use ByteBuffer.array(), you'll have to be careful that the ByteBuffer has a position, a limit and an arrayOffset and you should take that into account when accessing the backing array. But there is also no guarantee that the ByteBuffer will have a backing array so you need to handle this case too (I refer you to the ByteBuffer documentation). ->OK > Also the length of the > byte array in 0.6.3 seemed always to correspond to the bytes of column > name stored. But now in version 0.7.4 that ByteBuffer is not always > backed by such an array. > > I can still get around this by making the needed buffer myself like : > > int ar2Rem = o2.remaining(); >> byte[] ar2 = new byte[ar2Rem]; >> o2.get(ar2, 0, ar2Rem); > > Question is : Are the remaining bytes the actual bytes for this column > name > (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around > some larger stream of data and the remaining bytes number could be 10 M bytes. > Thus I would not be able to detect the end of the column to compare > and I would possibly be allocating a large unneeded byte array? As said above, the remaing bytes won't (always) be the actual bytes. ->Then how do I know the end is near? Eg.: If the stored value is a char string, it would be nice to know the end. Unless I also store it before the char string. ->Assuming that both ByteBuffers have the same data and the same position and limit, thus same remaining, one can imagine a loop comparing each byte until the remaining is used up. Thus then I can not get any more data and thus I should return 0? > #Using the ByteBuffer's 'get' function also updates the position of > the ByteBuffer. Is the compare function expected to do that or should > it reset the position back to what it was or ...? Neither. You should *not* use any function that change the ByteBuffer position. That is, changing it and resetting it afterward is *not* ok. ->OK Instead you should only use only the absolute get() methods, that do not change the position at all. Or, you start your compare function by calling BB.duplicate() on both buffers and then you're free to change the position of the duplicates. ->OK -- Sylvain Thanks Sylvain!
Re: Question about AbstractType class
On Wed, Apr 20, 2011 at 1:35 PM, Desimpel, Ignace wrote: > Cassandra version 0.7.4 > > > > Hi, > > > > I created my own java class as an extension of the AbstractType class. But > I’m not sure about the following items related to the compare function : > > # The remaining bytes of the buffer sometimes is zero during thrift > get_slice execution, however I never store any zero length column name nor > query for it . If normal, what would be the correct handling of the zero > remaining bytes? It is normal, the empty ByteBuffer is used in slice queries to indicate the beginning of the row (start=""). More generally, compare and validate should work for anything you store but also anything you provide for the 'start' and 'end' argument of slices. > Would it be something like : > > public int compare(ByteBuffer o1, ByteBuffer o2){ > int ar1Rem = o1.remaining(); > int ar2Rem = o2.remaining(); > if ( ar1Rem == 0 || ar2Rem == 0 ) { > if ( ar1Rem != 0 ) { > return 1; > } else if ( ar2Rem != 0 ) { > return -1; > } else { > return 0; > } > } > //Add the real compare here > …….} That looks reasonable (though not optimal in the number of comparison :)) > # Since in version 0.6.3 the same function was passing an array of bytes, I > assumed that I could now call the ByteBuffer.array() function in order to > get the array of bytes backing up the ByteBuffer. It's not that simple. First, even if you use ByteBuffer.array(), you'll have to be careful that the ByteBuffer has a position, a limit and an arrayOffset and you should take that into account when accessing the backing array. But there is also no guarantee that the ByteBuffer will have a backing array so you need to handle this case too (I refer you to the ByteBuffer documentation). > Also the length of the > byte array in 0.6.3 seemed always to correspond to the bytes of column name > stored. But now in version 0.7.4 that ByteBuffer is not always backed by > such an array. > > I can still get around this by making the needed buffer myself like : > > int ar2Rem = o2.remaining(); >> byte[] ar2 = new byte[ar2Rem]; >> o2.get(ar2, 0, ar2Rem); > > Question is : Are the remaining bytes the actual bytes for this column name > (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around some > larger stream of data and the remaining bytes number could be 10 M bytes. > Thus I would not be able to detect the end of the column to compare and I > would possibly be allocating a large unneeded byte array? As said above, the remaing bytes won't (always) be the actual bytes. > #Using the ByteBuffer’s ‘get’ function also updates the position of the > ByteBuffer. Is the compare function expected to do that or should it reset > the position back to what it was or …? Neither. You should *not* use any function that change the ByteBuffer position. That is, changing it and resetting it afterward is *not* ok. Instead you should only use only the absolute get() methods, that do not change the position at all. Or, you start your compare function by calling BB.duplicate() on both buffers and then you're free to change the position of the duplicates. -- Sylvain
Question about AbstractType class
Cassandra version 0.7.4 Hi, I created my own java class as an extension of the AbstractType class. But I'm not sure about the following items related to the compare function : # The remaining bytes of the buffer sometimes is zero during thrift get_slice execution, however I never store any zero length column name nor query for it . If normal, what would be the correct handling of the zero remaining bytes? Would it be something like : public int compare(ByteBuffer o1, ByteBuffer o2){ int ar1Rem = o1.remaining(); int ar2Rem = o2.remaining(); if ( ar1Rem == 0 || ar2Rem == 0 ) { if ( ar1Rem != 0 ) { return 1; } else if ( ar2Rem != 0 ) { return -1; } else { return 0; } } //Add the real compare here ...} # Since in version 0.6.3 the same function was passing an array of bytes, I assumed that I could now call the ByteBuffer.array() function in order to get the array of bytes backing up the ByteBuffer. Also the length of the byte array in 0.6.3 seemed always to correspond to the bytes of column name stored. But now in version 0.7.4 that ByteBuffer is not always backed by such an array. I can still get around this by making the needed buffer myself like : int ar2Rem = o2.remaining(); byte[] ar2 = new byte[ar2Rem]; o2.get(ar2, 0, ar2Rem); Question is : Are the remaining bytes the actual bytes for this column name (eg: 20 bytes) or would that ByteBuffer ever be some wrapper around some larger stream of data and the remaining bytes number could be 10 M bytes. Thus I would not be able to detect the end of the column to compare and I would possibly be allocating a large unneeded byte array? #Using the ByteBuffer's 'get' function also updates the position of the ByteBuffer. Is the compare function expected to do that or should it reset the position back to what it was or ...? Or maybe there is some good documentation I should read? Ignace