Re: Problems storing RoaringBitmaps

2017-03-17 Thread Denis Magda
Hi Luke,

Is there any chance you can create a similar test using Java so that we can run 
it on our side?

In the meanwhile, the warning below just says that your object can’t be 
deserialized into Ignite’s binary form:

>>> 10:16:45.730 [djura.local ~ nREPL-worker-3 ~ 
>>> o.a.i.internal.binary.BinaryContext] Class 
>>> "org.roaringbitmap.RoaringBitmap" cannot be serialized using 
>>> BinaryMarshaller because it either implements Externalizable interface or 
>>> have writeObject/readObject methods. OptimizedMarshaller will be used 
>>> instead and class instances will be deserialized on the server. Please 
>>> ensure that all nodes have this class in classpath. To enable binary 
>>> serialization either implement Binarylizable interface or set explicit 
>>> serializer using BinaryTypeConfiguration.setSerializer() method. 

Usually this happens when an object is of Externalizable class or overrides 
writeObject/readObject methods. Refer to “restrictions” callout from this page:
https://apacheignite.readme.io/docs/binary-marshaller#section-basic-concepts 


In general, it’s not severe if you see this warning. It will just mean that you 
need to have classes across your cluster nodes if the object might be 
serialized on the servers side (SQL queries execution for instance).

> Well, this introduces other issues, as Clojure's immutable data structures 
> rely on different semantics for hashCode. So now I get duplicate keys when I 
> use BinaryMarshaller.


Why do you use this object as a key? Just in case take a look here:
https://apacheignite.readme.io/docs/binary-marshaller#handling-hash-code-generation-and-equals-execution
 


—
Denis

> On Mar 15, 2017, at 5:19 PM, Luke Burton  wrote:
> 
> 
> Well, this introduces other issues, as Clojure's immutable data structures 
> rely on different semantics for hashCode. So now I get duplicate keys when I 
> use BinaryMarshaller.
> 
> Switching back to OptimizedMarshaller, this duplicate key problem goes away 
> (I'll deal with having to do this another time). But now there is a new 
> problem: wrong byte order, presumably due to RoaringBitmap's endian order: 
> https://github.com/RoaringBitmap/RoaringBitmap/issues/47 
> 
> 
> You can see it clearly here in the last 32 bits:
> 
> 3A 30 00 00 02 00 00 00  01 00 00 00 98 00 00 00  .0..
> 18 00 00 00 1A 00 00 00  9F 86 7F 96  
> 3A 30 00 00 02 00 00 00  01 00 00 00 98 00 00 00  .0..
> 18 00 00 00 1A 00 00 00  86 9F 96 7F  
> 
> Frustrating! Is there a way to customize the serializer for 
> OptimizedMarshaller, or otherwise stop it from twiddling these bits?
> 
> 
>> On Mar 15, 2017, at 11:24 AM, Luke Burton > > wrote:
>> 
>> 
>> I did just manage to use BinarySerializer and registered it as a custom 
>> serializer, it seems to work pending some more tests. Again, curious to hear 
>> if this approach is appropriate for the use case. Here's the Clojure version:
>> 
>> (defn roaring-serializer []
>>  (reify BinarySerializer
>>(writeBinary [this o binaryWriter]
>>  ; o contains our object.
>>  (let [byte-stream (ByteArrayOutputStream.)]
>>(with-open [out-stream (DataOutputStream. byte-stream)]
>>  (.serialize ^RoaringBitmap o out-stream))
>>(.writeByteArray binaryWriter "val" (.toByteArray byte-stream
>> 
>>(readBinary [this o binaryReader]
>>  ; o contains an empty object
>>  (let [raw-bytes (.readByteArray binaryReader "val")
>>byte-stream (ByteArrayInputStream. raw-bytes)]
>>(with-open [in-stream (DataInputStream. byte-stream)]
>>  (.deserialize ^RoaringBitmap o in-stream))
>> 
>> 
>> 
>> 
>>> On Mar 15, 2017, at 10:34 AM, Luke Burton >> > wrote:
>>> 
>>> 
>>> Hi there,
>>> 
>>> I'm storing RoaringBitmaps in Ignite and have encountered an odd 
>>> serialization issue. Please forgive the samples below being in Clojure, 
>>> I've written a small wrapper around most Ignite APIs that I can use. I 
>>> think you can catch the gist of what I'm doing, I hope :)
>>> 
>>> (let [inst (i/instance "attribute_bitmaps")
>>>   bmp (doto (RoaringBitmap.)
>>> (.add 999))
>>>   srz (fn [it]
>>> (let [b (ByteArrayOutputStream.)]
>>>   (with-open [o (DataOutputStream. b)]
>>> (.serialize it o))
>>>   (.toByteArray b)))]
>>> 
>>>   (doto inst
>>> (i/clear)
>>> (i/put "test" bmp))
>>> 
>>>   (byte-streams/print-bytes (srz bmp))
>>>   (byte-streams/print-bytes (srz (i/get inst "test"
>>> 
>>> Here I'm just creating a bitmap and storing it in a cache. I'm then just 
>>> printing the bytes

Re: Problems storing RoaringBitmaps

2017-03-17 Thread Luke Burton

Well, this introduces other issues, as Clojure's immutable data structures rely 
on different semantics for hashCode. So now I get duplicate keys when I use 
BinaryMarshaller.

Switching back to OptimizedMarshaller, this duplicate key problem goes away 
(I'll deal with having to do this another time). But now there is a new 
problem: wrong byte order, presumably due to RoaringBitmap's endian order: 
https://github.com/RoaringBitmap/RoaringBitmap/issues/47 


You can see it clearly here in the last 32 bits:

3A 30 00 00 02 00 00 00  01 00 00 00 98 00 00 00  .0..
18 00 00 00 1A 00 00 00  9F 86 7F 96  
3A 30 00 00 02 00 00 00  01 00 00 00 98 00 00 00  .0..
18 00 00 00 1A 00 00 00  86 9F 96 7F  

Frustrating! Is there a way to customize the serializer for 
OptimizedMarshaller, or otherwise stop it from twiddling these bits?


> On Mar 15, 2017, at 11:24 AM, Luke Burton  wrote:
> 
> 
> I did just manage to use BinarySerializer and registered it as a custom 
> serializer, it seems to work pending some more tests. Again, curious to hear 
> if this approach is appropriate for the use case. Here's the Clojure version:
> 
> (defn roaring-serializer []
>  (reify BinarySerializer
>(writeBinary [this o binaryWriter]
>  ; o contains our object.
>  (let [byte-stream (ByteArrayOutputStream.)]
>(with-open [out-stream (DataOutputStream. byte-stream)]
>  (.serialize ^RoaringBitmap o out-stream))
>(.writeByteArray binaryWriter "val" (.toByteArray byte-stream
> 
>(readBinary [this o binaryReader]
>  ; o contains an empty object
>  (let [raw-bytes (.readByteArray binaryReader "val")
>byte-stream (ByteArrayInputStream. raw-bytes)]
>(with-open [in-stream (DataInputStream. byte-stream)]
>  (.deserialize ^RoaringBitmap o in-stream))
> 
> 
> 
> 
>> On Mar 15, 2017, at 10:34 AM, Luke Burton  wrote:
>> 
>> 
>> Hi there,
>> 
>> I'm storing RoaringBitmaps in Ignite and have encountered an odd 
>> serialization issue. Please forgive the samples below being in Clojure, I've 
>> written a small wrapper around most Ignite APIs that I can use. I think you 
>> can catch the gist of what I'm doing, I hope :)
>> 
>> (let [inst (i/instance "attribute_bitmaps")
>>   bmp (doto (RoaringBitmap.)
>> (.add 999))
>>   srz (fn [it]
>> (let [b (ByteArrayOutputStream.)]
>>   (with-open [o (DataOutputStream. b)]
>> (.serialize it o))
>>   (.toByteArray b)))]
>> 
>>   (doto inst
>> (i/clear)
>> (i/put "test" bmp))
>> 
>>   (byte-streams/print-bytes (srz bmp))
>>   (byte-streams/print-bytes (srz (i/get inst "test"
>> 
>> Here I'm just creating a bitmap and storing it in a cache. I'm then just 
>> printing the bytes of the thing I stored, as well as what it looks like 
>> coming back out of Ignite.
>> 
>> I get the following warning in the logs:
>> 
>> 10:16:45.730 [djura.local ~ nREPL-worker-3 ~ 
>> o.a.i.internal.binary.BinaryContext] Class "org.roaringbitmap.RoaringBitmap" 
>> cannot be serialized using BinaryMarshaller because it either implements 
>> Externalizable interface or have writeObject/readObject methods. 
>> OptimizedMarshaller will be used instead and class instances will be 
>> deserialized on the server. Please ensure that all nodes have this class in 
>> classpath. To enable binary serialization either implement Binarylizable 
>> interface or set explicit serializer using 
>> BinaryTypeConfiguration.setSerializer() method. 
>> 
>> And really strangely, I get the same number of bytes back but some of them 
>> at the end have been zero'd out (first one is correct, second one went 
>> through Ignite):
>> 
>> 3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
>> 7F 96  ..
>> 3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
>> 00 00  ..
>> 
>> The resulting object is still a valid RoaringBitmap, except all the values 
>> in the bitmap are wrong! 
>> 
>> I'm guessing from this and the logs, that OptimizedMarshaller is being used 
>> instead of BinaryMarshaller, and OptimizedMarshaller is not copying the 
>> internal fields of the class correctly.
>> 
>> Would the recommended approach here be to create a custom class that extends 
>> RoaringBitmap and implements Binarylizable? I'm not sure if Binarylizable is 
>> a suitable approach for this situation where I don't control the source for 
>> the class in question. I have no knowledge of the internal fields of this 
>> class and really just want to ensure it survives the roundtrip through 
>> Ignite by using its own internal serialization mechanism.
>> 
>> Luke.
>> 
> 



Re: Problems storing RoaringBitmaps

2017-03-15 Thread Luke Burton

I did just manage to use BinarySerializer and registered it as a custom 
serializer, it seems to work pending some more tests. Again, curious to hear if 
this approach is appropriate for the use case. Here's the Clojure version:

(defn roaring-serializer []
  (reify BinarySerializer
(writeBinary [this o binaryWriter]
  ; o contains our object.
  (let [byte-stream (ByteArrayOutputStream.)]
(with-open [out-stream (DataOutputStream. byte-stream)]
  (.serialize ^RoaringBitmap o out-stream))
(.writeByteArray binaryWriter "val" (.toByteArray byte-stream

(readBinary [this o binaryReader]
  ; o contains an empty object
  (let [raw-bytes (.readByteArray binaryReader "val")
byte-stream (ByteArrayInputStream. raw-bytes)]
(with-open [in-stream (DataInputStream. byte-stream)]
  (.deserialize ^RoaringBitmap o in-stream))




> On Mar 15, 2017, at 10:34 AM, Luke Burton  wrote:
> 
> 
> Hi there,
> 
> I'm storing RoaringBitmaps in Ignite and have encountered an odd 
> serialization issue. Please forgive the samples below being in Clojure, I've 
> written a small wrapper around most Ignite APIs that I can use. I think you 
> can catch the gist of what I'm doing, I hope :)
> 
> (let [inst (i/instance "attribute_bitmaps")
>bmp (doto (RoaringBitmap.)
>  (.add 999))
>srz (fn [it]
>  (let [b (ByteArrayOutputStream.)]
>(with-open [o (DataOutputStream. b)]
>  (.serialize it o))
>(.toByteArray b)))]
> 
>(doto inst
>  (i/clear)
>  (i/put "test" bmp))
> 
>(byte-streams/print-bytes (srz bmp))
>(byte-streams/print-bytes (srz (i/get inst "test"
> 
> Here I'm just creating a bitmap and storing it in a cache. I'm then just 
> printing the bytes of the thing I stored, as well as what it looks like 
> coming back out of Ignite.
> 
> I get the following warning in the logs:
> 
> 10:16:45.730 [djura.local ~ nREPL-worker-3 ~ 
> o.a.i.internal.binary.BinaryContext] Class "org.roaringbitmap.RoaringBitmap" 
> cannot be serialized using BinaryMarshaller because it either implements 
> Externalizable interface or have writeObject/readObject methods. 
> OptimizedMarshaller will be used instead and class instances will be 
> deserialized on the server. Please ensure that all nodes have this class in 
> classpath. To enable binary serialization either implement Binarylizable 
> interface or set explicit serializer using 
> BinaryTypeConfiguration.setSerializer() method. 
> 
> And really strangely, I get the same number of bytes back but some of them at 
> the end have been zero'd out (first one is correct, second one went through 
> Ignite):
> 
> 3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
> 7F 96  ..
> 3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
> 00 00  ..
> 
> The resulting object is still a valid RoaringBitmap, except all the values in 
> the bitmap are wrong! 
> 
> I'm guessing from this and the logs, that OptimizedMarshaller is being used 
> instead of BinaryMarshaller, and OptimizedMarshaller is not copying the 
> internal fields of the class correctly.
> 
> Would the recommended approach here be to create a custom class that extends 
> RoaringBitmap and implements Binarylizable? I'm not sure if Binarylizable is 
> a suitable approach for this situation where I don't control the source for 
> the class in question. I have no knowledge of the internal fields of this 
> class and really just want to ensure it survives the roundtrip through Ignite 
> by using its own internal serialization mechanism.
> 
> Luke.
> 



Problems storing RoaringBitmaps

2017-03-15 Thread Luke Burton

Hi there,

I'm storing RoaringBitmaps in Ignite and have encountered an odd serialization 
issue. Please forgive the samples below being in Clojure, I've written a small 
wrapper around most Ignite APIs that I can use. I think you can catch the gist 
of what I'm doing, I hope :)

 (let [inst (i/instance "attribute_bitmaps")
bmp (doto (RoaringBitmap.)
  (.add 999))
srz (fn [it]
  (let [b (ByteArrayOutputStream.)]
(with-open [o (DataOutputStream. b)]
  (.serialize it o))
(.toByteArray b)))]

(doto inst
  (i/clear)
  (i/put "test" bmp))

(byte-streams/print-bytes (srz bmp))
(byte-streams/print-bytes (srz (i/get inst "test"

Here I'm just creating a bitmap and storing it in a cache. I'm then just 
printing the bytes of the thing I stored, as well as what it looks like coming 
back out of Ignite.

I get the following warning in the logs:

10:16:45.730 [djura.local ~ nREPL-worker-3 ~ 
o.a.i.internal.binary.BinaryContext] Class "org.roaringbitmap.RoaringBitmap" 
cannot be serialized using BinaryMarshaller because it either implements 
Externalizable interface or have writeObject/readObject methods. 
OptimizedMarshaller will be used instead and class instances will be 
deserialized on the server. Please ensure that all nodes have this class in 
classpath. To enable binary serialization either implement Binarylizable 
interface or set explicit serializer using 
BinaryTypeConfiguration.setSerializer() method. 

And really strangely, I get the same number of bytes back but some of them at 
the end have been zero'd out (first one is correct, second one went through 
Ignite):

3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
7F 96  ..
3A 30 00 00 01 00 00 00  98 00 00 00 10 00 00 00  .0..
00 00  ..

The resulting object is still a valid RoaringBitmap, except all the values in 
the bitmap are wrong! 

I'm guessing from this and the logs, that OptimizedMarshaller is being used 
instead of BinaryMarshaller, and OptimizedMarshaller is not copying the 
internal fields of the class correctly.

Would the recommended approach here be to create a custom class that extends 
RoaringBitmap and implements Binarylizable? I'm not sure if Binarylizable is a 
suitable approach for this situation where I don't control the source for the 
class in question. I have no knowledge of the internal fields of this class and 
really just want to ensure it survives the roundtrip through Ignite by using 
its own internal serialization mechanism.

Luke.