Re: [infinispan-dev] data interoperability and remote querying

Manik Surtani Wed, 10 Apr 2013 09:45:43 -0700

Yes.  We haven't quite designed how remote querying will work, but we have a 
few ideas.  First, let me explain  how in-VM indexing works.  An object's 
fields are appropriately annotated so that when it is stored in Infinispan with 
a put(), Hibernate Search can extract the fields and values, flatten it into a 
Lucene-friendly "document", and associate it with the entry's key for searching 
later.

Now one approach to doing this when storing objects remotely is the 
serialisation format.  A format that can be parsed on the server side for easy 
indexing.  An example of this could be JSON (an appropriate transformation will 
need to exist on the server side to strip out irrelevant fields before 
indexing).  This would be completely platform-independent, and also support the 
interop you described below.  The drawback?  Slow JSON serialisation and 
deserialization, and a very verbose data stream.

Another approach may be to perform the field extraction on the client side, so 
that the data sent to the server would be key=XXX (binary), value=YYY (binary), 
indexing_metadata=ZZZ (JSON).  This way the server does not need to be able to 
parse the value for indexing, since the field data it needs is already provided 
in a platform-independent manner (JSON).  The benefit here is that keys and 
values can still be binary, and can use an efficient marshaller.  The drawback, 
is that field extraction needs to happen on the client.  Not hard for the Java 
client (bits of Hibernate Search could be reused), but for non-Java clients 
this may increase complexity of those clients quite a bit (much easier for 
dynamic language clients - python/ruby).  This approach does *not* solve your 
problem below, because for interop you will still need a platform-independent 
serialisation mechanism like Avro or ProtoBufs for the object <--> blob <--> 
object conversion.

Personally, I prefer the second approach since it separates concerns (portable 
indexes vs. portable values) plus would lead to (IMO) a better-performing 
implementation.  I'd love to hear others' thoughts though.

Cheers
Manik

On 10 Apr 2013, at 17:11, Mircea Markus <[email protected]> wrote:

> That is write the Person object in Java and read a Person object in C#, 
> assume a hotrod client for simplicity.
> Now at some point we'll have to run a query over the same hotrod, something 
> like "give me all the Persons named Mircea".
> At this stage, the server side needs to be aware of the Person object in 
> order to be able to run the query and select the relevant Persons. It needs a 
> schema. Instead of suggesting Avro as an data interoperability protocol, we 
> might want to define and use this schema instead: we'd need it anyway for 
> remote querying and we won't have two ways of doing the same thing.
> Thoughts? 
> 
> Cheers,
> -- 
> Mircea Markus
> Infinispan lead (www.infinispan.org)
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
[email protected]
twitter.com/maniksurtani

Platform Architect, JBoss Data Grid
http://red.ht/data-grid

_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] data interoperability and remote querying

Reply via email to