Re: Routing with Google Protocol Buffers

Joshua Kramer Sat, 21 Feb 2009 17:52:48 -0800

I agree that it's preferable to have one exchange handle multiplemessage types, as it reduces code maintenance. Here are a few relevantquestions that I thought of.

Does XQuilla require the entire XML document, or can its documentprojection feature tell you how much of the PB message you have toXML-ify before it can get a valid match?

How much value would a speed increase provide on structured data? Doesthis discussion have practical value, or is it just a science experiment?

If I were to implement a PB exchange, I would do so in the followingmanner, after having read the documentation on message formats(http://code.google.com/apis/protocolbuffers/docs/encoding.html). Idon't think I would implement a new query language; that doesn't makesense for the reasons you outlined. I don't think it would be difficultto implement an XPath mechanism, if it contained a subset of XPath.

A. On exchange subscription, the client gives the exchange an XPathquery. The exchange then opens the protocol spec file and determinesthe key number of the field we want. If the requested element is at thetop level, the rule is simple: field n equals value y. If the requestedelement is one or more levels deep, we build a rule chain: first, getfield n; then, get field o; then, get field p. (Field p is in theobject that is field o, which in turn is in the object that is field n).


B. When we get a message:

1. If the MSB = 0 AND we are not working on something, skip tonext byte.

      2. If the MSB = 1:

i. right-shift 1 step. Does the field number equal the onewe're referencing? If yes:a. Get the data and move forward the number of bytesspecified by the type and length.b. Test the data to see if it matches the right ofthe = in the XPath. If yes, route the message as specified by thesubscription and return.c. If the XPath rule is a chain, and the data matchesthis link, then repeat from step 2 on this particular object using thenext link in the rulechain.

                  d. Goto 1.
            ii. If no:

a. Skip the number of bytes specified in the type andlength.

                  b. Goto 1.

It seems that this would use fewer CPU cycles than XML-ifying the entiremessage, and that doesn't include running the resulting data throughXQzilla.

Having said that - it may be necessary to use the full set of XPathfunctionality, and in that case we'd have to XMLify the message.


Thoughts?

Cheers,
-Josh

Jonathan Robie wrote:

Joshua Kramer wrote:
Jonathan Robie wrote:
There is a reflection API for protocol buffers that would allow youto easily create an XML representation:
Good thoughts, Jonathan. I hadn't considered doing it that waybefore. Here's a question, though... how many CPU cycles would yourmethod take, compared to modifying XQuilla (or creating our own querymechanism) to directly route the messages as they exist in the wireformat they enter the broker? One of the primary benefits of usingPB with QPid is the speed with which structured data may be processed.
I rather suspect that the difference in processing time would be muchsmaller than the overhead of reading the message, but this issomething best found by trying it and measuring it, then optimizing.If we can get good enough performance, I see a real advantage to usingone exchange type for XML, Protocol Buffers, and JSON, and using thesame language to specify criteria for all three.
If we create our own query mechanism, we wind up creating our ownquery language, I've done this a few times in different settings, andit takes work to get it right. And it would be a language used by avery small community. If we use a standard structured query language,XQuery seems to be the main contender.
XQilla can query many kinds of input - a Xerces DOM tree, an istream(which requires serialized XML), a SAX Stream, among others. Itprobably optimizes best for an istream, because it does "documentprojection", which means that it does not parse the entire document ifthe query clearly requires only part of the document. This is of mostbenefit when the message content is large.
Jonathan

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Re: Routing with Google Protocol Buffers

Reply via email to