I agree that it's preferable to have one exchange handle multiple message types, as it reduces code maintenance. Here are a few relevant questions that I thought of.

Does XQuilla require the entire XML document, or can its document projection feature tell you how much of the PB message you have to XML-ify before it can get a valid match?

How much value would a speed increase provide on structured data? Does this discussion have practical value, or is it just a science experiment?

If I were to implement a PB exchange, I would do so in the following manner, after having read the documentation on message formats (http://code.google.com/apis/protocolbuffers/docs/encoding.html). I don't think I would implement a new query language; that doesn't make sense for the reasons you outlined. I don't think it would be difficult to implement an XPath mechanism, if it contained a subset of XPath.

A. On exchange subscription, the client gives the exchange an XPath query. The exchange then opens the protocol spec file and determines the key number of the field we want. If the requested element is at the top level, the rule is simple: field n equals value y. If the requested element is one or more levels deep, we build a rule chain: first, get field n; then, get field o; then, get field p. (Field p is in the object that is field o, which in turn is in the object that is field n).

B. When we get a message:
1. If the MSB = 0 AND we are not working on something, skip to next byte.
      2. If the MSB = 1:
i. right-shift 1 step. Does the field number equal the one we're referencing? If yes: a. Get the data and move forward the number of bytes specified by the type and length. b. Test the data to see if it matches the right of the = in the XPath. If yes, route the message as specified by the subscription and return. c. If the XPath rule is a chain, and the data matches this link, then repeat from step 2 on this particular object using the next link in the rulechain.
                  d. Goto 1.
            ii. If no:
a. Skip the number of bytes specified in the type and length.
                  b. Goto 1.

It seems that this would use fewer CPU cycles than XML-ifying the entire message, and that doesn't include running the resulting data through XQzilla.

Having said that - it may be necessary to use the full set of XPath functionality, and in that case we'd have to XMLify the message.

Thoughts?

Cheers,
-Josh

Jonathan Robie wrote:
Joshua Kramer wrote:
Jonathan Robie wrote:
There is a reflection API for protocol buffers that would allow you to easily create an XML representation:
Good thoughts, Jonathan. I hadn't considered doing it that way before. Here's a question, though... how many CPU cycles would your method take, compared to modifying XQuilla (or creating our own query mechanism) to directly route the messages as they exist in the wire format they enter the broker? One of the primary benefits of using PB with QPid is the speed with which structured data may be processed.
I rather suspect that the difference in processing time would be much smaller than the overhead of reading the message, but this is something best found by trying it and measuring it, then optimizing. If we can get good enough performance, I see a real advantage to using one exchange type for XML, Protocol Buffers, and JSON, and using the same language to specify criteria for all three.

If we create our own query mechanism, we wind up creating our own query language, I've done this a few times in different settings, and it takes work to get it right. And it would be a language used by a very small community. If we use a standard structured query language, XQuery seems to be the main contender.

XQilla can query many kinds of input - a Xerces DOM tree, an istream (which requires serialized XML), a SAX Stream, among others. It probably optimizes best for an istream, because it does "document projection", which means that it does not parse the entire document if the query clearly requires only part of the document. This is of most benefit when the message content is large.

Jonathan

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to