Re: Use of AMQShortString in client side code

Rafael Schloming Tue, 18 Sep 2007 18:52:39 -0700

John O'Hara wrote:

I agree with Rob that the lower levels of the stack should be implemented in
AMQPShortString *where it occurs in the protocol* for the following reasons:


1) It provides the opportunity to validate the semantics; just because we're
not checking length today doesn't mean we shouldn't

AMQShortString really isn't the appropriate place to validate domainlevel semantics. Different uses of shortstr have different domain levelconstraints. Also, any validation we put in AMQShortString is forced torun for every single shortstr field that passes through a broker. Thisisn't particularly useful because when decoding fields off the wire,such validation is unnecessary as it is already performed by the codecin a more efficient manner that is specialized to the wire format.

2) We may introduce AMQPShortStrong Tokenisation in the protocol in the
future (has been discussed often, I think it's quite likely).  Doing this we
can collapse a shortstring to 2 bytes and reduce garbage.

I presume you're referring to some scheme for caching commonly usedstrings? If so this is a decoding optimization that would equally wellapply when decoding directly to Strings, or any other type for thatmatter. In fact such an optimization would likely nullify anyperformance advantage rendered by AMQShortString since decoding/encodingof anything would only be necessary when there is a cache miss.

3) I'm unsure of the memory ownership semantics but I believe the JMS spec
explicitly requires a copy of the message to be take to prevent grim race
conditions on message reuse.  Some products have the option to turn this
off, but that's not the spec.  It's like not DMA'ing from userspace without
extreme care.

I'm unsure how this relates to the use of AMQShortString. Any suchcopying would happen well past the point where raw types are decoded offthe wire.

Also, Rob has said it has been proven to be faster in the past.
In the absence of a measured, demonstrable issue why change this arguably
more correct implementation?

As it stands today AMQShortString is really just an optimization for thebroker, and one that comes at a pretty high cost to the client. So ifthere is a better way to solve the performance issue for the brokerwithout encumbering the client, it's certainly worth investigating.

That's why I asked about the original problem being solved. For exampleI'd guess that in the critical path the broker really never needs todecode much more than the exchange name and routing key in order todeliver a message, so it might be possible to limit the use ofAMQShortString to just those fields (or decode to specific Exchange andRoutingKey classes) and get the necessary performance benefit in thebroker, with much less impact on the client.


--Rafael

Cheers
John


On 19/09/2007, Rafael Schloming <[EMAIL PROTECTED]> wrote:

Robert Godfrey wrote:

On 13/09/2007, Rajith Attapattu <[EMAIL PROTECTED]> wrote:

I am wondering why we are using AMQShortString indiscriminately all

over

the
client side code?
There is no performance benefit of using AMQShortString (based on the

way

it
is used) on the client side and is purely used for encoding.



Rajith,

as we have discussed before - there *is* a significant performance

benefit

which we have tested and proved previously.

Can you point me to the previous discussion? I'd like to learn more
about the original issue.

   Many short strings are re-used

frequently within the client library, and by using our own type we can
exploit this.

Unless we're excessively copying them I don't see how this matters. For
both an AMQShortString and a String we should just be passing around
pointers when they are reused.

   Further, the domain for many parameters in AMQP is *not* a

unicode string, but is tightly defined as upto 255 bytes of data with a
particular encoding.  Java Strings are not the appropriate type to use

for

this.  Encoding and decoding Java Strings is expensive, and also prone

to

error (i.e. you need to make sure that you *always* use the correct

explicit

encoding).

Despite the name AMQShortString, I don't think the AMQShortString class
actually represents the AMQP type short-string, for example there is no
length limit for an AMQShortString. It's really just a generic
implementation of CharSequence that is optimized specifically for rapid
decoding from a ByteBuffer. From a domain restriction perspective, using
an ordinary String is just as correct.

It makes sense to use it on Broker side as you deal at bytes level and I

can

understand the performance benefit of not having convert back and forth
into
a String.


The low level API should be using correct AMQ domains.  High level APIs
(such as JMS) will obviously want to present these parameters as java
Strings.


On the client side we just merely wrap/unwrap a String using

AMQShortString.

Why can't we do that at the encoding/decoding level for the client side


In some cases this may be true, but in others certainly not.  When
converting into JMS Destinations on receipt of a message, for instance,

one

never needs to convert to a String... it is *much* faster to simply use

the

correct type of AMQShortString/

Unfortunately using AMQShortString imposes additional overhead whenever
we need to en/decode to/from an ordinary String. It basically requires
an additional copy when compared with directly encoding/decoding to/from
  a String. As the common case on the client side is dealing with
Strings, I'm not at all convinced that ubiquitous use of AMQShortString
is a net win for the client.

I believe what would be optimal is to use the CharSequence interface
everywhere. This way String values passed to us by an application could
be directly passed all the way down the stack and encoded directly onto
the wire without an additional copy, and incoming data could be
efficiently decoded into a private impl of CharSequence that could be
converted to a String on demand.

--Rafael

Re: Use of AMQShortString in client side code

Reply via email to