On Wed, Sep 21, 2011 at 10:53 AM, Gordon Sim <[email protected]> wrote:
> On 09/21/2011 02:57 PM, Rajith Attapattu wrote:
>>
>> On Wed, Sep 21, 2011 at 2:52 AM, Jiri Krutil<[email protected]>  wrote:
>>>
>>> Rajith
>>>
>>> I think this makes perfect sense from the JMS point of view and it also
>>> works fine if all peers are Java clients.
>>
>>> What I find quite surprising and unfortunate is that if a C++ client
>>> sends a
>>> message to a Java client, the string message properties set by the C++
>>> client are not visible by default for the Java client. (They only become
>>> visible if the C++ client sets the property encoding to "utf8".)
>>>
>>> IMHO this should work by default, especially if both client libraries are
>>> from the same vendor.
>>> (Not sure if it would be better to make the C++ client use the utf8
>>> encoding
>>> by default, or to change the Java client to map byte[] properties to
>>> Strings.)
>>
>> The JMS client can't really treat all byte[] properties as Strings, as
>> there could be properties that does need to be treated as "byte[]".
>
> That seems a somewhat illogical reason, since in the case where the
> properties need to be of type byte[] they aren't accessible through JMS
> anyway (as you have indicated).
>
Let me explain this a bit further.

There is a get/setObjectProperty method available in the Message interface.
As per the spec the type of the property value should be one of
Boolean, Number or String.

However we use this as an extension point to allow byte[] atm.
I believe there is a JIRA asking to allow UUID. It also makes sense to
allow lists and Maps as well as the AMQP spec has defined encodings
and hence can be retrieved by an AMQP complaint client.

JMS defines that any property value should be able to be retrieved as a String.
Hence the reason why we drop any properties from the
getPropertyNames() enumeration who value is not a String, Boolean or a
Number.

But that doesn't mean those properties with values like byte[] (and
soon UUID, list and maps) should are unaccessible.
If an application wants to retrieve a byte[]  they can use the
getObjectProperty method to do so.

That is why I mentioned that we cannot treat all byte[] as Strings, as
there can be properties which should be treated as byte[].

> I do however agree with the essence of the point, that the java client
> should not try to convert from byte[] to String by assuming an encoding.
>
>> IMO the C++ client should use a default encoding if a property is set
>> as a String.
>
> I assume you mean it should use utf8 as the default encoding where none is
> set?
>
> Unlike a java.lang.String, the c++ std::string does not imply textual data
> or any character set, it is simply a sequence of bytes. The possibilities
> therefore include:
>
> (i) assume (from the context) that the string is utf8
> (ii) check whether the string contains any chars that are not legal for
> utf8, and treat it as utf8 if not
>
> The issue with the first option is that if the string is not valid utf8,
> then encoding of the message will fail. The issue with the second option is
> that it imposes checking every time such a property is set. Both of these
> can of course be avoided by setting the encoding explicitly.
>
> With the first option, if the assumption is incorrect the error is clear and
> the fix obvious. With the second option there is a potential hit on
> performance that will as often as not result in people trying out explicit
> encoding anyway just to eliminate it as a concern.
>
> So while the second is more logically correct, I suspect the first is a more
> practical means of allowing the majority of users to ignore explicit
> encodings.
>
> An alternative line to take is to make setting the encoding more obvious
> (e.g. by ensuring our examples do so, which they currently do not) and
> perhaps simpler. Or we could allow assignment of std::wstring, arguably
> closer in role to java.lang.String, and encourage people to use that for
> textual data.

That seems like a reasonable approach to me.
I believe the same happens in python unless you specifically mark it
with u' to say it's a unicode.
So maybe the applications need to explicitly provide an encoding or
use std::wstring.

> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:[email protected]
>
>

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to