Re: [ltk-d] Java LTK and non XML characters

Gordon Waidhofer Fri, 29 Feb 2008 09:59:07 -0800

>From libxml:
void xmlNodeSetContent (xmlNodePtr cur, const xmlChar * content)

The 'content' argument points to a nul terminated string.
My point being that the trailing nul of content is not
printed with an escape.

Clearly, C/C++ represent strings using nul termination.
Other languages use counted array. Some languages use
lists of char. The upshot is that representation is
whatever somebody says it is.

We can say LLRP UTF8 strings are encoded over-the-wire
with terminating nul. It has to be byte counted for
proper encode/decode, but that transfer byte count does
not have to be considered the strlen(). The same way
different programing languages arbitrarily define
how "string" is represented, so can we.

To achieve equivalence, we don't have to say that the
XML text representation has to print every byte of a
binary transfer. We define the binary and XML representations
of LLRP UTF8 strings. Libxml APIs convert between XML text
and native nul terminated strings. That doesn't require
the XML text to have &#00 splattered through-out the text.
Further, Libxml has to malloc() N+1 bytes to hold the string
the same way (I suggest) LLRP binary must transfer N+1
bytes to convey a UTF8 string. In neither case would
N+1 be deemed the length; N is the length.

Can we make a statement about the over-the-wire definition
of a string? In particular, clarify that the transfer
byte count is not necessarily the strlen()? And, of course,
the rules for interior and trailing nuls? I'd even be
OK saying that a transfer of 100 bytes with a nul at 50
results in a strlen() of 50.

Regards,
    -gww

-----Original Message-----
From:   [EMAIL PROTECTED] on behalf of Paul Dietrich
Sent:   Fri 2/29/2008 9:32 AM
To:     LLRP Toolkit Development List
Cc:     
Subject:        Re: [ltk-d] Java LTK and non XML characters

Yes, the spec didn't help.  It would be nice to define some rules in
LLRP about the legal characters in strings. Water under the bridge ...

However, if we really want to use XML to convey "equivalent" copies of
the LLRP data, I think we've no choice but to encode these things in the
XML.  

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Gordon Waidhofer
Sent: Friday, February 29, 2008 9:22 AM
To: LLRP Toolkit Development List; LLRP Toolkit Development List
Subject: Re: [ltk-d] Java LTK and non XML characters

It turns on the definition of string.
The spec is no help.

LTKC and LTKCPP treat strings as nul terminated.
When printing if the last character is nul it is omitted.
Each interior nul is escaped.

The byte count in the binary encoding is the
over-the-write transfer size. It is not necessarily
the strlen().

The idea here is to transfer a string that is
ready to use by printf(%s) and the like. It was
thought prudent to transfer the terminating nul
rather than rely on client implementations to
provide one at the library level or applications
to use printf(%*s). Every now and then folks do
something cheesy and printf() without counts.

Regards,
  -gww

-----Original Message-----
From:   [EMAIL PROTECTED] on behalf of
Paul Dietrich
Sent:   Fri 2/29/2008 9:14 AM
To:     LLRP Toolkit Development List
Cc:     
Subject:        Re: [ltk-d] Java LTK and non XML characters

According to our method of unit test, we should be able to convert from
binary to XML and back again and get the same exact binary packet.
Given this, we'd have to escape these characters (as ugly as it may
look) rather than delete them.

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
John R. Hogerhuis
Sent: Thursday, February 28, 2008 10:38 PM
To: LLRP Toolkit Development List
Subject: Re: [ltk-d] Java LTK and non XML characters

On Thu, Feb 28, 2008 at 9:33 PM,  <[EMAIL PROTECTED]> wrote:
>
>  The issue Casey described resulted because we decided to throw an
exception
>  whenever a illegal XML character (as defined in XML 1.0
>  http://www.w3.org/TR/xml/#charsets) appeared in the UTF-8 string.
This is now
>  changed. Illegal XML characters are removed from the UTF8 string.
>

Yeah, I'm considering what to do about the general issue of non-XML
characters in utf8v's for LTK-Perl and LTK-XML. Probably what should
happen is that characters that are not legal in XML should get quoted
in the XML style, in this case I believe  

------------------------------------------------------------------------
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
llrp-toolkit-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/llrp-toolkit-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
llrp-toolkit-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/llrp-toolkit-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
llrp-toolkit-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/llrp-toolkit-devel

Re: [ltk-d] Java LTK and non XML characters

Reply via email to