Re: [xml] Strange 0x0a char popping in generated XML

2014-10-03 Thread Daniel Veillard
On Tue, Aug 26, 2014 at 12:30:01PM +, Jean-Philippe Jacoupy wrote:
 Hello, 
 
 I'm using libxml2 and I have a strange behaviour. 
 
 I'm creating a full document in memory (using xmlTextWriter with a 
 xmlBuffer). 
 
 I have called xmlTextWriterSetIndent with 0 as parameter. 
 
 Whenever I get the buffer content (once I have called
 xmlTextWriterEndDocument) I get strange 0x0a inserted: 
  - 1 after the xml header
  - 1 after the end of the xml document
 

  It's not strange, that a new line character, which is present
as non-significant white space and will be ignored by XML parsers
and hence the whole tool chain consuming the output.

Daniel

 I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2
 
 PS: 
 - As I searched the code of the libxml2, at the end of the
 xmlTextWriterStartDocument function I have found this: 
 
 count = xmlOutputBufferWriteString(writer-out, ?\n); (L. 617)
 
 Shouldn't the '\n' be prefixed by a if (writer-indent) ? 
 
  - Found the other one in xmlTextWriterEndDocument I have found: 
 
 if (!writer-indent) { (L. 701)
 
 instead of 
 
 if (writer-indent) {
 
 as done in all the file. 
 
 ___
 xml mailing list, project page  http://xmlsoft.org/
 xml@gnome.org
 https://mail.gnome.org/mailman/listinfo/xml

-- 
Daniel Veillard  | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Strange 0x0a char popping in generated XML

2014-10-03 Thread Daniel Veillard
On Fri, Oct 03, 2014 at 11:15:30AM +0200, Jean-Philippe Jacoupy wrote:
 On Fri, Oct 3, 2014 at 10:56 AM, Daniel Veillard veill...@redhat.com
 wrote:
 
  On Tue, Aug 26, 2014 at 12:30:01PM +, Jean-Philippe Jacoupy wrote:
   Hello,
  
   I'm using libxml2 and I have a strange behaviour.
  
   I'm creating a full document in memory (using xmlTextWriter with a
  xmlBuffer).
  
   I have called xmlTextWriterSetIndent with 0 as parameter.
  
   Whenever I get the buffer content (once I have called
   xmlTextWriterEndDocument) I get strange 0x0a inserted:
- 1 after the xml header
- 1 after the end of the xml document
  
 
It's not strange, that a new line character, which is present
  as non-significant white space and will be ignored by XML parsers
  and hence the whole tool chain consuming the output.
 
  Daniel
 
 
 Thanks for your response Daniel,
 
 But I still think the presence of those \n is bogus.
 Even if the XML parser will ignore the \n at reading (which is OK),
 when you have to cypher the document, the extras '\n' changes the result.
 
 Both of them prevent to generate a linearized xml. And that's the point of
 this report.
 
 I mean this isn't a linearized xml:
 
 ?xml version=1.0 encoding=UTF-8 standalone=yes?\nDocumentdata1
 /data2data21 //data2/Document\n
 
 Whereas this is a linearized xml:
 
 ?xml version=1.0 encoding=UTF-8 standalone=yes?Documentdata1
 /data2data21 //data2/Document

  If you want to sign the output, the is a canonical format and you
should use that. Libxml2 supports it !

Considering linearized that's severely bogus you mean you have
recipient who won't parse anything with a line feed in it ?
Where did that definition come from ? What happen if one of you data
field has a content with a \n inside ? *That* is the the broken part.

That's not an libxml2 issue , XML is here as a spec to define
interoperability, there is a number of place where that interop is
guaranteed even if you reformat the document, and there is equivalence
at the XML Infoset level. The two added \n are in those spaces.

Daniel

 
 
   I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2
  
   PS:
   - As I searched the code of the libxml2, at the end of the
   xmlTextWriterStartDocument function I have found this:
  
   count = xmlOutputBufferWriteString(writer-out, ?\n); (L. 617)
  
   Shouldn't the '\n' be prefixed by a if (writer-indent) ?
  
- Found the other one in xmlTextWriterEndDocument I have found:
  
   if (!writer-indent) { (L. 701)
  
   instead of
  
   if (writer-indent) {
  
   as done in all the file.
  
   ___
   xml mailing list, project page  http://xmlsoft.org/
   xml@gnome.org
   https://mail.gnome.org/mailman/listinfo/xml
 
  --
  Daniel Veillard  | Open Source and Standards, Red Hat
  veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
  http://veillard.com/ | virtualization library  http://libvirt.org/
 
 
 
 
 -- 
 Cordialement,
 JACOUPY Jean-Philippe

-- 
Daniel Veillard  | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Strange 0x0a char popping in generated XML

2014-10-03 Thread Daniel Veillard
On Fri, Oct 03, 2014 at 02:38:21PM +0200, Jean-Philippe Jacoupy wrote:
 The protocol, that use xml, that I have to work with refuses those.
 
 They don't use the signing method defined in the XML because they specified
 their own method for certification.

  then it's not an XML compliant protocol, that's pretty bad design,
would you name the name of that protocol so that some public shame
can be casted on those who pushed for it ?

 The point is when you use xmlTextWriterSetIndent(writer, 0)
 before EVEN starting your document, I expect to get no indentation as
 stated here
 http://xmlsoft.org/html/libxml-xmlwriter.html#xmlTextWriterSetIndent
 ( http://xmlsoft.org/html/libxml-xmlwriter.html#xmlTextWriterSetIndent )
 
 The first '\n' is indentation.
 
 I modified the lib, but still reporting something that seems wrong for me.
 You'll find the patch as attachment and below.

  Sorry I won't change that behaviour. This will break people expecting
those for example in regression tests.
  indent is about adding those space in significant content it's
actually not the default, what you are doing is changing part where
it's not supposed to be significant.

  See for example the differences on xmllint with --pretty
for the value of 0 (no change), 1 (change adding significant content)
and 2 (change adding non significant content)

 --- libxml2-2.7.2/xmlwriter.cTue Mar 11 23:54:05 2008
 +++ libxml2-2.7.2/xmlwriter.cFri Aug 29 16:38:31 2014
 @@ -614,7 +614,12 @@
  sum += count;
  }
 
 -count = xmlOutputBufferWriteString(writer-out, ?\n);
 +if (writer-indent) {
 +count = xmlOutputBufferWriteString(writer-out, ?\n);
 +}
 +else {
 +count = xmlOutputBufferWriteString(writer-out, ?);
 +}
  if (count  0)
  return -1;
  sum += count;
 
 As for the other one I remove it inside my code.

  The writer might be able to save without the XMLDecl which would could
then add by yourself without that line feed

Daniel
 
 On Fri, Oct 3, 2014 at 1:11 PM, Daniel Veillard veill...@redhat.com wrote:
 
  On Fri, Oct 03, 2014 at 11:15:30AM +0200, Jean-Philippe Jacoupy wrote:
   On Fri, Oct 3, 2014 at 10:56 AM, Daniel Veillard veill...@redhat.com
   wrote:
  
On Tue, Aug 26, 2014 at 12:30:01PM +, Jean-Philippe Jacoupy wrote:
 Hello,

 I'm using libxml2 and I have a strange behaviour.

 I'm creating a full document in memory (using xmlTextWriter with a
xmlBuffer).

 I have called xmlTextWriterSetIndent with 0 as parameter.

 Whenever I get the buffer content (once I have called
 xmlTextWriterEndDocument) I get strange 0x0a inserted:
  - 1 after the xml header
  - 1 after the end of the xml document

   
  It's not strange, that a new line character, which is present
as non-significant white space and will be ignored by XML parsers
and hence the whole tool chain consuming the output.
   
Daniel
   
   
   Thanks for your response Daniel,
  
   But I still think the presence of those \n is bogus.
   Even if the XML parser will ignore the \n at reading (which is OK),
   when you have to cypher the document, the extras '\n' changes the result.
  
   Both of them prevent to generate a linearized xml. And that's the point
  of
   this report.
  
   I mean this isn't a linearized xml:
  
   ?xml version=1.0 encoding=UTF-8 standalone=yes?\nDocumentdata1
   /data2data21 //data2/Document\n
  
   Whereas this is a linearized xml:
  
   ?xml version=1.0 encoding=UTF-8 standalone=yes?Documentdata1
   /data2data21 //data2/Document
 
If you want to sign the output, the is a canonical format and you
  should use that. Libxml2 supports it !
 
  Considering linearized that's severely bogus you mean you have
  recipient who won't parse anything with a line feed in it ?
  Where did that definition come from ? What happen if one of you data
  field has a content with a \n inside ? *That* is the the broken part.
 
  That's not an libxml2 issue , XML is here as a spec to define
  interoperability, there is a number of place where that interop is
  guaranteed even if you reformat the document, and there is equivalence
  at the XML Infoset level. The two added \n are in those spaces.
 
  Daniel
 
  
  
 I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2

 PS:
 - As I searched the code of the libxml2, at the end of the
 xmlTextWriterStartDocument function I have found this:

 count = xmlOutputBufferWriteString(writer-out, ?\n); (L. 617)

 Shouldn't the '\n' be prefixed by a if (writer-indent) ?

  - Found the other one in xmlTextWriterEndDocument I have found:

 if (!writer-indent) { (L. 701)

 instead of

 if (writer-indent) {

 as done in all the file.

 ___
 xml mailing list, project page  http://xmlsoft.org/
 xml@gnome.org
 

[xml] Strange 0x0a char popping in generated XML

2014-08-26 Thread Jean-Philippe Jacoupy
Hello, 

I'm using libxml2 and I have a strange behaviour. 

I'm creating a full document in memory (using xmlTextWriter with a xmlBuffer). 

I have called xmlTextWriterSetIndent with 0 as parameter. 

Whenever I get the buffer content (once I have called
xmlTextWriterEndDocument) I get strange 0x0a inserted: 
 - 1 after the xml header
 - 1 after the end of the xml document


I'm under Windows compiling with VS2008 against LibXML2 version 2.7.2

PS: 
- As I searched the code of the libxml2, at the end of the
xmlTextWriterStartDocument function I have found this: 

count = xmlOutputBufferWriteString(writer-out, ?\n); (L. 617)

Shouldn't the '\n' be prefixed by a if (writer-indent) ? 

 - Found the other one in xmlTextWriterEndDocument I have found: 

if (!writer-indent) { (L. 701)

instead of 

if (writer-indent) {

as done in all the file. 

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml