Re: [xml] libxml2 very slow on big data dump

2008-12-17 Thread Alexandre Macard
Alexandre Macard a écrit :
> Alexandre Macard a écrit :
>   
>> Stefan Behnel a écrit :
>>   
>> 
>>> Alexandre Macard wrote:
>>>   
>>> 
>>>   
 Stefan Behnel a écrit :
 
   
 
> Alexandre Macard wrote:
>   
> 
>   
>> I try dump a node from a big xml (near 7mo), and the libxml2 is very
>> slow to respond.
>>
>> I tried to trace the problem and it seems to take all it's time into
>> the
>> function: xmlOutputBufferWriteEscape.
>> I do not need to escape data because I use a base64 encoding.
>>
>> 
>>   
>> 
> You didn't write which version of libxml2 you are using, but there was a
> bug in an older version that could lead to horrible performance when
> serialising character entities.
>
> Try upgrading your library.
>   
> 
>   
 Sorry I forgot to precise this information. I am using the last version
 2.7.2.
 
   
 
>>> So maybe it's a similar bug, but for a different encoding (I think it was
>>> related to the ASCII encoding at the time).
>>>
>>> Could you provide the code snippet that you use for serialisation? I.e.
>>> what parameters you pass into what function?
>>>
>>> Stefan
>>>
>>>
>>>   
>>> 
>>>   
>> This little test code make 15secs to exit.
>> The journal.xml size is 7.1Mo.
>>
>> int main() {
>> xmlDocPtr doc;
>> xmlNodePtr cur;
>> xmlBufferPtr buf;
>>
>> doc = xmlParseFile("./journal.xml");
>>
>> if (doc == NULL ) {
>> fprintf(stderr,"Document not parsed successfully. \n");
>> return (0);
>> }
>> cur = xmlDocGetRootElement(doc);
>>
>> if (cur == NULL) {
>> fprintf(stderr,"empty document\n");
>> xmlFreeDoc(doc);
>> return (0);
>> }
>>
>> buf = xmlBufferCreate();
>>
>> xmlNodeDump(buf, doc, cur, 1, 1);
>>
>> xmlFree(buf);
>> xmlFreeDoc(doc);
>>
>> return (0);
>> }
>>
>> I will try to add later a script to generate a similar xml.
>>
>> Thanks.
>> ___
>> xml mailing list, project page  http://xmlsoft.org/
>> xml@gnome.org
>> http://mail.gnome.org/mailman/listinfo/xml
>>
>>   
>> 
> I forgot to precise that all the time is passed into function xmlNodeDump.
>
> At the end you find a script that generate similar xml. I used this xml
> to test and I had to wait 22secs for my program to exit.
>
> usage: script.sh > journal.xml
>
>
> #!/bin/bash
>
> #Header
> echo -n ' xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/";
> xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance";
> xmlns:xsd="http://www.w3.org/1999/XMLSchema";
> SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/";>
> xmlns:m="urn:arkeia">'
>
> echo -n ' xsi:type="xsd:integer">1 xsi:type="xsd:integer">0 xsi:type="xsd:string">MjAwOC8xMi8xNiAxNjo0NzoxMyBJMDAxMTAwMDAgMDFUUF9MSVNUX0FMTDogWW91IGhhdmUgc3VjY2Vzc2Z1bGx5IGxvYWRlZCB0aGUgbGlzdCBvZiB0YXBlcyE= xsi:type="xsd:list">'
>
> i=0
> while [ $i -lt 15000 ] ; do
> echo -n ' xsi:type="xsd:string">MTIzMDkxMDAyNQ== xsi:type="xsd:string">MDAwMDE= xsi:type="xsd:string">cm9vdA== xsi:type="xsd:string">MDAx xsi:type="xsd:string">NDczODVhMWY= xsi:type="xsd:string">NDkzNjlmZjA= xsi:type="xsd:string">NDc1NThlZjM= xsi:type="xsd:string">L2JhY2t1cHMvZmlsZQ== xsi:type="xsd:string">dGFwZV9maWxl'
> i=`expr $i + 1`
> done
>
> echo -n ''
>
> #Footer
> echo ' '
>
> ___
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> http://mail.gnome.org/mailman/listinfo/xml
>
>   
Hi,

I tried to add http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Alexandre Macard
Alexandre Macard a écrit :
> Stefan Behnel a écrit :
>   
>> Alexandre Macard wrote:
>>   
>> 
>>> Stefan Behnel a écrit :
>>> 
>>>   
 Alexandre Macard wrote:
   
 
> I try dump a node from a big xml (near 7mo), and the libxml2 is very
> slow to respond.
>
> I tried to trace the problem and it seems to take all it's time into
> the
> function: xmlOutputBufferWriteEscape.
> I do not need to escape data because I use a base64 encoding.
>
> 
>   
 You didn't write which version of libxml2 you are using, but there was a
 bug in an older version that could lead to horrible performance when
 serialising character entities.

 Try upgrading your library.
   
 
>>> Sorry I forgot to precise this information. I am using the last version
>>> 2.7.2.
>>> 
>>>   
>> So maybe it's a similar bug, but for a different encoding (I think it was
>> related to the ASCII encoding at the time).
>>
>> Could you provide the code snippet that you use for serialisation? I.e.
>> what parameters you pass into what function?
>>
>> Stefan
>>
>>
>>   
>> 
> This little test code make 15secs to exit.
> The journal.xml size is 7.1Mo.
>
> int main() {
> xmlDocPtr doc;
> xmlNodePtr cur;
> xmlBufferPtr buf;
>
> doc = xmlParseFile("./journal.xml");
>
> if (doc == NULL ) {
> fprintf(stderr,"Document not parsed successfully. \n");
> return (0);
> }
> cur = xmlDocGetRootElement(doc);
>
> if (cur == NULL) {
> fprintf(stderr,"empty document\n");
> xmlFreeDoc(doc);
> return (0);
> }
>
> buf = xmlBufferCreate();
>
> xmlNodeDump(buf, doc, cur, 1, 1);
>
> xmlFree(buf);
> xmlFreeDoc(doc);
>
> return (0);
> }
>
> I will try to add later a script to generate a similar xml.
>
> Thanks.
> ___
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> http://mail.gnome.org/mailman/listinfo/xml
>
>   
I forgot to precise that all the time is passed into function xmlNodeDump.

At the end you find a script that generate similar xml. I used this xml
to test and I had to wait 22secs for my program to exit.

usage: script.sh > journal.xml


#!/bin/bash

#Header
echo -n 'http://schemas.xmlsoap.org/soap/envelope/";
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance";
xmlns:xsd="http://www.w3.org/1999/XMLSchema";
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/";>
   '

echo -n '10MjAwOC8xMi8xNiAxNjo0NzoxMyBJMDAxMTAwMDAgMDFUUF9MSVNUX0FMTDogWW91IGhhdmUgc3VjY2Vzc2Z1bGx5IGxvYWRlZCB0aGUgbGlzdCBvZiB0YXBlcyE='

i=0
while [ $i -lt 15000 ] ; do
echo -n 'MTIzMDkxMDAyNQ==MDAwMDE=cm9vdA==MDAxNDczODVhMWY=NDkzNjlmZjA=NDc1NThlZjM=L2JhY2t1cHMvZmlsZQ==dGFwZV9maWxl'
i=`expr $i + 1`
done

echo -n ''

#Footer
echo ' '

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Alexandre Macard
Stefan Behnel a écrit :
> Alexandre Macard wrote:
>   
>> Stefan Behnel a écrit :
>> 
>>> Alexandre Macard wrote:
>>>   
 I try dump a node from a big xml (near 7mo), and the libxml2 is very
 slow to respond.

 I tried to trace the problem and it seems to take all it's time into
 the
 function: xmlOutputBufferWriteEscape.
 I do not need to escape data because I use a base64 encoding.

 
>>> You didn't write which version of libxml2 you are using, but there was a
>>> bug in an older version that could lead to horrible performance when
>>> serialising character entities.
>>>
>>> Try upgrading your library.
>>>   
>> Sorry I forgot to precise this information. I am using the last version
>> 2.7.2.
>> 
>
> So maybe it's a similar bug, but for a different encoding (I think it was
> related to the ASCII encoding at the time).
>
> Could you provide the code snippet that you use for serialisation? I.e.
> what parameters you pass into what function?
>
> Stefan
>
>
>   
This little test code make 15secs to exit.
The journal.xml size is 7.1Mo.

int main() {
xmlDocPtr doc;
xmlNodePtr cur;
xmlBufferPtr buf;

doc = xmlParseFile("./journal.xml");
   
if (doc == NULL ) {
fprintf(stderr,"Document not parsed successfully. \n");
return (0);
}
cur = xmlDocGetRootElement(doc);

if (cur == NULL) {
fprintf(stderr,"empty document\n");
xmlFreeDoc(doc);
return (0);
}

buf = xmlBufferCreate();
   
xmlNodeDump(buf, doc, cur, 1, 1);

xmlFree(buf);
xmlFreeDoc(doc);

return (0);
}

I will try to add later a script to generate a similar xml.

Thanks.
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Stefan Behnel
Alexandre Macard wrote:
> Stefan Behnel a écrit :
>> Alexandre Macard wrote:
>>> I try dump a node from a big xml (near 7mo), and the libxml2 is very
>>> slow to respond.
>>>
>>> I tried to trace the problem and it seems to take all it's time into
>>> the
>>> function: xmlOutputBufferWriteEscape.
>>> I do not need to escape data because I use a base64 encoding.
>>>
>>
>> You didn't write which version of libxml2 you are using, but there was a
>> bug in an older version that could lead to horrible performance when
>> serialising character entities.
>>
>> Try upgrading your library.
>
> Sorry I forgot to precise this information. I am using the last version
> 2.7.2.

So maybe it's a similar bug, but for a different encoding (I think it was
related to the ASCII encoding at the time).

Could you provide the code snippet that you use for serialisation? I.e.
what parameters you pass into what function?

Stefan

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Alexandre Macard
Stefan Behnel a écrit :
> Hi,
>
> Alexandre Macard wrote:
>   
>> I try dump a node from a big xml (near 7mo), and the libxml2 is very
>> slow to respond.
>>
>> I tried to trace the problem and it seems to take all it's time into the
>> function: xmlOutputBufferWriteEscape.
>> I do not need to escape data because I use a base64 encoding.
>> 
>
> You didn't write which version of libxml2 you are using, but there was a
> bug in an older version that could lead to horrible performance when
> serialising character entities.
>
> Try upgrading your library.
>
> Stefan
>
>
>   
Hi,

Sorry I forgot to precise this information. I am using the last version
2.7.2.

Thanks.
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Stefan Behnel
Hi,

Alexandre Macard wrote:
> I try dump a node from a big xml (near 7mo), and the libxml2 is very
> slow to respond.
>
> I tried to trace the problem and it seems to take all it's time into the
> function: xmlOutputBufferWriteEscape.
> I do not need to escape data because I use a base64 encoding.

You didn't write which version of libxml2 you are using, but there was a
bug in an older version that could lead to horrible performance when
serialising character entities.

Try upgrading your library.

Stefan

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Alexandre Macard
Martin Trappel a écrit :
> Alexandre Macard wrote:
>> HI,
>>
>> I try dump a node from a big xml (near 7mo), and the libxml2 is very
>> slow to respond.
>>
>
> 7 MB? That's not large at all. (Well, of course depending on what you
> do, but it shouldn't be too "big" from a general libxml2 perspective I
> guess.
>
> What does slow mean? 100ms, 1 sec, 10 sec? Slowness is really quite
> application dependent :)
>
> br,
> Martin
> ___
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> http://mail.gnome.org/mailman/listinfo/xml
>
Hi,

Slow means that libxml2 is blocked more than 20 secs !

Regards.
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


Re: [xml] libxml2 very slow on big data dump

2008-12-16 Thread Martin Trappel

Alexandre Macard wrote:

HI,

I try dump a node from a big xml (near 7mo), and the libxml2 is very
slow to respond.



7 MB? That's not large at all. (Well, of course depending on what you 
do, but it shouldn't be too "big" from a general libxml2 perspective I 
guess.


What does slow mean? 100ms, 1 sec, 10 sec? Slowness is really quite 
application dependent :)


br,
Martin
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml


[xml] libxml2 very slow on big data dump

2008-12-16 Thread Alexandre Macard
HI,

I try dump a node from a big xml (near 7mo), and the libxml2 is very
slow to respond.

I tried to trace the problem and it seems to take all it's time into the
function: xmlOutputBufferWriteEscape.
I do not need to escape data because I use a base64 encoding.

How can I get rid of passing inside this function ?

Thanks a lot.

Regards.
Alexandre Macard.

___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
http://mail.gnome.org/mailman/listinfo/xml