Re: [xml] Error on parsing HTML with libxml

2018-08-21 Thread Eric S Eberhard
8 301 7537 (not reliable except for text or if not home) 2933 W Middle Verde Rd Camp Verde, AZ 86322 -Original Message- From: xml [mailto:xml-boun...@gnome.org] On Behalf Of André Rothe Sent: Monday, August 20, 2018 12:48 AM To: xml@gnome.org; Liam R. E. Quin Subject: Re: [xml] Error on pa

Re: [xml] Error on parsing HTML with libxml

2018-08-20 Thread André Rothe
I have looked into the libxml code and I found the method htmlParseScript() within HTMLParser.c. https://gitlab.gnome.org/GNOME/libxml2/blob/master/HTMLparser.c It describes the problem with the "<" character within scripts. But it offers the possibility to use the recover mode to ignore the

Re: [xml] Error on parsing HTML with libxml

2018-08-20 Thread André Rothe
I can't chage the source of the HTML page, because the page will be generated by another system, where I don't have access. I get only the pages from there and our Apache module makes a post-processing step just before the pages will be sent to the user's browser. And there I need a parser to

Re: [xml] Error on parsing HTML with libxml

2018-08-17 Thread Liam R E Quin
On Fri, 2018-08-17 at 14:42 +0200, André Rothe wrote: > > https://3v4l.org/O0iEf Try changing ...writeln(''); to ...writeln('<' + '/td>'); and see if that helps; or use a CDATA section, to escape the markup from the HTML parser. Although it may depend on what the

Re: [xml] Error on parsing HTML with libxml

2018-08-17 Thread Eric S Eberhard
I could be way off base -- don't you have to encode the portions in the js? Otherwise I can see it being confused. The js looks like data and it can't have < or > in it. https://stackoverflow.com/questions/1398571/html-inside-xml-should-i-use-cdata-or-encode-the-html Eric Eric S Eberhard