[PHP] Simple XML - problem with errors

2010-07-08 Thread Gary .
Why am I still getting an exception when I do this:

libxml_use_internal_errors(true);
$this-xml = new SimpleXMLElement($this-htmlString);

or this
$this-xml = new SimpleXMLElement($this-htmlString,
LIBXML_NOERROR|LIBXML_NOWARNING);

?

The exception says Exception: String could not be parsed as XML. Not
a hint of why not, of course.

I thought the point of those things was to just stuff the content in,
and let user code handle errors? I mean, I *know* the provided HTML is
broken. I also know there's not a chance in hell of it ever being
fixed (completely out of my control).

And yes, I'd rather use DOM, but I can't.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Richard Quadling
On 8 July 2010 08:07, Gary . php-gene...@garydjones.name wrote:
 Why am I still getting an exception when I do this:

 libxml_use_internal_errors(true);
 $this-xml = new SimpleXMLElement($this-htmlString);

 or this
 $this-xml = new SimpleXMLElement($this-htmlString,
 LIBXML_NOERROR|LIBXML_NOWARNING);

 ?

 The exception says Exception: String could not be parsed as XML. Not
 a hint of why not, of course.

 I thought the point of those things was to just stuff the content in,
 and let user code handle errors? I mean, I *know* the provided HTML is
 broken. I also know there's not a chance in hell of it ever being
 fixed (completely out of my control).

 And yes, I'd rather use DOM, but I can't.

 --
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php



The XML needs to be well formed [1]. So, if it is junk, you can't
read it using SimpleXML as the XML is not well formed.

Try putting it through Tidy first - that is, tidy the file first.


Regards,

Richard.

[1] http://www.devx.com/projectcool/Article/19944/0/page/3

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Gary .
Richard Quadling writes:
 On 8 July 2010 08:07, Gary wrote:
 Why am I still getting an exception when I do this:

 libxml_use_internal_errors(true);
 $this-xml = new SimpleXMLElement($this-htmlString);

 or this
 $this-xml = new SimpleXMLElement($this-htmlString,
 LIBXML_NOERROR|LIBXML_NOWARNING);

 ?

 The exception says Exception: String could not be parsed as XML.
...
 The XML needs to be well formed [1].

I thought so, thanks. What does libxml_use_internal_errors do then, if
it doesn't allow me to handle those problems in my own code?

 So, if it is junk, you can't
 read it using SimpleXML as the XML is not well formed.

I'm trying to just use xml_parse and so on now.

This problem really should be *so* easy. In fact I've already solved it once X(

 Try putting it through Tidy first - that is, tidy the file first.

Ha ha!

Sorry.

It's almost certainly not available. I don't want to talk about it
*cries*

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Marc Guay
 libxml_use_internal_errors(true);
 $this-xml = new SimpleXMLElement($this-htmlString);

Hi Gary,

I have code that looks like this:

libxml_use_internal_errors(true);
$xml = simplexml_load_string($val);
$errors = libxml_get_errors();

if ($errors)
do this
else
do that

which works fine.  Not sure if that's helpful to you, but it seems
like it might.

Marc

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Gary .
Marc Guay writes:
 libxml_use_internal_errors(true);
 $this-xml = new SimpleXMLElement($this-htmlString);

 I have code that looks like this:

 libxml_use_internal_errors(true);
 $xml = simplexml_load_string($val);

Yeah. I tried simplexml_load_string and found that worked (in that it
didn't cause an exception - there are errors which caused the conversion
not to work). I wonder what the difference is between doing new
SimpleXMLElement and calling simplexml_load_string which results in the
libxml_use_internal_errors call being ineffective. Odd.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Marc Guay
  I wonder what the difference is between doing new
 SimpleXMLElement and calling simplexml_load_string which results in the
 libxml_use_internal_errors call being ineffective. Odd.

The documentation for Dealing with XML errors only mentions
simplexml_load_string() and this comment
http://ca3.php.net/manual/en/simplexml.examples-basic.php#93263 shows
that you're not the first person to run into this.

Marc

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Marc Guay
 And yes, I'd rather use DOM, but I can't.

Could you use this: http://simplehtmldom.sourceforge.net/?

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Gary .
On 7/8/10, Marc Guay wrote:
 And yes, I'd rather use DOM, but I can't.

 Could you use this: http://simplehtmldom.sourceforge.net/?

Interesting.

Although I can't use DOM or Tidy (because they're normally built in,
but TPTB decided to recompile PHP and exclude them, and I am not
allowed to recompile it with them in), that's external so might be a
possibility.

Thanks.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Simple XML - problem with errors

2010-07-08 Thread Richard Quadling
On 8 July 2010 18:55, Gary . php-gene...@garydjones.name wrote:
 On 7/8/10, Marc Guay wrote:
 And yes, I'd rather use DOM, but I can't.

 Could you use this: http://simplehtmldom.sourceforge.net/?

 Interesting.

 Although I can't use DOM or Tidy (because they're normally built in,
 but TPTB decided to recompile PHP and exclude them, and I am not
 allowed to recompile it with them in), that's external so might be a
 possibility.

 Thanks.

If it were windows, then the Tidy extension is loadable via php.ini.

You could ask TPTB why they've removed the only tool that can read
this sh*t with any success?

Make the case for it. If they still say no, then tell them that the
sh*t is NOT XML and therefore the XML tools won't read it.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php