On Wed, Nov 17, 2010 at 10:27 AM, Peter Lind <peter.e.l...@gmail.com> wrote:

> Quick note, in case anyone has similar problems: make sure that the
> data you feed into DOMDocument is UTF8 encoded
>

I can attest to this as well. I just fixed a bug in our sitemap-building
code that was producing some items with empty titles for Google News. it
turned out they had smart quotes from Word in them because the title field
wasn't being passed through the filter. Once I filtered and converted to
UTF-8, all is well again.

The strange thing is that we just upgraded to PHP 5.3, and I can't believe
no one had accidentally pasted in a smart quote before the upgrade. We're
running 5.3.3 in fact, and I wouldn't be surprised if something changed in
DOMElement.

David

Reply via email to