On Jan 19, 2010, at 4:04 PM, Khalid Baheyeldin wrote:

On Tue, Jan 19, 2010 at 2:15 PM, Alex Barth <[email protected]> wrote:

After getting a report that

http://news.google.com/news?pz=1&hl=ar&q=سوريا&cf=all&output=rss

Could be a UTF-8 issue? The q= has "Syria" (in Arabic) in it. Is that stripped out
somewhere in some layer in Drupal?

bangpound pointed that out on the issue queue, too. Indeed url encoding the arabic string fixes the behavior I described - my guesses that Google News might require special request parameters were simply not on the right track.

What I am not clear about now is whether wget and PHP streams do better URL sanitation before doing the request or if non ASCII characters are allowed in an HTTP URL but curl doesn't support it.


--
Khalid M. Baheyeldin
2bits.com, Inc.
http://2bits.com
Drupal optimization, development, customization and consulting.
Simplicity is prerequisite for reliability. --  Edsger W.Dijkstra
Simplicity is the ultimate sophistication. --   Leonardo da Vinci

Alex Barth
http://www.developmentseed.org/blog
tel (202) 250-3633




Reply via email to