Took a little while to find the time, but finally managed to fix this issue
:)

There were 2 unrelated issues, one was the fact that json_encode translated
utf8 to the escaped variant, which is actually valid and would display
properly ... however pretty its not, so thanks to Eiji's help I've build in
a misc convert-to-utf8 into the makeRequest handler so that the output is
regular a bit more readable again.

The second issue was that the feed parsing code assumed that the Link
element in the Entry would be a text string, while in these cases they were
a list of dom nodes instead, so I've added a bit of code that detects that
and uses the correct href element.

As a result the output is now:

throw 1; < don't be evil'
>{"http:\/\/www.kots.jp\/blog\/feed":{"rc":200,"body":"{\"Entry\":[{\"Title\":\"速読って…\",\"Link\":\"http:\\\/\\\/www.kots.jp\\\/blog\\\/archives\\\/1883\",\"Date\":1253440553},{\"Title\":\"連休兼夏休み\",\"Link\":\"http:\\\/\\\/www.kots.jp\\\/blog\\\/archives\\\/1880\",\"Date\":1253347393},{\"Title\":\"セカイカメラ
すごいですね…\",\"Link\":\"http:\\\/\\\/www.kots.jp\\\/blog\\\/archives\\\/1876\",\"Date\":1253280507}],\"Title\":\"kots
blog\",\"URL\":\"http:\\\/\\\/www.kots.jp\\\/blog\\\/feed\",\"Description\":\"創れるということは「特別」ではないのかも知れないが、その人が『創った』ものは特別な存在にしたいと思うので。\",\"Link\":\"http:\\\/\\\/www.kots.jp\\\/blog\",\"Author\":\"admin\"}"}}

As it should be.

Thanks again for the feedback!

   -- Chris

On Sat, Sep 5, 2009 at 3:57 AM, Chris Chabot <[email protected]> wrote:

> Hey Robert,
>
> Sorry for the slow reply, I've taken an initial look at this and your guess
> is correct that the unicode translations is what's causing it to fail.
> Unicode is historically quite difficult to get right, and especially in PHP
> (it won't be until version 6 that they will really improve that).
>
> Thanks for the report & including a way to reproduce the issue, I'll take a
> look at fixing this correctly in the near future (within  the next week or
> 2).
>
>    -- Chris
>
> 2009/8/27 Robert Gravina <[email protected]>
>
> I'm having problems again with some RSS feeds and the PHP version of
>> Shindig. These RSS feeds are returned correctly with the Java version,
>> but the PHP version returns different JSON.
>>
>> Here are two examples of RSS feeds which don't work correctly:
>> http://triglav.jp/blog/feed/
>> http://www.kots.jp/blog/feed
>>
>> The link object, rather than being a string of the URL, is an array of
>> empty objects.
>>
>> The response from the first feed (with only one entry for brevity):
>> throw 1; < don't be evil'
>> >{"http:\/\/triglav.jp
>> \/blog\/feed\/":{"body":"{\"Entry\":[{\"Title\":\"\\u4e2d\\u53e4\\u81ea\\u52d5\\u8eca\\u4e8b\\u696d\\u3092\\u771f\\u5263\\u306b\\u958b\\u59cb\\u3057\\u307e\\u3057\\u305f\\u3002\",\"Link\":[{},{}],\"Date\":1249130296}],\"Title\":\"TRIBLOG\",\"URL\":\"http:\\\/\\\/
>> triglav.jp
>> \\\/blog\\\/feed\\\/\",\"Description\":\"\\u682a\\u5f0f\\u4f1a\\u793e\\u30c8\\u30ea\\u30b0\\u30e9\\u30d5\\u306e\\u65e5\\u5e38\\u3084\\u7c21\\u5358\\u306a\\u304a\\u77e5\\u3089\\u305b\\u306a\\u3069\",\"Link\":\"http:\\\/\\\/
>> triglav.jp\\\/blog\",\"Author\":\"monobe\"}","rc":200}}
>>
>> As you can see (aside from it being full of unicode chars) is that
>> Link == [{},{}].
>>
>> Here's what the Java version returns:
>>
>> throw 1; < don't be evil'
>> >{"http://triglav.jp/blog/feed/":{"rc":200,"body":"{\"Link\":\";
>> http://triglav.jp/blog
>> \",\"Description\":\"株式会社トリグラフの日常や簡単なお知らせなど\",\"URL\":\"
>> http://triglav.jp/blog/feed/
>> \",\"Author\":\"monobe\",\"Title\":\"TRIBLOG\",\"Entry\":[{\"Link\":\"
>> http://triglav.jp/blog/2009/08/01/%e4%b8%ad%e5%8f%a4%e8%87%aa%e5%8b%95%e8%bb%8a%e4%ba%8b%e6%a5%ad%e3%82%92%e9%96%8b%e5%a7%8b%e3%81%97%e3%81%be%e3%81%97%e3%81%9f/
>> \",\"Date\":1249130296000,\"Title\":\"中古自動車事業を真剣に開始しました。\"}]}"}}
>>
>> And the Link is fine.
>>
>> These urls have unicode characters in them (which might be causing
>> it). The PHP version does seem to handle other RSS feeds containing
>> unicode characters, and the second RSS example above does not contain
>> unicode characters in links. Or it might not have anything to do with
>> unicode at all.
>>
>> I'd just like to know if this is a known issue, anyone has come across
>> it before etc.
>>
>> Robert
>>
>
>

Reply via email to