Google forces a translation to Japanese
Hi all, One of my clients is having a weird problem, and I'm pretty much at my wit's end as for what to do about it. The site is called Tzofit (at tzofit.co.il), and is an index and publisher for Zimmers. When you search Google for צימרים the site appears on the second page, and when you search Google for צופית it is the first result. In both cases, you cannot miss it - Google displays the site's title and summary as Japanese! Now here's where it gets really strange. While the main site is proclaimed to be in Japanese, all the deep links are in Hebrew. If you ask to see the Google cache, the site appears in Hebrew. If you search for its address directly (tzofit.co.il), the site appears with correct title and summary. The only explanation I have is that this is a Google index bug. The problem is that even if that is the case, I cannot see what I can do about it. I tried to ask about it on the Google forums (http://www.google.com/support/forum/p/Web+Search/thread?tid=08c423ea40d5c1abhl=en), but, as expected, got not replies. On the other hand, I did not manage to find anything wrong with the actual page. Trying to translate the Japanese text, using Google Translate, back to English seems to show that the text translates, but is not coherent sentences. Then again, looking at the raw encoding, this does not appear to be Hebrew interpreted with the wrong encoding (or am I missing something?) If anyone has any clue, it would be much appreciated. Thanks, Shachar -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Google forces a translation to Japanese
2009/9/14 Shachar Shemesh shac...@shemesh.biz Hi all, One of my clients is having a weird problem, and I'm pretty much at my wit's end as for what to do about it. The site is called Tzofit (at tzofit.co.il), and is an index and publisher for Zimmers. When you search Google for צימרים the site appears on the second page, and when you search Google for צופית it is the first result. In both cases, you cannot miss it - Google displays the site's title and summary as Japanese! Now here's where it gets really strange. While the main site is proclaimed to be in Japanese, all the deep links are in Hebrew. If you ask to see the Google cache, the site appears in Hebrew. If you search for its address directly (tzofit.co.il), the site appears with correct title and summary. The only explanation I have is that this is a Google index bug. The problem is that even if that is the case, I cannot see what I can do about it. I tried to ask about it on the Google forums ( http://www.google.com/support/forum/p/Web+Search/thread?tid=08c423ea40d5c1abhl=en), but, as expected, got not replies. On the other hand, I did not manage to find anything wrong with the actual page. Trying to translate the Japanese text, using Google Translate, back to English seems to show that the text translates, but is not coherent sentences. Then again, looking at the raw encoding, this does not appear to be Hebrew interpreted with the wrong encoding (or am I missing something?) If anyone has any clue, it would be much appreciated. I would try the following: - remove extra newlines from beginning of document. an xml document should begin with an xml definition. maybe newlines are valid, i never checked, but usually they don't begin that way, so why do it... :) - in an html document, you define the language inside the html opening tag, with lang=he. the meta tag that does this is redundant, and I would assume google likes the html definition better. - the newlines in the file appears to be dos-style. maybe you want to try to run the file through dos2unix - it could be this windows-1255 thing - maybe try putting there iso-8859-8-i - or even better, switch to utf-8 altogether. everybody loves utf-8 :) These are my ideas... HTH, -- Shimi ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Google forces a translation to Japanese
Hi Shahacr, A bit far fetched, if you control the web server, could you verify that there is no special treatment to Google Bots in regard to the responses receives/sent to it? Also I noticed the web server doesn't specify which language it responds with, it is worth telling it via the Content-Language header. 2009/9/14 Shachar Shemesh shac...@shemesh.biz: Hi all, One of my clients is having a weird problem, and I'm pretty much at my wit's end as for what to do about it. The site is called Tzofit (at tzofit.co.il), and is an index and publisher for Zimmers. When you search Google for צימרים the site appears on the second page, and when you search Google for צופית it is the first result. In both cases, you cannot miss it - Google displays the site's title and summary as Japanese! Now here's where it gets really strange. While the main site is proclaimed to be in Japanese, all the deep links are in Hebrew. If you ask to see the Google cache, the site appears in Hebrew. If you search for its address directly (tzofit.co.il), the site appears with correct title and summary. The only explanation I have is that this is a Google index bug. The problem is that even if that is the case, I cannot see what I can do about it. I tried to ask about it on the Google forums (http://www.google.com/support/forum/p/Web+Search/thread?tid=08c423ea40d5c1abhl=en), but, as expected, got not replies. On the other hand, I did not manage to find anything wrong with the actual page. Trying to translate the Japanese text, using Google Translate, back to English seems to show that the text translates, but is not coherent sentences. Then again, looking at the raw encoding, this does not appear to be Hebrew interpreted with the wrong encoding (or am I missing something?) If anyone has any clue, it would be much appreciated. Thanks, Shachar -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Google forces a translation to Japanese
2009/9/14 Shachar Shemesh shac...@shemesh.biz Hi all, One of my clients is having a weird problem, and I'm pretty much at my wit's end as for what to do about it. In addition to the other advise you got - maybe have a sniff around the Google Webmaster Tools site (http://www.google.com/webmasters/) to try to find a way through to google or understand more about what Google thinks about this web site. --Amos ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
Re: Google forces a translation to Japanese
Yuval Hager wrote: Trying to translate the Japanese text, using Google Translate, back to English seems to show that the text translates, but is not coherent sentences. Then again, looking at the raw encoding, this does not appear to be Hebrew interpreted with the wrong encoding (or am I missing something?) If anyone has any clue, it would be much appreciated. Thanks, Shachar The Japanese text is not complete nonsense, like you would expect from an encoding problem. Could it be that the site was hacked in some way that presents Google bots different content from what others see? The client contacted no less than three (3) SEO specialists, with all not coming any more than It's a malware of some sort. They even recommended we hire a scanning service by one of the list's participants (which we would have, had Noam answered his messenger - in fact, Noam, please have one of your sales people contact me). What not one of them managed to do is explain how a malware can cause the Google cache to show the wrong result for some search results, and the correct one for others, nor how to make Google show the wrong summary, but the correct page in the cache. My personal opinion is that Google had a bug that crossed the index with some other site. Admittedly, that theory does not completely match up to all available evidence. For instance, if you search Google for the Japanese description in quotes, there are zero results found (then again, you also don't get Tzofit's site, which is also weird). Like I said, ideas welcome. Shachar -- Shachar Shemesh Lingnu Open Source Consulting Ltd. http://www.lingnu.com ___ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il