Hi,

you probably have to extend the code yourself to (a) detect the
HTML page's encoding and (b) convert it into UTF8 (which should
be very straight-forward in Perl).

-phi

On Sat, Aug 2, 2008 at 5:48 PM, musa ghurab <[EMAIL PROTECTED]> wrote:
> Hi all
>
> I'm facing problem with the moses web-based, problem related to encoding.
> In web-root file: translate.cgi line: 234
>
> $html=decode_entities($html);
>
> decode_entities(page coding: windows-1256)wrong coding (not utf8)
>
> This is converting the fetched text from iso coding to utf8 coding. But what
> I got is when fetch page other than utf8 such as Arabic (windows-1256
> (cp-1256)) or any page not declaring the coding in the charset of head tag
> of html, then it goes to wrong encoding and moses cannot understand this
> coding.
> i think this is bug with perl or must use another function for this.
>
> Please any suggestion to solve this problem.
>
>
>
> musa ghurab
>
>
> ________________________________
> Get news, entertainment and everything you care about at Live.com. Check it
> out!
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to