Re: encoding hebrew text

2009-12-16 Thread Shlomi Fish
On Wednesday 16 Dec 2009 01:09:29 Uri Even-Chen wrote:
 OK, I found a solution.  I opened the file with both notepad and
 notepad++, then I changed the encoding to windows-1255 in notepad++,
 then I copied all the contents to notepad and saved in utf-8.  It
 works.  I'm attaching the result.
 
 Thanks!
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/
 

Just a note, one can use iconv or Perl's http://perldoc.perl.org/Encode.html 
module or something to convert an entire file from one encoding to the other:


iconv -f windows-1255 -t utf-8  1.txt


Seems to work here.

Regards,

Shlomi Fish

-- 
-
Shlomi Fish   http://www.shlomifish.org/
Best Introductory Programming Language - http://shlom.in/intro-lang

Bzr is slower than Subversion in combination with Sourceforge. 
( By: http://dazjorz.com/ )

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-16 Thread Ori Idan
On Wed, Dec 16, 2009 at 11:22 AM, Shlomi Fish shlo...@iglu.org.il wrote:

 On Wednesday 16 Dec 2009 01:09:29 Uri Even-Chen wrote:
  OK, I found a solution.  I opened the file with both notepad and
  notepad++, then I changed the encoding to windows-1255 in notepad++,
  then I copied all the contents to notepad and saved in utf-8.  It
  works.  I'm attaching the result.
 
  Thanks!
  Uri Even-Chen
  Mobile Phone: +972-50-9007559
  E-mail: u...@speedy.net
  Blog: http://www.speedy.net/uri/blog/
 

 Just a note, one can use iconv or Perl's
 http://perldoc.perl.org/Encode.html
 module or something to convert an entire file from one encoding to the
 other:

 
 iconv -f windows-1255 -t utf-8  1.txt
 

 Seems to work here.

 Regards,

Shlomi Fish
 I am using iconv on linux.
 I am not sure if iconv exists for windows.

 --
 Ori Idan

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-16 Thread Shlomi Fish
On Wednesday 16 Dec 2009 11:50:49 Ori Idan wrote:
 On Wed, Dec 16, 2009 at 11:22 AM, Shlomi Fish shlo...@iglu.org.il wrote:
  On Wednesday 16 Dec 2009 01:09:29 Uri Even-Chen wrote:
   OK, I found a solution.  I opened the file with both notepad and
   notepad++, then I changed the encoding to windows-1255 in notepad++,
   then I copied all the contents to notepad and saved in utf-8.  It
   works.  I'm attaching the result.
  
   Thanks!
   Uri Even-Chen
   Mobile Phone: +972-50-9007559
   E-mail: u...@speedy.net
   Blog: http://www.speedy.net/uri/blog/
 
  Just a note, one can use iconv or Perl's
  http://perldoc.perl.org/Encode.html
  module or something to convert an entire file from one encoding to the
  other:
 
  
  iconv -f windows-1255 -t utf-8  1.txt
 
 
  Seems to work here.
 
  Regards,
 
 Shlomi Fish

 I am using iconv on linux.
 I am not sure if iconv exists for windows.

It does:

http://gnuwin32.sourceforge.net/packages/libiconv.htm

Regards,

Shlomi Fish

-- 
-
Shlomi Fish   http://www.shlomifish.org/
The Case for File Swapping - http://shlom.in/file-swap

Bzr is slower than Subversion in combination with Sourceforge. 
( By: http://dazjorz.com/ )

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-16 Thread Tom Goren
iconv is what i was going to recommend in the first place, once the
encodings were figured out.

2009/12/16 Shlomi Fish shlo...@iglu.org.il

 On Wednesday 16 Dec 2009 11:50:49 Ori Idan wrote:
  On Wed, Dec 16, 2009 at 11:22 AM, Shlomi Fish shlo...@iglu.org.il
 wrote:
   On Wednesday 16 Dec 2009 01:09:29 Uri Even-Chen wrote:
OK, I found a solution.  I opened the file with both notepad and
notepad++, then I changed the encoding to windows-1255 in notepad++,
then I copied all the contents to notepad and saved in utf-8.  It
works.  I'm attaching the result.
   
Thanks!
Uri Even-Chen
Mobile Phone: +972-50-9007559
E-mail: u...@speedy.net
Blog: http://www.speedy.net/uri/blog/
  
   Just a note, one can use iconv or Perl's
   http://perldoc.perl.org/Encode.html
   module or something to convert an entire file from one encoding to the
   other:
  
   
   iconv -f windows-1255 -t utf-8  1.txt
  
  
   Seems to work here.
  
   Regards,
  
  Shlomi Fish
 
  I am using iconv on linux.
  I am not sure if iconv exists for windows.

 It does:

 http://gnuwin32.sourceforge.net/packages/libiconv.htm

 Regards,

Shlomi Fish

 --
 -
 Shlomi Fish   http://www.shlomifish.org/
 The Case for File Swapping - http://shlom.in/file-swap

 Bzr is slower than Subversion in combination with Sourceforge.
 ( By: http://dazjorz.com/ )

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


encoding hebrew text

2009-12-15 Thread Uri Even-Chen
Hi people,

I have a problem with encoding hebrew text files on windows.  I used
notepad to edit these files, now I'm using notepad++ (by the way, I
highly recommend notepad++ on windows).  the problem is, hebrew text
appears as gibberish (ëøèéñ etc.).  I tried different encodings,
eventually with using windows-1255 as the character set, I can read
the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
of my files I can't read the hebrew at all, even with windows-1255
encoding.  I need help to fix the hebrew encoding and convert the
files to utf-8.  Am I right that utf-8 is the best solution for
hebrew?

Thanks,
Uri Even-Chen
Mobile Phone: +972-50-9007559
E-mail: u...@speedy.net
Blog: http://www.speedy.net/uri/blog/

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Tom Goren
could you perhaps attach an example of such a file?

it would make it easier to recommend the appropriate conversion for you to
make (in my opinion it should eventually all be utf8).

i know that notepad++ should be sufficient.

tom.

2009/12/15 Uri Even-Chen u...@speedy.net

 Hi people,

 I have a problem with encoding hebrew text files on windows.  I used
 notepad to edit these files, now I'm using notepad++ (by the way, I
 highly recommend notepad++ on windows).  the problem is, hebrew text
 appears as gibberish (ëøèéñ etc.).  I tried different encodings,
 eventually with using windows-1255 as the character set, I can read
 the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
 of my files I can't read the hebrew at all, even with windows-1255
 encoding.  I need help to fix the hebrew encoding and convert the
 files to utf-8.  Am I right that utf-8 is the best solution for
 hebrew?

 Thanks,
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Tom Goren
also i just remembered, notepad itself has an option of saving the files
natively to utf8 format, however it is utf8 with BOM, which is bad for you.

2009/12/16 Tom Goren motne...@gmail.com

 could you perhaps attach an example of such a file?

 it would make it easier to recommend the appropriate conversion for you to
 make (in my opinion it should eventually all be utf8).

 i know that notepad++ should be sufficient.

 tom.

 2009/12/15 Uri Even-Chen u...@speedy.net

 Hi people,

 I have a problem with encoding hebrew text files on windows.  I used
 notepad to edit these files, now I'm using notepad++ (by the way, I
 highly recommend notepad++ on windows).  the problem is, hebrew text
 appears as gibberish (ëøèéñ etc.).  I tried different encodings,
 eventually with using windows-1255 as the character set, I can read
 the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
 of my files I can't read the hebrew at all, even with windows-1255
 encoding.  I need help to fix the hebrew encoding and convert the
 files to utf-8.  Am I right that utf-8 is the best solution for
 hebrew?

 Thanks,
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il



___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Uri Even-Chen
OK, I'm attaching an example file with a few lines in hebrew (encode
it in windows-1255 to read the hebrew).  I checked now and I can
convert the file to windows-1255 encoding, the problem was that it was
already in utf-8 and that's why I couldn't convert it.  but I still
need to encode it in utf-8 with hebrew enabled, so I will be able to
read the file with notepad and notepad++.

Uri.


On Wed, Dec 16, 2009 at 12:21 AM, Tom Goren motne...@gmail.com wrote:
 could you perhaps attach an example of such a file?

 it would make it easier to recommend the appropriate conversion for you to
 make (in my opinion it should eventually all be utf8).

 i know that notepad++ should be sufficient.

 tom.

 2009/12/15 Uri Even-Chen u...@speedy.net

 Hi people,

 I have a problem with encoding hebrew text files on windows.  I used
 notepad to edit these files, now I'm using notepad++ (by the way, I
 highly recommend notepad++ on windows).  the problem is, hebrew text
 appears as gibberish (ëøèéñ etc.).  I tried different encodings,
 eventually with using windows-1255 as the character set, I can read
 the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
 of my files I can't read the hebrew at all, even with windows-1255
 encoding.  I need help to fix the hebrew encoding and convert the
 files to utf-8.  Am I right that utf-8 is the best solution for
 hebrew?

 Thanks,
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


   זיכרון נייד חדשני Sandisk Cruzer Micro U3 2GB.
   דגם: SDCZ6-2048-E10WT 
   מספר מכירה: 11648541  
   מחיר המוצר כולל דמי משלוח: 139 ₪
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Uri Even-Chen
what's the difference with encoding with BOM and without BOM?  with
notepad++ I have both options, I don't know which one is better.  I
know that with BOM notepad recognizes it as utf-8, which is good when
opening the files with notepad.

Uri Even-Chen
Mobile Phone: +972-50-9007559
E-mail: u...@speedy.net
Blog: http://www.speedy.net/uri/blog/




On Wed, Dec 16, 2009 at 12:23 AM, Tom Goren motne...@gmail.com wrote:
 also i just remembered, notepad itself has an option of saving the files
 natively to utf8 format, however it is utf8 with BOM, which is bad for you.

 2009/12/16 Tom Goren motne...@gmail.com

 could you perhaps attach an example of such a file?

 it would make it easier to recommend the appropriate conversion for you to
 make (in my opinion it should eventually all be utf8).

 i know that notepad++ should be sufficient.

 tom.

 2009/12/15 Uri Even-Chen u...@speedy.net

 Hi people,

 I have a problem with encoding hebrew text files on windows.  I used
 notepad to edit these files, now I'm using notepad++ (by the way, I
 highly recommend notepad++ on windows).  the problem is, hebrew text
 appears as gibberish (ëøèéñ etc.).  I tried different encodings,
 eventually with using windows-1255 as the character set, I can read
 the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
 of my files I can't read the hebrew at all, even with windows-1255
 encoding.  I need help to fix the hebrew encoding and convert the
 files to utf-8.  Am I right that utf-8 is the best solution for
 hebrew?

 Thanks,
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il




___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Matitiahu Allouche
OK, your file is encoded in windows-1255.  What is the problem?  Open it 
in Notepad and save it as UTF-8.  You will get a BOM at the beginning.  If 
this is bad for you, edit the file with any editor except Notepad and 
remove the first 3 bytes.


Shalom (Regards),  Mati
   Bidi Architect
   Globalization Center Of Competency - Bidirectional Scripts
   IBM Israel
   Phone: +972 2 502Fax: +972 2 5870333Mobile: +972 52 
2554160




From:
Uri Even-Chen u...@speedy.net
To:
Tom Goren motne...@gmail.com
Cc:
linux-il linux-il@cs.huji.ac.il
Date:
16/12/2009 00:47
Subject:
Re: encoding hebrew text
Sent by:
linux-il-boun...@cs.huji.ac.il



OK, I'm attaching an example file with a few lines in hebrew (encode
it in windows-1255 to read the hebrew).  I checked now and I can
convert the file to windows-1255 encoding, the problem was that it was
already in utf-8 and that's why I couldn't convert it.  but I still
need to encode it in utf-8 with hebrew enabled, so I will be able to
read the file with notepad and notepad++.

Uri.


On Wed, Dec 16, 2009 at 12:21 AM, Tom Goren motne...@gmail.com wrote:
 could you perhaps attach an example of such a file?

 it would make it easier to recommend the appropriate conversion for you 
to
 make (in my opinion it should eventually all be utf8).

 i know that notepad++ should be sufficient.

 tom.

 2009/12/15 Uri Even-Chen u...@speedy.net

 Hi people,

 I have a problem with encoding hebrew text files on windows.  I used
 notepad to edit these files, now I'm using notepad++ (by the way, I
 highly recommend notepad++ on windows).  the problem is, hebrew text
 appears as gibberish (ëøèéñ etc.).  I tried different encodings,
 eventually with using windows-1255 as the character set, I can read
 the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
 of my files I can't read the hebrew at all, even with windows-1255
 encoding.  I need help to fix the hebrew encoding and convert the
 files to utf-8.  Am I right that utf-8 is the best solution for
 hebrew?

 Thanks,
 Uri Even-Chen
 Mobile Phone: +972-50-9007559
 E-mail: u...@speedy.net
 Blog: http://www.speedy.net/uri/blog/

 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


[attachment 1.txt deleted by Matitiahu Allouche/Israel/IBM] 
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Uri Even-Chen
If I open the file with notepad, the hebrew text is not readable.  If
I save it as utf-8, the hebrew text is not readable even in notepad++.
 I need to convert the file to utf-8 in another way, I can't see any
option to encode it with windows-1255 in notepad.

attached it the file in utf-8.

Uri Even-Chen
Mobile Phone: +972-50-9007559
E-mail: u...@speedy.net
Blog: http://www.speedy.net/uri/blog/



On Wed, Dec 16, 2009 at 12:55 AM, Matitiahu Allouche mat...@il.ibm.com wrote:

 OK, your file is encoded in windows-1255.  What is the problem?  Open it in 
 Notepad and save it as UTF-8.  You will get a BOM at the beginning.  If this 
 is bad for you, edit the file with any editor except Notepad and remove the 
 first 3 bytes.


 Shalom (Regards),  Mati
           Bidi Architect
           Globalization Center Of Competency - Bidirectional Scripts
           IBM Israel
           Phone: +972 2 502    Fax: +972 2 5870333    Mobile: +972 52 
 2554160



 From:
 Uri Even-Chen u...@speedy.net
 To:
 Tom Goren motne...@gmail.com
 Cc: linux-il linux-il@cs.huji.ac.il
 Date: 16/12/2009 00:47
 Subject: Re: encoding hebrew text
 Sent by: linux-il-boun...@cs.huji.ac.il
 


 OK, I'm attaching an example file with a few lines in hebrew (encode
 it in windows-1255 to read the hebrew).  I checked now and I can
 convert the file to windows-1255 encoding, the problem was that it was
 already in utf-8 and that's why I couldn't convert it.  but I still
 need to encode it in utf-8 with hebrew enabled, so I will be able to
 read the file with notepad and notepad++.

 Uri.


 On Wed, Dec 16, 2009 at 12:21 AM, Tom Goren motne...@gmail.com wrote:
  could you perhaps attach an example of such a file?
 
  it would make it easier to recommend the appropriate conversion for you to
  make (in my opinion it should eventually all be utf8).
 
  i know that notepad++ should be sufficient.
 
  tom.
 
  2009/12/15 Uri Even-Chen u...@speedy.net
 
  Hi people,
 
  I have a problem with encoding hebrew text files on windows.  I used
  notepad to edit these files, now I'm using notepad++ (by the way, I
  highly recommend notepad++ on windows).  the problem is, hebrew text
  appears as gibberish (ëøèéñ etc.).  I tried different encodings,
  eventually with using windows-1255 as the character set, I can read
  the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
  of my files I can't read the hebrew at all, even with windows-1255
  encoding.  I need help to fix the hebrew encoding and convert the
  files to utf-8.  Am I right that utf-8 is the best solution for
  hebrew?
 
  Thanks,
  Uri Even-Chen
  Mobile Phone: +972-50-9007559
  E-mail: u...@speedy.net
  Blog: http://www.speedy.net/uri/blog/
 
  ___
  Linux-il mailing list
  Linux-il@cs.huji.ac.il
  http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
 
 
 [attachment 1.txt deleted by Matitiahu Allouche/Israel/IBM] 
 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


   æéëøåï ðééã çãùðé Sandisk Cruzer Micro U3 2GB.
   ãâí: SDCZ6-2048-E10WT 
   îñôø îëéøä: 11648541  
   îçéø äîåöø ëåìì ãîé îùìåç: 139 ¤
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


Re: encoding hebrew text

2009-12-15 Thread Uri Even-Chen
OK, I found a solution.  I opened the file with both notepad and
notepad++, then I changed the encoding to windows-1255 in notepad++,
then I copied all the contents to notepad and saved in utf-8.  It
works.  I'm attaching the result.

Thanks!
Uri Even-Chen
Mobile Phone: +972-50-9007559
E-mail: u...@speedy.net
Blog: http://www.speedy.net/uri/blog/



On Wed, Dec 16, 2009 at 12:55 AM, Matitiahu Allouche mat...@il.ibm.com wrote:

 OK, your file is encoded in windows-1255.  What is the problem?  Open it in 
 Notepad and save it as UTF-8.  You will get a BOM at the beginning.  If this 
 is bad for you, edit the file with any editor except Notepad and remove the 
 first 3 bytes.


 Shalom (Regards),  Mati
           Bidi Architect
           Globalization Center Of Competency - Bidirectional Scripts
           IBM Israel
           Phone: +972 2 502    Fax: +972 2 5870333    Mobile: +972 52 
 2554160



 From:
 Uri Even-Chen u...@speedy.net
 To:
 Tom Goren motne...@gmail.com
 Cc: linux-il linux-il@cs.huji.ac.il
 Date: 16/12/2009 00:47
 Subject: Re: encoding hebrew text
 Sent by: linux-il-boun...@cs.huji.ac.il
 


 OK, I'm attaching an example file with a few lines in hebrew (encode
 it in windows-1255 to read the hebrew).  I checked now and I can
 convert the file to windows-1255 encoding, the problem was that it was
 already in utf-8 and that's why I couldn't convert it.  but I still
 need to encode it in utf-8 with hebrew enabled, so I will be able to
 read the file with notepad and notepad++.

 Uri.


 On Wed, Dec 16, 2009 at 12:21 AM, Tom Goren motne...@gmail.com wrote:
  could you perhaps attach an example of such a file?
 
  it would make it easier to recommend the appropriate conversion for you to
  make (in my opinion it should eventually all be utf8).
 
  i know that notepad++ should be sufficient.
 
  tom.
 
  2009/12/15 Uri Even-Chen u...@speedy.net
 
  Hi people,
 
  I have a problem with encoding hebrew text files on windows.  I used
  notepad to edit these files, now I'm using notepad++ (by the way, I
  highly recommend notepad++ on windows).  the problem is, hebrew text
  appears as gibberish (ëøèéñ etc.).  I tried different encodings,
  eventually with using windows-1255 as the character set, I can read
  the hebrew in notepad++, but I can't convert it to utf-8.  Also, one
  of my files I can't read the hebrew at all, even with windows-1255
  encoding.  I need help to fix the hebrew encoding and convert the
  files to utf-8.  Am I right that utf-8 is the best solution for
  hebrew?
 
  Thanks,
  Uri Even-Chen
  Mobile Phone: +972-50-9007559
  E-mail: u...@speedy.net
  Blog: http://www.speedy.net/uri/blog/
 
  ___
  Linux-il mailing list
  Linux-il@cs.huji.ac.il
  http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
 
 
 [attachment 1.txt deleted by Matitiahu Allouche/Israel/IBM] 
 ___
 Linux-il mailing list
 Linux-il@cs.huji.ac.il
 http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il


   זיכרון נייד חדשני Sandisk Cruzer Micro U3 2GB.
   דגם: SDCZ6-2048-E10WT 
   מספר מכירה: 11648541  
   מחיר המוצר כולל דמי משלוח: 139 ₪
___
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il