Re: Ripping Hebrew CDs
On Mon, 23 Apr 2007 16:24:17 Yedidyah Bar-David wrote: On Mon, Apr 23, 2007 at 02:41:51PM +0300, Ehud Karni wrote: I prefer the file name to be in ISO-8859-8 (8 bits) and not UTF-8. Then I can see the Hebrew in Emacs and xterm, but not in Gnome or KDE Any reason not to use utf-8 with xterm/emacs? I admit I do not use emacs so I do not know how comfortable it is, but in xterm/vim it's fine, tab completion and everything. The real problem is that you can't have both ISO-8859-8 (8 bit Hebrew) and UTF-8 Hebrew displayed correctly together. My company have a LOT of files encoded in ISO-8859-8 and it won't be changed in the near (and, I believe, also in the far[1]) future. I use many simple UNIX tools ( cat, more, sed, etc.) to manipulate and view file names and content, on both X and tty terminals. So, ISO-8859-8 it is. Ehud. [1] Unless the company somehow becomes international and has to have many languages simultaneously, and even then, I think they will use fixed wide characters (16 bits) and not UTF-8. -- Ehud Karni Tel: +972-3-7966-561 /\ Mivtach - Simon Fax: +972-3-7966-667 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
On Tue, 24 Apr 2007 11:56:38 +0300, יובל האגר wrote: Ehud Karni: נכתב על ידי ,15:46 ,2007 אפריל 23 ביום שני: You have to re-encode the file name to Hebrew UTF-8 like this: NEWNM=`echo $NM | iconv -futf8 -tlatin1 | iconv -fhebrew -tutf8` Thanks! I've been looking for some time how to do this.. I didn't think there could even be a latin1/hebrew in a UTF-8 encoding.. Anyway, that solves my problem too with file names, but how should I handle ID3 tags? You can use the id3lib package ( http://sourceforge.net/projects/id3lib/ ) to extract the should be Hebrew names, try to convert it using something like the command above, and, if successful, replace the original names. You can then write a script that will do it to all your songs. I'm sure you can get this package for any Linux distribution. But also read this: https://bugs.launchpad.net/debian/+source/id3lib3.8.3/+bug/54136 There are other (using simpler tools) ways to extract the tag and replace it but you'll have to do it with your own scripts. Ehud. -- @@ @@@ @@ @@ Ehud Karni אהוד קרני @@ @ @@ @ Senior System Support תמיכה במערכות מחשב @@ @@ @ @@Mivtach - Simon מבטח - סימון @@ @@ @@ Insurance agencies סוכנויות לבטוח Better Safe Than SorryTel: 03-7966-561 :טל Fax: 03-7966-667 :פקס http://www.mvs.co.il mailto:[EMAIL PROTECTED] :דואל To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
ביום רביעי 25 אפריל 2007, 15:09, נכתב על ידי Ehud Karni: On Tue, 24 Apr 2007 11:56:38 +0300, יובל האגר wrote: Ehud Karni: נכתב על ידי ,�15:46 ,2007 אפריל �23 ביום שני: You have to re-encode the file name to Hebrew UTF-8 like this: NEWNM=`echo $NM | iconv -futf8 -tlatin1 | iconv -fhebrew -tutf8` Thanks! I've been looking for some time how to do this.. I didn't think there could even be a latin1/hebrew in a UTF-8 encoding.. Anyway, that solves my problem too with file names, but how should I handle ID3 tags? You can use the id3lib package ( http://sourceforge.net/projects/id3lib/ ) to extract the should be Hebrew names, try to convert it using something like the command above, and, if successful, replace the original names. You can then write a script that will do it to all your songs. Exactly! It works flawlessly. I'm sure you can get this package for any Linux distribution. But also read this: https://bugs.launchpad.net/debian/+source/id3lib3.8.3/+bug/54136 Luckily, my distribution (Gentoo) includes this patch by default on the latest id3lib version. Anyway, using easytag v2.0 proved to be painless even in Hebrew and it does a great work retagging, where my ripping program doesn't get it right.. Easytag is so kind, it even checks if id3lib is broken with regards to UTF8 tags and notifies about it. Thanks! --yuval -- yuval To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
ביום שני 23 אפריל 2007, 15:46, נכתב על ידי Ehud Karni: On Mon, 23 Apr 2007 15:25:48 Hadar wrote: [snip] Good that you sent the list with the file name. It is encoded in UTF-8 but in latin1 not Hebrew (the song name is: Chadashot Meha-Yareach - News from the Moon, Right ?). You have to re-encode the file name to Hebrew UTF-8 like this: NEWNM=`echo $NM | iconv -futf8 -tlatin1 | iconv -fhebrew -tutf8` Thanks! I've been looking for some time how to do this.. I didn't think there could even be a latin1/hebrew in a UTF-8 encoding.. Anyway, that solves my problem too with file names, but how should I handle ID3 tags? I am using k3b for the ripping, and Amarok does not seem to agree with it on the ID3 tags character encoding..I've tried easytag, but it didn't write the tags correctly (and did not show them on the screen..) Any insight would be appreciated.. --yuval To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Ripping Hebrew CDs
Hello, I'm encoding audio CDs into FLAC files using Sound Juicer. The albums are automatically recognized and the songs names are downloaded. When ripping Hebrew albums, the songs names are sometimes malformed (a.k.agibberish - certainly not Hebrew characters). I've installed the Hebrew packages for Debian and also tried K3B without success... Is it a local problem, or a problem with the remote database? If the problem is local, how can I fix it? Thanks, Hadar
Re: Ripping Hebrew CDs
On Mon, Apr 23, 2007 at 11:38:35AM +0300, Hadar wrote: I'm encoding audio CDs into FLAC files using Sound Juicer. The albums are automatically recognized and the songs names are downloaded. When ripping Hebrew albums, the songs names are sometimes malformed (a.k.agibberish - certainly not Hebrew characters). They are Hebrew characters, but not in a charcter set you expect them to be. I get this when I rip CDs using Audiograbber on Windows. The CD is recognized in the freedb and often I get useable Hebrew names. When I copy the files to a Linux server via Samba the names become garbled. Sometimes they still show properly on a Windows computer, sometimes they are garbage there too. I know there are ways to fix the Samba side of it, but I don't really care. Someone who had many Hebrew albums and wanted the names in readable Hebrew would. Is it a local problem, or a problem with the remote database? If the problem is local, how can I fix it? There are two places where the name of the album and the name of the song are found. One is the obvious, the file name which may or may not be important to you. To me having the name in a simple form in a way I can easily understand and manipulate it is more important than having it in a proper representation of the language. Therefore I use the English name of the artist and a simple translation if I use anything at all for the song name. I remove punctation and replace spaces and special characters with underbars. I convert all file names to lower case. This is done with a PERL program which has gotten quite sophisticated over the years. The second, and more important to me as Hebrew, location is the ID tags in the files themselves. Technicaly they are ID version 3, or ID3 tags for short. Most players will display the ID3 tags instead of the file name, often by default, and that would be a matter of making sure the player has the correct fonts available and uses them. Of course I am talking about playing them on a computer. Playing them on an MP3 player and getting proper Hebrew may be impossible. It depends upon the player. Geoff. -- Geoffrey S. Mendelson, Jerusalem, Israel [EMAIL PROTECTED] N3OWJ/4X1GM IL Voice: (07)-7424-1667 Fax ONLY: 972-2-648-1443 U.S. Voice: 1-215-821-1838 Visit my 'blog at http://geoffstechno.livejournal.com/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
On 4/23/07, Geoffrey S. Mendelson [EMAIL PROTECTED] wrote: On Mon, Apr 23, 2007 at 01:06:12PM +0300, Hadar wrote: I don't care if the names are in Hebrew or English, as long as I get them automatically. Is there any English database for Hebrew albums? Ripping large amount of albums, and typing the song names manually can take ages. No, but the Freedb often has multiple records for the same CD, Audiograbber lets you choose between them, does your ripper? It may be an option. Sound Juicer doesn't... Anyone using an application that play well with Hebrew text?
Re: Ripping Hebrew CDs
On Mon, 23 Apr 2007 12:24:54 Geoffrey S. Mendelson wrote: On Mon, Apr 23, 2007 at 11:38:35AM Hadar wrote: I'm encoding audio CDs into FLAC files using Sound Juicer. The albums are automatically recognized and the songs names are downloaded. When ripping Hebrew albums, the songs names are sometimes malformed (a.k.agibberish - certainly not Hebrew characters). They are Hebrew characters, but not in a charcter set you expect them to be. I prefer the file name to be in ISO-8859-8 (8 bits) and not UTF-8. Then I can see the Hebrew in Emacs and xterm, but not in Gnome or KDE applications. I assume you get your Hebrew name in either ISO-8859-8 or in PC DOS (CP856). I have 2 small shell scripts (below) that converts ALL Hebrew names in a directory tree to ISO-8859-8 / UTF-8. Ehud. #! /bin/sh -e # Translate all file names in this directory tree to iso-8859-8 # - chk_nm () { echo \n\n now working on `pwd` for DFL in * do case $DFL in *�* )# 0xD7 is a sign for Hebrew UTF-8 NDFL=`echo $DFL | iconv -futf8 -thebrew` ;; * ) # NOT UTF-8, DOS Hebrew is 0x80-0x9A NDFL=`echo $DFL | tr [-] [ת-א]` ;; esac if [ $DFL != $NDFL ] ; then# name has changed ? mv $DFL $NDFL # rename echo $DFL == $NDFL # show to user DFL=$NDFL# for recursive checking fi if [ ! -L $DFL -a -d $DFL ] ; then ( cd $DFL ; chk_nm ) fi done } chk_nm # start scanning ## trns-utf-2-heb.sh ## #! /bin/sh -e # Translate all file names in this directory tree to utf-8 # chk_nm () { echo \n\n now working on `pwd` for DFL in * do case $DFL in *�* );; # 0xD7 is a sign for Hebrew UTF-8 * )# NOT UTF-8, DOS Hebrew is 0x80-0x9A NDFL=`echo $DFL | tr [-] [ת-א] | iconv -fhebrew -tutf8` if [ $DFL != $NDFL ] ; then# had any Hebrew ? mv $DFL $NDFL # rename echo $DFL == $NDFL # show to user DFL=$NDFL# for recursive checking fi ;; esac if [ ! -L $DFL -a -d $DFL ] ; then ( cd $DFL ; chk_nm ) fi done } chk_nm # start scanning ## trns-heb-2-utf.sh ## -- Ehud Karni Tel: +972-3-7966-561 /\ Mivtach - Simon Fax: +972-3-7966-667 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
On Mon, 23 Apr 2007 15:25:48 Hadar wrote: Thanks for the scripts! If I understand you correctly, trns-heb-2-utf.sh is what I need. When I tried it on a directory, it simply wiped down the Hebrew characters. Here's some debugging info: + for DFL in '*' + case $DFL in ++ echo '01 - �§�£�¹�¥�÷ �®�¤�©�¸�§.ogg' ++ tr '[�-�]' '[א-ת]' ++ iconv -fhebrew -tutf8 iconv: illegal input sequence at position 5 Looks like the input text is not iso-8859-8 (HEBREW) encoded. I played with iconv a little but couldn't find the appropriate encoding Any suggestion? First Hadar - are you boy or a girl ? Good that you sent the list with the file name. It is encoded in UTF-8 but in latin1 not Hebrew (the song name is: Chadashot Meha-Yareach - News from the Moon, Right ?). You have to re-encode the file name to Hebrew UTF-8 like this: NEWNM=`echo $NM | iconv -futf8 -tlatin1 | iconv -fhebrew -tutf8` Ehud. -- Ehud Karni Tel: +972-3-7966-561 /\ Mivtach - Simon Fax: +972-3-7966-667 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
On Mon, Apr 23, 2007 at 02:41:51PM +0300, Ehud Karni wrote: On Mon, 23 Apr 2007 12:24:54 Geoffrey S. Mendelson wrote: On Mon, Apr 23, 2007 at 11:38:35AM Hadar wrote: I'm encoding audio CDs into FLAC files using Sound Juicer. The albums are automatically recognized and the songs names are downloaded. When ripping Hebrew albums, the songs names are sometimes malformed (a.k.agibberish - certainly not Hebrew characters). They are Hebrew characters, but not in a charcter set you expect them to be. I prefer the file name to be in ISO-8859-8 (8 bits) and not UTF-8. Then I can see the Hebrew in Emacs and xterm, but not in Gnome or KDE Any reason not to use utf-8 with xterm/emacs? I admit I do not use emacs so I do not know how comfortable it is, but in xterm/vim it's fine, tab completion and everything. -- Didi = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: Ripping Hebrew CDs
On 4/23/07, Ehud Karni [EMAIL PROTECTED] wrote: On Mon, 23 Apr 2007 12:24:54 Geoffrey S. Mendelson wrote: On Mon, Apr 23, 2007 at 11:38:35AM Hadar wrote: I'm encoding audio CDs into FLAC files using Sound Juicer. The albums are automatically recognized and the songs names are downloaded. When ripping Hebrew albums, the songs names are sometimes malformed (a.k.agibberish - certainly not Hebrew characters). They are Hebrew characters, but not in a charcter set you expect them to be. I prefer the file name to be in ISO-8859-8 (8 bits) and not UTF-8. Then I can see the Hebrew in Emacs and xterm, but not in Gnome or KDE applications. I assume you get your Hebrew name in either ISO-8859-8 or in PC DOS (CP856). I have 2 small shell scripts (below) that converts ALL Hebrew names in a directory tree to ISO-8859-8 / UTF-8. Thanks for the scripts! If I understand you correctly, trns-heb-2-utf.sh is what I need. When I tried it on a directory, it simply wiped down the Hebrew characters. Here's some debugging info: + for DFL in '*' + case $DFL in ++ echo '01 - çãùåú îäéøç.ogg' ++ tr '[-]' '[ת-א]' ++ iconv -fhebrew -tutf8 iconv: illegal input sequence at position 5 Looks like the input text is not iso-8859-8 (HEBREW) encoded. I played with iconv a little but couldn't find the appropriate encoding. Any suggestion? Ehud. #! /bin/sh -e # Translate all file names in this directory tree to iso-8859-8 # - chk_nm () { echo \n\n now working on `pwd` for DFL in * do case $DFL in *�* )# 0xD7 is a sign for Hebrew UTF-8 NDFL=`echo $DFL | iconv -futf8 -thebrew` ;; * ) # NOT UTF-8, DOS Hebrew is 0x80-0x9A NDFL=`echo $DFL | tr [-] [ת-א]` ;; esac if [ $DFL != $NDFL ] ; then# name has changed ? mv $DFL $NDFL # rename echo $DFL == $NDFL # show to user DFL=$NDFL# for recursive checking fi if [ ! -L $DFL -a -d $DFL ] ; then ( cd $DFL ; chk_nm ) fi done } chk_nm # start scanning ## trns-utf-2-heb.sh## #! /bin/sh -e # Translate all file names in this directory tree to utf-8 # chk_nm () { echo \n\n now working on `pwd` for DFL in * do case $DFL in *�* );; # 0xD7 is a sign for Hebrew UTF-8 * )# NOT UTF-8, DOS Hebrew is 0x80-0x9A NDFL=`echo $DFL | tr [-] [ת-א] | iconv -fhebrew -tutf8` if [ $DFL != $NDFL ] ; then# had any Hebrew ? mv $DFL $NDFL # rename echo $DFL == $NDFL # show to user DFL=$NDFL# for recursive checking fi ;; esac if [ ! -L $DFL -a -d $DFL ] ; then ( cd $DFL ; chk_nm ) fi done } chk_nm # start scanning ## trns-heb-2-utf.sh## -- Ehud Karni Tel: +972-3-7966-561 /\ Mivtach - Simon Fax: +972-3-7966-667 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry