Bug#307540: Bug#306773: Please support UTF8/Unicode
On Thu, 2005-05-19 at 14:26 +0100, Martin Michlmayr wrote: The upstream author of eyeD3 seems very active, so I'm copying him to this message. I'm sure he's in a better position than me to comment, and maybe he can help convert Quod Libet from id3lib to eyeD3. The next version of Quod Libet supports eyeD3 modulo a few patches; unfortunately I haven't gotten any response from Travis yet. Without them we may have to include a patched copy of the library in the next version of Quod Libet. :/ But the good news is, either way we'll be able to drop python-id3lib from Debian soon. -- Joe Wreschnig [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Bug#307540: Bug#306773: Please support UTF8/Unicode
The upstream author of eyeD3 seems very active, so I'm copying him to this message. I'm sure he's in a better position than me to comment, and maybe he can help convert Quod Libet from id3lib to eyeD3. * Joe Wreschnig [EMAIL PROTECTED] [2005-05-03 18:04]: retitle 307540 Use eyeD3 rather than id3lib tags 307540 + help thanks On Tue, 2005-05-03 at 23:13 +0100, Martin Michlmayr wrote: * [EMAIL PROTECTED] [2005-05-03 21:37]: retitle -1 Please drop python-id3lib and use python-eyed3 instead Joe, I took a brief look at quodlibet and you can get rid of most of the nasty code in formats/mp3.py by switching to eyeD3. eyeD3 also supports Unicode and is maintained unlike python-id3lib. I'm not sure what nasty code you refer to; the nastiest bit is the huge mapping tables for ID3 frame IDs into Quod Libet tag names and TCON IDs into genres, which we'd still need. As indicated at http://www.sacredchao.net/quodlibet/ticket/3 I've looked at eyeD3 and found it completely lacking in documentation. I poked around it for a few hours, but found no easy way to manipulate frames at the level QL wants, and no way at all to support multivalued frames. I'd also like to keep enough control over COMM frames to do the QuodLibet:: namespace trick we use to support unrestricted tags in MP3s. I've reopened the ticket since the abandoning of python-id3lib makes it an issue again, but don't have time to work on it myself. Anyone more familiar with eyeD3 is encouraged to contribute, either by fixing QL or writing documentation for eyeD3. An aside about ID3 tags and encodings: Unicode support is not going to magically appear when people switch to eyeD3; anything using libid3-3.8.3 for reading/writing is going to continue to be broken (#213239 among others). This includes Beep, Grip, and EasyTag. I don't know if libid3tag (which GStreamer uses) properly supports it either. The millions of broken files in the wild are also going to remain broken, and need special-casing in tag readers/editors that don't want to suck. Given that fact, Quod Libet ignores the encoding flag for ID3 tags anyway, and just uses UTF-8. It already supports Unicode tags, about as well as anything can given the mess of ID3. Retitling appropriately. -- Joe Wreschnig [EMAIL PROTECTED] -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#306773: Please support UTF8/Unicode
clone 306773 -1 -2 reassign -1 quodlibet reassign -2 rbscrobbler retitle -1 Please drop python-id3lib and use python-eyed3 instead retitle -2 Please drop python-id3lib and use python-eyed3 instead thanks * Jonas Smedegaard [EMAIL PROTECTED] [2005-05-02 13:07]: retitle 306773 Please drop python-id3lib and use python-eyed3 instead clone 306773 -1 -2 -3 reassign -1 pytone reassign -2 quodlibet reassign -3 rbscrobbler thanks It seems you forgot to CC [EMAIL PROTECTED] Also, pytone uses -eyed3 already. -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#307540: Bug#306773: Please support UTF8/Unicode
* [EMAIL PROTECTED] [2005-05-03 21:37]: retitle -1 Please drop python-id3lib and use python-eyed3 instead Joe, I took a brief look at quodlibet and you can get rid of most of the nasty code in formats/mp3.py by switching to eyeD3. eyeD3 also supports Unicode and is maintained unlike python-id3lib. -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#307540: Bug#306773: Please support UTF8/Unicode
retitle 307540 Use eyeD3 rather than id3lib tags 307540 + help thanks On Tue, 2005-05-03 at 23:13 +0100, Martin Michlmayr wrote: * [EMAIL PROTECTED] [2005-05-03 21:37]: retitle -1 Please drop python-id3lib and use python-eyed3 instead Joe, I took a brief look at quodlibet and you can get rid of most of the nasty code in formats/mp3.py by switching to eyeD3. eyeD3 also supports Unicode and is maintained unlike python-id3lib. I'm not sure what nasty code you refer to; the nastiest bit is the huge mapping tables for ID3 frame IDs into Quod Libet tag names and TCON IDs into genres, which we'd still need. As indicated at http://www.sacredchao.net/quodlibet/ticket/3 I've looked at eyeD3 and found it completely lacking in documentation. I poked around it for a few hours, but found no easy way to manipulate frames at the level QL wants, and no way at all to support multivalued frames. I'd also like to keep enough control over COMM frames to do the QuodLibet:: namespace trick we use to support unrestricted tags in MP3s. I've reopened the ticket since the abandoning of python-id3lib makes it an issue again, but don't have time to work on it myself. Anyone more familiar with eyeD3 is encouraged to contribute, either by fixing QL or writing documentation for eyeD3. An aside about ID3 tags and encodings: Unicode support is not going to magically appear when people switch to eyeD3; anything using libid3-3.8.3 for reading/writing is going to continue to be broken (#213239 among others). This includes Beep, Grip, and EasyTag. I don't know if libid3tag (which GStreamer uses) properly supports it either. The millions of broken files in the wild are also going to remain broken, and need special-casing in tag readers/editors that don't want to suck. Given that fact, Quod Libet ignores the encoding flag for ID3 tags anyway, and just uses UTF-8. It already supports Unicode tags, about as well as anything can given the mess of ID3. Retitling appropriately. -- Joe Wreschnig [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Bug#306773: Please support UTF8/Unicode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 retitle 306773 Please drop python-id3lib and use python-eyed3 instead clone 306773 -1 -2 -3 reassign -1 pytone reassign -2 quodlibet reassign -3 rbscrobbler thanks On 02-05-2005 00:28, Chris Vanden Berghe wrote: Last week I looked at the python-id3lib code a bit (for the purpose of adding UTF8 support to Jack) and came to the conclusion that IMHO it would be better to drop python-id3lib in favor of python-eyed3. The upstream python-id3lib package seems to be unmaintained and the underlying libid3 library itself also seems to have a plethora of open bugs, several of which related to UTF8. The python-eyed3 library on the other hand seems to be well maintained and UTF8-aware. Porting an application from python-id3lib to python-eyed3 is quite easy. So, in this respect I prefer to spend my time porting Jack (and if requested also other rdepends on python-id3lib) to python-eyed3 than fixing python-id3lib itself. Hope you understand... I do understand - and agree that it sounds most sane to abandon python-id3lib. It seems only pytone, quodlibet and rbscrobbler rdepends on python-id3lib. I would appreciate your help patching those packages to switch to using python-eyed3 instead, so that python-id3lib can be dropped completely. I have now cloned this bugreport to the relevant packages. - Jonas - -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ - Enden er nr: http://www.shibumi.org/eoti.htm -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCdgnfn7DbMsAkQLgRAnwSAKCl0lIRBI+7wSzLu415x1g/0Y5YKACffi48 fNopRIGnFU/HESUqCjapvHg= =cpGk -END PGP SIGNATURE-
Bug#306773: Please support UTF8/Unicode
Hi Jonas, Last week I looked at the python-id3lib code a bit (for the purpose of adding UTF8 support to Jack) and came to the conclusion that IMHO it would be better to drop python-id3lib in favor of python-eyed3. The upstream python-id3lib package seems to be unmaintained and the underlying libid3 library itself also seems to have a plethora of open bugs, several of which related to UTF8. The python-eyed3 library on the other hand seems to be well maintained and UTF8-aware. Porting an application from python-id3lib to python-eyed3 is quite easy. So, in this respect I prefer to spend my time porting Jack (and if requested also other rdepends on python-id3lib) to python-eyed3 than fixing python-id3lib itself. Hope you understand... Best regards, Chris. --- Hi Chris, If you get around to hacking UTF-8 support into the python library please post the patch here, so I can include it with the official Debian package. You say you'd preserve current behaviour of the library by default - but does that make sense at all? I don't know the details of ID3v2 if it is defined as UTF-8 then current behaviour is broken IMHO (even if it happens to work in the western world). As you can probably read between the lines of the above, I do packaging, but is not capable of hacking the code itself, so can't be of assistence in fixing this. Regards, - Jonas -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#306773: Please support UTF8/Unicode
Package: python-id3lib Severity: wishlist Please support UTF8 ID3v2 tags. - Forwarded message from Chris Vanden Berghe [EMAIL PROTECTED] - From: Chris Vanden Berghe [EMAIL PROTECTED] Subject: Re: Bug#266052: jack: doesn't support UTF8 freedb entries Date: Tue, 26 Apr 2005 13:37:54 +0200 To: Martin Michlmayr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.2 Thanks! I looked myself a bit already... changing the byte exactly before every ID3 string from 00 to 03 results in a correct UTF8 ID3v2 tag. This is good news, since this means that everything except for the setting of the encoding itself works fine... I could probably fix this quite easily by adding an option to the python2.3-id3 library for setting the encoding to UTF8 and then calling this function from jack... this should not cause any problems for other applications using this library as the default behavior doesn't change. If you know of another (maintained) library that would of course be a much better option in the long run... I'll look into it next week. If you had time to look for an alternative id3 library or better solution by then, please let me know. Cheers, Chris. --- Martin Michlmayr wrote: * Chris Vanden Berghe [EMAIL PROTECTED] [2005-04-26 01:39]: Argh, forgot to look at the ID3(v2) tag... the filenames are correct, but the UTF8 characters in the ID3 tag are interpreted as two seperate characters. I guess that the encoding of the ID3 content should also be set to UTF8. (as in http://www.id3.org/id3v2.4.0-structure.txt) Oh my. Read in python2.3-id3's documentation that Unicode was on the TODO list, but I hoped that just passing a UTF-8 string would work. It seems both of the Python ID3 modules are unmaintained; I'll try to look around to see whether there's another module, but note that it's not a top priority on my list since I don't use MP3 myself. - End forwarded message - -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#306773: Please support UTF8/Unicode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 28-04-2005 14:10, Martin Michlmayr wrote: Package: python-id3lib Severity: wishlist Please support UTF8 ID3v2 tags. - Forwarded message from Chris Vanden Berghe [EMAIL PROTECTED] - From: Chris Vanden Berghe [EMAIL PROTECTED] Subject: Re: Bug#266052: jack: doesn't support UTF8 freedb entries Date: Tue, 26 Apr 2005 13:37:54 +0200 To: Martin Michlmayr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED] User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.2 Thanks! I looked myself a bit already... changing the byte exactly before every ID3 string from 00 to 03 results in a correct UTF8 ID3v2 tag. This is good news, since this means that everything except for the setting of the encoding itself works fine... I could probably fix this quite easily by adding an option to the python2.3-id3 library for setting the encoding to UTF8 and then calling this function from jack... this should not cause any problems for other applications using this library as the default behavior doesn't change. Hi Chris, If you get around to hacking UTF-8 support into the python library please post the patch here, so I can include it with the official Debian package. You say you'd preserve current behaviour of the library by default - but does that make sense at all? I don't know the details of ID3v2 if it is defined as UTF-8 then current behaviour is broken IMHO (even if it happens to work in the western world). As you can probably read between the lines of the above, I do packaging, but is not capable of hacking the code itself, so can't be of assistence in fixing this. Regards, - Jonas - -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ - Enden er nr: http://www.shibumi.org/eoti.htm -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCcPZnn7DbMsAkQLgRAhcZAKCXX+AzwXsdlInfjms1cfMvS3TpHgCgpT2/ o/AlNphdXNOGc3Uial9wBps= =AWbZ -END PGP SIGNATURE-
Bug#306773: Please support UTF8/Unicode
* Jonas Smedegaard [EMAIL PROTECTED] [2005-04-28 16:42]: You say you'd preserve current behaviour of the library by default - but does that make sense at all? I don't know the details of ID3v2 if it is defined as UTF-8 then current behaviour is broken IMHO (even if it happens to work in the western world). ID3v2 supports multiple charsets; the current behaviour is okay, but only supports Latin-1. I think adding UTF-8 support should be easy (since the library used by the Python bindings support it) and I'll have a look later on. Do you know if upstream is still around so he can review the patch once it's available? -- Martin Michlmayr http://www.cyrius.com/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]