Tino Keitel wrote:
> lltag always seems to decode non-ASCII characters as Latin1 characters, even
> when the user has set a UTF8 locale:
>   

I was actually expecting some UTF8 related problems since people
requesting ID3v2 used the UTF-8 support. But I wanted to get early
feedback about ID3v2 before dealing with all related issues, so I didn't
really bother trying to make the UTF-8 support very good.

>     Track 06: Mu� ja
>     Track 07: H�nde hoch, Papa                                         
>     Track 08: Fin de mill�naire                                        
>   

I didn't think about CDDB. Looks like the CDDB site I use returns
non-UTF8 encoding. I need to keep that in mind.

> Here are the correct filenames:
>
> 06 - Muß ja.mp3
> 07 - Hände hoch, Papa.mp3
> 08 - Fin de millénaire.mp3
>
> The idv2 tag will also get a Latin1 encoding instead of UTF8. Here is the
> id3v2 -l output:
>
> TALB (Album/Movie/Show title): Economy Class
> TCON (Content type): Rock (17)
> TIT2 (Title/songname/content description): Mu� ja
>   

I am not sure you can trust id3v2 here. lltag currently does not pass
any encoding to MP3::Tag, which means the encoding is set to non-UTF8 by
default. So I don't think there's an inconsistency between the encoding
that is set and the actual tag string encoding, both are non-UTF8. Then,
id3v2 might be displaying lltag's non-UTF-8 tags without translating
them into your UTF-8 locale.


Anyway, it looks like I should
* know that CDDB returns non-UTF-8
* convert its result to UTF-8 if the locale is UTF-8
* use the converted values for both displaying and tagging

Then several questions need to be raised:
* should I assume that the filename, the tags and the current locale are
all the same (I mean all UTF-8 or all non-UTF8)?
* if your current locale is UTF8 while the file contains non-UTF8 tags,
do I convert them into UTF-8 ?
* do we need options to convert filenames from/to UTF8 when
reading/renaming and tags from/to UTF8 when reading/tagging? It would
mean a lot of new command line options...
* what about other filetype? OGG vorbis seems to use UTF8 by default, I
should fix lltag then. It might be the same for FLAC. But, I don't know
what to do with ID3v1.

So yes, I am going to work on all this, but I need to think a lot first :)

Brice


Reply via email to