Santiago Vila <[email protected]> wrote:

> So: Is this a bug, or is the file supposed to be always in UTF-8?
> (Is this documented?)

Hi,

I think it is a bug because:

1) Standard 'grep' works just fine on the described examples
2) The behaviour also breaks the integration with other utilities, such as
'find' and 'ls', even in the UTF-8 environment. For instance, I've the
following file names in a directory (attached you find the corresponding .tar
archive containing those files):

   % ls

   (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee 
d'Obermann.flac
   (08) [Debussy] Images, Book2 1 Cloches ? travers les feuilles.flac
   (11) [Mozart] Fantasia in C minor, KV475.flac

where the second file name contains a weird character (here shown as '?').
Then, the command

   % ls | tre-agrep flac

returns just the first file name (the problematic one is the second file):

   (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee 
d'Obermann.flac

while

   % ls | grep flac

correctly returns all of them:

   (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee 
d'Obermann.flac
   (08) [Debussy] Images, Book2 1 Cloches � travers les feuilles.flac
   (11) [Mozart] Fantasia in C minor, KV475.flac

The command

   % ls | tre-agrep feuilles

returns nothing; neither does this one:

   % ls | tre-agrep KV475

-- 
Douglas A. Augusto

Attachment: tre-agrep_bug-dir_example.tar
Description: Unix tar archive

Reply via email to