Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

2020-04-02 Thread L A Walsh




On 2020/04/02 06:43, Andrey Repin wrote:

That's not what actually happens.

...\Documents> ls -1 *.pdf
21927-ticket.pdf
'Stars! Universe Map.pdf'

---
Thank you for your update.
--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple


Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

2020-04-02 Thread Andrey Repin
Greetings, L A Walsh!

> On 2020/03/24 00:18, Jay Libove via Cygwin wrote:
>> Problem:
>> Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' 
>> built-in argv[] globbing will produce unexpected:
>> "{programName}: cannot access '{glob pattern}: No such file or directory"
>> e.g.
>> "ls: cannot access '*.pdf': No such file or directory"
>> .. despite the fact that e.g. *.pdf definitely exists.
>>   
> 
> This isn't a bug or a problem, it is working normally as expected.
> Cygwin programs don't have built-in argv[] globbing or processing.

> The problem you are seeing is because you are calling cygwin programs
> from a windows shell.

> On windows, every program has to be built with glob processing.

> On unix, glob processing happens in the shell, so all unix 
> (linux+cygwin)
> type programs have no glob processing because they know that globbing is 
> built
> into the shell (like bash or csh, or dash, etc).

> If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments
> that don't contain a glob (if the glob matches a file).  So 'ls' sees
> only fixed filenames and no globs.

> When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand
> glob chars into anything.  so 'ls' sees a literal file name of '*.pdf'.

> On linux you can name a file '*.pdf' (using an asterisk as a valid 
> character).
> Unless you have a file named, literally '*.pdf', ls won't see it.

That's not what actually happens.

...\Documents> ls -1 *.pdf
21927-ticket.pdf
'Stars! Universe Map.pdf'


-- 
With best regards,
Andrey Repin
Thursday, April 2, 2020 15:51:26

Sorry for my terrible english...

--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple


Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

2020-04-02 Thread L A Walsh

On 2020/03/24 00:18, Jay Libove via Cygwin wrote:

Problem:
Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' 
built-in argv[] globbing will produce unexpected:
"{programName}: cannot access '{glob pattern}: No such file or directory"
e.g.
"ls: cannot access '*.pdf': No such file or directory"
.. despite the fact that e.g. *.pdf definitely exists.
  


   This isn't a bug or a problem, it is working normally as expected.
Cygwin programs don't have built-in argv[] globbing or processing.

   The problem you are seeing is because you are calling cygwin programs
from a windows shell.

   On windows, every program has to be built with glob processing.

   On unix, glob processing happens in the shell, so all unix 
(linux+cygwin)
type programs have no glob processing because they know that globbing is 
built

into the shell (like bash or csh, or dash, etc).

If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments
that don't contain a glob (if the glob matches a file).  So 'ls' sees
only fixed filenames and no globs.

When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand
glob chars into anything.  so 'ls' sees a literal file name of '*.pdf'.

On linux you can name a file '*.pdf' (using an asterisk as a valid 
character).

Unless you have a file named, literally '*.pdf', ls won't see it.

Cygwin does simulate this: example:

 cd /tmp

/tmp> touch \*.pdf
/tmp> ls *.pdf
*.pdf
/tmp cmd
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\tmp>ls *.pdf
ls *.pdf
'*.pdf'

^^ note that now windows find *.pdf because there is a file named '*.pdf'
(quotes added by 'ls').

Does this explain your issue, or am I not understanding it?

Thanks (I'm not a cygwin author; just answering the question)
Linda


Steps to Reproduce:
* Have some files in the local director with accented characters in the names, 
e.g.:
C:> mkdir c:\temp\test
C:> cd c:\temp\test
C:> touch h�llo.pdf
C:> touch g�odbye.pdf
C:> touch normal.pdf
* DON'T have the LANG= environment variable set to anything
* NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a 
Cygwin command which needs to do file name globbing because the Windows CMD.exe 
shells does not do so for it, e.g.
C:> ls *.pdf
C:> cat *.pdf
These will produce "ls: cannot access '*.pdf': No such file or directory"
Although, curiously,
C:> ls *or*
does correctly produce:
normal.pdf

Also, display output of the �cc�nted characters is incomplete:
C:> ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf
C:> bash
jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp
$ ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf


Analysis:
I've verified that it's not about case sensitivity. That is, it's not a matter 
of ls *.pdf vs. ls *.PDF.
If these test commands are run either under bash.exe or within a Cygwin 
Terminal window, the problem does not occur.
I've verified that the Windows system locale (per Windows' Region setting) 
actually doesn't matter. (I've reproduced this both on systems in Region Spain 
with language English-International and English-Ireland, and in a VM with a bog 
standard vanilla US English Windows).

Credits to Paul for suggesting deleting files one by one until the problem goes 
away, and to Andrey for pointing out `locale` and the LANG= setting.

Set LANG=en_US.UTF-8, e.g.
C:> set LANG=en_US.UTF-8
.. and the problem goes away.
C:> ls *.pdf
g�odbye.pdf
h�llo.pdf
normal.pdf
C:> ls
g�odbye.pdf
h�llo.pdf
normal.pdf

Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't 
see the problem. When I tried that exact setting, I still had the problem.
So it's maybe not just that LANG must be set to *something*, but that somehow 
LANG must be set to something that matches something in Windows? (Sorry, I know 
that's nearly uselessly vague).


In summary, it appears that the way that the argv[] globbing code which gets 
compiled in to Cygwin programs functions a bit differently than the way the 
shell globbing code works within bash.exe.
And this produces unexpected globbing failures.


Thanks to all the Cygwin maintainers for this amazing software, for so many 
years!
-Jay


  



--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
  


--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple


Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

2020-03-24 Thread Mark Geisert



Maybe it can simply be fixed by changing the order of setting up locale stuff 
and applying the expansion in cygwin?

(I would look into the code if I had a clue where to find the respective 
things.)


I would guess dcrt0.cc, the Cygwin DLL runtime initialization.

..mark
--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple


Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash

2020-03-24 Thread Thomas Wolff

Am 24.03.2020 um 08:18 schrieb Jay Libove via Cygwin:

Hi Cygwin team,
Here is a consolidated bug report based on the discussion in recent days which I'd started under 
the subject " shell expansion produces e.g. "ls: cannot access '*.pdf': No such file or 
directory" in Windows CMD shell, but works okay in bash " (thread starter 
https://cygwin.com/pipermail/cygwin/2020-March/244161.html )
Many thanks to Paul, Andrey, and others for helping me nail down where and how 
it seems to be happening.
My apologies in advance that my coding days are long behind me, so I'm not in a 
position to include a proposed code fix.

cygcheck output attached (lightly modified to redact a couple of personal 
items).

Problem:
Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' 
built-in argv[] globbing will produce unexpected:
"{programName}: cannot access '{glob pattern}: No such file or directory"
e.g.
"ls: cannot access '*.pdf': No such file or directory"
.. despite the fact that e.g. *.pdf definitely exists.

Steps to Reproduce:
* Have some files in the local director with accented characters in the names, 
e.g.:
C:> mkdir c:\temp\test
C:> cd c:\temp\test
C:> touch héllo.pdf
C:> touch gòodbye.pdf
C:> touch normal.pdf
* DON'T have the LANG= environment variable set to anything
* NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a 
Cygwin command which needs to do file name globbing because the Windows CMD.exe 
shells does not do so for it, e.g.
C:> ls *.pdf
C:> cat *.pdf
These will produce "ls: cannot access '*.pdf': No such file or directory"
Although, curiously,
C:> ls *or*
does correctly produce:
normal.pdf

Also, display output of the áccènted characters is incomplete:
C:> ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf
C:> bash
jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp
$ ls
'g'$'\303\262''odbye.pdf'  'h'$'\303\251''llo.pdf'   normal.pdf


Analysis:
I've verified that it's not about case sensitivity. That is, it's not a matter 
of ls *.pdf vs. ls *.PDF.
If these test commands are run either under bash.exe or within a Cygwin 
Terminal window, the problem does not occur.
I've verified that the Windows system locale (per Windows' Region setting) 
actually doesn't matter. (I've reproduced this both on systems in Region Spain 
with language English-International and English-Ireland, and in a VM with a bog 
standard vanilla US English Windows).

Credits to Paul for suggesting deleting files one by one until the problem goes 
away, and to Andrey for pointing out `locale` and the LANG= setting.

Set LANG=en_US.UTF-8, e.g.
C:> set LANG=en_US.UTF-8
.. and the problem goes away.
C:> ls *.pdf
gòodbye.pdf
héllo.pdf
normal.pdf
C:> ls
gòodbye.pdf
héllo.pdf
normal.pdf

Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't 
see the problem. When I tried that exact setting, I still had the problem.
So it's maybe not just that LANG must be set to *something*, but that somehow 
LANG must be set to something that matches something in Windows? (Sorry, I know 
that's nearly uselessly vague).


In summary, it appears that the way that the argv[] globbing code which gets 
compiled in to Cygwin programs functions a bit differently than the way the 
shell globbing code works within bash.exe.
And this produces unexpected globbing failures.

(As commented in the other thread already:)
Maybe it can simply be fixed by changing the order of setting up locale 
stuff and applying the expansion in cygwin?
(I would look into the code if I had a clue where to find the respective 
things.)

Thomas



Thanks to all the Cygwin maintainers for this amazing software, for so many 
years!
-Jay

--
Problem reports:  https://cygwin.com/problems.html
FAQ:  https://cygwin.com/faq/
Documentation:https://cygwin.com/docs.html
Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple