Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash
On 2020/04/02 06:43, Andrey Repin wrote: That's not what actually happens. ...\Documents> ls -1 *.pdf 21927-ticket.pdf 'Stars! Universe Map.pdf' --- Thank you for your update. -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash
Greetings, L A Walsh! > On 2020/03/24 00:18, Jay Libove via Cygwin wrote: >> Problem: >> Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' >> built-in argv[] globbing will produce unexpected: >> "{programName}: cannot access '{glob pattern}: No such file or directory" >> e.g. >> "ls: cannot access '*.pdf': No such file or directory" >> .. despite the fact that e.g. *.pdf definitely exists. >> > > This isn't a bug or a problem, it is working normally as expected. > Cygwin programs don't have built-in argv[] globbing or processing. > The problem you are seeing is because you are calling cygwin programs > from a windows shell. > On windows, every program has to be built with glob processing. > On unix, glob processing happens in the shell, so all unix > (linux+cygwin) > type programs have no glob processing because they know that globbing is > built > into the shell (like bash or csh, or dash, etc). > If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments > that don't contain a glob (if the glob matches a file). So 'ls' sees > only fixed filenames and no globs. > When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand > glob chars into anything. so 'ls' sees a literal file name of '*.pdf'. > On linux you can name a file '*.pdf' (using an asterisk as a valid > character). > Unless you have a file named, literally '*.pdf', ls won't see it. That's not what actually happens. ...\Documents> ls -1 *.pdf 21927-ticket.pdf 'Stars! Universe Map.pdf' -- With best regards, Andrey Repin Thursday, April 2, 2020 15:51:26 Sorry for my terrible english... -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash
On 2020/03/24 00:18, Jay Libove via Cygwin wrote: Problem: Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' built-in argv[] globbing will produce unexpected: "{programName}: cannot access '{glob pattern}: No such file or directory" e.g. "ls: cannot access '*.pdf': No such file or directory" .. despite the fact that e.g. *.pdf definitely exists. This isn't a bug or a problem, it is working normally as expected. Cygwin programs don't have built-in argv[] globbing or processing. The problem you are seeing is because you are calling cygwin programs from a windows shell. On windows, every program has to be built with glob processing. On unix, glob processing happens in the shell, so all unix (linux+cygwin) type programs have no glob processing because they know that globbing is built into the shell (like bash or csh, or dash, etc). If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments that don't contain a glob (if the glob matches a file). So 'ls' sees only fixed filenames and no globs. When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand glob chars into anything. so 'ls' sees a literal file name of '*.pdf'. On linux you can name a file '*.pdf' (using an asterisk as a valid character). Unless you have a file named, literally '*.pdf', ls won't see it. Cygwin does simulate this: example: cd /tmp /tmp> touch \*.pdf /tmp> ls *.pdf *.pdf /tmp cmd Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\tmp>ls *.pdf ls *.pdf '*.pdf' ^^ note that now windows find *.pdf because there is a file named '*.pdf' (quotes added by 'ls'). Does this explain your issue, or am I not understanding it? Thanks (I'm not a cygwin author; just answering the question) Linda Steps to Reproduce: * Have some files in the local director with accented characters in the names, e.g.: C:> mkdir c:\temp\test C:> cd c:\temp\test C:> touch h�llo.pdf C:> touch g�odbye.pdf C:> touch normal.pdf * DON'T have the LANG= environment variable set to anything * NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a Cygwin command which needs to do file name globbing because the Windows CMD.exe shells does not do so for it, e.g. C:> ls *.pdf C:> cat *.pdf These will produce "ls: cannot access '*.pdf': No such file or directory" Although, curiously, C:> ls *or* does correctly produce: normal.pdf Also, display output of the �cc�nted characters is incomplete: C:> ls 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf C:> bash jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp $ ls 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf Analysis: I've verified that it's not about case sensitivity. That is, it's not a matter of ls *.pdf vs. ls *.PDF. If these test commands are run either under bash.exe or within a Cygwin Terminal window, the problem does not occur. I've verified that the Windows system locale (per Windows' Region setting) actually doesn't matter. (I've reproduced this both on systems in Region Spain with language English-International and English-Ireland, and in a VM with a bog standard vanilla US English Windows). Credits to Paul for suggesting deleting files one by one until the problem goes away, and to Andrey for pointing out `locale` and the LANG= setting. Set LANG=en_US.UTF-8, e.g. C:> set LANG=en_US.UTF-8 .. and the problem goes away. C:> ls *.pdf g�odbye.pdf h�llo.pdf normal.pdf C:> ls g�odbye.pdf h�llo.pdf normal.pdf Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't see the problem. When I tried that exact setting, I still had the problem. So it's maybe not just that LANG must be set to *something*, but that somehow LANG must be set to something that matches something in Windows? (Sorry, I know that's nearly uselessly vague). In summary, it appears that the way that the argv[] globbing code which gets compiled in to Cygwin programs functions a bit differently than the way the shell globbing code works within bash.exe. And this produces unexpected globbing failures. Thanks to all the Cygwin maintainers for this amazing software, for so many years! -Jay -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash
Maybe it can simply be fixed by changing the order of setting up locale stuff and applying the expansion in cygwin? (I would look into the code if I had a clue where to find the respective things.) I would guess dcrt0.cc, the Cygwin DLL runtime initialization. ..mark -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple
Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash
Am 24.03.2020 um 08:18 schrieb Jay Libove via Cygwin: Hi Cygwin team, Here is a consolidated bug report based on the discussion in recent days which I'd started under the subject " shell expansion produces e.g. "ls: cannot access '*.pdf': No such file or directory" in Windows CMD shell, but works okay in bash " (thread starter https://cygwin.com/pipermail/cygwin/2020-March/244161.html ) Many thanks to Paul, Andrey, and others for helping me nail down where and how it seems to be happening. My apologies in advance that my coding days are long behind me, so I'm not in a position to include a proposed code fix. cygcheck output attached (lightly modified to redact a couple of personal items). Problem: Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' built-in argv[] globbing will produce unexpected: "{programName}: cannot access '{glob pattern}: No such file or directory" e.g. "ls: cannot access '*.pdf': No such file or directory" .. despite the fact that e.g. *.pdf definitely exists. Steps to Reproduce: * Have some files in the local director with accented characters in the names, e.g.: C:> mkdir c:\temp\test C:> cd c:\temp\test C:> touch héllo.pdf C:> touch gòodbye.pdf C:> touch normal.pdf * DON'T have the LANG= environment variable set to anything * NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a Cygwin command which needs to do file name globbing because the Windows CMD.exe shells does not do so for it, e.g. C:> ls *.pdf C:> cat *.pdf These will produce "ls: cannot access '*.pdf': No such file or directory" Although, curiously, C:> ls *or* does correctly produce: normal.pdf Also, display output of the áccènted characters is incomplete: C:> ls 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf C:> bash jay_l@DESKTOP-I9MRIE3 /cygdrive/c/Temp $ ls 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf Analysis: I've verified that it's not about case sensitivity. That is, it's not a matter of ls *.pdf vs. ls *.PDF. If these test commands are run either under bash.exe or within a Cygwin Terminal window, the problem does not occur. I've verified that the Windows system locale (per Windows' Region setting) actually doesn't matter. (I've reproduced this both on systems in Region Spain with language English-International and English-Ireland, and in a VM with a bog standard vanilla US English Windows). Credits to Paul for suggesting deleting files one by one until the problem goes away, and to Andrey for pointing out `locale` and the LANG= setting. Set LANG=en_US.UTF-8, e.g. C:> set LANG=en_US.UTF-8 .. and the problem goes away. C:> ls *.pdf gòodbye.pdf héllo.pdf normal.pdf C:> ls gòodbye.pdf héllo.pdf normal.pdf Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't see the problem. When I tried that exact setting, I still had the problem. So it's maybe not just that LANG must be set to *something*, but that somehow LANG must be set to something that matches something in Windows? (Sorry, I know that's nearly uselessly vague). In summary, it appears that the way that the argv[] globbing code which gets compiled in to Cygwin programs functions a bit differently than the way the shell globbing code works within bash.exe. And this produces unexpected globbing failures. (As commented in the other thread already:) Maybe it can simply be fixed by changing the order of setting up locale stuff and applying the expansion in cygwin? (I would look into the code if I had a clue where to find the respective things.) Thomas Thanks to all the Cygwin maintainers for this amazing software, for so many years! -Jay -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation:https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple