Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Brian Inglis
On 2019-09-09 12:26, Duncan Roe wrote:
> On Mon, Sep 09, 2019 at 11:57:21AM -0500, Eric Blake wrote:
>> On 9/9/19 11:47 AM, Stephen Provine via cygwin wrote:
>>> Argh, my mistake about top posting again. My email client does not help me
>>> with this by default and I have to manually construct quoting of previous
>>> responses and delete what shouldn't be there (and missed it again). If 
>>> there's
>>> any way for someone to delete my previous message from the archive, please 
>>> do.
>>>
>>> Why doesn't Cygwin utilize Github or something else more modern to manage 
>>> issues?
>>
>> Just because it is "modern" does not necessarily make it better.
>>
>> https://www.gnu.org/software/repo-criteria-evaluation.en.html
>>
>> ranks github as worse than gitlab, in part because there is no way to
>> use the full power of github without surrendering to the use of non-free
>> software.

And it's now owned by a proprietary vendor without any roots in open source.

> No problems with that. I think though, that the OP's point was to use *git*
> (wherever the repo may be).

The core code repo is in git on sourceware.org alias cygwin.com:
https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git

The patch and commit messages, and ChangeLog and NEWS files often contain
references to the OPs reporting problems.

There are no issue or bug handling components for git.

There are bugzillas that have been write-only since about 2012:
https://cygwin.com/bugzilla/describecomponents.cgi?product=cygwin
if you report there just remember to email this list with a link to the bug.

There is also savannah.nongnu.org which might be usable, as most
non-GPL/AGPL/LGPL contributions, are BSD licensed to allow modification for
proprietary uses.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Andrey Repin
Greetings, Duncan Roe!

> On Mon, Sep 09, 2019 at 11:57:21AM -0500, Eric Blake wrote:
>> On 9/9/19 11:47 AM, Stephen Provine via cygwin wrote:
>> > Argh, my mistake about top posting again. My email client does not help me
>> > with this by default and I have to manually construct quoting of previous
>> > responses and delete what shouldn't be there (and missed it again). If 
>> > there's
>> > any way for someone to delete my previous message from the archive, please 
>> > do.
>> >
>> > Why doesn't Cygwin utilize Github or something else more modern to manage 
>> > issues?
>>
>> Just because it is "modern" does not necessarily make it better.
>>
>> https://www.gnu.org/software/repo-criteria-evaluation.en.html
>>
>> ranks github as worse than gitlab, in part because there is no way to
>> use the full power of github without surrendering to the use of non-free
>> software.

> No problems with that. I think though, that the OP's point was to use *git*
> (wherever the repo may be).

Cygwin uses Git. Just not GitHub.
The question was to manage ISSUES, though.
And the answer is, this list was working just fine for the purpose so far.
There's no pressing need to change that, last I heard.


-- 
With best regards,
Andrey Repin
Monday, September 9, 2019 21:55:23

Sorry for my terrible english...


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Duncan Roe
Hi Eric,

On Mon, Sep 09, 2019 at 11:57:21AM -0500, Eric Blake wrote:
> On 9/9/19 11:47 AM, Stephen Provine via cygwin wrote:
> > Argh, my mistake about top posting again. My email client does not help me
> > with this by default and I have to manually construct quoting of previous
> > responses and delete what shouldn't be there (and missed it again). If 
> > there's
> > any way for someone to delete my previous message from the archive, please 
> > do.
> >
> > Why doesn't Cygwin utilize Github or something else more modern to manage 
> > issues?
>
> Just because it is "modern" does not necessarily make it better.
>
> https://www.gnu.org/software/repo-criteria-evaluation.en.html
>
> ranks github as worse than gitlab, in part because there is no way to
> use the full power of github without surrendering to the use of non-free
> software.

No problems with that. I think though, that the OP's point was to use *git*
(wherever the repo may be).

Cheers ... Duncan.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Eric Blake
On 9/9/19 11:47 AM, Stephen Provine via cygwin wrote:
> Argh, my mistake about top posting again. My email client does not help me
> with this by default and I have to manually construct quoting of previous
> responses and delete what shouldn't be there (and missed it again). If there's
> any way for someone to delete my previous message from the archive, please do.
> 
> Why doesn't Cygwin utilize Github or something else more modern to manage 
> issues?

Just because it is "modern" does not necessarily make it better.

https://www.gnu.org/software/repo-criteria-evaluation.en.html

ranks github as worse than gitlab, in part because there is no way to
use the full power of github without surrendering to the use of non-free
software.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Stephen Provine via cygwin
Argh, my mistake about top posting again. My email client does not help me
with this by default and I have to manually construct quoting of previous
responses and delete what shouldn't be there (and missed it again). If there's
any way for someone to delete my previous message from the archive, please do.

Why doesn't Cygwin utilize Github or something else more modern to manage 
issues?

Thanks,
Stephen

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-09 Thread Stephen Provine via cygwin
On 2019-09-06 13:35, Andrey Repin wrote:
> CMD escape character is ^, not \

You are correct about the cmd.exe interpretation, so my test cases were
buggy, but Go invokes other executables using CreateProcess directly and
is not subject to the additional set of command line processing rules that
are used by cmd.exe.

If you see the last exchange with Eric, I think it is clear that there is a case
missing in the Cygwin processing rules that becomes a problem when a
calling process directly reverses the rules, specifically when an argument
value does not itself need to be quoted but it has a double quote in the
value. This is rule 4 in what I found to be the most definitive reference:

http://daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULESCHANGE

And see the fourth example in section 5.4.

However, the *safest* way to construct a command line is to avoid this
case and make sure to always double quote an argument that contains
double quotes. The official algorithm from a Microsoft source was
previously posted by Eric:

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

Interesting that there's actually nothing in this article that specifically
means it *shouldn't* be ok to do what the Go algorithm does, it just
happens to be simpler if you don't worry about that case.

FWIW, .NET Core uses this algorithm:

https://github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/System/PasteArguments.cs

Which I think is probably pretty good validation that it's the right one to use.

So, the outcome of all of this is that Go should probably update their logic
as it's based on the wrong official source. I plan to follow up there. If there
is any interest in the future to correct the parsing behavior in Cygwin, the
information needed to do that is in this thread. Personally, I think that if
Cygwin fixes the problem it's easier to recompile all those binaries than try
to locate all potential source calling processes to make sure they follow
the right algorithm (Go isn't right, what about Node, Python, etc...) But
I'm not going to push on this point as I can work around it for my case.

Thanks,
Stephen

-Original Message-
From: Andrey Repin  
Sent: Friday, September 6, 2019 1:35 PM
To: Stephen Provine ; cygwin@cygwin.com
Subject: Re: Command line processing in dcrt0.cc does not match Microsoft 
parsing rules

Greetings, Stephen Provine!

> On 2019-09-04 23:29, Brian Inglis wrote:
>> As standard on Unix systems, just add another level of quoting for 
>> each level of interpretation, as bash will process that command line, 
>> then bash will process the script command line.

> My mistake - I'm very aware of the quoting rules, yet in my test 
> script for this scenario I forgot to quote the arguments. However, if 
> POSIX rules are being implemented, there is still something I didn't expect. 
> Here's my bash script:

> #!/bin/bash
> echo "$1"
> echo "$2" 
> echo "$3"

> And I invoke it like this from a Windows command prompt:

> C:\> bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar\baz bat'
> bar\baz bat
> + echo ''

> Not expected. Called from within Cygwin, the behavior is correct:

Again, fully expected.

> $ bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar"baz'
> bar"baz
> + echo bat
> bat

> Can you explain this difference?

CMD escape character is ^, not \

> The reason I ask is that if this worked, the way Go constructs the 
> command line string would be just fine.

No.


--
With best regards,
Andrey Repin
Friday, September 6, 2019 23:33:46

Sorry for my terrible english...


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-07 Thread Brian Inglis
On 2019-09-05 16:01, Stephen Provine via cygwin wrote:
> On 9/5/19 2:05 PM, Eric Blake wrote:
>> On 9/5/19 1:31 PM, Stephen Provine via cygwin wrote:
>>> Not expected.
> 
>> Why not? That obeyed cmd's odd rules: The moment you have a " in the
>> command line, that argument continues until end of line or the next "
>> (regardless of how many \ precede the ").
> 
> Now I'm really confused. Brian seemed to indicate that the POSIX rules were
> followed, but you're indicating that the Windows command line parsing rules
> are followed. So I assume the reality is that it is actually some mix of the 
> two.
> Is the effective parsing logic implemented by Cygwin documented anywhere?

Depends on what you are running thru - you have layers - in that test case you
ran from cmd, so cmd parsing has to be first taken into account, before passing
the resulting command line to bash, where Cygwin will construct a POSIX argument
list from cmd output, and pass that to bash then script.sh.

Try your testing using my script.sh shown earlier, and call bash with -vx
options for debugging output.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-07 Thread Andrey Repin
Greetings, Stephen Provine!

> On 2019-09-04 23:29, Brian Inglis wrote:
>> As standard on Unix systems, just add another level of quoting for each 
>> level of
>> interpretation, as bash will process that command line, then bash will 
>> process
>> the script command line.

> My mistake - I'm very aware of the quoting rules, yet in my test script for 
> this
> scenario I forgot to quote the arguments. However, if POSIX rules are being
> implemented, there is still something I didn't expect. Here's my bash script:

> #!/bin/bash
> echo "$1"
> echo "$2" 
> echo "$3"

> And I invoke it like this from a Windows command prompt:

> C:\> bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar\baz bat'
> bar\baz bat
> + echo ''

> Not expected. Called from within Cygwin, the behavior is correct:

Again, fully expected.

> $ bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar"baz'
> bar"baz
> + echo bat
> bat

> Can you explain this difference?

CMD escape character is ^, not \

> The reason I ask is that if this worked,
> the way Go constructs the command line string would be just fine.

No.


-- 
With best regards,
Andrey Repin
Friday, September 6, 2019 23:33:46

Sorry for my terrible english...


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-06 Thread Stephen Provine via cygwin
On 9/5/19 9:26 PM, Eric Blake wrote:
> Rather, go is not passing the command line to CreateProcess in the way
> that is unambiguously parseable in the manner expected by
> CommandLineToArgvW.

The specific example I gave is unambiguous and is parsed correctly by
CommandLineToArgvW, so if the goal is for Cygwin to effectively
simulate this function, I can confirm that it is missing this case.

It's reasonable that Go's algorithm should be changed to have a better
chance of working with Windows programs that manually implement
command line parsing and may not match expectations for all cases.
I'll follow up with them and for the time being, work around the issue
with my own implementation as I've since figured out how to do that.

FWIW, here's the most definitive reference I've found for how Windows
binaries compiled with the Microsoft C/C++ compilers do command line
parsing, in case there is any desire to address this issue at some point:

http://daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULES

Thanks for entertaining my persistence on this topic!

Stephen


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Eric Blake
On 9/5/19 6:45 PM, Stephen Provine via cygwin wrote:

> 
> To prove it is not going through cmd.exe, I debugged the Go program
> to the point that it calls the Win32 CreateProcess function, and the
> first two arguments are:
> 
> lpApplicationName: "C:\\cygwin64\\bin\\bash.exe"
> lpCommandLine: "C:\\cygwin64\\bin\\bash.exe test.sh foo bar\\\"baz bat"

And according to
https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
that is NOT the correct command line to be handing to CreateProcess, at
least not if you want things preserved.

If I read that page correctly, the unambiguously correct command line
should be:

"C:\\cygwin64\\bin\\bash.exe test.sh foo \"bar\\\"baz\" bat"

> 
> So unless I'm missing something, bash.exe is not interpreting the command line
> following the rules pointed to by the documentation for CommandLineToArgvW.

Rather, go is not passing the command line to CreateProcess in the way
that is unambiguously parseable in the manner expected by
CommandLineToArgvW.  And because Go is relying on a corner case of
ambiguous parsing instead of well-balanced quoting, it's no surprise if
cygwin doesn't parse that corner case in the manner expected.  A patch
to teach cygwin to parse the corner case identically would be welcome,
but fixing recipient processes does not scale as well as fixing the
culprit source process.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Steven Penny

On Thu, 5 Sep 2019 23:45:44, "Stephen Provine via cygwin" wrote:

package main

import (
"log"
"os"
"os/exec"
)

func main() {
cmd :=3D exec.Command("C:\\cygwin64\\bin\\bash.exe", "test.sh", "foo", 
"ba=
r\"baz", "bat")
cmd.Stdout =3D os.Stdout
cmd.Stderr =3D os.Stderr
if err :=3D cmd.Run(); err !=3D nil {
log.Fatal(err)
}
}


Why are you doing this? I hate to be that guy, but examples are important.
Arguably the most important lesson I have learned with computer programming is:
use the right tool for the job.

So when I need to do something, I start with a shell script. Then once a shell
script doesnt cut it anymore, I move to AWK, then Python, the Go. Substitute
your language of choice.

What I dont do is call a shell script from Go or anything else. I might call
"git.exe" or "ffmpeg.exe", but even then you could argue against it as those
binaries have libraries too.

I agree that Cygwin should be parsing to and from cmd.exe correctly. But unless
you have a valid use case, its kind of like "Cygwin theory". I have found that
historically those type issues are less likely to be resolved in timely manner,
if at all.


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Stephen Provine via cygwin
On 9/5/19 5:46 PM, Eric Blake wrote:
> If you start a cygwin process from Windows, then cygwin1.dll is given
> only a single string, which it must parse into argv according to windows
> conventions (if it does not produce the same argv[] as a windows process
> using CommandLineToArgvW, then that's a bug in cygwin1.dll).  But on top
> of that, if you are using cmd.exe to generate your command line, then
> you must use proper escaping, otherwise, cmd.exe can produce a command
> line that has unexpected quoting in the string handed to
> CommandLineToArgvW, and the Windows parsing when there are unbalanced
> quotes can be screwy

Great explanation, it's very helpful.

I've been using cmd.exe to generate the command line for my tests, but the
original problem was when my compiled Go binary directly executes another
Windows process using the Win32 APIs like CreateProcess directly. Here's a
simple Go program that reproduces the issue:

package main

import (
"log"
"os"
"os/exec"
)

func main() {
cmd := exec.Command("C:\\cygwin64\\bin\\bash.exe", "test.sh", "foo", 
"bar\"baz", "bat")
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
if err := cmd.Run(); err != nil {
log.Fatal(err)
}
}

The output of this process is:

foo
bar\baz bat


To prove it is not going through cmd.exe, I debugged the Go program
to the point that it calls the Win32 CreateProcess function, and the
first two arguments are:

lpApplicationName: "C:\\cygwin64\\bin\\bash.exe"
lpCommandLine: "C:\\cygwin64\\bin\\bash.exe test.sh foo bar\\\"baz bat"

So unless I'm missing something, bash.exe is not interpreting the command line
following the rules pointed to by the documentation for CommandLineToArgvW.

Thanks,
Stephen

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Eric Blake

On 9/5/19 5:01 PM, Stephen Provine via cygwin wrote:
> On 9/5/19 2:05 PM, Eric Blake wrote:
>> On 9/5/19 1:31 PM, Stephen Provine via cygwin wrote:
>>> Not expected.
> 
>> Why not? That obeyed cmd's odd rules: The moment you have a " in the
>> command line, that argument continues until end of line or the next "
>> (regardless of how many \ precede the ").
> 
> Now I'm really confused. Brian seemed to indicate that the POSIX rules were
> followed, but you're indicating that the Windows command line parsing rules
> are followed. So I assume the reality is that it is actually some mix of the 
> two.
> Is the effective parsing logic implemented by Cygwin documented anywhere?

If you start a Cygwin process from another cygwin process, then only
POSIX rules are in effect.  The bash shell parses its command line
according to POSIX rules, creates an argv[] to pass to exec(), then
cygwin1.dll manages to get that argv[], unscathed, to the new child
process (bypassing Window's mechanisms), which uses the argv[] as-is.

If you start a Windows process from a cygwin process, then cygwin1.dll
must quote the arguments into a single concatenated string that will be
reparsed in the manner that the Windows runtime expects, because the
Windows process only gets a single string, not an argv[].  But cygwin
should be providing the correct escaping so that windows then parses it
back into the same intended argv[] (if not, that's a bug in cygwin1.dll).

If you start a cygwin process from Windows, then cygwin1.dll is given
only a single string, which it must parse into argv according to windows
conventions (if it does not produce the same argv[] as a windows process
using CommandLineToArgvW, then that's a bug in cygwin1.dll).  But on top
of that, if you are using cmd.exe to generate your command line, then
you must use proper escaping, otherwise, cmd.exe can produce a command
line that has unexpected quoting in the string handed to
CommandLineToArgvW, and the Windows parsing when there are unbalanced
quotes can be screwy (if it encounters a " inside an argument that was
not quoted with ", then that groups remaining text into the same
argument until a balanced " or end of string is encountered).  So it is
not always obvious at first glance if what you type in cmd.exe provides
the argv[] that you intended, because of the two layers of
interpretation (one from cmd to Windows, and one from Windows convention
into argv[]).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Stephen Provine via cygwin
On 9/5/19 2:05 PM, Eric Blake wrote:
> On 9/5/19 1:31 PM, Stephen Provine via cygwin wrote:
> > Not expected.

> Why not? That obeyed cmd's odd rules: The moment you have a " in the
> command line, that argument continues until end of line or the next "
> (regardless of how many \ precede the ").

Now I'm really confused. Brian seemed to indicate that the POSIX rules were
followed, but you're indicating that the Windows command line parsing rules
are followed. So I assume the reality is that it is actually some mix of the 
two.
Is the effective parsing logic implemented by Cygwin documented anywhere?

Thanks,
Stephen


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Eric Blake
On 9/5/19 1:31 PM, Stephen Provine via cygwin wrote:
> My mistake - I'm very aware of the quoting rules, yet in my test script for 
> this
> scenario I forgot to quote the arguments. However, if POSIX rules are being
> implemented, there is still something I didn't expect. Here's my bash script:
> 
> #!/bin/bash
> echo "$1"
> echo "$2" 
> echo "$3"
> 
> And I invoke it like this from a Windows command prompt:
> 
> C:\> bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar\baz bat'
> bar\baz bat
> + echo ''
> 
> Not expected.

Why not? That obeyed cmd's odd rules: The moment you have a " in the
command line, that argument continues until end of line or the next "
(regardless of how many \ precede the ").

https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

Perhaps you meant to try:

c:\> bash -x script.sh foo ^"bar\^"baz^" bat

> Called from within Cygwin, the behavior is correct:
> 
> $ bash -x script.sh foo bar\"baz bat
> + echo foo
> foo
> + echo 'bar"baz'
> bar"baz
> + echo bat
> bat

Moral of the story: POSIX rules are saner than cmd rules.

> 
> Can you explain this difference? The reason I ask is that if this worked,
> the way Go constructs the command line string would be just fine.

If Go is not constructing the command line string in a manner that
matches that blog post, the bug would be in Go.  Presumably, Cygwin is
correctly quoting things any time it calls into a non-Cygwin process
(but if not, give us a test case for us to patch cygwin, or even better
submit the patch).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-05 Thread Stephen Provine via cygwin
On 2019-09-04 23:29, Brian Inglis wrote:
> As standard on Unix systems, just add another level of quoting for each level 
> of
> interpretation, as bash will process that command line, then bash will process
> the script command line.

My mistake - I'm very aware of the quoting rules, yet in my test script for this
scenario I forgot to quote the arguments. However, if POSIX rules are being
implemented, there is still something I didn't expect. Here's my bash script:

#!/bin/bash
echo "$1"
echo "$2" 
echo "$3"

And I invoke it like this from a Windows command prompt:

C:\> bash -x script.sh foo bar\"baz bat
+ echo foo
foo
+ echo 'bar\baz bat'
bar\baz bat
+ echo ''

Not expected. Called from within Cygwin, the behavior is correct:

$ bash -x script.sh foo bar\"baz bat
+ echo foo
foo
+ echo 'bar"baz'
bar"baz
+ echo bat
bat

Can you explain this difference? The reason I ask is that if this worked,
the way Go constructs the command line string would be just fine.

Thanks,
Stephen


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-04 Thread Brian Inglis
On 2019-09-04 17:46, Stephen Provine wrote:
> On 2019-09-04 10:20, Brian Inglis wrote:
>> and ask if you really expect anyone else to use or reproduce this insanity,
>> rather than a sane POSIX parser?
> 
> I know it's insanity, but it's insanity that almost all Windows programs 
> inherit and
> implement consistently enough because they use standard libraries or functions
> to do the parsing. The Go command line parser used to use CommandLineToArgvW
> and only switched away from it due to performance (it's in shell32.dll and 
> that takes
> a long time to load). I don't know how accurate their manual reproduction is, 
> but
> they seemed to study the sources I sent pretty carefully.
> 
> Anyway, my specific problem is that I have Go code with an array of arguments 
> that
> I want to pass verbatim (no glob expansion) to a bash script. I've figured 
> out how to
> override Go's default code for building the command line string, but it's not 
> clear how
> to correctly construct the command line string. If the POSIX rules are being 
> followed,
> I'd expect the following to work:
> 
> bash.exe script.sh arg1 "*" arg3
> 
> But it always expands the "*" to all the files in the current directory. I've 
> also tried \* and
> '*', but same problem. So how do I build a command line string that takes 
> each argument
> literally with no processing?

As standard on Unix systems, just add another level of quoting for each level of
interpretation, as bash will process that command line, then bash will process
the script command line.

How are you running the command line; I get the same results under cmd or
mintty/bash:

$ bash -nvx script.sh arg1 "*" arg3
#!/bin/bash
# script.sh - echo args

argc=$#
argv=("$0" "$@")
echo argc $argc argv[0] "${argv[0]}"

for ((a = 1; a <= $argc; ++a))
do
echo argv[$a] "${argv[$a]}"
done

C:\ > bash script.sh arg1 "*" arg3
argc 3 argv[0] script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

C:\ > bash -c 'script.sh arg1 "*" arg3'
argc 3 argv[0] /mnt/c/Users/bwi/bin/script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

$ bash script.sh arg1 "*" arg3
argc 3 argv[0] script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

$ bash -c 'script.sh arg1 "*" arg3'
argc 3 argv[0] /home/bwi/bin/script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

$ cmd /c bash script.sh arg1 "\*" arg3
argc 3 argv[0] script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

$ cmd /c bash -c 'script.sh arg1 "*" arg3'
argc 3 argv[0] /mnt/c/Users/bwi/bin/script.sh
argv[1] arg1
argv[2] *
argv[3] arg3

but with un-double-quoted (and backslash escaped) * I get a list of the current
directory files from all of these commands.

Invoking bash with options -vx or set -vx in script.sh will let you see what is
happening on stderr. Many errors cause non-interactive shell scripts to exit, so
check for child process error return codes (often 128+errno). If you are not
careful within script.sh, many unquoted uses of $2 may expand the *. Double
quotes allow command and parameter substitution, and history expansion, but
suppress pathname expansion. You should refer to each parameter within script.sh
as "$1" "$2" "$3", or you might need to quote some or each argument character
and enclose the * in double quotes e.g. \""\*"\" to pass thru the Go command
line interface.
Can you not tell the interface to verbatim passthru the string for execution?

You may check any of the POSIX shell, dash/ash/sh shell, ksh Korn shell, or bash
shell man pages or docs for more details on variations between shells and
extensions to POSIX operation.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-04 Thread Stephen Provine via cygwin
On 2019-09-04 10:20, Brian Inglis wrote:
> and ask if you really expect anyone else to use or reproduce this insanity,
> rather than a sane POSIX parser?

I know it's insanity, but it's insanity that almost all Windows programs 
inherit and
implement consistently enough because they use standard libraries or functions
to do the parsing. The Go command line parser used to use CommandLineToArgvW
and only switched away from it due to performance (it's in shell32.dll and that 
takes
a long time to load). I don't know how accurate their manual reproduction is, 
but
they seemed to study the sources I sent pretty carefully.

Anyway, my specific problem is that I have Go code with an array of arguments 
that
I want to pass verbatim (no glob expansion) to a bash script. I've figured out 
how to
override Go's default code for building the command line string, but it's not 
clear how
to correctly construct the command line string. If the POSIX rules are being 
followed,
I'd expect the following to work:

bash.exe script.sh arg1 "*" arg3

But it always expands the "*" to all the files in the current directory. I've 
also tried \* and
'*', but same problem. So how do I build a command line string that takes each 
argument
literally with no processing?

Thanks,
Stephen


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-04 Thread Brian Inglis
On 2019-09-03 10:38, Stephen Provine wrote:
> On 2019-08-30 21:58, Brian Inglis wrote:
>> Not being in the same Cygwin process group and lacking the appropriate 
>> interface
>> info indicates that the invoker was not Cygwin.
> 
> Should I interpret this to mean the "winshell" parameter is not an accurate
> statement of what I thought it was for and because there is no way to reliably
> determine if the calling process was from Cygwin or not, behavior like I 
> suggest
> is actually impossible?

Reread the rules in the article you quoted, carefully, then read:

http://www.windowsinspired.com/how-a-windows-programs-splits-its-command-line-into-individual-arguments/
[also see linked articles about cmd and batch file command line parsing]

and ask if you really expect anyone else to use or reproduce this insanity,
rather than a sane POSIX parser?
Once again MS "persists in reinventing the square wheel", badly [from Henry
Spencer's Commandments].

What does the Go command line parser actually accept, does it really invert the
parse_cmdline or CommandLineToArgvW rules, and which?

That winshell parameter is set in dcrt0.cc calling build_argv, based on whether
the parent process was Cygwin and an argv array is available preset by the
Cygwin parent, or not and globs are allowed to be expanded, such that the
command line args, quotes, and wildcards have to be handled by the program
according to POSIX shell command line quoting, field splitting, and pathname
expansion rules, respecting $IFS:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html

The similar flag in spawn.cc based on the exe or interpreter exe being under a
Cygwin exec mount in realpath.iscygexec() decides whether the argv array can be
passed a la Unix to a Cygwin child, or a Windows command line needs to be built
with Windows argument double quoting and escaping where required.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-09-03 Thread Stephen Provine via cygwin
On 2019-08-30 21:58, Brian Inglis wrote:
> Not being in the same Cygwin process group and lacking the appropriate 
> interface
> info indicates that the invoker was not Cygwin.

Should I interpret this to mean the "winshell" parameter is not an accurate
statement of what I thought it was for and because there is no way to reliably
determine if the calling process was from Cygwin or not, behavior like I suggest
is actually impossible?

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-08-31 Thread Andrey Repin
Greetings, Stephen Provine!

> The standard rules for Microsoft command line processing are documented here:

> https://docs.microsoft.com/en-us/previous-versions/17w5ykft(v=vs.85)

> The Cygwin code for command line processing is in dcrt0.cc, function 
> build_argv.

> The behaviors do not match. For instance, given a test.sh script like this:

> #!/bin/bash
> echo $1

> And the following invocation of bash.exe from a Windows command prompt:

> bash.exe test.sh foo\"bar

> The result is:

> foo\bar

> When the expected result is:

> foo"bar

I would actually expect parsing error, but I guess, CMD gives you some slack.
Then, the expected result is either 'foo\"bar' or 'foo\bar', since in CMD, the
escape character is a caret (^).

> As a workaround, you can achieve the expected result using:

> bash.exe test.sh "foo\"bar"

> Which is great until you use a language like Go to shell exec the command
> line, and don't have control over how the command line string is generated
> from an original set of arguments. See:

> https://github.com/golang/go/blob/master/src/syscall/exec_windows.go#L86

> Go just reverses the Microsoft standard rules in the most efficient manner
> possible, but those command lines don't parse correctly in Cygwin processes.

> Go implements a pretty definitive command line parsing algorithm as a
> replacement for the CommandLineToArgv function in shell32.dll:

>
> https://github.com/golang/go/commit/39c8d2b7faed06b0e91a1ad7906231f53aab45d1

> The behavior here is based on a detailed analysis of what command line 
> parsing "should" be in Windows:

> http://daviddeley.com/autohotkey/parameters/parameters.htm#WINARGV

> It would be very nice if Cygwin followed the same procedure at startup.

> Thanks,
> Stephen


> --
> Problem reports:   http://cygwin.com/problems.html
> FAQ:   http://cygwin.com/faq/
> Documentation: http://cygwin.com/docs.html
> Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



-- 
With best regards,
Andrey Repin
Saturday, August 31, 2019 11:27:38

Sorry for my terrible english...


--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-08-30 Thread Brian Inglis
On 2019-08-30 14:59, Stephen Provine wrote:
>> Cygwin command line parsing has to match Unix shell command line processing,
>> like argument splitting, joining within single or double quotes or after a
>> backslash escaped white space characters, globbing, and other actions 
>> normally
>> performed by a shell, when any Cygwin program is invoked from any Windows
>> program e.g. cmd, without those Windows limitations which exclude any use of 
>> a
>> backslash escape character except preceding another or a double quote.

> I guess my assumption was that the "winshell" parameter would be used to 
> determine
> when a Cygwin process is called from a non-Cygwin process and that it would 
> be more
> appropriate to use standard Windows command line processing (as limiting as 
> it may
> be) in that case. Once in the Cygwin environment, calls from one process to 
> another
> should obviously process command lines according to Unix shell rules.

Not being in the same Cygwin process group and lacking the appropriate interface
info indicates that the invoker was not Cygwin.
Cygwin command line file name globs can include any UTF-8 character excluding
forward and backward (for Windows compatibility) oblique slashes and nulls, with
non-Windows supported characters including leading and trailing spaces and dots,
and result in thousands of file name arguments on the command line e.g.

$ echo /var/log/* | wc -lwmcL
  1   66858 2903078 2903078 2903077

shows I need to clean up my /var/log directory as it contains 64K+ files with
names totalling 2234498 chars/bytes, plus 668579 for paths and spaces, plus a
newline terminator.

Some file names with non-Windows supported characters have them converted to the
UTF-16LE BMP PUA by adding xf000, or for characters not supported by non-UTF-8
interface encodings, ^X CAN x18 followed by a BMP UTF-8 sequence, allowing
conversion to UTF-16LE, at the cost of weird characters in the displayed names.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



RE: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-08-30 Thread Stephen Provine via cygwin
> Cygwin command line parsing has to match Unix shell command line processing,
> like argument splitting, joining within single or double quotes or after a
> backslash escaped white space characters, globbing, and other actions normally
> performed by a shell, when any Cygwin program is invoked from any Windows
> program e.g. cmd, without those Windows limitations which exclude any use of a
> backslash escape character except preceding another or a double quote.

I guess my assumption was that the "winshell" parameter would be used to 
determine
when a Cygwin process is called from a non-Cygwin process and that it would be 
more
appropriate to use standard Windows command line processing (as limiting as it 
may
be) in that case. Once in the Cygwin environment, calls from one process to 
another
should obviously process command lines according to Unix shell rules.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple



Re: Command line processing in dcrt0.cc does not match Microsoft parsing rules

2019-08-30 Thread Brian Inglis
On 2019-08-30 13:16, Stephen Provine via cygwin wrote:
> The standard rules for Microsoft command line processing are documented here:
> https://docs.microsoft.com/en-us/previous-versions/17w5ykft(v=vs.85)
> The Cygwin code for command line processing is in dcrt0.cc, function 
> build_argv.
> The behaviors do not match. For instance, given a test.sh script like this:
> #!/bin/bash
> echo $1
> And the following invocation of bash.exe from a Windows command prompt:
> bash.exe test.sh foo\"bar
> The result is:
> foo\bar
> When the expected result is:
> foo"bar
> As a workaround, you can achieve the expected result using:
> bash.exe test.sh "foo\"bar"
> Which is great until you use a language like Go to shell exec the command 
> line, and don't have control over how the command line string is generated 
> from an original set of arguments. See:
> https://github.com/golang/go/blob/master/src/syscall/exec_windows.go#L86
> Go just reverses the Microsoft standard rules in the most efficient manner 
> possible, but those command lines don't parse correctly in Cygwin processes.
> Go implements a pretty definitive command line parsing algorithm as a 
> replacement for the CommandLineToArgv function in shell32.dll:
> 
> https://github.com/golang/go/commit/39c8d2b7faed06b0e91a1ad7906231f53aab45d1
> The behavior here is based on a detailed analysis of what command line 
> parsing "should" be in Windows:
> http://daviddeley.com/autohotkey/parameters/parameters.htm#WINARGV
> It would be very nice if Cygwin followed the same procedure at startup.

Cygwin command line parsing has to match Unix shell command line processing,
like argument splitting, joining within single or double quotes or after a
backslash escaped white space characters, globbing, and other actions normally
performed by a shell, when any Cygwin program is invoked from any Windows
program e.g. cmd, without those Windows limitations which exclude any use of a
backslash escape character except preceding another or a double quote.

Mixing Cygwin and Windows programs is a user choice requiring them to deal with
any interface issues: just use mintty with bash. ;^> It's actually the same
situation as invoking any another Cygwin program which also does some argument
interpretation, from the shell, possibly requiring nested quoting and escaping.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.

--
Problem reports:   http://cygwin.com/problems.html
FAQ:   http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple