On Monday, 11 March 2013 at 23:21:18 UTC, Steven Schveighoffer
wrote:
OK, I was able to reproduce on a windows 7 box, and I found the
problem. Really dumb. It was the difference in the
spawnProcess calls between the two programs that gave me the
hint.
Awesome! :)
So the environment pointer to CreateProcess is a list of
null-separated "var=value" strings. The last string is
followed by an additional null, so the function can identify
the end of the list.
HOWEVER, if you have NO variables, there is no double null,
because there isn't the null from the last variable. However,
on XP, a single null character is fine, whereas on windows 7,
it REQUIRES an extra null character, even if you didn't have
any variables. Adding the extra null character in toEnvz fixed
the problem.
I never would have thought of trying that, especially considering
that the CreateProcess() documentation is rather vague on this
point. It says:
"A Unicode environment block is terminated by four zero bytes:
two for the last string, two more to terminate the block."
(This confused me slightly, until I remembered that one zero
wchar is two zero bytes.) It does not explicitly mention the
case where there are *no* environment variables, but from the
above statement, I had inferred that one zero character would
suffice, considering there is no "last string" to terminate. But
it seems I was wrong. :)
Now, that actually brings up another bug I think. pipeProcess
is sending a null to spawnProcess for the environment.
However, because AA's are structs, and null is the same as an
empty AA, it gets translated into "kill the entire environment"
for the child process. I think this is wrong. But at the same
time, what if you DID want to kill the entire environment?
Should we even support that? Either way I don't think
pipeProcess should kill the environment, so that needs to be
changed.
Passing null to CreateProcessW copies the parent's environment,
I think that should be the ultimate default when the
environment is not specified, even for pipeProcess. But what
should we do if the spawnProcess overload that takes an
environment receives a null environment? My instinct is to
detect an empty AA, and interpret that as "copy parent
environment," even Lars seems to have interpreted it that way
(maybe it works that way on Linux as it's written now?) in how
he wrote pipeProcess.
I think we can forgo the pull request, the solution is simply
to add another '\0', that will handle the case where the
environment is empty and be a noop for when the environment has
stuff in it. But maybe we want to make toEnvz return null if
it gets an empty AA to avoid killing the environment? I have
no idea what the right answer is.
Nice catch! Fortunately, the only bug here is that I've
specified a null parameter in the spawn call in
pipeProcessImpl(). If you look at the various spawnProcess()
overloads, you'll see that they are designed as follows:
If you omit the AA parameter altogether, the child inherits the
parent's environment. A null pointer is passed as 'envz' to
spawnProcessImpl(), which in turn passes this straight to
CreateProcess() on Windows and replaces it by 'environ' on POSIX.
If you do specify the AA parameter, it is passed through toEnvz()
on its way to spawnProcessImpl(). If the AA is empty/null,
toEnvz() will create an empty (but non-null) environment block,
and the child's environment will be empty.
I think the bug stems from the fact that, at some intermediate
stage of the development, pipeProcessImpl() used to call
spawnProcessImpl() directly. I later changed this to
spawnProcess(), but forgot to remove the null parameter. I'll
simply remove it, and everything will work as intended.
Initially, I wrote spawnProcess() the way you suggest, i.e. that
a passing a null AA is equivalent to omitting it. But then we
are left with no simple way to explicitly clear the child's
environment. We could make a distinction between a "null" and an
"empty" AA, but making a non-null empty AA is a hassle. I think
it is better the way it is now.
I don't think we need to support clearing the child's environment
in pipeProcess(). I suspect it is a rare need, and the user will
just have to deal with spawnProcess() directly in that case.
(It's like how there are all kinds of different exec*() functions
in C, but only one, simple, popen() function.)
Lars