Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-26 Thread Corinna Vinschen
On Apr 24 17:09, Michael Haubenwallner wrote:
> On 4/12/19 7:40 PM, Corinna Vinschen wrote:
> > Hi Michael,
> 
> > Nick Clifton, one of the binutils maintainers, made the following
> > suggestion in PM:
> > 
> > Allow the ld flag --enable-auto-image-base to take a filename as
> > argument.> 
> > The idea: The file is used by ld to generate the start address
> > for the next built DLL.  Mechanism:
> > 
> > 1.1. If ld links a DLL and if the file given to --enable-auto-image-base
> >  doesn't exist, ld will give the DLL the start address of the
> >  auto image base range.
> > 
> > 1.2: Next time, if ld links a DLL and if the file given to
> >  --enable-auto-image-base exists, it will use the address in that
> >  file as the start address for th just built DLL.
> > 
> > 2. It will store that address, plus the size of the DLL, rounded up to
> >64K, in that file.
> 
> The rounding up is fine to get some alignment for the base address itself,
> but it feels irrelevant if it was for "finding the next base" only.

Well,DLLs always start at a 64K boundary, so it makes sesne to round
immediately.

> > 3. If the auto image base range is at an end, ld will wrap back to
> >the start address of the auto image base range.> 
> > TBD: A way to enable this feature without having to change all
> >  packages' build systems.
> 
> As the --enable-auto-image-base flag does not name any method for finding
> the image base beyond "automatic", IMHO using some predefined control file
> under the hoods should be fine.

The current preliminary solution is to check if a file
~/.ld-pe-auto-image-base exists.  If it doesn't, ld uses the usual
hashing to compute the base address.  If the file exists and is empty,
the base address range start address is used (i.e. 0x4:),
otherwise the address is taken from the file and the next free address
after that is written back to the file.

The problem is that auto-image-basing occurs *so* early in ld,
that the size of the built DLL isn't known when writing the file back.
To do this right there needs to be a bigger change to ld, the current
infrastructure around image basing doesn't allow to call saving the file
content deferred.

So ATM, ld just adds ~38 Megs to the current DLL address, which is
1 Meg more than the largest Cygwin DLL on my system.

There's already a bit more in terms of settings, but that's still
in the works.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-24 Thread Michael Haubenwallner
On 4/12/19 7:40 PM, Corinna Vinschen wrote:
> Hi Michael,

> Nick Clifton, one of the binutils maintainers, made the following
> suggestion in PM:
> 
> Allow the ld flag --enable-auto-image-base to take a filename as
> argument.> 
> The idea: The file is used by ld to generate the start address
> for the next built DLL.  Mechanism:
> 
> 1.1. If ld links a DLL and if the file given to --enable-auto-image-base
>  doesn't exist, ld will give the DLL the start address of the
>  auto image base range.
> 
> 1.2: Next time, if ld links a DLL and if the file given to
>  --enable-auto-image-base exists, it will use the address in that
>  file as the start address for th just built DLL.
> 
> 2. It will store that address, plus the size of the DLL, rounded up to
>64K, in that file.

The rounding up is fine to get some alignment for the base address itself,
but it feels irrelevant if it was for "finding the next base" only.

> 3. If the auto image base range is at an end, ld will wrap back to
>the start address of the auto image base range.> 
> TBD: A way to enable this feature without having to change all
>  packages' build systems.

As the --enable-auto-image-base flag does not name any method for finding
the image base beyond "automatic", IMHO using some predefined control file
under the hoods should be fine.

Beyond holding the last image base and the range, such an auto image base
control file could control the actual behaviour as well, as in either
"use the -o argument" or "use next base within range".

Hence some versioning of that file's content might be resonable.  And
for parallel builds, "finding the next base" needs some synchronization.

What about hardcoding (the default of) such a filename into ld itself?  E.g.
"./configure --with-auto-image-base-control-file=/var/lib/ld/auto-image-base"

Also, I could think of binutils' configure options like this instead:
 --with-auto-image-base=control-file=/path/to/file
 --with-auto-image-base=dash-o-argument # current behaviour, no control file 
support

And to set the defaults of that control file:
 --with-auto-image-base-control-file-default-mode=[dash-o-argument|control-file]
 
--with-auto-image-base-control-file-default-range=0x123456789abcdef:0xfedcba9876543210

> That way you could build hundreds of DLLs in a project and use them
> immediately without having to rebase.
> 
> This is just in a discussion state, nothing has happend yet, but
> what do you think in general?

This does make a lot of sense to me in general, although package managers
still need to use the Cygwin rebase database, as in the long run that range
will exceed.  But for Gentoo Prefix in particular, this would help during
bootstrap, before the Cygwin rebase database is set up inside the Prefix.

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-13 Thread Corinna Vinschen
On Apr 13 09:46, Achim Gratz wrote:
> Corinna Vinschen writes:
> > Nick Clifton, one of the binutils maintainers, made the following
> > suggestion in PM:
> >
> > Allow the ld flag --enable-auto-image-base to take a filename as
> > argument.
> >
> > The idea: The file is used by ld to generate the start address
> > for the next built DLL.  Mechanism:
> >
> > 1.1. If ld links a DLL and if the file given to --enable-auto-image-base
> >  doesn't exist, ld will give the DLL the start address of the
> >  auto image base range.
> >
> > 1.2: Next time, if ld links a DLL and if the file given to
> >  --enable-auto-image-base exists, it will use the address in that
> >  file as the start address for th just built DLL.
> >
> > 2. It will store that address, plus the size of the DLL, rounded up to
> >64K, in that file.
> >
> > 3. If the auto image base range is at an end, ld will wrap back to
> >the start address of the auto image base range.
> 
> Sounds OK if the goal is just to avoid collisions, but it would really
> be nicer if there was some way to plug this together with the rebase
> database from the start.

No, that's contrary to the idea.  The solution should be self-sufficient
within binutils.  We don't want to add any reliance to external tools.

The linker uses a DLL address space which does not collide with rebased
DLLs in 64 bit, so this only occurs during developement, and none of the
built DLLs can collide with system DLLs.  I do not much care for 32 bit,
it's a lost case anyway.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-13 Thread Achim Gratz
Corinna Vinschen writes:
> Nick Clifton, one of the binutils maintainers, made the following
> suggestion in PM:
>
> Allow the ld flag --enable-auto-image-base to take a filename as
> argument.
>
> The idea: The file is used by ld to generate the start address
> for the next built DLL.  Mechanism:
>
> 1.1. If ld links a DLL and if the file given to --enable-auto-image-base
>  doesn't exist, ld will give the DLL the start address of the
>  auto image base range.
>
> 1.2: Next time, if ld links a DLL and if the file given to
>  --enable-auto-image-base exists, it will use the address in that
>  file as the start address for th just built DLL.
>
> 2. It will store that address, plus the size of the DLL, rounded up to
>64K, in that file.
>
> 3. If the auto image base range is at an end, ld will wrap back to
>the start address of the auto image base range.

Sounds OK if the goal is just to avoid collisions, but it would really
be nicer if there was some way to plug this together with the rebase
database from the start.

> TBD: A way to enable this feature without having to change all
>  packages' build systems.

:-)

> That way you could build hundreds of DLLs in a project and use them
> immediately without having to rebase.
>
> This is just in a discussion state, nothing has happend yet, but
> what do you think in general?

Looking at what triggered the discussion, on would probably want to have
the option of giving the linker the name of an existing DLL as the
argument and have it re-use that base address (and a warning if the size
gets larger than the original DLL plus some guardband).


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-12 Thread Corinna Vinschen
Hi Michael,

On Apr  3 14:22, Corinna Vinschen wrote:
> On Apr  3 11:18, Michael Haubenwallner wrote:
> > On 4/1/19 5:56 PM, Corinna Vinschen wrote:
> > > On Apr  1 16:56, Corinna Vinschen wrote:
> > >> On Apr  1 16:28, Michael Haubenwallner wrote:
> > >>> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> >  can you please collect the base addresses of all DLLs generated during
> >  the build, plus their size and make a sorted list?  It would be
> >  interesting to know if the hash algorithm in ld is actually as bad
> >  as I conjecture.
> > >>>
> > >>> Please find attached the output of rebase -i for the dlls after 
> > >>> bootstrap
> > >>> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> > > 
> > > Oh, wait.  That's not what I was looking for.  The addresses are ok, but
> > > the paths *must* be the ones at the time the DLLs have been created,
> > > because that's what ld uses when creating the image base addresses.  The
> > > addresses combined with the installation paths don't make sense anymore.
> > 
> > So I have intercepted the ld.exe to show 'rebase -i' on any just created 
> > dll,
> > tell about the exact -o argument to ld, and the current directory.
> > 
> > This is with binutils-2.31.1
> > 
> > Anything else needed?
> 
> No, that should be sufficient, thanks for collecting this!

Nick Clifton, one of the binutils maintainers, made the following
suggestion in PM:

Allow the ld flag --enable-auto-image-base to take a filename as
argument.

The idea: The file is used by ld to generate the start address
for the next built DLL.  Mechanism:

1.1. If ld links a DLL and if the file given to --enable-auto-image-base
 doesn't exist, ld will give the DLL the start address of the
 auto image base range.

1.2: Next time, if ld links a DLL and if the file given to
 --enable-auto-image-base exists, it will use the address in that
 file as the start address for th just built DLL.

2. It will store that address, plus the size of the DLL, rounded up to
   64K, in that file.

3. If the auto image base range is at an end, ld will wrap back to
   the start address of the auto image base range.

TBD: A way to enable this feature without having to change all
 packages' build systems.

That way you could build hundreds of DLLs in a project and use them
immediately without having to rebase.

This is just in a discussion state, nothing has happend yet, but
what do you think in general?


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-09 Thread Michael Haubenwallner
On 4/8/19 7:09 PM, Achim Gratz wrote:
> Michael Haubenwallner writes:
>> Well... once installed, a dll may get in use quickly, because I can not 
>> require
>> to shut down all Cygwin processes.  So I need to rebase and register the dll 
>> in
>> some staging directory before it is installed into it's final directory, 
>> hence
>>  I'm about to add some new '--destdir' option.
> 
> I don't quite understand yet what you're trying to do and why, but
> "--destdir" doesn't have the right ring to it for my ears.  If I'm not
> mistaken you want to strip the staging prefix from the database entry,
> which incidentally would be where a
> 
> make DESTDIR=/staging install
> 
> would have placed the files?

Exactly, the _rebase_ needs to be done while the files are in /staging,
but the database records need to not have the /staging part of course.
However, updating the _database_ can be done either while the files are
in /staging still, or when they are at their final location later on.
I just need to avoid performing a rebase to files in their final location.

For the moment, I'm doing the database update together with performing the
rebase in /staging, so I need to tell rebase.exe about "/staging" to strip
from the database record.  This boils down to:
$ find /staging -type f -name '*.dll' > files.list
$ rebase --database --filelist=files.list --destdir=/staging

But I'm facing some fork problems now, where I need to investigate whether
they're related to my rebase step, before I can submit the patches.

If curious, see https://github.com/haubi/cygwin-rebase/commits/gentoo

Thanks!
/haubi/
PS: I've tried to submit the first two patches yesterday, but somehow the
mails didn't make it to the list.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-08 Thread Achim Gratz
Michael Haubenwallner writes:
> Well... once installed, a dll may get in use quickly, because I can not 
> require
> to shut down all Cygwin processes.  So I need to rebase and register the dll 
> in
> some staging directory before it is installed into it's final directory, hence
>  I'm about to add some new '--destdir' option.

I don't quite understand yet what you're trying to do and why, but
"--destdir" doesn't have the right ring to it for my ears.  If I'm not
mistaken you want to strip the staging prefix from the database entry,
which incidentally would be where a

make DESTDIR=/staging install

would have placed the files?

> When I install rebase right within Gentoo Prefix, the rebase db is stored 
> there
> as well, to not cope with host Cygwin's rebase db.  Other than cygwin1.dll, no
> dll should be used by Gentoo Prefix binaries anyway (except during bootstrap).

Since cygwin1.dll is always at a fixed address anyway, then you don't
need to do anything extra, I think.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf Q+, Q and microQ:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-08 Thread Brian Inglis
On 2019-04-08 06:38, Michael Haubenwallner wrote:
> On 4/3/19 2:26 PM, Corinna Vinschen wrote:
>> On Apr  3 12:38, Michael Haubenwallner wrote:
>>> Furthermore, with so called "Stacked Prefix", it is possible to have a 
>>> second
>>> level of Gentoo Prefix, so what I'm after is some option to tell the rebase
>>> utility which database to record dll base addresses into, and which 
>>> multiple(!)
>>> databases take into account while performing a rebase.
>> rebase is OSS.
> Yeah, and I have found the git repo so far.  But I'm wondering if distfiles 
> like
> "cygwin-rebase-4.4.4.tar.bz2" are already available somewhere more persistent
> than via the current Cygwin distro source package "rebase-4.4.4-1-src.tar.xz"?

http://www.crouchingtigerhiddenfruitbat.org/Cygwin/timemachine.html

as long as that is around: dependent on health, and ability to fund, of one guy,
Peter Castro, like some of the software we use.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-08 Thread Michael Haubenwallner
On 4/3/19 2:28 PM, Achim Gratz wrote:
> Michael Haubenwallner writes:
>> Before I really can tell what I need regarding the rebase, I need to learn 
>> what
>> exactly is recorded into the rebase database, and probably how the recorded 
>> data
>> does influence the rebase procedure right now.
> 
> Just where the DLL resides in the filesystem, what address it has been
> rebased to and what size it occupies.  If you rebase a new DLL with the
> database, it will give you the first gap in the address space that this
> new DLL fits into for doing the rebase and record that into the
> database.  With the --oblivious option, it keeps the database file
> untouched, so the information about the newly rebased DLL gets lost
> whenh the program exits.  That's why you need to do all oblivious
> rebasing in a single invocation.

Ok, this does fit what I guessed, thanks!

> 
>> My thoughts so far for what I probably need:
>>
>> * First, rebase new dlls before being installed into the target file system
>> directory with respect to currently installed dlls (the --oblivious
>> option),
> 
> You always rebase after the install so that the path information is
> correct.  Pre-rebasing is useless.
> 
>> * Second, register new dlls just installed into the target file system
>> directory into the rebase database without performing a rebase, and
> 
> No, rebasing the installed DLL already does that.

Well... once installed, a dll may get in use quickly, because I can not require
to shut down all Cygwin processes.  So I need to rebase and register the dll in
some staging directory before it is installed into it's final directory, hence
 I'm about to add some new '--destdir' option.

>> * Third, unregister dlls being removed from the rebase database.
> 
> Rebase already removes any entries that are no longer accessible from
> the database.

Ah, nice.  So I don't need to care about rebased but then not installed ones.

>> Also, it may make sense to allow for reusing the base address of an installed
>> dll by it's update replacement - while the old version dll still is in use 
>> and
>> the new version dll is in some temporary staging directory.
> 
> Rebase already re-uses the base-address if the path for the new DLL is
> the same and it still fits into the gap.

Ok.

> In general, however, that
> won't work when the size of any DLL changes.  You can ask for more
> guardband around each entry, but that doesn't actually solve the problem
> as it's only useful for the initial (full) rebase.
> 
>> As there may be multiple instances of Gentoo Prefix within one single 
>> operating
>> system instance, it does not make sense to record the dll's base addresses 
>> into
>> the rebase database of the underlying Cygwin instance in /etc, but still the
>> base addresses already recorded there should be respected when rebasing dlls
>> for within a particular Gentoo Prefix instance.
> 
> If you can limit the address space that's used by the Cygwin base
> system, I'd just give your Gentoo prefix installation its own address
> space and rebase it independently from the base system.  That probably
> requires some fooling around with the (currently hardcoded) rebase
> database files, but should otherwise just work.

When I install rebase right within Gentoo Prefix, the rebase db is stored there
as well, to not cope with host Cygwin's rebase db.  Other than cygwin1.dll, no
dll should be used by Gentoo Prefix binaries anyway (except during bootstrap).

>> Furthermore, with so called "Stacked Prefix", it is possible to have a second
>> level of Gentoo Prefix, so what I'm after is some option to tell the rebase
>> utility which database to record dll base addresses into, and which 
>> multiple(!)
>> databases take into account while performing a rebase.
> 
> I don't think you'll want to do that.

Indeed - at least not for the moment.

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-08 Thread Michael Haubenwallner
On 4/3/19 2:26 PM, Corinna Vinschen wrote:
> On Apr  3 12:38, Michael Haubenwallner wrote:
>> Furthermore, with so called "Stacked Prefix", it is possible to have a second
>> level of Gentoo Prefix, so what I'm after is some option to tell the rebase
>> utility which database to record dll base addresses into, and which 
>> multiple(!)
>> databases take into account while performing a rebase.
> 
> rebase is OSS.

Yeah, and I have found the git repo so far.  But I'm wondering if distfiles like
"cygwin-rebase-4.4.4.tar.bz2" are already available somewhere more persistent
than via the current Cygwin distro source package "rebase-4.4.4-1-src.tar.xz"?

> There's nothing keeping you from providing patches
> to make your scenario work ;)

I'm already testing some patch adding a '--destdir' option...

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-05 Thread Achim Gratz
E. Madison Bray writes:
> However, I can see how this could be inconvenient for some Python
> builds where you might have something within the setup.py script
> (which, when building Python extension modules, is still usually used)
> like (in pseudo-code):
>
> run_build_ext_command()
> import just_built_module
> # Use just_built_module to generate some files
> run_install_command()
>
> all within the same process.  One could work around this by modifying
> the setup.py to call `rebase` as a subprocess and that should work,
> but it would suck to have to make such extra considerations just for
> Cygwin, much less get some upstream project to accept that.

Well, Perl has hooks for platform specific code in ExtUtils and
Module::Install, so that takes care of 99% of the module builds out
there and they seem to have no trouble accepting it into their code as
long as you can demonstrate that it woreks and why it's there.  I won't
touch Python if I can avoid it, so I have no idea what they do; but
again, it would seem a glaring omission to not have _something_ that
caters to the runtime platform at least.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-05 Thread E. Madison Bray
On Thu, Mar 28, 2019 at 6:50 PM Achim Gratz wrote:
>
> Michael Haubenwallner writes:
> > It will not help for conflicts between dlls within a single package while 
> > this
> > package is built.  I'm thinking of python modules built within the python 
> > package
> > itself, where the just built modules are used within the very build 
> > process.  Not
> > sure if packages using local modules during build also do use fork then, 
> > though.
>
> It does help, that's the whole point.  But you will have to rebase all
> the in-processing DLL together, as the database will only have
> information on the installed DLL.  So if you build in stages, you'll
> need to do something like incremental autorebase does and collect all
> DLL into some file that you can then feed to
>
> rebase -sOT dlls_to_rebase
>
> That is slightly less convenient than using the database in persistent
> mode, but it is much less of a headache when you want to throw things
> away and start over since you don't need to worry about cruft in the
> database file.

That is essentially what I do for incremental builds; I keep
re-running rebase between stages with roughly those same flags and
this works.

However, I can see how this could be inconvenient for some Python
builds where you might have something within the setup.py script
(which, when building Python extension modules, is still usually used)
like (in pseudo-code):

run_build_ext_command()
import just_built_module
# Use just_built_module to generate some files
run_install_command()

all within the same process.  One could work around this by modifying
the setup.py to call `rebase` as a subprocess and that should work,
but it would suck to have to make such extra considerations just for
Cygwin, much less get some upstream project to accept that.

I don't know if what I described is at all similar to Michael's case,
and I've never run into a problem with this myself (even building
Numpy or SciPy).  But I could see it happening somehow...


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-03 Thread Achim Gratz
Michael Haubenwallner writes:
> Before I really can tell what I need regarding the rebase, I need to learn 
> what
> exactly is recorded into the rebase database, and probably how the recorded 
> data
> does influence the rebase procedure right now.

Just where the DLL resides in the filesystem, what address it has been
rebased to and what size it occupies.  If you rebase a new DLL with the
database, it will give you the first gap in the address space that this
new DLL fits into for doing the rebase and record that into the
database.  With the --oblivious option, it keeps the database file
untouched, so the information about the newly rebased DLL gets lost
whenh the program exits.  That's why you need to do all oblivious
rebasing in a single invocation.

> My thoughts so far for what I probably need:
>
> * First, rebase new dlls before being installed into the target file system
> directory with respect to currently installed dlls (the --oblivious
> option),

You always rebase after the install so that the path information is
correct.  Pre-rebasing is useless.

> * Second, register new dlls just installed into the target file system
> directory into the rebase database without performing a rebase, and

No, rebasing the installed DLL already does that.

> * Third, unregister dlls being removed from the rebase database.

Rebase already removes any entries that are no longer accessible from
the database.

> Also, it may make sense to allow for reusing the base address of an installed
> dll by it's update replacement - while the old version dll still is in use and
> the new version dll is in some temporary staging directory.

Rebase already re-uses the base-address if the path for the new DLL is
the same and it still fits into the gap.  In general, however, that
won't work when the size of any DLL changes.  You can ask for more
guardband around each entry, but that doesn't actually solve the problem
as it's only useful for the initial (full) rebase.

> As there may be multiple instances of Gentoo Prefix within one single 
> operating
> system instance, it does not make sense to record the dll's base addresses 
> into
> the rebase database of the underlying Cygwin instance in /etc, but still the
> base addresses already recorded there should be respected when rebasing dlls
> for within a particular Gentoo Prefix instance.

If you can limit the address space that's used by the Cygwin base
system, I'd just give your Gentoo prefix installation its own address
space and rebase it independently from the base system.  That probably
requires some fooling around with the (currently hardcoded) rebase
database files, but should otherwise just work.

> Furthermore, with so called "Stacked Prefix", it is possible to have a second
> level of Gentoo Prefix, so what I'm after is some option to tell the rebase
> utility which database to record dll base addresses into, and which 
> multiple(!)
> databases take into account while performing a rebase.

I don't think you'll want to do that.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-03 Thread Corinna Vinschen
On Apr  3 12:38, Michael Haubenwallner wrote:
> Furthermore, with so called "Stacked Prefix", it is possible to have a second
> level of Gentoo Prefix, so what I'm after is some option to tell the rebase
> utility which database to record dll base addresses into, and which 
> multiple(!)
> databases take into account while performing a rebase.

rebase is OSS.  There's nothing keeping you from providing patches
to make your scenario work ;)


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-03 Thread Corinna Vinschen
On Apr  3 11:18, Michael Haubenwallner wrote:
> On 4/1/19 5:56 PM, Corinna Vinschen wrote:
> > On Apr  1 16:56, Corinna Vinschen wrote:
> >> On Apr  1 16:28, Michael Haubenwallner wrote:
> >>> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
>  can you please collect the base addresses of all DLLs generated during
>  the build, plus their size and make a sorted list?  It would be
>  interesting to know if the hash algorithm in ld is actually as bad
>  as I conjecture.
> >>>
> >>> Please find attached the output of rebase -i for the dlls after bootstrap
> >>> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> > 
> > Oh, wait.  That's not what I was looking for.  The addresses are ok, but
> > the paths *must* be the ones at the time the DLLs have been created,
> > because that's what ld uses when creating the image base addresses.  The
> > addresses combined with the installation paths don't make sense anymore.
> 
> So I have intercepted the ld.exe to show 'rebase -i' on any just created dll,
> tell about the exact -o argument to ld, and the current directory.
> 
> This is with binutils-2.31.1
> 
> Anything else needed?

No, that should be sufficient, thanks for collecting this!


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-03 Thread Michael Haubenwallner
Hi Brian, hi Achim,

Thanks a lot for your input!

On 3/30/19 5:09 PM, Brian Inglis wrote:
> On 2019-03-30 02:22, Achim Gratz wrote:
>> Brian Inglis writes:
>>> On 2019-03-29 14:23, Achim Gratz wrote:
 Brian Inglis writes:
>> If you are packaging your own exes and dlls with your own local Cygwin 
>> distro,
>> you should point to your local utility directory with a path in a file 
>> under
>> /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or 
>> perhaps
>> you might also need to add your own production exes and dlls into
>> /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
>> /usr/share/doc/Cygwin/_autorebase.README.
>>>
>>> I was wondering as my first para above stated, whether rebase_user{,_exe} 
>>> would
>>> be the proper place to add 3rd party Cygwin dlls and exes, that are 
>>> distributed
>>> with Cygwin (internally)?
>>
>> Well, if you are distributing something (even just locally), then
>> preferrably you make proper Cygwin packages and you will never have to
>> deal with rebase yourself.
>>
>> The options you allude to above are meant for cases where that just
>> isn't possible and so you install things without using setup and often
>> also outside the Cygwin install (permanently, not temporarily until it
>> gets packaged).  You still need to run setup after each change so
>> autorebase can pick up on it.
> 
> Thanks Achim,
> 
> I think that those are possibly the answers the OP Michael was looking for,
> depending on how they are using Gentoo Prefix: it did not seem like they were
> installing their dlls and exes using Cygwin setup, but they could still run
> autorebase under dash.

Beyond being portable across many operating systems (*nix, MacOS, Cygwin, ...),
one of the main goals for Gentoo Prefix is to provide it's packaging mechanism
without the need for any privilege elevation on the underlying operating system,
nor coping with the various underlying operating system's packaging mechanisms.

On a side note:
To get it working as intended on Cygwin, I had to extend Cygwin fork() to allow
for updating dlls and executables while the process is running, as the Gentoo
Prefix package manager is a Cygwin program by itself - unlike Cygwin setup.exe,
which is a non-Cygwin executable and requires Cygwin processes to be terminated.

Before I really can tell what I need regarding the rebase, I need to learn what
exactly is recorded into the rebase database, and probably how the recorded data
does influence the rebase procedure right now.

My thoughts so far for what I probably need:

* First, rebase new dlls before being installed into the target file system
directory with respect to currently installed dlls (the --oblivious option),
* Second, register new dlls just installed into the target file system
directory into the rebase database without performing a rebase, and
* Third, unregister dlls being removed from the rebase database.

Also, it may make sense to allow for reusing the base address of an installed
dll by it's update replacement - while the old version dll still is in use and
the new version dll is in some temporary staging directory.

As there may be multiple instances of Gentoo Prefix within one single operating
system instance, it does not make sense to record the dll's base addresses into
the rebase database of the underlying Cygwin instance in /etc, but still the
base addresses already recorded there should be respected when rebasing dlls
for within a particular Gentoo Prefix instance.

Furthermore, with so called "Stacked Prefix", it is possible to have a second
level of Gentoo Prefix, so what I'm after is some option to tell the rebase
utility which database to record dll base addresses into, and which multiple(!)
databases take into account while performing a rebase.

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-03 Thread Michael Haubenwallner
On 4/1/19 5:56 PM, Corinna Vinschen wrote:
> On Apr  1 16:56, Corinna Vinschen wrote:
>> On Apr  1 16:28, Michael Haubenwallner wrote:
>>> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
 can you please collect the base addresses of all DLLs generated during
 the build, plus their size and make a sorted list?  It would be
 interesting to know if the hash algorithm in ld is actually as bad
 as I conjecture.
>>>
>>> Please find attached the output of rebase -i for the dlls after bootstrap
>>> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> 
> Oh, wait.  That's not what I was looking for.  The addresses are ok, but
> the paths *must* be the ones at the time the DLLs have been created,
> because that's what ld uses when creating the image base addresses.  The
> addresses combined with the installation paths don't make sense anymore.

So I have intercepted the ld.exe to show 'rebase -i' on any just created dll,
tell about the exact -o argument to ld, and the current directory.

This is with binutils-2.31.1

Anything else needed?

/haubi/


dll-info.txt.xz
Description: application/xz


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Brian Inglis
On 2019-04-01 10:31, Michael Haubenwallner wrote:
> 
> On 4/1/19 5:56 PM, Corinna Vinschen wrote:
>> On Apr  1 16:56, Corinna Vinschen wrote:
>>> On Apr  1 16:28, Michael Haubenwallner wrote:
 On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> can you please collect the base addresses of all DLLs generated during
> the build, plus their size and make a sorted list?  It would be
> interesting to know if the hash algorithm in ld is actually as bad
> as I conjecture.

 Please find attached the output of rebase -i for the dlls after bootstrap
 on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
>>
>> Oh, wait.  That's not what I was looking for.  The addresses are ok, but
>> the paths *must* be the ones at the time the DLLs have been created,
>> because that's what ld uses when creating the image base addresses.
> 
> Maybe I can provide that one as well.
> 
>> The
>> addresses combined with the installation paths don't make sense anymore.
>>
>> Apart from that, since you seem to be installing the DLLs anyway, can't
>> you combine every crucial point during installation with a rebase?
> 
> This is what I'm after now, but I may need to introduce something like
> additional readonly databases plus some --unregister option to rebase.

Check my questions and Achim's answers in the other subthread for existing ways
to deal with your issues that are only semi-documented.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Michael Haubenwallner


On 4/1/19 5:56 PM, Corinna Vinschen wrote:
> On Apr  1 16:56, Corinna Vinschen wrote:
>> On Apr  1 16:28, Michael Haubenwallner wrote:
>>> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
 can you please collect the base addresses of all DLLs generated during
 the build, plus their size and make a sorted list?  It would be
 interesting to know if the hash algorithm in ld is actually as bad
 as I conjecture.
>>>
>>> Please find attached the output of rebase -i for the dlls after bootstrap
>>> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> 
> Oh, wait.  That's not what I was looking for.  The addresses are ok, but
> the paths *must* be the ones at the time the DLLs have been created,
> because that's what ld uses when creating the image base addresses.

Maybe I can provide that one as well.

> The
> addresses combined with the installation paths don't make sense anymore.
> 
> Apart from that, since you seem to be installing the DLLs anyway, can't
> you combine every crucial point during installation with a rebase?

This is what I'm after now, but I may need to introduce something like
additional readonly databases plus some --unregister option to rebase.

/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Corinna Vinschen
On Apr  1 16:56, Corinna Vinschen wrote:
> On Apr  1 16:28, Michael Haubenwallner wrote:
> > Hi Corinna,
> > 
> > On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> > > can you please collect the base addresses of all DLLs generated during
> > > the build, plus their size and make a sorted list?  It would be
> > > interesting to know if the hash algorithm in ld is actually as bad
> > > as I conjecture.
> > 
> > Please find attached the output of rebase -i for the dlls after bootstrap
> > on Cygwin 3.0.4, each built with ld from binutils-2.31.1.

Oh, wait.  That's not what I was looking for.  The addresses are ok, but
the paths *must* be the ones at the time the DLLs have been created,
because that's what ld uses when creating the image base addresses.  The
addresses combined with the installation paths don't make sense anymore.

Apart from that, since you seem to be installing the DLLs anyway, can't
you combine every crucial point during installation with a rebase?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Corinna Vinschen
On Apr  1 16:28, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> > can you please collect the base addresses of all DLLs generated during
> > the build, plus their size and make a sorted list?  It would be
> > interesting to know if the hash algorithm in ld is actually as bad
> > as I conjecture.
> 
> Please find attached the output of rebase -i for the dlls after bootstrap
> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> 
> > If we can improve on the distribution within the 8 Gigs area by changing
> > ld's address generation(*), we may improve situations like these without
> > too much hassle.  As always, not a foolproof way out, but heck, 8 Gigs
> > is a lot of space for a couple 100 DLLs.
> 
> Feels like I need some Cygwin rebase step in Gentoo Prefix anyway, as there
> are ~250 dlls right after bootstrap - without any application yet.

For comparison, I have 1835 system DLLs installed, and they only take
a bit less than 30% of the 8 Gigs.

I'm surprised to see 7 collisions, one of them even using the exact
same address.  So the hash algorithm might be improvable.

In hindsight, we also might have been better off with a bit more space
for DLLs than 8 + 8 Gigs, I guess, given the size of the 64 bit address
space.  We can still get to that by updating Cygwin, rebase and
binutils.  For instance, assuming 32 Gigs + 32 Gigs, rebased DLLs would
start at 0x2:, non-rebased DLLs would start at 0xa: and
the heap would start at 0x12:.  Still lots of room in the VM.

However, that would probably not fix the exact collision between
usr/bin/cygncurses++6.dll and
usr/lib/python3.6/lib-dynload/_sha512.cpython-36m-x86_64-cygwin.dll


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Michael Haubenwallner
Hi Corinna,

On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> On Mar 28 12:48, Michael Haubenwallner wrote:
>> On 3/28/19 10:58 AM, Corinna Vinschen wrote:
>>> On Mar 28 10:17, Michael Haubenwallner wrote:
 As it is not some other dll being loaded at the colliding adress: any
 idea how to find out _what_ is allocated there (in the forked child),
 to find out whether we can reserve these areas even more early?
>>>
>>> I'm not sure what addresses you're talking about ATM.  The addresses in
>>> the 0x4: - 0x6: range?
>>
>> No, I'm thinking about the lower address that collides after relocation,
>> if there is some cygwin allocated object we may allocate later...
>>
>>> These are the interesting ones.
>>> The relocation to some random low address should only occur if there's
>>> a collision in this range.
>>
>> This should be easier to find out (by inspecting the loaded dlls).
> 
> can you please collect the base addresses of all DLLs generated during
> the build, plus their size and make a sorted list?  It would be
> interesting to know if the hash algorithm in ld is actually as bad
> as I conjecture.

Please find attached the output of rebase -i for the dlls after bootstrap
on Cygwin 3.0.4, each built with ld from binutils-2.31.1.

> 
> If we can improve on the distribution within the 8 Gigs area by changing
> ld's address generation(*), we may improve situations like these without
> too much hassle.  As always, not a foolproof way out, but heck, 8 Gigs
> is a lot of space for a couple 100 DLLs.

Feels like I need some Cygwin rebase step in Gentoo Prefix anyway, as there
are ~250 dlls right after bootstrap - without any application yet.

Thanks!
/haubi/


rebase-info.txt.xz
Description: application/xz


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-30 Thread Brian Inglis
On 2019-03-30 02:22, Achim Gratz wrote:
> Brian Inglis writes:
>> On 2019-03-29 14:23, Achim Gratz wrote:
>>> Brian Inglis writes:
> If you are packaging your own exes and dlls with your own local Cygwin 
> distro,
> you should point to your local utility directory with a path in a file 
> under
> /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or 
> perhaps
> you might also need to add your own production exes and dlls into
> /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
> /usr/share/doc/Cygwin/_autorebase.README.
>>
>> I was wondering as my first para above stated, whether rebase_user{,_exe} 
>> would
>> be the proper place to add 3rd party Cygwin dlls and exes, that are 
>> distributed
>> with Cygwin (internally)?
> 
> Well, if you are distributing something (even just locally), then
> preferrably you make proper Cygwin packages and you will never have to
> deal with rebase yourself.
> 
> The options you allude to above are meant for cases where that just
> isn't possible and so you install things without using setup and often
> also outside the Cygwin install (permanently, not temporarily until it
> gets packaged).  You still need to run setup after each change so
> autorebase can pick up on it.

Thanks Achim,

I think that those are possibly the answers the OP Michael was looking for,
depending on how they are using Gentoo Prefix: it did not seem like they were
installing their dlls and exes using Cygwin setup, but they could still run
autorebase under dash.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-30 Thread Achim Gratz
Brian Inglis writes:
> On 2019-03-29 14:23, Achim Gratz wrote:
>> Brian Inglis writes:
 If you are packaging your own exes and dlls with your own local Cygwin 
 distro,
 you should point to your local utility directory with a path in a file 
 under
 /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or 
 perhaps
 you might also need to add your own production exes and dlls into
 /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
 /usr/share/doc/Cygwin/_autorebase.README.
>
> I was wondering as my first para above stated, whether rebase_user{,_exe} 
> would
> be the proper place to add 3rd party Cygwin dlls and exes, that are 
> distributed
> with Cygwin (internally)?

Well, if you are distributing something (even just locally), then
preferrably you make proper Cygwin packages and you will never have to
deal with rebase yourself.

The options you allude to above are meant for cases where that just
isn't possible and so you install things without using setup and often
also outside the Cygwin install (permanently, not temporarily until it
gets packaged).  You still need to run setup after each change so
autorebase can pick up on it.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-29 Thread Brian Inglis
On 2019-03-29 14:23, Achim Gratz wrote:
> Brian Inglis writes:
>>> If you are packaging your own exes and dlls with your own local Cygwin 
>>> distro,
>>> you should point to your local utility directory with a path in a file under
>>> /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or 
>>> perhaps
>>> you might also need to add your own production exes and dlls into
>>> /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
>>> /usr/share/doc/Cygwin/_autorebase.README.

>> Achim, thanks for the clarifications; could you please comment on the 
>> suggested
>> approach for handling local production dlls and exes, or explain the best
>> approach for migrating from test to prod and handling rebase on target 
>> systems?

> I'm not quite sure what you want to know.  As I said before oblivious
> rebase was invented for running tests that use freshly built DLL (I
> usually package them before running the tests, so the package will have
> the un-rebased DLL from before the test was run).  For this it suffices
> to simply feed in all new DLL names to rebase.  If you were to build in
> stages and/or combine different builds then you'd somehow have to
> remember the DLL from each stage or build, or just collect all the DLL
> names again each time you change something.  The important thing is that
> each oblivious rebase needs to get the list of _all_ DLL that need to
> get rebased, since the database only knows about the host system
> (i.e. you can't rebase incrementally with --oblivious).

I was wondering as my first para above stated, whether rebase_user{,_exe} would
be the proper place to add 3rd party Cygwin dlls and exes, that are distributed
with Cygwin (internally)?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-29 Thread Achim Gratz
Brian Inglis writes:
> Achim, thanks for the clarifications; could you please comment on the 
> suggested
> approach for handling local production dlls and exes, or explain the best
> approach for migrating from test to prod and handling rebase on target 
> systems?

I'm not quite sure what you want to know.  As I said before oblivious
rebase was invented for running tests that use freshly built DLL (I
usually package them before running the tests, so the package will have
the un-rebased DLL from before the test was run).  For this it suffices
to simply feed in all new DLL names to rebase.  If you were to build in
stages and/or combine different builds then you'd somehow have to
remember the DLL from each stage or build, or just collect all the DLL
names again each time you change something.  The important thing is that
each oblivious rebase needs to get the list of _all_ DLL that need to
get rebased, since the database only knows about the host system
(i.e. you can't rebase incrementally with --oblivious).


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-29 Thread Brian Inglis
On 2019-03-29 01:15, Achim Gratz wrote:
> Brian Inglis writes:
>> File list my-dlls.txt is your local test rebase db listing all your
>> test dlls.
> 
> I think Michael got confused by your usage of "db" here.  This is in
> fact just a listing of all the DLL to operate on, not the rebase
> database (which won't be changed at all by an oblivious rebase, only
> read in order to not collide the new rebase with the already existing
> ones).
> 
>> If you are packaging your own exes and dlls with your own local Cygwin 
>> distro,
>> you should point to your local utility directory with a path in a file under
>> /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or 
>> perhaps
>> you might also need to add your own production exes and dlls into
>> /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
>> /usr/share/doc/Cygwin/_autorebase.README.
> 
> What Michael is using is a fairly complex build system that would indeed
> benefit from a layered rebase database, i.e. the one for the base system
> providing the substrate for the build system and then at leat on other
> one that collects the information from inside the build system (maybe
> even a third layer for tests).  How to deal with the complexities of
> when you want to push information down to a previous layer would likely
> be a main point of contention, so you'd probably best skip it in the
> beginning.
> 
> SHTDI, PTC, etc.pp.
> 
> With the current rebase, you'll have to use "--oblivious" (which, again,
> doesn't remember any data for the newly rebased objects) and those
> non-existing upper layers will have to be provided by side-channel
> information that the build system has to collect and maintain itself,
> then feed to the rebase command.

Achim, thanks for the clarifications; could you please comment on the suggested
approach for handling local production dlls and exes, or explain the best
approach for migrating from test to prod and handling rebase on target systems?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-29 Thread Achim Gratz
Brian Inglis writes:
> File list my-dlls.txt is your local test rebase db listing all your
> test dlls.

I think Michael got confused by your usage of "db" here.  This is in
fact just a listing of all the DLL to operate on, not the rebase
database (which won't be changed at all by an oblivious rebase, only
read in order to not collide the new rebase with the already existing
ones).

> If you are packaging your own exes and dlls with your own local Cygwin distro,
> you should point to your local utility directory with a path in a file under
> /var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or perhaps
> you might also need to add your own production exes and dlls into
> /var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
> /usr/share/doc/Cygwin/_autorebase.README.

What Michael is using is a fairly complex build system that would indeed
benefit from a layered rebase database, i.e. the one for the base system
providing the substrate for the build system and then at leat on other
one that collects the information from inside the build system (maybe
even a third layer for tests).  How to deal with the complexities of
when you want to push information down to a previous layer would likely
be a main point of contention, so you'd probably best skip it in the
beginning.

SHTDI, PTC, etc.pp.

With the current rebase, you'll have to use "--oblivious" (which, again,
doesn't remember any data for the newly rebased objects) and those
non-existing upper layers will have to be provided by side-channel
information that the build system has to collect and maintain itself,
then feed to the rebase command.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf rackAttack:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Brian Inglis
On 2019-03-28 10:48, Michael Haubenwallner wrote:
> On 3/28/19 4:19 PM, Brian Inglis wrote:
>> On 2019-03-28 08:59, Michael Haubenwallner wrote:
>>> On 3/27/19 8:59 PM, Achim Gratz wrote:
 Michael Haubenwallner writes:
> As far as I understand, rebasing is about touching already installed
> dlls as well, which would require to restart all Cygwin processes.
> As the problem is about some dll built during a larger build job,
> this is not something that feels useful to me.

 That's exactly why I introduced the "--oblivious" option several years
 ago.  It'll let you rebase a set of DLL while benefitting from the
 rebase database, but not recording them there, so if you later install
 them properly there will be no collision.  I needed this for testing
 newly compiled Perl XS modules, but you seem to have a similar use case.
>>>
>>> What I can see so far is that right now there is only one single rebase
>>> database, in /etc/rebase.db..
>>>
>>> However, my 'installed' dlls are not put into /bin, but into the so called
>>> Gentoo "Prefix", e.g. /home/haubi/test-20190327/gentoo-prefix/usr/bin for
>>> example.  Remember that there can be multiple independent instances of 
>>> Gentoo
>>> Prefix, so recording them all into the host /etc/rebase.db is not an option.
>>>
>>> Hence there should be a rebase database per Gentoo Prefix instance, like
>>> /home/haubi/test-20190327/gentoo-prefix/etc/rebase.db., to record
>>> my 'installed' dlls, while still loading the /etc/rebase.db. to avoid
>>> conflicts with cygwin provided dlls.
>>>
>>> And how would one explicitly remove specific entries from the rebase 
>>> database
>>> when dlls get uninstalled (by either package remove or package upgrade)?
>>
>> Using rebase -O, --oblivious with -T, --filelist local-test-rebase-db gives 
>> you
>> your own local test rebase db - just add all your test dlls into it (sort -u 
>> to
>> eliminate dups).
> Sounds interesting... but something I must be doing wrong here:
> $ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
> local-test-rebase-db: skipped because nonexistent.
> $ touch local-test-rebase-db
> $ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
> local-test-rebase-db: skipped because not rebaseable
> $ cp /etc/rebase.db.x86_64 local-test-rebase-db
> $ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
> local-test-rebase-db: skipped because not rebaseable
> It doesn't want to create or update the local-test-rebase-db file...

File list my-dlls.txt is your local test rebase db listing all your test dlls.
If you are packaging your own exes and dlls with your own local Cygwin distro,
you should point to your local utility directory with a path in a file under
/var/lib/rebase/user.d/$USER for each Cygwin userid on each system, or perhaps
you might also need to add your own production exes and dlls into
/var/cache/rebase/rebase_user and /var/cache/rebase/rebase_user_exe: see
/usr/share/doc/Cygwin/_autorebase.README.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.q


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Corinna Vinschen
Michael,

On Mar 28 12:48, Michael Haubenwallner wrote:
> On 3/28/19 10:58 AM, Corinna Vinschen wrote:
> > On Mar 28 10:17, Michael Haubenwallner wrote:
> >> As it is not some other dll being loaded at the colliding adress: any
> >> idea how to find out _what_ is allocated there (in the forked child),
> >> to find out whether we can reserve these areas even more early?
> > 
> > I'm not sure what addresses you're talking about ATM.  The addresses in
> > the 0x4: - 0x6: range?
> 
> No, I'm thinking about the lower address that collides after relocation,
> if there is some cygwin allocated object we may allocate later...
> 
> > These are the interesting ones.
> > The relocation to some random low address should only occur if there's
> > a collision in this range.
> 
> This should be easier to find out (by inspecting the loaded dlls).

can you please collect the base addresses of all DLLs generated during
the build, plus their size and make a sorted list?  It would be
interesting to know if the hash algorithm in ld is actually as bad
as I conjecture.

If we can improve on the distribution within the 8 Gigs area by changing
ld's address generation(*), we may improve situations like these without
too much hassle.  As always, not a foolproof way out, but heck, 8 Gigs
is a lot of space for a couple 100 DLLs.


Corinna

(*) Maybe even a RNG is better than a hash here...

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Achim Gratz
Michael Haubenwallner writes:
> It will not help for conflicts between dlls within a single package while this
> package is built.  I'm thinking of python modules built within the python 
> package
> itself, where the just built modules are used within the very build process.  
> Not
> sure if packages using local modules during build also do use fork then, 
> though.

It does help, that's the whole point.  But you will have to rebase all
the in-processing DLL together, as the database will only have
information on the installed DLL.  So if you build in stages, you'll
need to do something like incremental autorebase does and collect all
DLL into some file that you can then feed to

rebase -sOT dlls_to_rebase

That is slightly less convenient than using the database in persistent
mode, but it is much less of a headache when you want to throw things
away and start over since you don't need to worry about cruft in the
database file.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner
On 3/28/19 4:19 PM, Brian Inglis wrote:
> On 2019-03-28 08:59, Michael Haubenwallner wrote:
>> On 3/27/19 8:59 PM, Achim Gratz wrote:
>>> Michael Haubenwallner writes:
 As far as I understand, rebasing is about touching already installed
 dlls as well, which would require to restart all Cygwin processes.
 As the problem is about some dll built during a larger build job,
 this is not something that feels useful to me.
>>>
>>> That's exactly why I introduced the "--oblivious" option several years
>>> ago.  It'll let you rebase a set of DLL while benefitting from the
>>> rebase database, but not recording them there, so if you later install
>>> them properly there will be no collision.  I needed this for testing
>>> newly compiled Perl XS modules, but you seem to have a similar use case.
>>
>> What I can see so far is that right now there is only one single rebase
>> database, in /etc/rebase.db..
>>
>> However, my 'installed' dlls are not put into /bin, but into the so called
>> Gentoo "Prefix", e.g. /home/haubi/test-20190327/gentoo-prefix/usr/bin for
>> example.  Remember that there can be multiple independent instances of Gentoo
>> Prefix, so recording them all into the host /etc/rebase.db is not an option.
>>
>> Hence there should be a rebase database per Gentoo Prefix instance, like
>> /home/haubi/test-20190327/gentoo-prefix/etc/rebase.db., to record
>> my 'installed' dlls, while still loading the /etc/rebase.db. to avoid
>> conflicts with cygwin provided dlls.
>>
>> And how would one explicitly remove specific entries from the rebase database
>> when dlls get uninstalled (by either package remove or package upgrade)?
> 
> Using rebase -O, --oblivious with -T, --filelist local-test-rebase-db gives 
> you
> your own local test rebase db - just add all your test dlls into it (sort -u 
> to
> eliminate dups).

Sounds interesting... but something I must be doing wrong here:


$ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
local-test-rebase-db: skipped because nonexistent.

$ touch local-test-rebase-db

$ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
local-test-rebase-db: skipped because not rebaseable

$ cp /etc/rebase.db.x86_64 local-test-rebase-db

$ rebase --oblivious --filelist=my-dlls.txt local-test-rebase-db
local-test-rebase-db: skipped because not rebaseable


It doesn't want to create or update the local-test-rebase-db file...

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Corinna Vinschen
On Mar 28 16:02, Michael Haubenwallner wrote:
> On 3/28/19 10:15 AM, Corinna Vinschen wrote:
> > On Mar 28 09:34, Michael Haubenwallner wrote:
> >> Hi Corinna,
> >>
> >> On 3/27/19 10:16 AM, Corinna Vinschen wrote:
> >>> On Mar 27 09:26, Michael Haubenwallner wrote:
>  On 3/26/19 7:28 PM, Corinna Vinschen wrote:
> >>> Wait, let me understand what's going on.  IIUC you're building DLLs
> >>> which are then used during the build job itself, right?
> >>
> >> Exactly.
> >> FWIW, the CI builds also set up a Cygwin instance from scratch,
> >> as I'm also after testing Cygwin (v3) itself to some degree:
> >> https://dev.azure.com/gentoo-prefix/ci-builds/_build
> >>
> >> However, I've not found a commandline option for setup.exe to install
> >> "test" versions...
> >>
> >>> As you know, 64 bit has a defined memory layout.  Binutils ld is
> >>> supposed to base the DLLs to a pseudo-random address in the area between
> >>> 0x4: and 0x6:.  This area is occupied by un-rebased DLLs
> >>> only.  8 Gigs is a *lot* of space for DLLs.
> >>>
> >>> That also means that the DLLs should not at all collide with windows
> >>> objects (typically reserved in the lesser 2 Gigs area), unless they
> >>> collide with themselves.  At least that's the idea.
> >>>
> >>> Can you check what addresses the freshly built DLLs are based on by LD?
> >>> Is there a chance that the algorithm used in LD is too dumb?
> >>
> >> I've also added system_printf to dll_list::reserve_space() when a dynloaded
> >> dll was relocated, and each new address was below 0x0:0100. The 
> >> attached
> >> output also contains the preferred address, above 0x4: each.
> > 
> > Do they actually collide with each other?  Did you check the addresses?
> 
> Yes, there is a real collision between installed dlls:
> $ rebase -i /home/haubi/test-20190327/gentoo-prefix/usr/bin/cygcrypto-1.1.dll 
> /home/haubi/test-20190327/gentoo-prefix/usr/lib/python2.7/lib-dynload/_locale.dll
> /home/haubi/test-20190327/gentoo-prefix/usr/bin/cygcrypto-1.1.dll 
> base 0x00041c65 size 0x0027d000 *
> /home/haubi/test-20190327/gentoo-prefix/usr/lib/python2.7/lib-dynload/_locale.dll
>  base 0x00041c6a size 0x0002c000 *

Oh well, it would be nice if ld's hash algorithm would spread out DLLs
better in the 8 Gigs space.

> Is the cygwin1.dll from master branch available via setup.exe cmdline somehow?

No, only from the snapshot page.  I release a 3.0.5 soon, but 3.1 will
be dev-only for a while.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Brian Inglis
On 2019-03-28 08:59, Michael Haubenwallner wrote:
> Hi Achim,
> 
> On 3/27/19 8:59 PM, Achim Gratz wrote:
>> Michael Haubenwallner writes:
>>> As far as I understand, rebasing is about touching already installed
>>> dlls as well, which would require to restart all Cygwin processes.
>>> As the problem is about some dll built during a larger build job,
>>> this is not something that feels useful to me.
>>
>> That's exactly why I introduced the "--oblivious" option several years
>> ago.  It'll let you rebase a set of DLL while benefitting from the
>> rebase database, but not recording them there, so if you later install
>> them properly there will be no collision.  I needed this for testing
>> newly compiled Perl XS modules, but you seem to have a similar use case.
> 
> What I can see so far is that right now there is only one single rebase
> database, in /etc/rebase.db..
> 
> However, my 'installed' dlls are not put into /bin, but into the so called
> Gentoo "Prefix", e.g. /home/haubi/test-20190327/gentoo-prefix/usr/bin for
> example.  Remember that there can be multiple independent instances of Gentoo
> Prefix, so recording them all into the host /etc/rebase.db is not an option.
> 
> Hence there should be a rebase database per Gentoo Prefix instance, like
> /home/haubi/test-20190327/gentoo-prefix/etc/rebase.db., to record
> my 'installed' dlls, while still loading the /etc/rebase.db. to avoid
> conflicts with cygwin provided dlls.
> 
> And how would one explicitly remove specific entries from the rebase database
> when dlls get uninstalled (by either package remove or package upgrade)?

Using rebase -O, --oblivious with -T, --filelist local-test-rebase-db gives you
your own local test rebase db - just add all your test dlls into it (sort -u to
eliminate dups).

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner
On 3/28/19 10:15 AM, Corinna Vinschen wrote:
> On Mar 28 09:34, Michael Haubenwallner wrote:
>> Hi Corinna,
>>
>> On 3/27/19 10:16 AM, Corinna Vinschen wrote:
>>> On Mar 27 09:26, Michael Haubenwallner wrote:
 On 3/26/19 7:28 PM, Corinna Vinschen wrote:
>>> Wait, let me understand what's going on.  IIUC you're building DLLs
>>> which are then used during the build job itself, right?
>>
>> Exactly.
>> FWIW, the CI builds also set up a Cygwin instance from scratch,
>> as I'm also after testing Cygwin (v3) itself to some degree:
>> https://dev.azure.com/gentoo-prefix/ci-builds/_build
>>
>> However, I've not found a commandline option for setup.exe to install
>> "test" versions...
>>
>>> As you know, 64 bit has a defined memory layout.  Binutils ld is
>>> supposed to base the DLLs to a pseudo-random address in the area between
>>> 0x4: and 0x6:.  This area is occupied by un-rebased DLLs
>>> only.  8 Gigs is a *lot* of space for DLLs.
>>>
>>> That also means that the DLLs should not at all collide with windows
>>> objects (typically reserved in the lesser 2 Gigs area), unless they
>>> collide with themselves.  At least that's the idea.
>>>
>>> Can you check what addresses the freshly built DLLs are based on by LD?
>>> Is there a chance that the algorithm used in LD is too dumb?
>>
>> I've also added system_printf to dll_list::reserve_space() when a dynloaded
>> dll was relocated, and each new address was below 0x0:0100. The attached
>> output also contains the preferred address, above 0x4: each.
> 
> Do they actually collide with each other?  Did you check the addresses?

Yes, there is a real collision between installed dlls:
$ rebase -i /home/haubi/test-20190327/gentoo-prefix/usr/bin/cygcrypto-1.1.dll 
/home/haubi/test-20190327/gentoo-prefix/usr/lib/python2.7/lib-dynload/_locale.dll
/home/haubi/test-20190327/gentoo-prefix/usr/bin/cygcrypto-1.1.dll   
  base 0x00041c65 size 0x0027d000 *
/home/haubi/test-20190327/gentoo-prefix/usr/lib/python2.7/lib-dynload/_locale.dll
 base 0x00041c6a size 0x0002c000 *

> 
> There must be collisions in your case.  Can you please check if
> Achim's solution works for you?

The flexibility of rebase regarding multiple rebase databases seems not there 
yet,
but in theory this can help to avoid conflicts between dlls finally *installed*.

It will not help for conflicts between dlls within a single package while this
package is built.  I'm thinking of python modules built within the python 
package
itself, where the just built modules are used within the very build process.  
Not
sure if packages using local modules during build also do use fork then, though.

> 
> In the meantime I pushed your patch to the master branch (but not
> yet to the 3.0 branch).

Is the cygwin1.dll from master branch available via setup.exe cmdline somehow?

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner
Hi Achim,

On 3/27/19 8:59 PM, Achim Gratz wrote:
> Michael Haubenwallner writes:
>> As far as I understand, rebasing is about touching already installed
>> dlls as well, which would require to restart all Cygwin processes.
>> As the problem is about some dll built during a larger build job,
>> this is not something that feels useful to me.
> 
> That's exactly why I introduced the "--oblivious" option several years
> ago.  It'll let you rebase a set of DLL while benefitting from the
> rebase database, but not recording them there, so if you later install
> them properly there will be no collision.  I needed this for testing
> newly compiled Perl XS modules, but you seem to have a similar use case.

What I can see so far is that right now there is only one single rebase
database, in /etc/rebase.db..

However, my 'installed' dlls are not put into /bin, but into the so called
Gentoo "Prefix", e.g. /home/haubi/test-20190327/gentoo-prefix/usr/bin for
example.  Remember that there can be multiple independent instances of Gentoo
Prefix, so recording them all into the host /etc/rebase.db is not an option.

Hence there should be a rebase database per Gentoo Prefix instance, like
/home/haubi/test-20190327/gentoo-prefix/etc/rebase.db., to record
my 'installed' dlls, while still loading the /etc/rebase.db. to avoid
conflicts with cygwin provided dlls.

And how would one explicitly remove specific entries from the rebase database
when dlls get uninstalled (by either package remove or package upgrade)?

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Corinna Vinschen
On Mar 28 12:48, Michael Haubenwallner wrote:
> On 3/28/19 10:58 AM, Corinna Vinschen wrote:
> > On Mar 28 10:17, Michael Haubenwallner wrote:
> >> As it is not some other dll being loaded at the colliding adress: any
> >> idea how to find out _what_ is allocated there (in the forked child),
> >> to find out whether we can reserve these areas even more early?
> > 
> > I'm not sure what addresses you're talking about ATM.  The addresses in
> > the 0x4: - 0x6: range?
> 
> No, I'm thinking about the lower address that collides after relocation,
> if there is some cygwin allocated object we may allocate later...
> 
> > These are the interesting ones.
> > The relocation to some random low address should only occur if there's
> > a collision in this range.
> 
> This should be easier to find out (by inspecting the loaded dlls).
> 
> > 
> > I'm not quite sure how to find out what happens, unless you stop the
> > process in reserve_space and inspect the memory layout with sysinternal's
> > vmmap tool:
> > 
> > https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap
> 
> Maybe I will try that one - thanks for the pointer!
> 
> Are you about to apply the patch?

https://sourceware.org/ml/cygwin-patches/2019-q1/msg00061.html


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner
On 3/28/19 10:58 AM, Corinna Vinschen wrote:
> On Mar 28 10:17, Michael Haubenwallner wrote:
>> As it is not some other dll being loaded at the colliding adress: any
>> idea how to find out _what_ is allocated there (in the forked child),
>> to find out whether we can reserve these areas even more early?
> 
> I'm not sure what addresses you're talking about ATM.  The addresses in
> the 0x4: - 0x6: range?

No, I'm thinking about the lower address that collides after relocation,
if there is some cygwin allocated object we may allocate later...

> These are the interesting ones.
> The relocation to some random low address should only occur if there's
> a collision in this range.

This should be easier to find out (by inspecting the loaded dlls).

> 
> I'm not quite sure how to find out what happens, unless you stop the
> process in reserve_space and inspect the memory layout with sysinternal's
> vmmap tool:
> 
> https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap

Maybe I will try that one - thanks for the pointer!

Are you about to apply the patch?

Thanks!
/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Corinna Vinschen
On Mar 28 10:17, Michael Haubenwallner wrote:
> As it is not some other dll being loaded at the colliding adress: any
> idea how to find out _what_ is allocated there (in the forked child),
> to find out whether we can reserve these areas even more early?

I'm not sure what addresses you're talking about ATM.  The addresses in
the 0x4: - 0x6: range?  These are the interesting ones.
The relocation to some random low address should only occur if there's
a collision in this range.

I'm not quite sure how to find out what happens, unless you stop the
process in reserve_space and inspect the memory layout with sysinternal's
vmmap tool:

https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner


On 3/28/19 9:34 AM, Michael Haubenwallner wrote:
> On 3/27/19 10:16 AM, Corinna Vinschen wrote:
>> On Mar 27 09:26, Michael Haubenwallner wrote:
>>> On 3/26/19 7:28 PM, Corinna Vinschen wrote:
 On Mar 26 19:25, Corinna Vinschen wrote:
> On Mar 26 18:10, Michael Haubenwallner wrote:
>> Hi Corinna,

>>
 Btw., is that 32 or 64 bit?  Both?
>>>
>>> I'm on 64bit only, can't say for 32bit.  And while in theory possible,
>>> I'm not after supporting 32bit Cygwin in Gento Prefix at all...
>>
>> If so, then I'm really curious how many DLLs are affected and why this
>> occurs on 64 bit.
>>
>> As you know, 64 bit has a defined memory layout.  Binutils ld is
>> supposed to base the DLLs to a pseudo-random address in the area between
>> 0x4: and 0x6:.  This area is occupied by un-rebased DLLs
>> only.  8 Gigs is a *lot* of space for DLLs.
>>
>> That also means that the DLLs should not at all collide with windows
>> objects (typically reserved in the lesser 2 Gigs area), unless they
>> collide with themselves.  At least that's the idea.
>>
>> Can you check what addresses the freshly built DLLs are based on by LD?
>> Is there a chance that the algorithm used in LD is too dumb?
> 
> I've also added system_printf to dll_list::reserve_space() when a dynloaded
> dll was relocated, and each new address was below 0x0:0100. The attached
> output also contains the preferred address, above 0x4: each.
> 
>>
>> Or, hmm.  Is there a chance that newer Windows loads dynamically loaded
>> DLLs whereever it likes, ignoring the base address, ASLR-like, even
>> if the DLL is marked as non-ASLR-aware?  But then again, we should have
>> a lot more complaints on the list...
> 
> I've done this test on Windows Server 2012R2, but the problem exists on
> 2016 and 2019 as well (I'm not testing with other Windows versions).
> 
>>  I'm
>> coming up with attached patch.
>>
>> What do you think about it?
>
> I'm not opposed to this patch but I don't quite follow the description.
> threadinterface->Init only creates three event objects.  From what I can
> tell, Events are stored in Paged and Nonpaged Pools, so they don't
> affect the processes VM.  What am I missing?
>>>
>>> Honestly, I'm not completely sure whether this patch really does help:
>>> Beyond the Events, there also is CreateNamedPipe and CreateFile used
>>> in fhandler_pipe::create via sigproc_init, and these causing the address
>>> conflicts with some dll actually is nothing more than a wild guess:
>>> While their returned handles are below the conflicting dll address,
>>> who can tell what these API calls do allocate internally?
>>
>> The handles are not addresses.  If the sigproc_init stuff collides,
>> I only see two chances for that, the process-local read/write buffers
>> of the signal pipe, and the stack of the read_sig thread.
>>
>> If this patch helps your situation, we can pull it in and test it,
>> but I think your situation asks for more debugging along the lines
>> of the DLL rebasing above.
> 
> With this patch collisions seem gone, yet the relocations do happen.

Ehm... collisions still do happen, but less often at least,
so this patch does help in my situation.

As it is not some other dll being loaded at the colliding adress: any
idea how to find out _what_ is allocated there (in the forked child),
to find out whether we can reserve these areas even more early?

What if we adapt the initial dlopen call to disallow relocation into
such low address space?

Beyond that, I'm going to learn about rebase --oblivious, thanks Achim!

/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-28 Thread Michael Haubenwallner
Hi Corinna,

On 3/27/19 10:16 AM, Corinna Vinschen wrote:
> On Mar 27 09:26, Michael Haubenwallner wrote:
>> On 3/26/19 7:28 PM, Corinna Vinschen wrote:
>>> On Mar 26 19:25, Corinna Vinschen wrote:
 On Mar 26 18:10, Michael Haubenwallner wrote:
> Hi Corinna,
>
> as I do still encounter fork errors (address space needed by  is
> already occupied) with dynamically loaded dlls (but unrelated to
> replaced dlls), one of them repeating even upon multiple retries,

 Why didn't rebase fix that?
>>
>> As far as I understand, rebasing is about touching already installed
>> dlls as well, which would require to restart all Cygwin processes.
>> As the problem is about some dll built during a larger build job,
>> this is not something that feels useful to me.
> 
> Wait, let me understand what's going on.  IIUC you're building DLLs
> which are then used during the build job itself, right?

Exactly.
FWIW, the CI builds also set up a Cygwin instance from scratch,
as I'm also after testing Cygwin (v3) itself to some degree:
https://dev.azure.com/gentoo-prefix/ci-builds/_build

However, I've not found a commandline option for setup.exe to install
"test" versions...

> 
>>> Btw., is that 32 or 64 bit?  Both?
>>
>> I'm on 64bit only, can't say for 32bit.  And while in theory possible,
>> I'm not after supporting 32bit Cygwin in Gento Prefix at all...
> 
> If so, then I'm really curious how many DLLs are affected and why this
> occurs on 64 bit.
> 
> As you know, 64 bit has a defined memory layout.  Binutils ld is
> supposed to base the DLLs to a pseudo-random address in the area between
> 0x4: and 0x6:.  This area is occupied by un-rebased DLLs
> only.  8 Gigs is a *lot* of space for DLLs.
> 
> That also means that the DLLs should not at all collide with windows
> objects (typically reserved in the lesser 2 Gigs area), unless they
> collide with themselves.  At least that's the idea.
> 
> Can you check what addresses the freshly built DLLs are based on by LD?
> Is there a chance that the algorithm used in LD is too dumb?

I've also added system_printf to dll_list::reserve_space() when a dynloaded
dll was relocated, and each new address was below 0x0:0100. The attached
output also contains the preferred address, above 0x4: each.

> 
> Or, hmm.  Is there a chance that newer Windows loads dynamically loaded
> DLLs whereever it likes, ignoring the base address, ASLR-like, even
> if the DLL is marked as non-ASLR-aware?  But then again, we should have
> a lot more complaints on the list...

I've done this test on Windows Server 2012R2, but the problem exists on
2016 and 2019 as well (I'm not testing with other Windows versions).

>  I'm
> coming up with attached patch.
>
> What do you think about it?

 I'm not opposed to this patch but I don't quite follow the description.
 threadinterface->Init only creates three event objects.  From what I can
 tell, Events are stored in Paged and Nonpaged Pools, so they don't
 affect the processes VM.  What am I missing?
>>
>> Honestly, I'm not completely sure whether this patch really does help:
>> Beyond the Events, there also is CreateNamedPipe and CreateFile used
>> in fhandler_pipe::create via sigproc_init, and these causing the address
>> conflicts with some dll actually is nothing more than a wild guess:
>> While their returned handles are below the conflicting dll address,
>> who can tell what these API calls do allocate internally?
> 
> The handles are not addresses.  If the sigproc_init stuff collides,
> I only see two chances for that, the process-local read/write buffers
> of the signal pipe, and the stack of the read_sig thread.
> 
> If this patch helps your situation, we can pull it in and test it,
> but I think your situation asks for more debugging along the lines
> of the DLL rebasing above.

With this patch collisions seem gone, yet the relocations do happen.

Thanks!
/haubi/
 29 [main] python2.7 51113 dll_list::reserve_space: libpython2.7.dll 
preferring 0x53BB5 was loaded to 0xA8 
(\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\var\tmp\portage\dev-lang\python-2.7.16\work\x86_64-pc-cygwin\libpython2.7.dll)
  2 [main] python2.7 52526 dll_list::reserve_space: cygcrypto-1.1.dll 
preferring 0x41C65 was loaded to 0x65 
(\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
  2 [main] python2.7 54352 dll_list::reserve_space: cygcrypto-1.1.dll 
preferring 0x41C65 was loaded to 0x6E 
(\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
  2 [main] python2.7 55760 dll_list::reserve_space: cygcrypto-1.1.dll 
preferring 0x41C65 was loaded to 0x7C 
(\??\C:\cygwin64\home\haubi\test-20190327\gentoo-prefix\usr\bin\cygcrypto-1.1.dll)
  3 [main] python2.7 55763 dll_list::reserve_space: cygcrypto-1.1.dll 
preferring 0x41C65 was loaded to 0x7C 

Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-27 Thread Achim Gratz
Michael Haubenwallner writes:
> As far as I understand, rebasing is about touching already installed
> dlls as well, which would require to restart all Cygwin processes.
> As the problem is about some dll built during a larger build job,
> this is not something that feels useful to me.

That's exactly why I introduced the "--oblivious" option several years
ago.  It'll let you rebase a set of DLL while benefitting from the
rebase database, but not recording them there, so if you later install
them properly there will be no collision.  I needed this for testing
newly compiled Perl XS modules, but you seem to have a similar use case.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-27 Thread Corinna Vinschen
Hi Michael,

On Mar 27 09:26, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 3/26/19 7:28 PM, Corinna Vinschen wrote:
> > On Mar 26 19:25, Corinna Vinschen wrote:
> >> Hi Michael,
> >>
> >>
> >> Redirected to cygwin-patches...
> >>
> >>
> >> On Mar 26 18:10, Michael Haubenwallner wrote:
> >>> Hi Corinna,
> >>>
> >>> as I do still encounter fork errors (address space needed by  is
> >>> already occupied) with dynamically loaded dlls (but unrelated to
> >>> replaced dlls), one of them repeating even upon multiple retries,
> >>
> >> Why didn't rebase fix that?
> 
> As far as I understand, rebasing is about touching already installed
> dlls as well, which would require to restart all Cygwin processes.
> As the problem is about some dll built during a larger build job,
> this is not something that feels useful to me.

Wait, let me understand what's going on.  IIUC you're building DLLs
which are then used during the build job itself, right?

> > Btw., is that 32 or 64 bit?  Both?
> 
> I'm on 64bit only, can't say for 32bit.  And while in theory possible,
> I'm not after supporting 32bit Cygwin in Gento Prefix at all...

If so, then I'm really curious how many DLLs are affected and why this
occurs on 64 bit.

As you know, 64 bit has a defined memory layout.  Binutils ld is
supposed to base the DLLs to a pseudo-random address in the area between
0x4: and 0x6:.  This area is occupied by un-rebased DLLs
only.  8 Gigs is a *lot* of space for DLLs.

That also means that the DLLs should not at all collide with windows
objects (typically reserved in the lesser 2 Gigs area), unless they
collide with themselves.  At least that's the idea.

Can you check what addresses the freshly built DLLs are based on by LD?
Is there a chance that the algorithm used in LD is too dumb?

Or, hmm.  Is there a chance that newer Windows loads dynamically loaded
DLLs whereever it likes, ignoring the base address, ASLR-like, even
if the DLL is marked as non-ASLR-aware?  But then again, we should have
a lot more complaints on the list...

> >>>  I'm
> >>> coming up with attached patch.
> >>>
> >>> What do you think about it?
> >>
> >> I'm not opposed to this patch but I don't quite follow the description.
> >> threadinterface->Init only creates three event objects.  From what I can
> >> tell, Events are stored in Paged and Nonpaged Pools, so they don't
> >> affect the processes VM.  What am I missing?
> 
> Honestly, I'm not completely sure whether this patch really does help:
> Beyond the Events, there also is CreateNamedPipe and CreateFile used
> in fhandler_pipe::create via sigproc_init, and these causing the address
> conflicts with some dll actually is nothing more than a wild guess:
> While their returned handles are below the conflicting dll address,
> who can tell what these API calls do allocate internally?

The handles are not addresses.  If the sigproc_init stuff collides,
I only see two chances for that, the process-local read/write buffers
of the signal pipe, and the stack of the read_sig thread.

If this patch helps your situation, we can pull it in and test it,
but I think your situation asks for more debugging along the lines
of the DLL rebasing above.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-27 Thread Michael Haubenwallner
Hi Corinna,

On 3/26/19 7:28 PM, Corinna Vinschen wrote:
> On Mar 26 19:25, Corinna Vinschen wrote:
>> Hi Michael,
>>
>>
>> Redirected to cygwin-patches...
>>
>>
>> On Mar 26 18:10, Michael Haubenwallner wrote:
>>> Hi Corinna,
>>>
>>> as I do still encounter fork errors (address space needed by  is
>>> already occupied) with dynamically loaded dlls (but unrelated to
>>> replaced dlls), one of them repeating even upon multiple retries,
>>
>> Why didn't rebase fix that?

As far as I understand, rebasing is about touching already installed
dlls as well, which would require to restart all Cygwin processes.
As the problem is about some dll built during a larger build job,
this is not something that feels useful to me.

> 
> Btw., is that 32 or 64 bit?  Both?

I'm on 64bit only, can't say for 32bit.  And while in theory possible,
I'm not after supporting 32bit Cygwin in Gento Prefix at all...

>>>  I'm
>>> coming up with attached patch.
>>>
>>> What do you think about it?
>>
>> I'm not opposed to this patch but I don't quite follow the description.
>> threadinterface->Init only creates three event objects.  From what I can
>> tell, Events are stored in Paged and Nonpaged Pools, so they don't
>> affect the processes VM.  What am I missing?

Honestly, I'm not completely sure whether this patch really does help:
Beyond the Events, there also is CreateNamedPipe and CreateFile used
in fhandler_pipe::create via sigproc_init, and these causing the address
conflicts with some dll actually is nothing more than a wild guess:
While their returned handles are below the conflicting dll address,
who can tell what these API calls do allocate internally?

Thanks!
/haubi/

>>
>>> >From dfc28bcbb7ed55fe33ddb8d15e761b4d5b4815f8 Mon Sep 17 00:00:00 2001
>>> From: Michael Haubenwallner 
>>> Date: Tue, 26 Mar 2019 17:38:36 +0100
>>> Subject: [PATCH] Cygwin: fork: reserve dynloaded dll areas earlier
>>>
>>> In dll_crt0_0, both threadinterface->Init and sigproc_init allocate
>>> windows object handles using unpredictable memory regions, which may
>>> collide with dynamically loaded dlls when they were relocated.
>>> ---
>>>  winsup/cygwin/dcrt0.cc | 6 ++
>>>  winsup/cygwin/fork.cc  | 6 --
>>>  2 files changed, 6 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/winsup/cygwin/dcrt0.cc b/winsup/cygwin/dcrt0.cc
>>> index 11edcdf0d..fb726a739 100644
>>> --- a/winsup/cygwin/dcrt0.cc
>>> +++ b/winsup/cygwin/dcrt0.cc
>>> @@ -632,6 +632,12 @@ child_info_fork::handle_fork ()
>>>  
>>>if (fixup_mmaps_after_fork (parent))
>>>  api_fatal ("recreate_mmaps_after_fork_failed");
>>> +
>>> +  /* We need to occupy the address space for dynamically loaded dlls
>>> + before we allocate any dynamic object, or we may end up with
>>> + error "address space needed by  is already occupied"
>>> + for no good reason (seen with some relocated dll). */
>>> +  dlls.reserve_space ();
>>>  }
>>>  
>>>  bool
>>> diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
>>> index 74ee9acf4..7e1c08990 100644
>>> --- a/winsup/cygwin/fork.cc
>>> +++ b/winsup/cygwin/fork.cc
>>> @@ -136,12 +136,6 @@ frok::child (volatile char * volatile here)
>>>  {
>>>HANDLE& hParent = ch.parent;
>>>  
>>> -  /* NOTE: Logically this belongs in dll_list::load_after_fork, but by
>>> - doing it here, before the first sync_with_parent, we can exploit
>>> - the existing retry mechanism in hopes of getting a more favorable
>>> - address space layout next time. */
>>> -  dlls.reserve_space ();
>>> -
>>>sync_with_parent ("after longjmp", true);
>>>debug_printf ("child is running.  pid %d, ppid %d, stack here %p",
>>> myself->pid, myself->ppid, __builtin_frame_address (0));
>>> -- 
>>> 2.17.0
>>>
>>>
>>
>>>
>>> --
>>> Problem reports:   http://cygwin.com/problems.html
>>> FAQ:   http://cygwin.com/faq/
>>> Documentation: http://cygwin.com/docs.html
>>> Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
>>
>>
>> -- 
>> Corinna Vinschen
>> Cygwin Maintainer
> 
> 
> 


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-26 Thread Corinna Vinschen
On Mar 26 19:25, Corinna Vinschen wrote:
> Hi Michael,
> 
> 
> Redirected to cygwin-patches...
> 
> 
> On Mar 26 18:10, Michael Haubenwallner wrote:
> > Hi Corinna,
> > 
> > as I do still encounter fork errors (address space needed by  is
> > already occupied) with dynamically loaded dlls (but unrelated to
> > replaced dlls), one of them repeating even upon multiple retries,
> 
> Why didn't rebase fix that?

Btw., is that 32 or 64 bit?  Both?


Corinna


> >  I'm
> > coming up with attached patch.
> > 
> > What do you think about it?
> 
> I'm not opposed to this patch but I don't quite follow the description.
> threadinterface->Init only creates three event objects.  From what I can
> tell, Events are stored in Paged and Nonpaged Pools, so they don't
> affect the processes VM.  What am I missing?
> 
> 
> Thanks,
> Corinna
> 
> 
> > 
> > Thanks!
> > /haubi/
> 
> > >From dfc28bcbb7ed55fe33ddb8d15e761b4d5b4815f8 Mon Sep 17 00:00:00 2001
> > From: Michael Haubenwallner 
> > Date: Tue, 26 Mar 2019 17:38:36 +0100
> > Subject: [PATCH] Cygwin: fork: reserve dynloaded dll areas earlier
> > 
> > In dll_crt0_0, both threadinterface->Init and sigproc_init allocate
> > windows object handles using unpredictable memory regions, which may
> > collide with dynamically loaded dlls when they were relocated.
> > ---
> >  winsup/cygwin/dcrt0.cc | 6 ++
> >  winsup/cygwin/fork.cc  | 6 --
> >  2 files changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/winsup/cygwin/dcrt0.cc b/winsup/cygwin/dcrt0.cc
> > index 11edcdf0d..fb726a739 100644
> > --- a/winsup/cygwin/dcrt0.cc
> > +++ b/winsup/cygwin/dcrt0.cc
> > @@ -632,6 +632,12 @@ child_info_fork::handle_fork ()
> >  
> >if (fixup_mmaps_after_fork (parent))
> >  api_fatal ("recreate_mmaps_after_fork_failed");
> > +
> > +  /* We need to occupy the address space for dynamically loaded dlls
> > + before we allocate any dynamic object, or we may end up with
> > + error "address space needed by  is already occupied"
> > + for no good reason (seen with some relocated dll). */
> > +  dlls.reserve_space ();
> >  }
> >  
> >  bool
> > diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
> > index 74ee9acf4..7e1c08990 100644
> > --- a/winsup/cygwin/fork.cc
> > +++ b/winsup/cygwin/fork.cc
> > @@ -136,12 +136,6 @@ frok::child (volatile char * volatile here)
> >  {
> >HANDLE& hParent = ch.parent;
> >  
> > -  /* NOTE: Logically this belongs in dll_list::load_after_fork, but by
> > - doing it here, before the first sync_with_parent, we can exploit
> > - the existing retry mechanism in hopes of getting a more favorable
> > - address space layout next time. */
> > -  dlls.reserve_space ();
> > -
> >sync_with_parent ("after longjmp", true);
> >debug_printf ("child is running.  pid %d, ppid %d, stack here %p",
> > myself->pid, myself->ppid, __builtin_frame_address (0));
> > -- 
> > 2.17.0
> > 
> > 
> 
> > 
> > --
> > Problem reports:   http://cygwin.com/problems.html
> > FAQ:   http://cygwin.com/faq/
> > Documentation: http://cygwin.com/docs.html
> > Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
> 
> 
> -- 
> Corinna Vinschen
> Cygwin Maintainer



-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-03-26 Thread Corinna Vinschen
Hi Michael,


Redirected to cygwin-patches...


On Mar 26 18:10, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> as I do still encounter fork errors (address space needed by  is
> already occupied) with dynamically loaded dlls (but unrelated to
> replaced dlls), one of them repeating even upon multiple retries,

Why didn't rebase fix that?

>  I'm
> coming up with attached patch.
> 
> What do you think about it?

I'm not opposed to this patch but I don't quite follow the description.
threadinterface->Init only creates three event objects.  From what I can
tell, Events are stored in Paged and Nonpaged Pools, so they don't
affect the processes VM.  What am I missing?


Thanks,
Corinna


> 
> Thanks!
> /haubi/

> >From dfc28bcbb7ed55fe33ddb8d15e761b4d5b4815f8 Mon Sep 17 00:00:00 2001
> From: Michael Haubenwallner 
> Date: Tue, 26 Mar 2019 17:38:36 +0100
> Subject: [PATCH] Cygwin: fork: reserve dynloaded dll areas earlier
> 
> In dll_crt0_0, both threadinterface->Init and sigproc_init allocate
> windows object handles using unpredictable memory regions, which may
> collide with dynamically loaded dlls when they were relocated.
> ---
>  winsup/cygwin/dcrt0.cc | 6 ++
>  winsup/cygwin/fork.cc  | 6 --
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/winsup/cygwin/dcrt0.cc b/winsup/cygwin/dcrt0.cc
> index 11edcdf0d..fb726a739 100644
> --- a/winsup/cygwin/dcrt0.cc
> +++ b/winsup/cygwin/dcrt0.cc
> @@ -632,6 +632,12 @@ child_info_fork::handle_fork ()
>  
>if (fixup_mmaps_after_fork (parent))
>  api_fatal ("recreate_mmaps_after_fork_failed");
> +
> +  /* We need to occupy the address space for dynamically loaded dlls
> + before we allocate any dynamic object, or we may end up with
> + error "address space needed by  is already occupied"
> + for no good reason (seen with some relocated dll). */
> +  dlls.reserve_space ();
>  }
>  
>  bool
> diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
> index 74ee9acf4..7e1c08990 100644
> --- a/winsup/cygwin/fork.cc
> +++ b/winsup/cygwin/fork.cc
> @@ -136,12 +136,6 @@ frok::child (volatile char * volatile here)
>  {
>HANDLE& hParent = ch.parent;
>  
> -  /* NOTE: Logically this belongs in dll_list::load_after_fork, but by
> - doing it here, before the first sync_with_parent, we can exploit
> - the existing retry mechanism in hopes of getting a more favorable
> - address space layout next time. */
> -  dlls.reserve_space ();
> -
>sync_with_parent ("after longjmp", true);
>debug_printf ("child is running.  pid %d, ppid %d, stack here %p",
>   myself->pid, myself->ppid, __builtin_frame_address (0));
> -- 
> 2.17.0
> 
> 

> 
> --
> Problem reports:   http://cygwin.com/problems.html
> FAQ:   http://cygwin.com/faq/
> Documentation: http://cygwin.com/docs.html
> Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple


-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature