Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Michael Haubenwallner


On 4/1/19 5:56 PM, Corinna Vinschen wrote:
> On Apr  1 16:56, Corinna Vinschen wrote:
>> On Apr  1 16:28, Michael Haubenwallner wrote:
>>> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
 can you please collect the base addresses of all DLLs generated during
 the build, plus their size and make a sorted list?  It would be
 interesting to know if the hash algorithm in ld is actually as bad
 as I conjecture.
>>>
>>> Please find attached the output of rebase -i for the dlls after bootstrap
>>> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> 
> Oh, wait.  That's not what I was looking for.  The addresses are ok, but
> the paths *must* be the ones at the time the DLLs have been created,
> because that's what ld uses when creating the image base addresses.

Maybe I can provide that one as well.

> The
> addresses combined with the installation paths don't make sense anymore.
> 
> Apart from that, since you seem to be installing the DLLs anyway, can't
> you combine every crucial point during installation with a rebase?

This is what I'm after now, but I may need to introduce something like
additional readonly databases plus some --unregister option to rebase.

/haubi/


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Corinna Vinschen
On Apr  1 16:56, Corinna Vinschen wrote:
> On Apr  1 16:28, Michael Haubenwallner wrote:
> > Hi Corinna,
> > 
> > On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> > > can you please collect the base addresses of all DLLs generated during
> > > the build, plus their size and make a sorted list?  It would be
> > > interesting to know if the hash algorithm in ld is actually as bad
> > > as I conjecture.
> > 
> > Please find attached the output of rebase -i for the dlls after bootstrap
> > on Cygwin 3.0.4, each built with ld from binutils-2.31.1.

Oh, wait.  That's not what I was looking for.  The addresses are ok, but
the paths *must* be the ones at the time the DLLs have been created,
because that's what ld uses when creating the image base addresses.  The
addresses combined with the installation paths don't make sense anymore.

Apart from that, since you seem to be installing the DLLs anyway, can't
you combine every crucial point during installation with a rebase?


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Michael Haubenwallner
Hi Corinna,

On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> On Mar 28 12:48, Michael Haubenwallner wrote:
>> On 3/28/19 10:58 AM, Corinna Vinschen wrote:
>>> On Mar 28 10:17, Michael Haubenwallner wrote:
 As it is not some other dll being loaded at the colliding adress: any
 idea how to find out _what_ is allocated there (in the forked child),
 to find out whether we can reserve these areas even more early?
>>>
>>> I'm not sure what addresses you're talking about ATM.  The addresses in
>>> the 0x4: - 0x6: range?
>>
>> No, I'm thinking about the lower address that collides after relocation,
>> if there is some cygwin allocated object we may allocate later...
>>
>>> These are the interesting ones.
>>> The relocation to some random low address should only occur if there's
>>> a collision in this range.
>>
>> This should be easier to find out (by inspecting the loaded dlls).
> 
> can you please collect the base addresses of all DLLs generated during
> the build, plus their size and make a sorted list?  It would be
> interesting to know if the hash algorithm in ld is actually as bad
> as I conjecture.

Please find attached the output of rebase -i for the dlls after bootstrap
on Cygwin 3.0.4, each built with ld from binutils-2.31.1.

> 
> If we can improve on the distribution within the 8 Gigs area by changing
> ld's address generation(*), we may improve situations like these without
> too much hassle.  As always, not a foolproof way out, but heck, 8 Gigs
> is a lot of space for a couple 100 DLLs.

Feels like I need some Cygwin rebase step in Gentoo Prefix anyway, as there
are ~250 dlls right after bootstrap - without any application yet.

Thanks!
/haubi/


rebase-info.txt.xz
Description: application/xz


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Corinna Vinschen
On Apr  1 16:28, Michael Haubenwallner wrote:
> Hi Corinna,
> 
> On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> > can you please collect the base addresses of all DLLs generated during
> > the build, plus their size and make a sorted list?  It would be
> > interesting to know if the hash algorithm in ld is actually as bad
> > as I conjecture.
> 
> Please find attached the output of rebase -i for the dlls after bootstrap
> on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
> 
> > If we can improve on the distribution within the 8 Gigs area by changing
> > ld's address generation(*), we may improve situations like these without
> > too much hassle.  As always, not a foolproof way out, but heck, 8 Gigs
> > is a lot of space for a couple 100 DLLs.
> 
> Feels like I need some Cygwin rebase step in Gentoo Prefix anyway, as there
> are ~250 dlls right after bootstrap - without any application yet.

For comparison, I have 1835 system DLLs installed, and they only take
a bit less than 30% of the 8 Gigs.

I'm surprised to see 7 collisions, one of them even using the exact
same address.  So the hash algorithm might be improvable.

In hindsight, we also might have been better off with a bit more space
for DLLs than 8 + 8 Gigs, I guess, given the size of the 64 bit address
space.  We can still get to that by updating Cygwin, rebase and
binutils.  For instance, assuming 32 Gigs + 32 Gigs, rebased DLLs would
start at 0x2:, non-rebased DLLs would start at 0xa: and
the heap would start at 0x12:.  Still lots of room in the VM.

However, that would probably not fix the exact collision between
usr/bin/cygncurses++6.dll and
usr/lib/python3.6/lib-dynload/_sha512.cpython-36m-x86_64-cygwin.dll


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer


signature.asc
Description: PGP signature


Re: [PATCH RFC] fork: reduce chances for "address space is already occupied" errors

2019-04-01 Thread Brian Inglis
On 2019-04-01 10:31, Michael Haubenwallner wrote:
> 
> On 4/1/19 5:56 PM, Corinna Vinschen wrote:
>> On Apr  1 16:56, Corinna Vinschen wrote:
>>> On Apr  1 16:28, Michael Haubenwallner wrote:
 On 3/28/19 9:30 PM, Corinna Vinschen wrote:
> can you please collect the base addresses of all DLLs generated during
> the build, plus their size and make a sorted list?  It would be
> interesting to know if the hash algorithm in ld is actually as bad
> as I conjecture.

 Please find attached the output of rebase -i for the dlls after bootstrap
 on Cygwin 3.0.4, each built with ld from binutils-2.31.1.
>>
>> Oh, wait.  That's not what I was looking for.  The addresses are ok, but
>> the paths *must* be the ones at the time the DLLs have been created,
>> because that's what ld uses when creating the image base addresses.
> 
> Maybe I can provide that one as well.
> 
>> The
>> addresses combined with the installation paths don't make sense anymore.
>>
>> Apart from that, since you seem to be installing the DLLs anyway, can't
>> you combine every crucial point during installation with a rebase?
> 
> This is what I'm after now, but I may need to introduce something like
> additional readonly databases plus some --unregister option to rebase.

Check my questions and Achim's answers in the other subthread for existing ways
to deal with your issues that are only semi-documented.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.