On Tue, Jan 31, 2012 at 09:10:25PM +0100, Michael Niedermayer wrote:

> On Tue, Jan 31, 2012 at 01:10:28PM +0100, Otto Moerbeek wrote:
> > On Mon, Jan 30, 2012 at 03:18:04PM +0100, Michael Niedermayer wrote:
> > 
> > > Hi
> > > 
> > > Our automated tests for FFmpeg (http://fate.ffmpeg.org/) have yesterday
> > > found a segfault in openbsds /bin/sh.
> > > 
> > > Analysis revealed that it was triggered by too many variables
> > > That is something like
> > > 
> > > #!/bin/sh
> > > 
> > > i=0
> > > while true ; do
> > >     eval X${i}=yes
> > >     i=$(($i+1))
> > >     test $i -gt 17000 && break
> > > done
> > > 
> > > will segfault.
> > > 
> > > The following patch fixes it: (which we are currently using on our
> > > openbsd fate client)
> > > 
> > > Index: table.h
> > > ===================================================================
> > > RCS file: /cvs/src/bin/ksh/table.h,v
> > > retrieving revision 1.7
> > > diff -u -r1.7 table.h
> > > --- table.h     11 Dec 2005 20:31:21 -0000      1.7
> > > +++ table.h     30 Jan 2012 14:13:30 -0000
> > > @@ -8,7 +8,7 @@
> > > 
> > >  struct table {
> > >         Area   *areap;          /* area to allocate entries */
> > > -       short   size, nfree;    /* hash size (always 2^^n), free entries 
> > > */
> > > +       int     size, nfree;    /* hash size (always 2^^n), free entries 
> > > */
> > >         struct  tbl **tbls;     /* hashed table items */
> > >  };
> > > 
> > > 
> > 
> > Thanks for the report.
> > 
> > I could reproduce the bug, but I'm not 100% happy with your fix since
> > it just pushes the failure to a bigger number of vars.
> > 
> > It's better to check for the case like below,
> 
> yes, absolutely a check is needed.
> doing the check at SHORT_MAX/2 though would likely just lead ffmpegs
> configure to fail with "too many vars"

But what does the test actually tries to find out? Is this just a test
creating many vars for no real reason or is this something real and not
contrived?

> Some further investigation shows that code like:
> i=0
> while true ; do
>     eval X="\$X${i}"
> 
>     i=$(($i+1))
>     test $i -gt 17000 && break
> done
> 
> also fails, iam not sure why ksh generates entries in the table for
> this

Will look into this.

> 
> another (minor) issue ive noticed is that the hashes used are
> strongly biased toward the last chars due to
> h = 2*h + *n++;
> in hash()
> which shifts the first chars to the MSBs and then out while only the
> LSB are used as index
> 
> as many of our variables look like
> bar_decoder_deps
> foobar_decoder_deps
> bar_decoder_checked
> foobar_decoder_checked
> ...
> the 15 LSB of the hash() tend to fall only in few buckets making
> the hash table somewhat slow
> 
> i suspect using (bernsteins hash)
> h = 33*h + *n++;
> would work alot better for almost all cases
> but ive not investigated or tested this much and there are many hash
> functions that have better random distributions but are alot slower
> i just picked bernsteins as its closest to what is used currently
> and should be pretty much always better IMHO.

Yeah, well, that is not something for now (we're close to lock before
5.1 release).

        -Otto

Reply via email to