On Tue, Jan 31, 2012 at 01:10:28PM +0100, Otto Moerbeek wrote:
> On Mon, Jan 30, 2012 at 03:18:04PM +0100, Michael Niedermayer wrote:
>
> > Hi
> >
> > Our automated tests for FFmpeg (http://fate.ffmpeg.org/) have yesterday
> > found a segfault in openbsds /bin/sh.
> >
> > Analysis revealed that it was triggered by too many variables
> > That is something like
> >
> > #!/bin/sh
> >
> > i=0
> > while true ; do
> >     eval X${i}=yes
> >     i=$(($i+1))
> >     test $i -gt 17000 && break
> > done
> >
> > will segfault.
> >
> > The following patch fixes it: (which we are currently using on our
> > openbsd fate client)
> >
> > Index: table.h
> > ===================================================================
> > RCS file: /cvs/src/bin/ksh/table.h,v
> > retrieving revision 1.7
> > diff -u -r1.7 table.h
> > --- table.h     11 Dec 2005 20:31:21 -0000      1.7
> > +++ table.h     30 Jan 2012 14:13:30 -0000
> > @@ -8,7 +8,7 @@
> >
> >  struct table {
> >         Area   *areap;          /* area to allocate entries */
> > -       short   size, nfree;    /* hash size (always 2^^n), free entries
*/
> > +       int     size, nfree;    /* hash size (always 2^^n), free entries
*/
> >         struct  tbl **tbls;     /* hashed table items */
> >  };
> >
> >
>
> Thanks for the report.
>
> I could reproduce the bug, but I'm not 100% happy with your fix since
> it just pushes the failure to a bigger number of vars.
>
> It's better to check for the case like below,

yes, absolutely a check is needed.
doing the check at SHORT_MAX/2 though would likely just lead ffmpegs
configure to fail with "too many vars"
Some further investigation shows that code like:
i=0
while true ; do
    eval X="\$X${i}"

    i=$(($i+1))
    test $i -gt 17000 && break
done

also fails, iam not sure why ksh generates entries in the table for
this

another (minor) issue ive noticed is that the hashes used are
strongly biased toward the last chars due to
h = 2*h + *n++;
in hash()
which shifts the first chars to the MSBs and then out while only the
LSB are used as index

as many of our variables look like
bar_decoder_deps
foobar_decoder_deps
bar_decoder_checked
foobar_decoder_checked
...
the 15 LSB of the hash() tend to fall only in few buckets making
the hash table somewhat slow

i suspect using (bernsteins hash)
h = 33*h + *n++;
would work alot better for almost all cases
but ive not investigated or tested this much and there are many hash
functions that have better random distributions but are alot slower
i just picked bernsteins as its closest to what is used currently
and should be pretty much always better IMHO.


[...]

--
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I have often repented speaking, but never of holding my tongue.
-- Xenocrates

[demime 1.01d removed an attachment of type application/pgp-signature which had 
a name of signature.asc]

Reply via email to