Jim,

Our current patches were to fix some compile time warnings about signed vs.
unsigned comparisons, which I think have actually backfired and are causing
the 100% cpu usage.

My current guess is that our forked version works as the initial code
intended, and causes the hang, and the upstream version behaves differently
from the original author's intent, but happens to work anyway because the
path where len < 0 can never happen, and the pollset removes the fd.

I'd be happy to upstream anything when I'm confident that I've fixed the
issue that we're seeing.

Sorry if my first post was a bit confusing.

Jeff

On Wed, Sep 23, 2015 at 3:48 PM, Jim Jagielski <j...@jagunet.com> wrote:

> Haven't looked at the actual bug but yet, in general, we appreciate
> it when people send in patches that fix bugs or errors in our code
> instead of simply forking off those changes.
>
> Apache tries to work a little bit differently than the current buzz
> about forking and working in isolation; we instead encourage people
> to collaborate w/ the actual project itself, submitting bug reports
> and patches, and working as a cohesive group, rather than scattered
> islands of code :)
>
> Thx!
>
> > On Sep 23, 2015, at 3:23 PM, Jeffrey Crowell <jcrow...@google.com>
> wrote:
> >
> > Hi,
> >
> > I work on mod_pagespeed, where we use a forked version of apr_memcache.c
> (
> https://github.com/pagespeed/mod_pagespeed/blob/master/third_party/aprutil/apr_memcache2.c)
> to interact with memcached.
> >
> > Recently we've become aware of a bug where we end up hanging in
> apr_memcache2_multigetp, pinning cpu at 100%.
> >
> > We have some patches which were created in an attempt to fix some signed
> bugs in the original code that I think may be causing our issue.
> >
> > Namely here:
> https://github.com/pagespeed/mod_pagespeed/blob/master/third_party/aprutil/apr_memcache2.c#L1453
> vs here
> https://gist.github.com/crowell/59bfa1bb9f0cda30c48a#file-multiget-c-L192
> > The original code uses an atoi(), which has undefined behavior when
> called on non integer strings, leaving len to be 0.
> >
> > Second, the check here
> https://github.com/pagespeed/mod_pagespeed/blob/master/third_party/aprutil/apr_memcache2.c#L1457
> vs
> https://gist.github.com/crowell/59bfa1bb9f0cda30c48a#file-multiget-c-L199
> >
> > In the original apr code, this check will never fail. len is an
> apr_size_t, and will never be < 0.
> >
> > The issue here now is that the check in the "server sent back a key that
> i didn't ask for"
> https://github.com/pagespeed/mod_pagespeed/blob/master/third_party/aprutil/apr_memcache2.c#L1507
> (same in both forked and upstream), apr_pollset_remove, is never called,
> and the number of queries sent is never changed.
> >
> > Does it seem possible that this is causing the server to spin here, and
> would apr be interested in accepting a patch to fix the signedness issues?
> >
> > Thanks,
> > Jeff
>
>

Reply via email to