the manual tester of realpath is happy, and i've seen no flake from the
bots with tar (but haven't yet heard back from the manual tester of `tar
--sort=name`). so far, so good!

On Tue, Feb 7, 2023, 17:10 enh <e...@google.com> wrote:

> On Tue, Feb 7, 2023 at 1:34 PM Rob Landley <r...@landley.net> wrote:
> >
> > On 2/7/23 10:42, enh wrote:
> > > On Mon, Feb 6, 2023 at 7:35 PM Rob Landley <r...@landley.net> wrote:
> > >> On 2/6/23 12:07, enh wrote:
> > >> >> > -*-
> > >> >> >
> > >> >> > as for the new tar, i updated the prebuilts yesterday and we've
> seen
> > >> >> > enough OOM kills since that i've had to revert it.
> > >> >>
> > >> >> Grrr. I've got a couple of suspects for what I screwed up, but...
> Lemme see if I
> > >> >> can work the ASAN leak detector into my workflow at all here.
> (There's a lot of
> > >> >> commands that intentionally leak resources on the way out because
> the OS will
> > >> >> free them, but there's a category of commands that should NOT do
> that because
> > >> >> they process theoretically unbounded input and cannot be allowed
> to leak-per.)
> > >>
> > >> I haven't wired up the ASAN leak detector yet.
> > >
> > > i don't think there's a leak.
> >
> > There was on the create side, but it was just metadata not data, and I
> think I
> > fixed it. (I'm still unhappy with glibc's heap behavior, but I think
> that's just
> > glibc being glibc...)
> >
> > >> > read(4,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> > >> > 512) = 512
> > >> > kill(3992265, SIGKILL)                  = 0
> > >> > wait4(3992265, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) =
> 3992265
> > >> > exit_group(0)                           = ?
> > >> > +++ exited with 0 +++
> > >>
> > >> ... OOM killer maybe?
> > >
> > > no, sorry --- this was the point i was trying to make. _you're_
> > > calling kill(). (if it was the OOM killer, you'd just see the signal
> > > appear out of nowhere.)
> > >
> > > i think this is probably the earlier race workaround coming back to
> > > bite us?
> https://github.com/landley/toybox/blob/master/toys/posix/tar.c#L1120
> >
> > *blink* *blink*
> >
> > Oh right, that first patch did two things. Not the dirtree fiddling, the
> > wait-for-child logic. Which is gonna report error if we killed the
> child, and
> > now we're listening for that.
> >
> > So instead I need a read-into-toybuf loop to discard trailing input?
> Hmmm, does
> > tar do the same concatenate logic cpio does... no it does not, it just
> ignores
> > the second tarball when you cat two .gz tars together on its input. I
> can do that.
> >
> > Try now?
>
> thanks --- that seems to work for me locally.
>
> i'll start the process, and let you know if the build servers notice
> anything... should know by the end of the week!
>
> > >> Seems to have worked with the test binary I made before that last
> checkin?
> > >> Unless the -C or explicit -j were doing something weird, or I need
> more to the
> > >> reproduction sequence. (Built with NDK or musl...?)
> > >
> > > it's definitely flaky for me, but i was able to reproduce most of the
> > > time. given my suspicion that it's related to your earlier attempt to
> > > deal with a race condition, the active ingredient between my 8/10
> > > failure rate and your inability to repro might just be "i have a
> > > really fast machine"?
> >
> > Yeah, "can it write it all into the pipe buffer" was a race in the first
> place,
> > now you were seeing the other side of that. :P
> >
> > Hopefully fixed now?
> >
> > >> Rob
> >
> > Still Rob
>
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to