the manual tester of realpath is happy, and i've seen no flake from the bots with tar (but haven't yet heard back from the manual tester of `tar --sort=name`). so far, so good!
On Tue, Feb 7, 2023, 17:10 enh <e...@google.com> wrote: > On Tue, Feb 7, 2023 at 1:34 PM Rob Landley <r...@landley.net> wrote: > > > > On 2/7/23 10:42, enh wrote: > > > On Mon, Feb 6, 2023 at 7:35 PM Rob Landley <r...@landley.net> wrote: > > >> On 2/6/23 12:07, enh wrote: > > >> >> > -*- > > >> >> > > > >> >> > as for the new tar, i updated the prebuilts yesterday and we've > seen > > >> >> > enough OOM kills since that i've had to revert it. > > >> >> > > >> >> Grrr. I've got a couple of suspects for what I screwed up, but... > Lemme see if I > > >> >> can work the ASAN leak detector into my workflow at all here. > (There's a lot of > > >> >> commands that intentionally leak resources on the way out because > the OS will > > >> >> free them, but there's a category of commands that should NOT do > that because > > >> >> they process theoretically unbounded input and cannot be allowed > to leak-per.) > > >> > > >> I haven't wired up the ASAN leak detector yet. > > > > > > i don't think there's a leak. > > > > There was on the create side, but it was just metadata not data, and I > think I > > fixed it. (I'm still unhappy with glibc's heap behavior, but I think > that's just > > glibc being glibc...) > > > > >> > read(4, > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., > > >> > 512) = 512 > > >> > kill(3992265, SIGKILL) = 0 > > >> > wait4(3992265, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = > 3992265 > > >> > exit_group(0) = ? > > >> > +++ exited with 0 +++ > > >> > > >> ... OOM killer maybe? > > > > > > no, sorry --- this was the point i was trying to make. _you're_ > > > calling kill(). (if it was the OOM killer, you'd just see the signal > > > appear out of nowhere.) > > > > > > i think this is probably the earlier race workaround coming back to > > > bite us? > https://github.com/landley/toybox/blob/master/toys/posix/tar.c#L1120 > > > > *blink* *blink* > > > > Oh right, that first patch did two things. Not the dirtree fiddling, the > > wait-for-child logic. Which is gonna report error if we killed the > child, and > > now we're listening for that. > > > > So instead I need a read-into-toybuf loop to discard trailing input? > Hmmm, does > > tar do the same concatenate logic cpio does... no it does not, it just > ignores > > the second tarball when you cat two .gz tars together on its input. I > can do that. > > > > Try now? > > thanks --- that seems to work for me locally. > > i'll start the process, and let you know if the build servers notice > anything... should know by the end of the week! > > > >> Seems to have worked with the test binary I made before that last > checkin? > > >> Unless the -C or explicit -j were doing something weird, or I need > more to the > > >> reproduction sequence. (Built with NDK or musl...?) > > > > > > it's definitely flaky for me, but i was able to reproduce most of the > > > time. given my suspicion that it's related to your earlier attempt to > > > deal with a race condition, the active ingredient between my 8/10 > > > failure rate and your inability to repro might just be "i have a > > > really fast machine"? > > > > Yeah, "can it write it all into the pipe buffer" was a race in the first > place, > > now you were seeing the other side of that. :P > > > > Hopefully fixed now? > > > > >> Rob > > > > Still Rob >
_______________________________________________ Toybox mailing list Toybox@lists.landley.net http://lists.landley.net/listinfo.cgi/toybox-landley.net