On 2/7/23 10:42, enh wrote:
> On Mon, Feb 6, 2023 at 7:35 PM Rob Landley <r...@landley.net> wrote:
>> On 2/6/23 12:07, enh wrote:
>> >> > -*-
>> >> >
>> >> > as for the new tar, i updated the prebuilts yesterday and we've seen
>> >> > enough OOM kills since that i've had to revert it.
>> >>
>> >> Grrr. I've got a couple of suspects for what I screwed up, but... Lemme 
>> >> see if I
>> >> can work the ASAN leak detector into my workflow at all here. (There's a 
>> >> lot of
>> >> commands that intentionally leak resources on the way out because the OS 
>> >> will
>> >> free them, but there's a category of commands that should NOT do that 
>> >> because
>> >> they process theoretically unbounded input and cannot be allowed to 
>> >> leak-per.)
>>
>> I haven't wired up the ASAN leak detector yet.
> 
> i don't think there's a leak.

There was on the create side, but it was just metadata not data, and I think I
fixed it. (I'm still unhappy with glibc's heap behavior, but I think that's just
glibc being glibc...)

>> > read(4, 
>> > "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
>> > 512) = 512
>> > kill(3992265, SIGKILL)                  = 0
>> > wait4(3992265, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3992265
>> > exit_group(0)                           = ?
>> > +++ exited with 0 +++
>>
>> ... OOM killer maybe?
> 
> no, sorry --- this was the point i was trying to make. _you're_
> calling kill(). (if it was the OOM killer, you'd just see the signal
> appear out of nowhere.)
> 
> i think this is probably the earlier race workaround coming back to
> bite us? https://github.com/landley/toybox/blob/master/toys/posix/tar.c#L1120

*blink* *blink*

Oh right, that first patch did two things. Not the dirtree fiddling, the
wait-for-child logic. Which is gonna report error if we killed the child, and
now we're listening for that.

So instead I need a read-into-toybuf loop to discard trailing input? Hmmm, does
tar do the same concatenate logic cpio does... no it does not, it just ignores
the second tarball when you cat two .gz tars together on its input. I can do 
that.

Try now?

>> Seems to have worked with the test binary I made before that last checkin?
>> Unless the -C or explicit -j were doing something weird, or I need more to 
>> the
>> reproduction sequence. (Built with NDK or musl...?)
> 
> it's definitely flaky for me, but i was able to reproduce most of the
> time. given my suspicion that it's related to your earlier attempt to
> deal with a race condition, the active ingredient between my 8/10
> failure rate and your inability to repro might just be "i have a
> really fast machine"?

Yeah, "can it write it all into the pipe buffer" was a race in the first place,
now you were seeing the other side of that. :P

Hopefully fixed now?

>> Rob

Still Rob
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to