Re: [Toybox] [musl] Re: Not sure how to debug this one.

2024-02-19 Thread Rich Felker
On Mon, Feb 19, 2024 at 11:54:00AM -0600, Rob Landley wrote:
> On 2/17/24 19:40, Rich Felker wrote:
> > The attached patch should fix it.
> 
> Confirmed: worked for me.
> 
> Thanks,

Thanks for confirming!
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Re: Not sure how to debug this one.

2024-02-18 Thread Rich Felker
On Sun, Feb 18, 2024 at 06:06:46PM +0300, Valery Ushakov wrote:
> On Sun, Feb 18, 2024 at 09:33:13 -0500, Rich Felker wrote:
> 
> > On Sun, Feb 18, 2024 at 03:55:36PM +0300, Valery Ushakov wrote:
> > > On Sat, Feb 17, 2024 at 20:40:50 -0500, Rich Felker wrote:
> > > 
> > > > due to incorrect base address register when attempting to reload the
> > > > saved value of r8, the caller's value of r8 was not preserved.
> > > > ---
> > > >  src/signal/sh/sigsetjmp.s | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/src/signal/sh/sigsetjmp.s b/src/signal/sh/sigsetjmp.s
> > > > index 1e2270be..f0f604e2 100644
> > > > --- a/src/signal/sh/sigsetjmp.s
> > > > +++ b/src/signal/sh/sigsetjmp.s
> > > > @@ -27,7 +27,7 @@ __sigsetjmp:
> > > >  
> > > > mov.l 3f, r0
> > > >  4: braf r0
> > > > -mov.l @(4+8,r4), r8
> > > > +mov.l @(4+8,r6), r8
> > > >  
> > > >  9: mov.l 5f, r0
> > > >  6: braf r0
> > > 
> > > That takes care of restoring caller's r8 for the first return from
> > > sigsetjmp, but isn't there still the problem that the jump buffer
> > > contains the wrong one, so on the second return from sigsetjmp the
> > > caller will have clobbered r8?
> > > 
> > > Sorry for a drive-by reply.  I'll try to take a closer look in the
> > > evening.
> > 
> > No, that's the return path for both returns.
> >
> > The whole reason a call-saved register like r8 is used here is so
> > that we can return twice into the body of sigsetjmp, in order to
> > tailcall __sigsetjmp_tail at both the first return and subsequent
> > return.
> 
> Doh, right!  Sorry.  A comment to that effect to alert the reader
> would certainly have helped :) Neat trick that I missed on the quick
> reading.

Yes. Perhaps a single comment in each asm file pointing to a common
document location (the dummy sigsetjmp.c file would be a good
candidate) would be a good approach. This could also document what
needs to be done when writing a new port.

> > This is what makes it possible to restore the signal mask from the
> > returned-to frame rather than the returning-from frame (which is why
> > the attached doesn't crash with stack overflow on musl like it does
> > on glibc).
> 
> Restoring the context in siglongjmp should not be a problem per-se.
> NetBSD libc does that and the example code doesn't crash there (quick
> unscientific test on a ppc that I happen to have a terminal open on).
> But then NetBSD libc doesn't bother to carefully factor that code to
> minimize the need for MD asm.
> 
> Thanks, and sorry for the noise.

If you restore the signal mask from the returning context rather than
in the returned-to context, there's always the possibility of stack
overflow; in the worst case, this happens on the sigaltstack where
you're specifically taking measures to avoid stack overflow being a
fatal error. The test program is artificial, but the real-world way
this would happen is getting a flood of signals like SIGINT or SIGTSTP
or something coming in faster than you can respond to them, so that
every time you try to return via siglongjmp, you actually consume
another stack frame on the signal stack.

If NetBSD didn't crash, maybe it just has a much larger default stack
size limit? Or maybe they reload sp before calling sigprocmask? That
would work too, but the reason musl doesn't do it that way is that our
setjmp/longjmp are compatible with an old ABI where there is no extra
space in the jmp_buf for sigjmp_buf stuff.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Re: Not sure how to debug this one.

2024-02-18 Thread Rich Felker
On Sun, Feb 18, 2024 at 03:55:36PM +0300, Valery Ushakov wrote:
> On Sat, Feb 17, 2024 at 20:40:50 -0500, Rich Felker wrote:
> 
> > due to incorrect base address register when attempting to reload the
> > saved value of r8, the caller's value of r8 was not preserved.
> > ---
> >  src/signal/sh/sigsetjmp.s | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/signal/sh/sigsetjmp.s b/src/signal/sh/sigsetjmp.s
> > index 1e2270be..f0f604e2 100644
> > --- a/src/signal/sh/sigsetjmp.s
> > +++ b/src/signal/sh/sigsetjmp.s
> > @@ -27,7 +27,7 @@ __sigsetjmp:
> >  
> > mov.l 3f, r0
> >  4: braf r0
> > -mov.l @(4+8,r4), r8
> > +mov.l @(4+8,r6), r8
> >  
> >  9: mov.l 5f, r0
> >  6: braf r0
> 
> That takes care of restoring caller's r8 for the first return from
> sigsetjmp, but isn't there still the problem that the jump buffer
> contains the wrong one, so on the second return from sigsetjmp the
> caller will have clobbered r8?
> 
> Sorry for a drive-by reply.  I'll try to take a closer look in the
> evening.

No, that's the return path for both returns.

The whole reason a call-saved register like r8 is used here is so that
we can return twice into the body of sigsetjmp, in order to tailcall
__sigsetjmp_tail at both the first return and subsequent return. This
is what makes it possible to restore the signal mask from the
returned-to frame rather than the returning-from frame (which is why
the attached doesn't crash with stack overflow on musl like it does on
glibc).

Rich
#include 
#include 

static volatile long cnt = 100;
sigjmp_buf jb;

void handle(int s)
{
if (!cnt--) return;
raise(s);
siglongjmp(jb, 1);
}

int main()
{
if (sigsetjmp(jb, 1)) return 0;
signal(SIGALRM, handle);
raise(SIGALRM);
}
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Re: Not sure how to debug this one.

2024-02-17 Thread Rich Felker
On Sat, Feb 17, 2024 at 08:34:28PM -0500, Rich Felker wrote:
> On Sun, Feb 18, 2024 at 12:45:35AM +0300, Valery Ushakov wrote:
> > On Fri, Feb 16, 2024 at 19:48:27 -0600, Rob Landley wrote:
> > 
> > > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s
> > 
> > I haven't touched superh asm in a while and the code has zero
> > comments (*ugh*), but I *think* sigsetjmp clobbers caller's r8.
> > 
> > r8 is callee saved.  sigsetjmp wants to use it to save its env
> > argument across the call to __setjmp.  So it saves caller's r8 and
> > uses r8 to save its env b/c __setjmp it's about to call will clobber
> > it.  Then __setjmp saves this r8 = env in the jump buf, not caller's
> > r8.  The instruction in the delay slot of the tail call to
> > __sigsetjmp_tail vaguely looks like it might have been intended to
> > patch it, but it loads r8, not stores it. I'm not sure why it would
> > want to load r8 at that point.
> > 
> > Sorry, I only skimmed through the code and as I said, there're no
> > comments (which for asm code is borderline criminal, IMHO :) so I
> > might be completely misinterpreting what this code does...
> 
> I think you've stumbled across the bug even if you don't realize it.
> :-)
> 
> The idea of the sigsetjmp implementation is explained in commit
> 583e55122e767b1586286a0d9c35e2a4027998ab. As you've noted, comments
> would be nice, but I'm not sure where they belong since it's the same
> logic on every arch.
> 
> sigsetjmp wants to save a position in itself for longjmp to return to,
> but because sigsetjmp will be returning, it can't save any state on
> the stack; it doesn't have a stack frame. The only storage it has
> available is call-saved registers, which longjmp will necessarily
> restore for it. So, it picks one of its caller's call-saved registers
> (r8) and stores that inside the sigjmp_buf, then uses that register to
> preserve the pointer to the sigjmp_buf to access after setjmp returns
> (either the first time or the second time).
> 
> The only problem with the sh implementation is that, due to limited
> displacement range for load/store instructions, we have to add 60 to
> the sigjmp_buf base address to get the base register for accessing the
> saved return address and r8. This takes place in a temp register r6.
> However, the last line of this code path, the delay slot after the
> tail call to __sigsetjmp_tail, erroneously uses r4 (the base
> sigjmp_buf pointer) as the base for restoring r8:
> 
>mov.l @(4+8,r4), r8
> 
> This corrupts the caller's saved register file. If I'm not mistaken,
> it overwrites the caller's r8 with the caller's r11.
> 
> Presumably Rob didn't actually hit a crash in sigsetjmp, but just
> inferred the crash was there via printf debugging. The actual crash
> must be in the caller once it gets back control with the wrong value
> in r8.
> 
> This has been an area I've wanted to have testing for for a long time,
> but I don't have a really good idea for how to implement the test. We
> would want the compiler to generate code that puts lots of
> intermediates in as many call-saved registers as possible, calls
> setjmp, then calls another function that *also* puts pressure on the
> register allocation and then calls longjmp. If it's okay for the test
> to be arch-specific, this could just work via __asm__ register
> constraints and a call to setjmp from asm, but I'm not sure if that
> would be a good way to do things in libc-test...
> 
> In any case, thanks to everyone who participated in this. This is a
> rather bad bug and a nice one to get fixed up in the release window!

The attached patch should fix it.

Rich
>From 7020e85fd768be02e7f5971f1707229407cfa1e4 Mon Sep 17 00:00:00 2001
From: Rich Felker 
Date: Sat, 17 Feb 2024 20:36:42 -0500
Subject: [PATCH] sh: fix sigsetjmp corrupting call-saved register r8

due to incorrect base address register when attempting to reload the
saved value of r8, the caller's value of r8 was not preserved.
---
 src/signal/sh/sigsetjmp.s | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/signal/sh/sigsetjmp.s b/src/signal/sh/sigsetjmp.s
index 1e2270be..f0f604e2 100644
--- a/src/signal/sh/sigsetjmp.s
+++ b/src/signal/sh/sigsetjmp.s
@@ -27,7 +27,7 @@ __sigsetjmp:
 
mov.l 3f, r0
 4: braf r0
-mov.l @(4+8,r4), r8
+mov.l @(4+8,r6), r8
 
 9: mov.l 5f, r0
 6: braf r0
-- 
2.21.0

___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Re: Not sure how to debug this one.

2024-02-17 Thread Rich Felker
On Sun, Feb 18, 2024 at 12:45:35AM +0300, Valery Ushakov wrote:
> On Fri, Feb 16, 2024 at 19:48:27 -0600, Rob Landley wrote:
> 
> > https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s
> 
> I haven't touched superh asm in a while and the code has zero
> comments (*ugh*), but I *think* sigsetjmp clobbers caller's r8.
> 
> r8 is callee saved.  sigsetjmp wants to use it to save its env
> argument across the call to __setjmp.  So it saves caller's r8 and
> uses r8 to save its env b/c __setjmp it's about to call will clobber
> it.  Then __setjmp saves this r8 = env in the jump buf, not caller's
> r8.  The instruction in the delay slot of the tail call to
> __sigsetjmp_tail vaguely looks like it might have been intended to
> patch it, but it loads r8, not stores it. I'm not sure why it would
> want to load r8 at that point.
> 
> Sorry, I only skimmed through the code and as I said, there're no
> comments (which for asm code is borderline criminal, IMHO :) so I
> might be completely misinterpreting what this code does...

I think you've stumbled across the bug even if you don't realize it.
:-)

The idea of the sigsetjmp implementation is explained in commit
583e55122e767b1586286a0d9c35e2a4027998ab. As you've noted, comments
would be nice, but I'm not sure where they belong since it's the same
logic on every arch.

sigsetjmp wants to save a position in itself for longjmp to return to,
but because sigsetjmp will be returning, it can't save any state on
the stack; it doesn't have a stack frame. The only storage it has
available is call-saved registers, which longjmp will necessarily
restore for it. So, it picks one of its caller's call-saved registers
(r8) and stores that inside the sigjmp_buf, then uses that register to
preserve the pointer to the sigjmp_buf to access after setjmp returns
(either the first time or the second time).

The only problem with the sh implementation is that, due to limited
displacement range for load/store instructions, we have to add 60 to
the sigjmp_buf base address to get the base register for accessing the
saved return address and r8. This takes place in a temp register r6.
However, the last line of this code path, the delay slot after the
tail call to __sigsetjmp_tail, erroneously uses r4 (the base
sigjmp_buf pointer) as the base for restoring r8:

 mov.l @(4+8,r4), r8

This corrupts the caller's saved register file. If I'm not mistaken,
it overwrites the caller's r8 with the caller's r11.

Presumably Rob didn't actually hit a crash in sigsetjmp, but just
inferred the crash was there via printf debugging. The actual crash
must be in the caller once it gets back control with the wrong value
in r8.

This has been an area I've wanted to have testing for for a long time,
but I don't have a really good idea for how to implement the test. We
would want the compiler to generate code that puts lots of
intermediates in as many call-saved registers as possible, calls
setjmp, then calls another function that *also* puts pressure on the
register allocation and then calls longjmp. If it's okay for the test
to be arch-specific, this could just work via __asm__ register
constraints and a call to setjmp from asm, but I'm not sure if that
would be a good way to do things in libc-test...

In any case, thanks to everyone who participated in this. This is a
rather bad bug and a nice one to get fixed up in the release window!

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Not sure how to debug this one.

2024-02-17 Thread Rich Felker
On Fri, Feb 16, 2024 at 07:48:27PM -0600, Rob Landley wrote:
> While grinding away at release prep, I hit a WEIRD one. The qemu-system-sh4
> target got broken by commit 3e0e8c687eee (PID 1 exits trying to run the init
> script), which is the commit that changed the stdout buffering type.
> 
> It's not the kernel, if I use the last release kernel with the new root
> filesystem I see the problem, and newly built kernel from today's git with 
> last
> release's initramfs.cpio.gz boots to a shell prompt.
> 
> The actual _problem_ is that sigsetjmp() is faulting (in sh.c function
> run_command()), for NO OBVIOUS REASON. Calling memset() to zero the struct
> before the sigsetjmp() works fine, but the sigsetjmp() call (built against
> musl-libc) never returns.
> 
> Not siglongjmp, _sigsetjmp_. Which means it's failing somewhere in:
> 
> https://git.musl-libc.org/cgit/musl/tree/src/signal/sh/sigsetjmp.s
> 
> And I dunno how to stick a printf into superh assembly code.

Rather than "stick a printf in there", can you identify (with gdb or
strace or qemu user execution tracing) exactly which instruction it's
crashing at, and the register values at the point of crash?

Provided it was called with a valid pointer to the sigjmp_buf, there
should be no way the initial call to sigsetjmp can segfault. The only
memory accesses it makes are to that object. It does make a call to
setjmp, which in theory could clobber the call-saved r8 containing
sigjmp_buf address, but setjmp does not do that. It's possible that,
on second return, this has been clobbered; even a single-byte buffer
overflow into the sigjmp_buf would do that, and sh may be unique in
having the relevant register at the beginning of the buffer, which
could explain it happening only on sh. But that would affect second
return not the first.

> The sigjmp_buf lives on the stack, but I confirmed it's 8 byte aligned, and 
> not
> even straddling a page boundary. I can access variables I stick before and 
> after
> it, so it can't be some kind of "fault due to guard page" weirdness? (I 
> suppose
> the optimizer may be invalidating that test, I could try adding "volatile"...)
> 
> While debugging I made the problem GO AWAY more than once by sticking 
> printfs()
> and similar into the code, but that's not FIXING it. Adding another sigjmp_buf
> declaration and call to sigsetjmp() right at the start of the function works
> fine (although the other one in the place it's in now still fails). I 
> confirmed

This all suggests that there's a buffer overflow and shuffling things
around on the stack is preventing it. Have you tried running (even on
unaffected archs) under valgrind to look for such errors?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] The cut -C test is failing because bionic's wcwidth() doesn't match glibc or musl.

2021-10-28 Thread Rich Felker
On Thu, Oct 28, 2021 at 02:41:56AM -0500, Rob Landley wrote:
> On 10/27/21 7:27 PM, enh wrote:
> > and storage shouldn't be an issue
> > either (Walmart is currently claiming to have a 2TB usb3 stick on sale 
> > for $12,
> > and amazon says a 1TB sd card is $30. Can't vouch for either, but I 
> > bought a
> > 512G sd card at Target that works fine so far?)
> > 
> > and i'd strongly recommend against doing anything on an sd card --- the few
> > times i've thought i've killed my rpi400 have actually been asking it to do
> > something i/o intensive on an sd card. most of the time you don't notice 
> > they're
> > awful, but try something "medium heavy" like checking out the linux kernel 
> > onto
> > an sd card and, well, it isn't pretty...
> 
> Years ago Rich Felker found a bug in the raspberry pi sdcard driver (some kind
> of missing locking I think?) that would occasionally just STOMP the 
> transaction
> and lobotomize the card. I have no idea if it ever got fixed.
> 
> Rich: did that fix ever go upstream?

It wasn't in the pi driver. It was in the high level SPI subsystem
logic. And it was fixed around the same time I found it. (IIRC the
subsystem maintainer wrote the fix once the issue was explained and so
it went upstream immediately.)

> (Pi had a terrible habit of eating sd cards which we made darn sure Turtle 
> would
> not. We were using a pi as a VOIP server until it ate its SD card.)

There are still other reasons both can eat SD cards. Most shouldn't
happen without async loss of power. However one related issue (affects
my laptop's eMMC) I am aware of is that the kernel is overly strict
with the MMC-advertised timeout limits, and measures time incorrectly
(at next scheduling not immediately), such that a device that
advertises a 1ms timeout and responds in (a wrongly measured) 1.001ms
will be treated as having a fatal error, resulting in data loss.
There's a quirk flag to work around this on "known bad" devices, but
it should just be always on, because the issue is not device-specific
but a kernel logic bug.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] The cut -C test is failing because bionic's wcwidth() doesn't match glibc or musl.

2021-10-20 Thread Rich Felker
On Wed, Oct 20, 2021 at 01:56:21AM -0500, Rob Landley wrote:
> Those tables are just under 2k each, and could SERIOUSLY be compressed. For
> example, the first 512 bytes of the first table are all "16" except for a 
> sparse
> set of sequential values running from 18 to 59, so it could be initialized 
> from
> a bitmask. Let's see, quick and dirty bash to generate said bitmask from the 
> table:

If you know a compression sceheme that admits accessing them
efficiently in-place, that would be great. Expanding them into memory
is a big net loss. You'd be trading shareable rodata for per-process
dirty memory plus decompression code (text) that's comparable in size
if not larger than the uncompressed data.

ALWAYS prefer static const [] tables over runtime generation except
possibly in microcontroller context where the usual ROM vs RAM cost
analysis may not apply.

> Not sure if I _should_, but I _can_. (It was nice to leave this to libc. Then 
> it
> wasn't my problem to update it every time Microsoft wrote another check to the
> unicode committee. Both glibc and musl can do this when statically linked. 
> Sigh.)

I think it's better not to duplicate this information in more places
that become inconsistent if you don't have to. In theory glibc users
might even have locales where the ambiguous-width characters are
treated as wide -- I'm not sure if anyone does this anymore but it was
legacy CJK practice in some locales and honoring that is the polite
thing to do.

> P.S. Rich's other table has some 17s mixed in the 16s which... I think it 
> moves
> in runs of 8? Very small bitmap if so? It would be so much easier to work out
> the alignment if he'd wordwrapped his tables to a consistent number of entries
> per line, but no. Eh, runs of 4, 54 bits total. Plus two isolated weirdos.) 
> And

The tables are generated and the generator aims to format the output
such that diffs are small and readable when the output is checked in.
It wraps both at column overflow and at fixed power of two indices.

If you'd like to see it, the code (horribly ugly; this does not matter
because it's not something to be deployed anywhere) is at:
https://github.com/richfelker/musl-chartable-tools

> most of the nonzero values in the latter part are 255, so traverse a bitmap of
> THOSE and there's not much left to initialize afterwards. Yeah, a dozen lines
> per table of initialization is looking doable, within range of sticking it in
> portability.c...

I'm pretty sure you'll find this larger than the tables.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] FYI musl's support horizon.

2021-08-27 Thread Rich Felker
On Fri, Aug 27, 2021 at 10:21:51AM -0500, Rob Landley wrote:
> On 8/27/21 8:56 AM, Rich Felker wrote:
> > On Thu, Aug 26, 2021 at 03:56:47PM -0700, enh wrote:
> >> On Thu, Aug 26, 2021 at 2:34 PM Rob Landley  wrote:
> > 
> > Rob asked me for some input on this:
> > 
> >> > In my private emails somebody is trying to make the last aboriginal linux
> >> > release work and the old busybox isn't building anymore because makedev()
> >> > used
> >> > to be in #include  and now it's moved to .
> >> > (Why? I
> >> > dunno. Third base.)
> > 
> > This was glibc bug #19239, and clearly needed to be fixed. Generic
> > lowercase macro names like "major" and "minor" are *extreme* namespace
> > pollution for a standard header to be doing.
> 
> I stared at that a bit before figuring out you were objecting because they're
> MACROS. (Generic words being FUNCTION names from headers is common: stdio.h
> defines remove(), rename(), rewind()... stdlib.h has random(), free(), 
> abort(),
> exit()...)

Because those names are defined as being there by the C language and
expected to be there by anyone writing programs in C. Indeed, it makes
it far *worse* that major and minor were macros, since they break
things in more contexts, but if they were functions it would still be
inappropriate for sys/types.h to be exposing them.

> > sys/sysmacros.h was
> > always the correct way to get these macros on glibc, but for whatever
> > reason (compatibility with some ancient historic unices?) glibc was
> > making sys/types.h include sys/sysmacros.h implicitly, at least under
> > some profiles.
> 
> The problem is the man page said to #include  to get it, so 
> that's
> what people did.

That's unfortunate. There's been a lot of wrong stuff in man pages
but Michael is very good about taking reports and getting it
corrected.

> And even today the man page doesn't particularly call them out
> as macros, it gives them function prototypes and even calls them functions:
> 
>   The  major()  and  minor() functions perform the converse task: given a
>   device ID, they return, respectively, the major and  minor  components.

Yes. Historically they were always macros, but glibc made the macros
wrap functions because they did something more elaborate with munging
the bits to fit major and minor numbers larger than 8 bits, splitting
the extra ones across the high 48 bits...

> I thought it was like getc() being officially undecided about whether or not
> it's a macro. (Could be either way.)

Generally it's assumed that the standard headers may (at least in C,
and for standard functions) additionally define a macro anywhere a
function is specified as long as it doesn't evaluate arguments the
wrong number of times. Where the spec explicitly mentions a macro,
that means there's no obligation to avoid multiple evaluation. However
there are good reasons not to do this (breaking C++ callers, and when
the name is not a standard function, just breaking programs that don't
expect it to be there and which use the name for something else
locally.

> > What made this even worse was that, under BSD or GNU profile
> > (including by default), glibc's stdlib.h included sys/types.h. This
> > meant anything using stdlib.h got the namespace breakage.
> 
> Is it "breakage" if it's the standard for Linux and has been for years?

Yes.

> I had a function called throw() in a turbo C++ program I wrote for DOS back in
> college and a compiler upgrade turned that into a keyword and broke my stuff
> because they'd decided to gratuitously add stuff to the language AFTER I wrote
> my code. If it had been "compiling it on Linux behaves differently" I wouldn't
> have minded so much, it was a version upgrade of the same THING breaking stuff
> that bothered me.

That's why we're careful to honor the implementation's obligations on
namespace (fully in plain C or POSIX profile, and modulo well-known
extension functions in default/BSD profile). Avoiding this kind of
breakage requires both sides (implementation and application) to honor
their obligations though. If either doesn't, upgrades can break stuff.

> > In commit f552c792c7ce5a560f214e1104d93ee5b0833967 (2011) musl
> > emulated this glibc behavior in _GNU_SOURCE profile, but commit
> > a31a30a0076c284133c0f4dfa32b8b37883ac930 (2019) removed it since glibc
> > had already fixed the problem and pushed applications still depending
> > on the wrong behavior to fix their stuff.
> 
> So sys/types.h couldn't have prototyped the functions (inside #ifndef if
> necessary) and sys/macros.h couldn't have defined faster macro vesions then, 
> and
> you get whichever one you #include first?
> 
> Eh, wate

Re: [Toybox] FYI musl's support horizon.

2021-08-27 Thread Rich Felker
On Thu, Aug 26, 2021 at 03:56:47PM -0700, enh wrote:
> On Thu, Aug 26, 2021 at 2:34 PM Rob Landley  wrote:

Rob asked me for some input on this:

> > In my private emails somebody is trying to make the last aboriginal linux
> > release work and the old busybox isn't building anymore because makedev()
> > used
> > to be in #include  and now it's moved to .
> > (Why? I
> > dunno. Third base.)

This was glibc bug #19239, and clearly needed to be fixed. Generic
lowercase macro names like "major" and "minor" are *extreme* namespace
pollution for a standard header to be doing. sys/sysmacros.h was
always the correct way to get these macros on glibc, but for whatever
reason (compatibility with some ancient historic unices?) glibc was
making sys/types.h include sys/sysmacros.h implicitly, at least under
some profiles.

What made this even worse was that, under BSD or GNU profile
(including by default), glibc's stdlib.h included sys/types.h. This
meant anything using stdlib.h got the namespace breakage.

In commit f552c792c7ce5a560f214e1104d93ee5b0833967 (2011) musl
emulated this glibc behavior in _GNU_SOURCE profile, but commit
a31a30a0076c284133c0f4dfa32b8b37883ac930 (2019) removed it since glibc
had already fixed the problem and pushed applications still depending
on the wrong behavior to fix their stuff.

> the pain of dealing with that pointless deckchair crap with every glibc
> update is one reason why (a) i've vowed never to do that kind of thing
> again in bionic [we were guilty of the same crime in the past, even me
> personally; the most common example being transitive includes] and (b) i'm
> hoping musl will care a bit more about not breaking source compatibility
>  but realize he's a bit screwed because code expecting glibc might come
> to rely on the assumption that  *doesn't* contain makedev(),
> say --- i've had to deal with that kind of mess myself too. sometimes you
> can't win.

musl generally does not break source compatibility with sources that
are using things in the documented way. Things that are making
assumptions that are not part of the specified (for standard
interfaces; or "underspecified" for the extension ones, something we
hope to improve with better documentation as time progresses) behavior
can and will break now and then. A big part of the mission and success
of musl, and the systems integrators/distributions using musl, has
been getting stuff like that fixed in a huge number of upstream
packages.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] Junk printed on exit from top

2020-08-28 Thread Rich Felker
With toybox 0.8.3 on J2 I'm getting some junk printed along with the
final escape sequence when exiting top. strace shows:

writev(1, [{"\0\0\0\1\0\0\0\1\24\307\252\364\24\274!`\24\277\320\220\0\0 
\0H\33[K", 28}, {NULL, 0}], 2) = 28

I haven't tracked down the cause. I did look for relevant commits
since then that migth have fixed it but didn't see any so I haven't
retested with latest.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Android binaries?

2020-08-24 Thread Rich Felker
On Mon, Aug 24, 2020 at 10:03:11AM -0700, enh wrote:
> On Mon, Aug 24, 2020 at 9:53 AM Rich Felker  wrote:
> >
> > musl supports running on kernels back to 2.6.0, and historically uses
> > the earliest/simplest syscall that can provide the needed
> > functionality for the function being called. However, often it
> > simplifies code to have older functions implemented in terms of newer
> > more general ones, and in that case, order usually gets switched
> > around so that the newer syscall is tried first and the old one is
> > only used as a fallback if (1) the new one fails with ENOSYS, and (2)
> > arguments are such that the old syscall is suitable. The next release
> > is also moving all the socket functions from the multiplexed
> > socketcall to the newer individual syscalls, with socketcall as a
> > fallback, not for implementation ease reasons but because the separate
> > syscalls are the only ones that are filterable with seccomp.
> >
> 
> fwiw, i don't think that's actually true. afaik Android and ChromeOS both
> do this. looks like there's a CTS test too:
> 
> # socketcall: call=={SYS_CONNECT,SYS_SOCKET,SYS_GETSOCKOPT}
> socketcall: arg0 == 1 || arg0 == 3 || arg0 == 15; return EPERM
> 
> i thought the x86 non-socketcall socket stuff was 3.15 anyway? iirc that's
> why we're still using socketcall --- the only x86 device we shipped
> ourselves (Nexus Player) had a kernel that was too old.

The arguments to socketcall are all in a buffer in memory, which can
be changed asynchronously by another task with access to the memory
(i.e. any other thread in the process). Thus there is no security
property obtained by performing filtering on them; at best you can
prevent casual unintended behavior.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Android binaries?

2020-08-24 Thread Rich Felker
On Sun, Aug 23, 2020 at 11:54:12PM -0500, Rob Landley wrote:
> On 8/21/20 6:36 PM, enh wrote:
> > I'm writing a "reporting bugs" FAQ entry because of the recent github 
> > thread.
> > 
> > I've also had a todo item to salvage todo entries I wrote for busybox 
> > forever
> > ago, especially since the busybox devs crapped all over the current 
> > versions.
> > For example,
> > 
> > https://git.busybox.net/busybox/tree/docs/busybox.net/FAQ.html?h=95718b309169#n361
> > used to be project-agnostic (usable advice no matter what open source 
> > project
> > you were talking about), but in current 
> > https://busybox.net/FAQ.html#backporting
> > they've inserted a large digression in the middle about configuring 
> > busybox
> > source from the command line to make sure it doesn't apply to any other 
> > project
> > and can't be generally referenced as advice by other projects.
> > 
> > But anyway, the advice was "try to reproduce the bug on a current 
> > version before
> > poking the developers because there's a nonzero chance we already fixed 
> > it", and
> > for linux toybox I can point them at
> > https://landley.net/toybox/downloads/binaries for current versions 
> > (even if they
> > don't want to build it from source for their target)... but those are 
> > linked
> > against musl?
> > 
> > To check if it's been fixed on _android_, they need a bionic version. 
> > (I mean
> > the musl versions will run but all sorts of subtle behavior's 
> > different.) 
> > 
> > yeah, i don't know --- Android's seccomp system call filter is pretty 
> > narrow and
> > doesn't cover all of the "legacy" system calls that musl probably uses. i
> > suspect arm32 musl doesn't work at all, but arm64 musl might mostly work?
> 
> Rich is fastidious about only using new syscalls when he can get away with it,
> and keeping the set of used syscalls to a minimum. I'd have to ask him what 
> the
> actual set he's using is though.
> 
> But mostly I was thinking that when trying to reproduce a bug, swapping out 
> the
> libc probably shouldn't be part of the standard reproduction sequence. Even 
> when
> the bug isn't in libc, what will/won't trigger it can change.

musl supports running on kernels back to 2.6.0, and historically uses
the earliest/simplest syscall that can provide the needed
functionality for the function being called. However, often it
simplifies code to have older functions implemented in terms of newer
more general ones, and in that case, order usually gets switched
around so that the newer syscall is tried first and the old one is
only used as a fallback if (1) the new one fails with ENOSYS, and (2)
arguments are such that the old syscall is suitable. The next release
is also moving all the socket functions from the multiplexed
socketcall to the newer individual syscalls, with socketcall as a
fallback, not for implementation ease reasons but because the separate
syscalls are the only ones that are filterable with seccomp.

Aside from that, the syscalls used by a particular function are an
implementation detail that can and will change from version to
version, and sometimes these changes are out of necessity to fix a
bug. For example, it often comes up that newly realized
requirements/corner-cases for async-signal-safety, locking corner
cases, fork-related issues, etc. impose a requirement that a function
block signals during some critical section. Another case where this
happens is where it's discovered that there are kernel API bugs or
corner cases the kernel does not handle that need special treatment in
userspace, that require additional syscalls.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH 1/4] Implement hostname lookups in display_routes

2020-06-03 Thread Rich Felker
On Wed, Jun 03, 2020 at 08:49:47AM -0500, Rob Landley wrote:
> On 6/2/20 12:53 PM, Rich Felker wrote:
> > On Tue, Jun 02, 2020 at 12:42:41PM -0500, Rob Landley wrote:
> >> On 6/2/20 12:25 PM, Rich Felker wrote:
> >>> This likely needs a new __UAPI_* macro to suppress it. It's likely
> >>> that nobody has hit it because sys/sysinfo.h is a pretty obscure
> >>> header and it's usually not included anywhere except uptime(1), etc.
> >>> Having it in lib/portability.h, rather than just in the files that
> >>> need it, is probably not a good idea.
> >>
> >> It built fine with glibc, which is why it got committed unnoticed.
> >>
> >>> I'm not sure how kernel folks will want to fix this, if at all. Once
> >>> we get an idea of that I can include a patch in mcm for the kernel
> >>> header that matches what upstream is doing.
> >>
> >> Again, builds fine with glibc. I commited a workaround for the musl bug.
> > 
> > This is not a musl bug. I'm really really REALLY tired of you calling
> > things musl bugs when they obviously are not.
> 
> Sorry. I'm _really_ stressed right now.

Apology accepted.

> For context, I'm frustrated that musl-cross-make seems to be reproducing the
> "forked kernel header" era that Maszur did 15 years ago. Remember when Linux
> From Scratch tried to put together a FAQ about it?

This very intentionally is not what I'm doing. Use of the sabotage
linux-headers package is merely an optimization, and intended to be
functionally equivalent to the upstream headers. If there are actually
functional changes between them please let me know so I can try to get
it resolved. Because I really really don't want to go that direction.

> That was the context within which I hit _this_ problem, which is another 
> aspect
> of musl's unique refusal to #include kernel headers. Both glibc and bionic do 
> so
> here, bionic's sys/sysinfo.h has:
> 
>   #include 
>   #include 
> 
> I.E. This ONLY breaks on musl. My undestanding of your position on the issue 
> is
> "that's not a bug, that's a feature". Which is another way of saying it's a
> choice you made that broke stuff.

musl does not use kernel headers. It does not require them to build.
This is a standard and completely reasonable policy. It was also
supposed to be glibc policy from the early days (one of the big
improvements over libc5) but they kept randomly breaking it and
locking themselves into instances of breaking it.

In the specific case of sysinfo, we *can't* use the kernel header
because the type does not match. Not for x32 anyway. At the time x32
was created, glibc made the (possibly unconscious) decision to ignore
existing specifications of types, in both standard and linux-specific
interfaces, and replace long with __syscall_slong_t, etc. to match the
x86_64 types. struct sysinfo is such an interface. Note that if you
type "man sysinfo", you get:

struct sysinfo {
long uptime; /* Seconds since boot */
...

This means applications can write:

struct sysinfo si;
sysinfo();
printf("%ld", si.uptime);

But on glibc x32 which uses the x86_64 definition of sysinfo, this is
a type mismatch. (In practice it might not break with printf anyway
but it will break hard in other examples.)

This kind of x32-specific breakage was probably (maybe a small) part
of why nobody ever used x32 anyway. But on musl the work was done to
make it actually work right in hopes that x32 would be useful. The
musl struct sysinfo in sys/sysinfo.h is the documented userspace API,
the sysinfo() function fills it in to match, and any program that's
using the documented API works just as well on x32 as elsewhere.

> So I added a workaround for another unique idiosyncrasy in a project that made
> the also unique choice not to #define _MUSL_.

I don't get why you're repeatedly bothered by the fact that musl did
not do everything the same way as glibc. If we had done so there would
be no point in having musl. The whole premise is that there were a lot
of things in glibc that could/should have done better. Some of them
are fixable and glibc gradually adopts improved approaches from musl
where they are, and where there's consensus that the way musl is doing
it is better. In other instances, they remain different, and *that's
ok*.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH 1/4] Implement hostname lookups in display_routes

2020-06-02 Thread Rich Felker
On Tue, Jun 02, 2020 at 12:42:41PM -0500, Rob Landley wrote:
> On 6/2/20 12:25 PM, Rich Felker wrote:
> > This likely needs a new __UAPI_* macro to suppress it. It's likely
> > that nobody has hit it because sys/sysinfo.h is a pretty obscure
> > header and it's usually not included anywhere except uptime(1), etc.
> > Having it in lib/portability.h, rather than just in the files that
> > need it, is probably not a good idea.
> 
> It built fine with glibc, which is why it got committed unnoticed.

To clarify here, I was talking about why the kernel header clash went
unnotice for so long in the entire ecosysten, not in toybox. There's
never been a report of this before, and it's likely because there are
only 6 kernel headers which include the offending linux/kernel.h, and
none of them are things that would typically be included in the same
file using sysinfo(). This is among tens of thousands of packages.

That's not to say it's wrong to include both linux/netlink.h and
sys/sysinfo.h, just an explanation of how the bug survived so long.

> > I'm not sure how kernel folks will want to fix this, if at all. Once
> > we get an idea of that I can include a patch in mcm for the kernel
> > header that matches what upstream is doing.
> 
> Again, builds fine with glibc. I commited a workaround for the musl bug.

I checked out your workaround and it should work fine at present and
for the forseeable future. I'd still like to get this fixed properly,
but that will be a discussion with kernel folks.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH 1/4] Implement hostname lookups in display_routes

2020-06-02 Thread Rich Felker
On Tue, Jun 02, 2020 at 12:42:41PM -0500, Rob Landley wrote:
> On 6/2/20 12:25 PM, Rich Felker wrote:
> > This likely needs a new __UAPI_* macro to suppress it. It's likely
> > that nobody has hit it because sys/sysinfo.h is a pretty obscure
> > header and it's usually not included anywhere except uptime(1), etc.
> > Having it in lib/portability.h, rather than just in the files that
> > need it, is probably not a good idea.
> 
> It built fine with glibc, which is why it got committed unnoticed.
> 
> > I'm not sure how kernel folks will want to fix this, if at all. Once
> > we get an idea of that I can include a patch in mcm for the kernel
> > header that matches what upstream is doing.
> 
> Again, builds fine with glibc. I commited a workaround for the musl bug.

This is not a musl bug. I'm really really REALLY tired of you calling
things musl bugs when they obviously are not.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH 1/4] Implement hostname lookups in display_routes

2020-06-02 Thread Rich Felker
On Tue, Jun 02, 2020 at 10:56:37AM -0500, Rob Landley wrote:
> On 6/1/20 10:56 AM, Eric Molitor wrote:
> > ---
> >  toys/pending/route.c | 46 +++-
> >  1 file changed, 33 insertions(+), 13 deletions(-)
> 
> FYI route.c doesn't build with musl-cross-make because:
> 
> ~/ccc/sh4-linux-musl-cross/sh4-linux-musl/include/sys/sysinfo.h:10:8: error:
> redefinition of 'struct sysinfo'
>10 | struct sysinfo {
>   |^~~
> In file included from
> ~/ccc/sh4-linux-musl-cross/sh4-linux-musl/include/linux/kernel.h:5,
>  from
> ~/ccc/sh4-linux-musl-cross/sh4-linux-musl/include/linux/netlink.h:5,
>  from
> ~/ccc/sh4-linux-musl-cross/sh4-linux-musl/include/linux/rtnetlink.h:6,
>  from toys/pending/route.c:50:
> ~/ccc/sh4-linux-musl-cross/sh4-linux-musl/include/linux/sysinfo.h:8:8: note:
> originally defined here
> 
> Builds find with glibc, but not with musl.
> 
> I'm pretty sure this is because glibc does this in sys/sinfo.h:
> 
> /* Get sysinfo structure from kernel header.  */
> #include 
> 
> And if you DON'T do that, including linux/rtnetlink.h along with sys/sysinfo.h
> breaks the build.

This likely needs a new __UAPI_* macro to suppress it. It's likely
that nobody has hit it because sys/sysinfo.h is a pretty obscure
header and it's usually not included anywhere except uptime(1), etc.
Having it in lib/portability.h, rather than just in the files that
need it, is probably not a good idea.

With that said it *also* looks wrong that linux/netlink.h is including
linux/kernel.h, when the only purpose of the latter for userspace
seems to be serving as a legacy name for linux/sysinfo.h (to make
glibc happy). linux/netlink.h is doing this to get a definition for
__ALIGN_KERNEL which it uses in excactly one place:

#define NL_MMAP_MSG_ALIGN(sz)   __ALIGN_KERNEL(sz, 
NL_MMAP_MSG_ALIGNMENT)

Indeed the inclusion was introduced in commit
ccdfcc398594ddf3f77348c5a10938dbe9efefbe, without any regard for the
namespace pollution.

I'm not sure how kernel folks will want to fix this, if at all. Once
we get an idea of that I can include a patch in mcm for the kernel
header that matches what upstream is doing.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] The last working musl-cross-make version is commit aacd84ed2cd7.

2020-03-05 Thread Rich Felker
On Wed, Mar 04, 2020 at 07:31:17PM -0600, Rob Landley wrote:
> You may notice toybox's scripts/mcm-buildall.sh running against recent 
> versions
> of https://github.com/richfelker/musl-cross-make hasn't been producing working
> m68k or s390x compilers (no kernel headers were installed).
> 
> I bisected it to commit 38e52db8358c where Rich resurrected Mariusz Mazur's
> out-of-tree kernel headers package from 2005 (ala
> https://landley.net/notes-2005.html#17-03-2005) from before "make

That's not what these are.

> headers_install" was invented, and decided doing that _again_ was a good idea.

Because downloading 100 MB of kernel source and extracting it to a far
larger tree just to get the headers isn't really fun.

The linux-headers alternative has been supported for a long time; it
just wasn't default because the default had been untouched for a long
time. When it became necessary to bump the default (because old
headers are incompatible with musl 1.2) I switched to it.

> The commit before that builds them both fine.

If this is the problem, you can switch to using the full kernel
sources via LINUX_VER=4.19.90. I'd like to debug the root cause and
get it fixed, but unless there's a quick fix I'll just change the
default to LINUX_VER=4.19.90 for now.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] [PATCH] sh.c: fix memory corruption due to uninitialized sh_function in sh_run

2020-01-11 Thread Rich Felker
This code only happened to work due to being started on a pristine
zero-filled stack. With dynamic linking on i386/musl, I got:

Program received signal SIGSEGV, Segmentation fault.
0x08050fd1 in dlist_add_nomalloc (list=0xd9f0, new=0x80afef0) at 
lib/llist.c:90
90  (*list)->prev->next = new;
(gdb) bt
#0  0x08050fd1 in dlist_add_nomalloc (list=0xd9f0, new=0x80afef0) at 
lib/llist.c:90
#1  0x0806f913 in parse_line (line=line@entry=0x80a3aa0 "cd .", 
sp=sp@entry=0xd9ec) at toys/pending/sh.c:1142
#2  0x0807166d in sh_run (new=0x80a3aa0 "cd .") at toys/pending/sh.c:1693
#3  sh_main () at toys/pending/sh.c:1841
#4  0x080542ee in toy_exec_which (which=, argv=0xdba8) at 
main.c:175
#5  0x08054317 in toy_exec (argv=0xdba8) at main.c:182
#6  0x0805433f in toybox_main () at main.c:196
#7  0x080542ee in toy_exec_which (which=, argv=0xdba4) at 
main.c:175
#8  0x08054317 in toy_exec (argv=0xdba4) at main.c:182
#9  0x0805433f in toybox_main () at main.c:196
#10 0x0804c68f in main (argc=2, argv=0xdba4) at main.c:260
(gdb) print list
$1 = (struct double_list **) 0xd9f0
(gdb) print *list
$2 = (struct double_list *) 0xf7ffb1e0 <__stdout_FILE>

This was just what happened to be on the stack from before sh_run was
called, likely leftover from dynamic linker startup. With the attached
patch, it now runs without crashing.

Rich
>From 555871688b88dafd6d5b6f80bba7af01acfb8a7c Mon Sep 17 00:00:00 2001
From: Rich Felker 
Date: Sat, 11 Jan 2020 20:16:48 -0500
Subject: [PATCH] sh.c: fix memory corruption due to uninitialized sh_function
 in sh_run

---
 toys/pending/sh.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/toys/pending/sh.c b/toys/pending/sh.c
index 482639da..32e33507 100644
--- a/toys/pending/sh.c
+++ b/toys/pending/sh.c
@@ -1685,7 +1685,7 @@ dprintf(2, "TODO skipped running for((;;)), need math 
parser\n");
 // Parse and run a self-contained command line with no prompt/continuation
 static int sh_run(char *new)
 {
-  struct sh_function scratch;
+  struct sh_function scratch = { 0 };
   int rc;
 
 // TODO: parse with len? (End early?)
-- 
2.21.0

___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] [PATCH] find.c: don't assume time_t is the same as long

2020-01-11 Thread Rich Felker
I did this with a cast to long long rather than intmax_t since I think
you prefer long long, but either should be fine. This incurs a slight
binary size cost on 32-bit targets with 32-bit time_t (glibc, uclibc,
old musl) that could be eliminated via a conditional of the form:

if (sizeof(time_t)>sizeof(long))
/* print by casting to long long */
else
/* print by casting to long */

but I would prefer not uglifying the code over saving a couple bytes
of binary.

Rich
>From 3751201107ee8b34f3154b974adc6d5d4ae333ba Mon Sep 17 00:00:00 2001
From: Rich Felker 
Date: Sat, 11 Jan 2020 11:07:35 -0500
Subject: [PATCH] find.c: don't assume time_t is the same as long

This is false on ILP32 targets with 64-bit time_t.
---
 toys/posix/find.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/toys/posix/find.c b/toys/posix/find.c
index 8a837385..3fb97b1c 100644
--- a/toys/posix/find.c
+++ b/toys/posix/find.c
@@ -636,7 +636,7 @@ static int do_find(struct dirtree *new)
   } else if (ch == 'p') ll = (long)(ff = dirtree_path(new, 0));
   else if (ch == 'T') {
 if (*++fmt!='@') error_exit("bad -printf %%T: %%T%c", *fmt);
-sprintf(buf, "%ld.%ld", new->st.st_mtim.tv_sec,
+sprintf(buf, "%lld.%ld", (long long)new->st.st_mtim.tv_sec,
  new->st.st_mtim.tv_nsec);
 ll = (long)buf;
   } else if (ch == 'Z') {
-- 
2.21.0

___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-07 Thread Rich Felker
On Mon, May 06, 2019 at 10:57:23PM -0700, enh wrote:
> > I know not everything is a macro that can be tested; both myself and
> > folks from other implementations are interested in developing an
> > agreed-upon way to report availability of other extensions via macros,
> > so that configure-style compile/link tests are not needed. This would
> > also facilitate reporting of some things that can't even properly be
> > tested because they would need runtime tests. Unfortunately efforts in
> > this area are stalled from lack of interest. If you or others would
> > like to raise a fuss about making it happen, I'd welcome it.
> 
> wouldn't it be better to have more stuff along the lines of clang's
> __has_builtin/__has_feature/__has_include? they're already very useful
> for getting rid of this kind of stuff, and a __has_function would
> cover most of what's missing.

It probably doesn't, and it's hard to figure out who would be
responsible for providing them. Are you suggesting the compiler would
do it automatically based on presence of a declaration? There's
already an evil hack to check for presence of a declaration and branch
on the result using _Generic, but of course that's after the
preprocessor level. OTOH the preprocessor doesn't and can't have
access to the presence or absence of declarations since everything is
just tokens at that stage.

I think it would be a lot simpler, more flexible, and compatible with
existing compilers (probably a requirement on our side) to simply have
the library define macros for the individual extensions or extension
groups it provides -- basically the same thing as the existing POSIX
macros in unistd.h, but for extensions outside of POSIX. One major
advantage here is that we can also report extensions that are not in
the form of symbols or flags, but some other type of interface -- for
example, support for %m in the printf family.

As a side benefit, this would probably require us to look at,
enumerate, and better document the extensions we support. :-)

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-07 Thread Rich Felker
On Tue, May 07, 2019 at 05:27:24PM +0300, makep...@firemail.cc wrote:
> 
> > wouldn't it be better to have more stuff along the lines of clang's
> > __has_builtin/__has_feature/__has_include? they're already very useful
> > for getting rid of this kind of stuff, and a __has_function would
> > cover most of what's missing.
> 
> Feature claims of clang and musl are giving us a conflict when
> building cmake btw, could anyone look at
> https://github.com/gentoo/musl/issues/141 please? I don't know
> whether to forward this to clang or musl or cmake or some other
> list, thus glad you brought up the topic.

Just replied on the tracker:

  "musl uses a BSD style header layout, where the system (libc) header
  location must come before the compiler's header location in the
  search path; it does not use and is not compatible with the
  compiler-provided versions of the standard headers, which should
  never get included."

Thanks for the heads-up.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rich Felker
On Mon, May 06, 2019 at 07:05:46PM -0500, Rob Landley wrote:
> On 5/6/19 12:48 PM, Rich Felker wrote:
> > On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote:
> >> Huh... I'll assume REG_STARTEND works in bionic since you're pointing me 
> >> at it.
> >> It's not in the regex man page but it _is_ in the glibc headers...
> >>
> >>   https://github.com/bminor/glibc/commit/6fefb4e0b16
> >>
> >> Looks like it went into glibc in 2004, which is way past 7 years. I should 
> >> poke
> >> Michael Kerrisk to update the man page.
> >>
> >> And musl explicitly refused to do it, but Rich makes bad calls all the 
> >> time:
> >>
> >>   https://www.openwall.com/lists/musl/2013/01/15/26
> >>
> >> I still haven't got an #ifdef __MUSL__ in portability because Rich insists 
> >> his
> >> libc is the only perfect piece of software ever written, but I can probe 
> >> for it
> > 
> > You can #ifdef REG_STARTEND and use it conditionally depending on
> > whether the functionality is offered. There is no reason to hard-code
> > an assumption that musl doesn't or does have this functionality;
> 
> I did (https://github.com/landley/toybox/commit/48162c4ee3fb) but it's still a
> musl-specific workaround. Glibc, uClibc, and bionic all adopted this about 15
> years ago, and it came from freebsd in the first place which presumably covers
> macosx (I can't currently check, but somebody would poke me before too long).

Regardless of your opinion on whether or why musl does or doesn't have
this extension, testing for it with #ifdef REG_STARTEND rather than
"testing for musl" is the right approach, because if/when it's added
to musl, you automatically get support for it. If you hard-coded an
assumption that musl doesn't have it, then users are stuck with that
even if it changes.

> Musl is the only thing left that _doesn't_ have this functonality, and that's
> because
  ~~~

I don't think it's particularly "because" of this. Rather, the pros
and cons of adding this extension have been under discussion for a
while, with nobody pushing for further action on the topic.

> you dismissed NUL matching as not a "meaningful use of sed", and called
> it "hideous hacks" and "nobody will notice the difference with it missing". (I
> implemented a regexec0() wrapper to provide NUL matching 3 years before this
> performance issue came up.)

Use on non-text data is outside the scope of the standard sed utility,
and I'm a bit hostile to insistence that the standard utilities be
useful with non-text because of what a disaster Austin Group issue 663
turned into. However this is really not a consideration in whether
REG_STARTEND is included going forward.

> > it's
> > been proposed and there's even a patch somewhere. (It's not costly to
> > support relative to the current bad regex implementation, but the
> > concern is that it might impose a nontrivial cost on a future good
> > implementation once we're locked into having it.)
> 
> We have different project philosophies. I wait for actual users to show up
> before implementing a lot of things, but when they do I don't tend to think I
> know better than they do about what their use cases should be.

If I accepted arbitrary requests for stuff in musl, it would just be a
clone of glibc. As a community we have developed pretty clear and
consistent criteria for what does or doesn't get included. I'm not
aware of any indication that REG_STARTEND shouldn't be included except
possible future runtime cost; it doesn't have any major complexity
cost or incompatibility between different historical implementations
AFAIK.

> $ echo -e 'hello\0blah' | ./sed 's/blah/boing/' | hd
>   68 65 6c 6c 6f 00 62 6f  69 6e 67 0a  |hello.boing.|
> 000c
> 
> The freebsd man page from last time
> (https://www.freebsd.org/cgi/man.cgi?query=regex=3=freebsd-release-ports)
> says that REG_PEND is explicitly for specifying the length of a string
> containing NULs (without moving the starting point like REG_STARTEND can), so
> the plumbing in between already has to support embedded NULs. The man page 
> could
> be clearer, hopefully Michael Kerrisk's will be.
> 
> > whether it treats the start/end as *additional* constraints,
> 
> Not in the systems I've been able to test. (I need to get my freebsd test 
> image
> set back up on the new machine...)
> 
> (And hey, congrats on musl not doing the strlen() before matching without 
> this.
> You don't have the N^2 performance issue glibc and bionic did, and thus don't
> need that reason this patch went in. But I didn't keep the previous "implement
> ma

Re: [Toybox] pathological case in sed s///g

2019-05-06 Thread Rich Felker
On Mon, May 06, 2019 at 12:42:44PM -0500, Rob Landley wrote:
> Huh... I'll assume REG_STARTEND works in bionic since you're pointing me at 
> it.
> It's not in the regex man page but it _is_ in the glibc headers...
> 
>   https://github.com/bminor/glibc/commit/6fefb4e0b16
> 
> Looks like it went into glibc in 2004, which is way past 7 years. I should 
> poke
> Michael Kerrisk to update the man page.
> 
> And musl explicitly refused to do it, but Rich makes bad calls all the time:
> 
>   https://www.openwall.com/lists/musl/2013/01/15/26
> 
> I still haven't got an #ifdef __MUSL__ in portability because Rich insists his
> libc is the only perfect piece of software ever written, but I can probe for 
> it

You can #ifdef REG_STARTEND and use it conditionally depending on
whether the functionality is offered. There is no reason to hard-code
an assumption that musl doesn't or does have this functionality; it's
been proposed and there's even a patch somewhere. (It's not costly to
support relative to the current bad regex implementation, but the
concern is that it might impose a nontrivial cost on a future good
implementation once we're locked into having it.)

> If this _does_ match NUL bytes it lets me remove regexec0() entirely from 
> libc,
> which would be very nice. And since it started life as a BSD extension 
> (they're
> the only man page _documenting_ it) I get support on FreeBSD and probably 
> MacOS too.

I'm not sure if the proposed patch supports matching NUL or not (i.e.
whether it treats the start/end as *additional* constraints, to work
with a substring of a necessarily null-terminated string, or whether
they replace the null-termination criterion. If we do adopt it we
should ensure we do whatever other implementations do here, especially
if that's important to use cases you or others want.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] date.test: add some tests.

2018-10-06 Thread Rich Felker
On Sat, Oct 06, 2018 at 01:20:06PM -0500, Rob Landley wrote:
> On 10/05/2018 05:18 PM, enh wrote:
> > These are reasonable examples I found in AOSP.
> > +
> > +# Test embedded TZ to take a date in one time zone and display it in 
> > another.
> > +testing "TZ=" "TZ='America/Los_Angeles' date -d 'TZ=\"Europe/London\"
> > 2018-10-04 08:00'" "Thu Oct  4 00:00:00 PDT 2018\n" "" ""
> > +testing "TZ= @" "TZ='America/Los_Angeles' date -d 'TZ=\"GMT\"
> > @1533427200'" "Sat Aug  4 17:00:00 PDT 2018\n" "" ""
> 
> Hmmm, slight concern that I'm not sure how many timezones we should expect to 
> be
> installed in a test environment.

You can use POSIX TZ strings rather than zoneinfo files for testing if
you don't know what will be installed. The format is:

stdoffset[dst[offset][,start[/time],end[/time]]]

with offsets in the format:

hh[:mm[:ss]]

start and end in the format:

Jn - one-based julian day number not counting leap days
n - zero-based julian day number counting leap days
Mm,n,d - the d'th day (0=Synday) of week n (1-5) of month m (1-12)

Optional /time specifies the hour of the dst change.

For example, current US/Eastern is:

TZ=EST5EDT,M3.2.0/2,M11.1.0/2

Full detail is here:

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03

> I'm back poking at mkroot and trying to turn it into a test environment that 
> can
> run the full toybox test suite under qemu, so I can finally have some proper
> tests for mount and insmod and such. It's a musl-based environment, and I 
> dunno
> what musl does about timezones? (Glibc needs horrible generation and files and
> such, and uClibc _copied_glibc's_data_at_install_time_ off the build system,
> which was just crazy from a licensing perspective.)
> 
> Hey Rich: musl timezones? Do I have to do anything special, or what?

Most implementations including glibc and musl (not sure about Bionic)
use the standard zoneinfo format and data from:

https://www.iana.org/time-zones

and, if you don't want to have them installed in the standard system
locations ([/usr]/share/zoneinfo; musl also searches /etc/zoneinfo)
they also support setting TZ to an absolute zoneinfo file pathname as
long as the program was not invoked suid/sgid/setcap.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Does anyone here understand how unicode combining characters work?

2018-09-26 Thread Rich Felker
On Wed, Sep 26, 2018 at 03:42:25PM -0500, Rob Landley wrote:
> On 09/26/2018 03:00 PM, enh wrote:
> > if anyone's interested, here's how bionic translates from the actual
> > unicode properties to implement wcwidth:
> > https://android.googlesource.com/platform/bionic/+/master/libc/bionic/wcwidth.cpp
> > 
> > (we do this in general so that we can outsource all the actual
> > unicodet data to icu4c, and thereby guarantee consistency for
> > C/C++/Java regardless of which API is actually called.)
> 
> I think I've got the answer to my question now. what I needed to know was how
> much I can print before the cursor winds up on the next line (and scrolls the
> screen if it was at the bottom), and the answer is "print combining characters
> _after_ the last character, but stop before the next wcwidth>0 character that
> would overflow the line".

On a decent terminal (google "magic margins"), you can always print
the full width of the terminal, even on the last line, so if the
terminal width is 80, you print until the wcwidth of the next
character would throw the position strictly over 80 (81 or higher).

I'm not sure if there are still any non-magic-margin terminals that
are relevant. If so, and if you don't know what row you're on (e.g.
for shell line editing), you probably just need to stop at 1 column
less than the width to be safe. You could probably hardcode a list of
$TERM values for broken terminals though.

> (This is the logic I've needed to work out for screen, less, and vi as well. 
> At
> least when they're not doing the force escapes thing.)
> 
> The ansi escape parsing is still a todo item, but I note I wrote my own ansi
> escape parsing direct screen memory writer for DOS as one of my first C 
> programs
> back in 1990. :P
> 
> (And tabs. And the other low-ascii stuff that's also handled inconsistently 
> and
> which I might have watch and less and such filter out and just not print to 
> the
> tty. It'd be nice if TERM=linux specified consistent behavior here, but it's
> determined by the terminal display program consuming the output...)

I think most of this stuff is largely Unicode-agnostic, and is just a
matter of understanding classic terminal behavior and the idioms for
dealing with it.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Does anyone here understand how unicode combining characters work?

2018-09-26 Thread Rich Felker
On Wed, Sep 26, 2018 at 12:21:46PM -0700, enh wrote:
> in general ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt is
> pretty useful too. iirc plan9 had a code point lookup tool, but
> honestly i mainly type U+ into Google and end up at
> https://www.fileformat.info/info/unicode/char/2028/index.htm.
> 
> the wcwidth stuff isn't well defined (in that it's not a Unicode
> notion, and is under-specified by POSIX) but Unicode does have the

This is true; it's only defined by convention between implementations
and terminal emulators, and without their agreement, everything
breaks.

> "east asian width" data. see
> ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt for that.
> 
> the Unicode FAQs are often helpful too.
> http://unicode.org/faq/char_combmark.html
> 
> plus the full standard is freely available:
> http://www.unicode.org/versions/Unicode11.0.0/

Generally, implementations agree that characters with East Asian Width
property full or wide are wcwidth==2, and character classes Mn or Mc
(nonspacing or enclosing combining) are wcwidth==0. There are also a
number of class Cf characters that need to be treated as wcwidth==0
for the associated languages to work on a terminal.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Does anyone here understand how unicode combining characters work?

2018-09-26 Thread Rich Felker
On Wed, Sep 26, 2018 at 01:48:03PM -0500, Rob Landley wrote:
> On 09/26/2018 10:28 AM, Rob Landley wrote:
> > The crunch_str() logic is designed to escape nonprintable stuff and for 
> > watch.c
> > I need to write something that measures output but lets utf8 combining stuff
> > happen. (And measures tabs. And also parses at least the color change part 
> > of
> > ansi escapes, but we'll burn that bridge when we come to it...)
> > 
> > Using hexdump and echo -e's hex escapes to try to print minimal bits of the
> > combining character examples (which cut and paste appears to have horked
> > somewhat, but you get the idea):
> >
> >   $ cat tests/files/utf8/test1.txt
> >   l̴̗̞̠ȩ̸̩̥ṱ̴͍̻ ̴̲͜ͅt̷͇̗̮h̵̥͉̝e̴̡̺̼ ̸̤̜͜ŗ̴͓͉i̶͉͓͎t̷̞̝̻u̶̻̫̗a̴̺͎̯l̴͍͜ͅ 
> > ̵̩̲̱c̷̩̟̖o̴̠͍̻m̸͚̬̘ṃ̷̢͜e̵̗͎̫n̸̨̦̖c̷̰̩͎e̴̱̞̗
> >   $ echo -e '\xcc\xb4\xcc\x97\xcc\xa0e'
> >   e
> >   $ echo -e 'l\xcc\xb4\xcc\x97\xcc\xa0e'
> >   l̴̗̠e
> >   $ echo -e '\xcc\xb4\xcc\x97\xcc\xa0ee'
> >   ee
> >   $ echo -e 'l\xcc\xb4\xcc\x97\xcc\xa0'
> >   l̴̗̠
> >   $ echo -e '\xcc\xb4\xcc\x97\xcc\xa0'
> > 
> > So there needs to be a character _before_ the combining characters for them 
> > to
> > take effect, but they apply to the character _after_? Even when it's a 
> > newline?
> > (Which still works as a newline, but leaves trailing weirdness?)

Combining characters (at the terminal, any wcwidth==0 characters since
there is no finer-grained distinction) attach to the
previous/logical-left character cell.

> But if I have just enough characters to fill a line, the trailing weirdness 
> does
> _not_ go to the next line (it appears to get discarded), at least on my 80 
> char
> xfce Terminal:
> 
> echo -e
> 'xx0123456789091234567890123456789012345678901234567890123456789a\xcc\xb4\xcc\x97\xcc\xa0'

What you should see is:

xx0123456789091234567890123456789012345678901234567890123456789a̴̗̠

That is, the combining characters should be visible on the 'a' in the
last cell. I would not be surprised if some terminals get this wrong.

> I should look up what these escape sequences _do_. Hmmm... I could slowly and
> painfully do that by hand, but really I want a sort of unicode version of
> "hexdump -C" telling me what the codepoints are. (Ideally combined with a
> variant of the "ascii" program to then tell me what each one does.) Somebody 
> has
> to have written this already, but I dunno what to Google for. Hmm...
> 
> Hey Rich, I'm fiddling with unicode and lost/confused. Know any good tools 
> for this?

Does something like this help?

#include 
#include 
#include 
#include 
int main()
{
setlocale(LC_CTYPE, "");
wint_t c;
while ((c=getwchar())!=WEOF)
printf("U+%.4X wcwidth=%d\n", c, wcwidth(c));
}

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Stat %Z - What are valid values?

2017-01-25 Thread Rich Felker
On Wed, Jan 25, 2017 at 05:15:37PM -0600, Rob Landley wrote:
> On 01/24/2017 06:10 PM, Rich Felker wrote:
> >>> strptime with %s? I suspect there are some nasty underspecified issues
> >>> with how it interacts with timezones.
> >>
> >> I thought unix time was always UTS and the timezone just affected how it
> >> was displayed?
> > 
> > The problem is that strptime produces a struct tm. When the fields
> > it's parsing to produce this struct tm are "broken down time" fields,
> > strptime does not need to know/care whether the caller is going to
> > interpret the struct tm as UTC or local time; it's just a bunch of
> > numbers. But when strptime is going to take a unix time value (seconds
> > since epoch) and convert it to struct tm, it matters whether the
> > caller is supposed to treat that struct tm as UTC or local.
> 
> According to man 2 time:
> 
>  time()  returns  the  time  as  the  number of seconds since the Epoch,
>  1970-01-01 00:00:00 + (UTC).
> 
> I.E. the definition of unix time is UTC. So %s _not_ being UTC is silly.

Nobody is questioning whether the input read by %s is UTC. The
question is whether the output struct tm is UTC or local.

> My use case (and presumably most people's) is turning a time string into
> a struct timespec. What the actual struct tm fields are is irrelevant to
> that as long as mktime() translates the result back to the right number.

mktime interprets it as local, in which case strptime would have to be
doing the inverse operation, converting from UTC to local for the
struct tm output.

On the other hand strptime output may also be used with timegm (or a
hand-rolled version thereof). This is what you would do if you were
parsing a sting in UTC. With all the existing/standard fields,
strptime is completely agnostic with regards to what you want to do
with the output. Only with %s will it need a hard-coded assumption
about what you're going to do with the output.

> The struct tm has a timezone field in it. As long as strptime sets the
> timezone when it adjusts the hours so mktime() can adjust it _back_, the
> result is the same number you fed into %s. Which is the point of the
> exercise.

mktime cannot use the timezone field of struct tm. It's required to
assume the struct tm is in local time in the current timezone.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Stat %Z - What are valid values?

2017-01-24 Thread Rich Felker
On Tue, Jan 24, 2017 at 05:47:46PM -0600, Rob Landley wrote:
> 
> 
> On 01/21/2017 06:25 PM, Rich Felker wrote:
> > On Sat, Jan 21, 2017 at 03:18:28PM -0600, Rob Landley wrote:
> >>> Or if it's signed, that's -1346458162 which would be... sometime in the
> >>> 1930's? hmmm... "./date -D %s -d -1346458162" is failing under glibc,
> >>> and failing _differently_ under musl. (Wheee.)
> >>>
> >>> /me goes down tangent rathole debugging why.
> >> 
> >>>
> >>> (Answer: musl doesn't implement %s at all, and glibc doesn't allow the
> >>> %s value it converts to be negative.)
> >>
> >> Query: does bionic strptime() handle %s, and if so does it handle
> >> negative input values? (If not I suppose I can try to special case this
> >> in toybox, but ew.)
> >>
> >> Also, Rich: any interest in adding this to musl?
> > 
> > strptime with %s? I suspect there are some nasty underspecified issues
> > with how it interacts with timezones.
> 
> I thought unix time was always UTS and the timezone just affected how it
> was displayed?

The problem is that strptime produces a struct tm. When the fields
it's parsing to produce this struct tm are "broken down time" fields,
strptime does not need to know/care whether the caller is going to
interpret the struct tm as UTC or local time; it's just a bunch of
numbers. But when strptime is going to take a unix time value (seconds
since epoch) and convert it to struct tm, it matters whether the
caller is supposed to treat that struct tm as UTC or local.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Stat %Z - What are valid values?

2017-01-21 Thread Rich Felker
On Sat, Jan 21, 2017 at 03:18:28PM -0600, Rob Landley wrote:
> > Or if it's signed, that's -1346458162 which would be... sometime in the
> > 1930's? hmmm... "./date -D %s -d -1346458162" is failing under glibc,
> > and failing _differently_ under musl. (Wheee.)
> > 
> > /me goes down tangent rathole debugging why.
> 
> > 
> > (Answer: musl doesn't implement %s at all, and glibc doesn't allow the
> > %s value it converts to be negative.)
> 
> Query: does bionic strptime() handle %s, and if so does it handle
> negative input values? (If not I suppose I can try to special case this
> in toybox, but ew.)
> 
> Also, Rich: any interest in adding this to musl?

strptime with %s? I suspect there are some nasty underspecified issues
with how it interacts with timezones. All the standard specifiers work
with broken-down (struct tm) time so timezone is irrelevant to how
they operate.

So the answer isn't no, but "it's complicated", and needs more
research on how other implementations work, if they're consistent,
pros and cons of different possible behaviors, etc.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] imgtec patch: Fix static linkage of toybox binary.

2016-05-06 Thread Rich Felker
On Fri, May 06, 2016 at 10:16:46PM -0700, Evgenii Stepanov wrote:
> Sorry, I did not look at the problem hard enough.
> The real issue is interaction of this code with safestack
> (http://clang.llvm.org/docs/SafeStack.html), which splits the stack in
> 2 disjoint memory regions. If the two variables are allocated on
> different stacks, the comparison result is truly undefined.

No, the calculation is always defined unless it overflows; it's the
difference between two integers. If you're concerned about the
overflow case, which is possibly an issue, use unsigned integers and
cast the result to a signed type after the subtraction.

Of course whether the difference is meaningful for the purpose it's
being used for is another question, and depends on ABI stack
conventions which are affected by safe-stack. But it's not
"undefined".

> I don't really understand what this code is tying to do. Is it
> catching unlimited stack growth? Why does the comment speak about
> heap?

Yes I was confused by that too.

> Maybe we could use __builtin_frame_address(0) instead?

I don't think it's always available or meaningful, especially older
compilers.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] imgtec patch: Fix static linkage of toybox binary.

2016-05-06 Thread Rich Felker
On Fri, May 06, 2016 at 10:52:21PM -0500, Rob Landley wrote:
> On 05/06/2016 08:38 PM, enh wrote:
> > On Fri, May 6, 2016 at 5:12 PM, Rob Landley  wrote:
> >> On 05/06/2016 02:56 PM, enh wrote:
> >>> On Thu, May 5, 2016 at 8:15 PM, Rob Landley  wrote:
>  Applied, and that fetch+cherry-pick thing _also_ seems to avoid a
>  gratuitous merge commit, which is very nice.
> >>>
> >>> it also has the happy side-effect (because you keep the gerrit
> >>> change-id line) of appearing in the UI as if the originally uploaded
> >>> change was merged when i do my command-line merge from github. so if i
> >>> hadn't told the imgtec guy i was sending this patch upstream first, as
> >>> far as he knows it just got submitted here.
> >>>
> >>> (i'll still keep pointing folks upstream though, because the community
> >>> of those fiddling with toybox should be around upstream, not AOSP or
> >>> whichever other downstream they happen to use personally.)
> >>
> >> I'm happy to make better use of git, so if you care about the history of
> >> a specific commit being preserved I can do that again.
> > 
> > not particularly. the main advantage for me is that it's less work to
> > just send you the appropriate link and copy/paste git command than to
> > cherrypick myself and git format-patch (when you're just going to have
> > to do the same on your end anyway) :-)
> > 
> > by strange coincidence, i have another one for you today: "Fix UB in
> > stack depth calculation."
> > (https://android-review.googlesource.com/223547)
> 
> Except this is one of those "not taking the patch as-is" things, because
> it's got a variable declaration in the middle of a block, and I'd like
> an in-situ comment explaining why we do a non-obvious thing.
> 
> Is typecasting both pointers to (long) insufficient here? (That's being
> pretty darn explicit that I want to do math on the integer
> representations of these pointers. I know compiler writers are crazy
> these days, but how crazy are we talking?)
> 
> If we need to explicitly copy it into a volatile long I can do that, but
> I'd declare it at the top of the function...

I don't see any need for the volatile object. Subtraction of the
intptr_t's is perfectly valid and well-defined. And of course, since
generally toybox code assumes long has the same width as intptr_t and
avoids use of the latter, long would be more idiomatic here.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] ls new option : b

2016-03-11 Thread Rich Felker
On Fri, Mar 11, 2016 at 12:57:05AM -0600, Rob Landley wrote:
> On 03/09/2016 10:40 PM, Sameer Pradhan wrote:
> > Thanks for your comment Isaac.
> > 
> > I have modified the patch by addressing to your comments.
> > Please find the modified patch as attachment.
> 
> I'm already poking at ls to fix the date format posix thing mentioned
> earlier, but assuming this patch supercedes your first attempt, let's
> take a look...
> 
> You're not modifying strwidth, and ls with no arguments acts like ls -C
> when we have a tty, so "ls -b" is going to get column sizes wrong.
> 
> The heart of this patch is just 3 lines. I have more comments than lines.
> 
> +for (b = sort[next]->name; *b; b++)
> +if (isgraph(*b)) fputc(*b, stdout);
> + else printf("\\%3hho", *b);
> 
> 1) You need to say %03 or you'll have "\ 52" with a space in it.
> 
> 2) You're not escaping \ so you can't distinguish between a file called "\033"
> and a file containing an escape character.
> 
> 3) What does the "hh" accomplish exactly?
> 
> In C99, varargs promotes anything shorter than int to int, so that before
> the rise of 64 bit systems all your arguments were the same size on the stack.
> (On 64 bit systems it _didn't_ expand that to "long" because it didn't want
> to waste stack space, and thus the horrible need to typecast (void *)0 in
> varargs and keep int/long passing straight with %d vs %ld vs %lld...)
> 
> Anyway, my point is %o should work fine, is there a reason for the hh here?

This is true in C89 too; the hh prefix is not needed and is probably a
complete nop, though there may be dissenting views on what happens if
you use the hh prefix with a value that doesn't fit in unsigned char.

> 4) No UTF-8 support? I tested ls -b on a directory with japanese and arabic
> text and the ubuntu one didn't escape those.
> 
> You haven't really defined what "unprintable" means with regard to UTF8.
> Are combining characters printable? (They're zero length, but they do
> stuff.) How about the direction-switching sequences?

I think it should be determined by iswprint on the decoded characters,
but being that the intent of -b is to make it easy to identify the
contents of filenames you may not be able to read or that may be
ambiguous, some people might argue that -b should show everything but
ascii as escapes. I don't think this is necessary since, if you want
that behavior, you can do LC_CTYPE=C ls -b ...

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] include sys/sysmacros.h

2016-02-29 Thread Rich Felker
On Mon, Feb 29, 2016 at 05:53:02PM -0500, Mike Frysinger wrote:
> On 29 Feb 2016 16:24, Rob Landley wrote:
> > On 02/29/2016 12:36 PM, Mike Frysinger wrote:
> > > The major/minor macros are defined in sys/sysmacros.h.  This has
> > > historically been pulled in implicitly by sys/types.h,
> > 
> > http://man7.org/linux/man-pages/man3/major.3.html says sys/types.h, yes.
> 
> this is merely documenting what glibc has historically done, and glibc
> has historically tried to copy BSD in this area.  even then, the manpage
> probably should be adjusted to document sysmacros.h directly since that
> is where glibc has always placed things.

Defining macros named "major" and "minor" from a standard header is
serious namespace pollution.

Since sys/types.h is a rather quirky historical header that nobody
needs to include anyway, it's not such a horrible thing that glibc's
sys/types.h exposes sys/sysmacros.h However, also for historical
reasons, glibc's stdlib.h includes sys/types.h. This means that every
source file which includes stdlib.h gets the namespace pollution, and
that's really bad. That's why they're trying to fix it.

> > > but C libs
> > > are moving away from that as they aren't in POSIX.  Use the header
> > > directly as defined by BSD systems.
> > 
> > This adds a non-posix header to the posix header section of toys.h,
> 
> i really have no idea how toybox code is structured.  feel free to add
> the include to wherever you feel is appropriate.
> 
> > based on the premise that unspecified C libraries plan to break existing
> > programs at some unknown point in the future, for no obvious reason.
> 
> feel free to read the thread:
>   https://sourceware.org/ml/libc-alpha/2015-11/msg00253.html
> 
> glibc is the process of deprecating, and musl/uClibc would follow suit,
> if not just drop the include altogether.

musl has never had the issue. :-) because we don't implicitly include
sys/sysmacros.h at all. The set of programs that have any use for
these macros is basically ls, mknod, and archivers (tar, cpio, etc.)
that support archiving device nodes, and these programs can just
include the appropriate header to get them.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Implement wget

2016-02-29 Thread Rich Felker
On Tue, Mar 01, 2016 at 12:08:52PM +0900, Lipi C. H. Lee wrote:
> Thanks for your comments, Rich Felker.
> 
> I think 'get_info' function checked path length from url before calling
> strcpy as belows,
> 
> if (strlen(url+i) < 1024) strcpy(path, url+i);
> else error_exit("too long path in URL");
> 
> Do you think it is not enough?
> 
> And you commented whole string operations is unsafe in the code.
> Could you kindly tell me any test case it is unsafe?
> 
> It will be very helpful to me.

I don't know for sure that there are exploitable bugs, but having
string length checks that are non-local with respect to where the
length is assumed to fit is a very dangerous practice and does not
belong anywhere near network-facing code. Really strcpy does not
belong anywhere network-facing. This could be safe and worry-free with
no significant cost by using snprintf or similar functions.

Rich

> On Sat, Feb 27, 2016 at 8:28 AM, Rich Felker <dal...@libc.org> wrote:
> 
> > On Thu, Feb 25, 2016 at 05:28:20PM +0900, Lipi C. H. Lee wrote:
> > > Thanks for your comment, Felix.
> > >
> > > I am sorry my response is so late.
> > > As your comments, I modified some code.
> > >
> > > wget: clean up
> > >
> > > - shorten error messages
> > > - replace mk_rq with sprintf
> >
> > This is almost surely unsafe.
> >
> > > - remove struct and defines
> > > - change unsigned int to unsigned
> > >
> > >
> > > [...]
> > >  void wget_main(void)
> > >  {
> > >int sock;
> > > -  struct httpinfo hi;
> > >FILE *fp;
> > > -  size_t len, body_len;
> > > -  char *body, *result, *rc, *r_str, ua[18] = "toybox wget/", ver[6];
> > > +  ssize_t len, body_len;
> > > +  char *body, *result, *rc, *r_str;
> > > +  char ua[18] = "toybox wget/", ver[6], hostname[1024], port[6],
> > path[1024];
> > >
> > >// TODO extract filename to be saved from URL
> > > -  if (!(toys.optflags & FLAG_f))
> > > -help_exit("The filename to be saved should be needed.");
> > > -  if (fopen(TT.filename, "r"))
> > > -error_exit("The file(%s) you specified already exists.",
> > TT.filename);
> > > +  if (!(toys.optflags & FLAG_f)) help_exit("no filename");
> > > +  if (fopen(TT.filename, "r")) perror_exit("file already exists");
> > >
> > > -  if(!toys.optargs[0]) help_exit("The URL should be specified.");
> > > -  get_info(, toys.optargs[0]);
> > > +  if(!toys.optargs[0]) help_exit("no URL");
> > > +  get_info(toys.optargs[0], hostname, port, path);
> > >
> > > -  sock = conn_svr(hi.hostname, hi.port);
> > > +  sock = conn_svr(hostname, port);
> > >
> > >// compose HTTP request
> > > -  mk_rq(hi.path);
> > > -  mk_fld("Host", hi.hostname);
> > > +  sprintf(toybuf, "GET %s HTTP/1.1\r\n", path);
> >
> > Depending on the size of toybuf, it might be safe, but the above
> > get_info is almost certainly unsafe already; it strcpy's into path[]
> > with no limits on size.
> >
> > Right now this whole utility looks extremely unsafe to me -- all sorts
> > of unbounded string operations. Even if you want to say you're only
> > going to be passing 'safe' urls to it, it's likely vulnerable to
> > malicous HTTP redirects and other similar issues. I think it needs
> > some major overhaul with attention to safe string handling.
> >
> > Rich
> >

> ___
> Toybox mailing list
> Toybox@lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net

___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH 2/2] pending/file: reuse int

2016-02-28 Thread Rich Felker
On Sun, Feb 28, 2016 at 03:37:08PM -0800, Isaac Dunham wrote:
> ---
>  toys/pending/file.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/toys/pending/file.c b/toys/pending/file.c
> index 83e8c8e..789ef59 100644
> --- a/toys/pending/file.c
> +++ b/toys/pending/file.c
> @@ -126,7 +126,7 @@ static void do_regular_file(int fd, char *name)
>  {
>char *s;
>int len = read(fd, s = toybuf, sizeof(toybuf)-256);
> -  int magic;
> +  int peer;
>  
>if (len<0) perror_msg("%s", name);
>  
> @@ -177,8 +177,8 @@ static void do_regular_file(int fd, char *name)
>  else if (toybuf[5] == '2') cpioformat = "SVR4 with CRC";
>  xprintf("ASCII cpio archive (%s)\n", cpioformat);
>}
> -  else if (len>33 && (magic=peek(,2), magic==0143561 || magic==070707)) {
> -if (magic == 0143561) printf("byte-swapped ");
> +  else if (len>33 && (peer=peek(,2), peer==0143561 || peer==070707)) {
> +if (peer == 0143561) printf("byte-swapped ");
>  xprintf("cpio archive\n");
>}
>// tar archive (ustar/pax or gnu)
> @@ -187,10 +187,10 @@ static void do_regular_file(int fd, char *name)
>}
>// zip/jar/apk archive, ODF/OOXML document, or such
>else if (len>5 && strstart(, "PK\03\04")) {
> -int ver = (int)(char)(toybuf[4]);
> +peer = (int)(char)(toybuf[4]);
>  xprintf("Zip archive data");
> -if (ver)
> -  xprintf(", requires at least v%d.%d to extract", ver/10, ver%10);
> +if (peer)
> +  xprintf(", requires at least v%d.%d to extract", peer/10, peer%10);
>  xputc('\n');
>}
>else {
> -- 

What's the point of this patch? Does it actually change the generated
code for the better? I would expect it to be at best the same code,
and possibly worse.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Implement wget

2016-02-26 Thread Rich Felker
On Thu, Feb 25, 2016 at 05:28:20PM +0900, Lipi C. H. Lee wrote:
> Thanks for your comment, Felix.
> 
> I am sorry my response is so late.
> As your comments, I modified some code.
> 
> wget: clean up
> 
> - shorten error messages
> - replace mk_rq with sprintf

This is almost surely unsafe.

> - remove struct and defines
> - change unsigned int to unsigned
> 
> 
> [...]
>  void wget_main(void)
>  {
>int sock;
> -  struct httpinfo hi;
>FILE *fp;
> -  size_t len, body_len;
> -  char *body, *result, *rc, *r_str, ua[18] = "toybox wget/", ver[6];
> +  ssize_t len, body_len;
> +  char *body, *result, *rc, *r_str;
> +  char ua[18] = "toybox wget/", ver[6], hostname[1024], port[6], path[1024];
>  
>// TODO extract filename to be saved from URL
> -  if (!(toys.optflags & FLAG_f))
> -help_exit("The filename to be saved should be needed.");
> -  if (fopen(TT.filename, "r"))
> -error_exit("The file(%s) you specified already exists.", TT.filename);
> +  if (!(toys.optflags & FLAG_f)) help_exit("no filename");
> +  if (fopen(TT.filename, "r")) perror_exit("file already exists");
>  
> -  if(!toys.optargs[0]) help_exit("The URL should be specified.");
> -  get_info(, toys.optargs[0]);
> +  if(!toys.optargs[0]) help_exit("no URL");
> +  get_info(toys.optargs[0], hostname, port, path);
>  
> -  sock = conn_svr(hi.hostname, hi.port);
> +  sock = conn_svr(hostname, port);
>  
>// compose HTTP request
> -  mk_rq(hi.path);
> -  mk_fld("Host", hi.hostname);
> +  sprintf(toybuf, "GET %s HTTP/1.1\r\n", path);

Depending on the size of toybuf, it might be safe, but the above
get_info is almost certainly unsafe already; it strcpy's into path[]
with no limits on size.

Right now this whole utility looks extremely unsafe to me -- all sorts
of unbounded string operations. Even if you want to say you're only
going to be passing 'safe' urls to it, it's likely vulnerable to
malicous HTTP redirects and other similar issues. I think it needs
some major overhaul with attention to safe string handling.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Distinguish 32- and 64-bit variants in file(1) for x32.

2016-02-21 Thread Rich Felker
On Sun, Feb 21, 2016 at 08:42:06PM -0600, Rob Landley wrote:
> On 02/21/2016 03:39 PM, Rich Felker wrote:
> > On Sat, Feb 20, 2016 at 02:28:22PM -0600, Rob Landley wrote:
> >> On Wed, Feb 17, 2016 at 7:02 PM, enh <e...@google.com> wrote:
> >>> On Wed, Feb 17, 2016 at 3:32 PM, Rob Landley <r...@landley.net> wrote:
> >>>> On Wed, Feb 17, 2016 at 10:22 AM, enh <e...@google.com> wrote:
> >>>>> It's necessary to distinguish x86 and x86-64 to be able to recognize the
> >>>>> way x32 is encoded in ELF.
> >>>>
> >>>> Hmmm. That's not fun.
> >>>>
> >>>> I note that I spent the morning teaching the code to read/display the
> >>>> dynamic linker name, so this patch won't "git am" directly.
> >>>>
> >>>> Reading the patch, we're pretending that arrch64 has nothing to do
> >>>> with arm? No mention of arm in this architecture? Ok... (I guess
> >>>> Cortex-M isn't arm either, but don't currently have an example binary
> >>>> of that to test.)
> >>>
> >>> well, you're the one who removed my original "ARM aarch64" which is
> >>> what the regular desktop file(1) says :-P
> >>
> >> But not what the linux-kernel developers ever seem to say in their
> >> patch submissions.
> > 
> > Regardless of what you think about these naming choices, IMO there's
> > little value in a file(1) that does not print the names that scripts
> > using it expect to see.
> 
> _What_ scripts? I don't know what would be using this. (All for
> real-world tests, but I have yet to find a build script using the file
> command. Looking at /proc, sure, but not calling file...)

I suspect it's stuff like: case "$(file "$f")" in ...

I'm not thinking of build scripts for packages (this would not be
remotely portable usage) but things like private admin scripts,
perhaps printer filter scripts, file preview/thumbnail generation
scripts, etc. I don't have any good examples at hand but this was the
historical justification I always saw for the rather arcane/antiquated
forms for many of the names.

> If the script wants to match "Intel 80386" explicitly, then do I have to
> say that for i686?

I would think it makes sense to preserve the "Intel 80386" convention
here. There's not even a reliable way to detect that a binary is for
"i686" anyway.

> > The choice to use aarch64 instead of arm64 is
> > in some ways also a consequence of this, or rather an intentional
> > _mismatch_ with patterns that should not match. The fact that mips64
> > and powerpc64 match mips* and powerpc* was historically very
> > problematic.
> 
> grep -w? Test for 64 bit first?

Indeed, testing for 64 first is the right approach (see musl's
configure script for an example), but the problem arises when the
existing tests were written before the 64-bit version of the platform
existed. I suspect there are lots of scripts that match arm*-*-* in
the machine tuple or arm* in `uname -m` which would have wrongly
detected "arm64" as arm.

> Elliott was suggesting that the elf-em.h constants might be good enough,
> but that says 386 not 80386, and PPC instead of Powerpc...
> 
> Seriously, standards would be nice!

Yes, wouldn't they? :)

> (The mime types are designed to be programattically interpreted. The
> post here about a mime type database is very interesting and I have a
> tab open. But file type output seems to be for humans...?)

Yes, mime types are a lot more consistent but less informative than
the freeform file(1) strings.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Distinguish 32- and 64-bit variants in file(1) for x32.

2016-02-21 Thread Rich Felker
On Sat, Feb 20, 2016 at 02:28:22PM -0600, Rob Landley wrote:
> On Wed, Feb 17, 2016 at 7:02 PM, enh  wrote:
> > On Wed, Feb 17, 2016 at 3:32 PM, Rob Landley  wrote:
> >> On Wed, Feb 17, 2016 at 10:22 AM, enh  wrote:
> >>> It's necessary to distinguish x86 and x86-64 to be able to recognize the
> >>> way x32 is encoded in ELF.
> >>
> >> Hmmm. That's not fun.
> >>
> >> I note that I spent the morning teaching the code to read/display the
> >> dynamic linker name, so this patch won't "git am" directly.
> >>
> >> Reading the patch, we're pretending that arrch64 has nothing to do
> >> with arm? No mention of arm in this architecture? Ok... (I guess
> >> Cortex-M isn't arm either, but don't currently have an example binary
> >> of that to test.)
> >
> > well, you're the one who removed my original "ARM aarch64" which is
> > what the regular desktop file(1) says :-P
> 
> But not what the linux-kernel developers ever seem to say in their
> patch submissions.

Regardless of what you think about these naming choices, IMO there's
little value in a file(1) that does not print the names that scripts
using it expect to see. The choice to use aarch64 instead of arm64 is
in some ways also a consequence of this, or rather an intentional
_mismatch_ with patterns that should not match. The fact that mips64
and powerpc64 match mips* and powerpc* was historically very
problematic.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Correct Setting of uid and gid when extracting CPIO archive

2015-12-12 Thread Rich Felker
On Sun, Dec 06, 2015 at 04:04:22PM -0600, Rob Landley wrote:
> On 12/02/2015 07:39 AM, Mike Moreton wrote:
> > Hi, I’ve fixed a couple of bugs in cpio.
> 
> Thanks, I applied both patches. Several good fixes here in the "I don't
> know how this ever worked before" category, but:
> 
> 1) git am says it doesn't have a valid email address. (I fixed that up
> to get it to apply.)
> 
> 2) I don't understand this bit:
> 
> -  if (!S_ISREG(mode) && !S_ISLNK(mode) && !getpid()) {
> -int fd = open(name, O_WRONLY|O_NOFOLLOW);
> +  if (!S_ISREG(mode) && !S_ISLNK(mode) && !geteuid()) {
> +int fd = open(name, O_RDONLY|O_NOFOLLOW);
> 
> The purpose of the filehandle is to call fchown() on it. Can we change
> ownership through a read only filehandle? (The kernel not only allows
> this now but will continue to allow this in future?)

fchown/fchmod do not care about the access mode of the open file
description. Their behavior depends only on having ownership of (or
otherwise having permission to change) the actual file referenced.
This is a POSIX requirement and will not change; it's simply a
consequence of the text;

"The fchmod() function shall be equivalent to chmod() except
that the file whose permissions are changed is specified by
the file descriptor fildes."

This does not permit it to behave differently based on the access mode
the file was opened with.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pull: fix modprobe, login, switch_root, improve init, reboot

2015-10-04 Thread Rich Felker
On Sat, Oct 03, 2015 at 05:44:54AM -0500, Rob Landley wrote:
> > It would be nice to audit all the toys that are
> > intended to be long-running rather than commands that just do their
> > thing and exit to reduce or eliminate any fatal exits after they reach
> > the 'long-running' part.
> 
> Except strlower() calls xstrdup() in the I18N case and dlist_add() calls
> xmalloc() and dirtree_add_node() calls xzalloc()... You can _try_ to
> avoid it, but it's not a simple thing to audit. (And no, you can't check
> whether or not xexit() and such are linked in because common
> infrastructure like toy_init() and the option parsing logic use them.)

There should be static analysis tools that can show the call graph
restricted to actually-reachable code. As long as the false-positive
rate for reachability is low (and no false negatives) this should make
such auditing practical.

For strlower, how do you manage knowing whether to free the result, or
is it only used in places where you wouldn't care about freeing? An
in-place strlower is definitely possible if you just make sure to
allocate the right amount of memory to account for any possible
expansion on case conversion at the time of allocation. I'm not sure
if this would be practical but it might be able to simplify code and
eliminate failure cases.

> This is why I have the "longjmp back to a recovery point instead of
> exit" logic. It may leak resources (although we can _try_ to avoid and
> clean up after that) but it lets you recover from failures. Currently
> only toysh is using it (that was my prototype implementation and proof
> of concept) but the concept is genericizable.

In general this is a good approach.

> The standard idiom in toybox is to abort on fatal errors, which is the
> right thing to do 90% of the time and means we're not _ignoring_ errors
> by failing to check for them. I can't change that idiom for the
> remaining 10%, but I can convert it into exception handling with
> throw/catch. That's not ideal, but it's workable.
> 
> (I have actually thought about this before. It's on the todo list. And
> it affects the nommu stuff too because allocation failures are _much_
> more likely in a context where all allocations must be contiguous and
> memory fragmentation limits your maximum allocation size, so malloc
> failures aren't just due to resource exhaustion there...)

By "all allocations must be contiguous" do you just mean "if I
malloc(N), there must be N physically consecutive bytes free for it to
succeed"? Certainly the whole heap does not need to be contiguous, and
the physical contiguity requirement doesn't put any more constraints
on you than the virtual contiguity requirement does on MMU-ful systems
until you reach allocation sizes at least as large as page size (and
probably a good bit larger to make a practical difference).

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pull: fix modprobe, login, switch_root, improve init, reboot

2015-10-02 Thread Rich Felker
On Fri, Oct 02, 2015 at 08:00:24AM -0700, Isaac Dunham wrote:
> [in the context of long running processes]
> > Rich Felker wrote:
> > > Isaac Dunham wrote:
> > > > Agreed. But then, wouldn't xrun() also be the wrong thing?
> > > Why?
> If I'm not mistaken, xfork() and XVFORK() will perror_exit on failure.
> This results in a fork-bomb killing a long-running hotplug helper.

Indeed. I suspect being robust against resource-exhaustion conditions
is going to be harder than just avoiding these 'x' functions, but it's
a necessary condition. It would be nice to audit all the toys that are
intended to be long-running rather than commands that just do their
thing and exit to reduce or eliminate any fatal exits after they reach
the 'long-running' part.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] pull: fix modprobe, login, switch_root, improve init, reboot

2015-10-01 Thread Rich Felker
On Thu, Oct 01, 2015 at 07:34:25PM -0700, Isaac Dunham wrote:
> On Thu, Oct 01, 2015 at 04:42:34AM -0500, Rob Landley wrote:
> > Still going through old posts, I'm up to:
> > 
> > On 08/09/2015 12:24 AM, Isaac Dunham wrote:
> > >>> > > If hard-coding "modprobe -bq $MODALIAS" int mdev is desireable, I 
> > >>> > > can send
> > >>> > > that; but I suspect it's not quite fit for inclusion.
> > >> > 
> > >> > I'd be interested in seeing the patch.
> > > Attached!
> > > 
> > > Thanks,
> > > Isaac Dunham
> > > 
> > >>From 0acde7ff07603e7b485bc19ec4a49b3010c49966 Mon Sep 17 00:00:00 2001
> > > From: Isaac Dunham 
> > > Date: Mon, 3 Aug 2015 13:28:36 -0700
> > > Subject: [PATCH] mdev: autoload modules on hotplug.
> > > 
> > > ---
> > >  toys/pending/mdev.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/toys/pending/mdev.c b/toys/pending/mdev.c
> > > index a13a53d..b273d4a 100644
> > > --- a/toys/pending/mdev.c
> > > +++ b/toys/pending/mdev.c
> > > @@ -58,6 +58,8 @@ static void make_device(char *path)
> > >} else {
> > >  // if (!path), do hotplug
> > >  
> > > +if ((temp = getenv("MODALIAS")))
> > > +  xexec((char *[]){"modprobe", temp, 0});
> > >  if (!(temp = getenv("SUBSYSTEM")))
> > >return;
> > >  type = strcmp(temp, "block") ? S_IFCHR : S_IFBLK;
> > 
> > The problem here is that does xexec, which is an exec so it replaces the
> > current process, meaning the rest of processing doesn't happen.
> > 
> > Do you mean xrun() maybe? (Spawns child process and waits for it to
> > return, reporting exit code.) Or is modalias really the only thing this
> > call is expected to do?
>  
> (This was a preliminary patch that works for me but might not be
> appropriate for upstream.)
> 
> modprobe $MODALIAS is the only thing this is expected to do.
> As far as I know (which is not very far), there are no cases where a single
> uevent calls for loading a module and creating a device.
> 
> I suppose that it's improper to rely on that without finding docs;
> but it somehow seems wrong to do a fully blocking call to modprobe, and
> serially with that check if we need to create a device.
> 
> Maybe (1) try to fork(); (2) if we succeed, xexec() modprobe; (3) do the
> whole mknod/chmod/... bit if appropriate; (4) on completion of that,
> call wait().
> Note that a forkbomb, which could temporarily result in fork() failing,
> should not make the hotplugger die if it's long-running. Making module
> loading fail temporarily seems like it would be better.

This will not be nommu compatible. I think this is why Rob suggested
xrun.

> > (I note that I have a todo item to implement demon mode doing the
> > netlink thing, in which case xexec() would again be the wrong thing...)
> 
> Agreed. But then, wouldn't xrun() also be the wrong thing?

Why?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Make yesno printf-like.

2015-09-02 Thread Rich Felker
On Wed, Sep 02, 2015 at 07:58:18PM -0500, Rob Landley wrote:
> On 09/02/2015 10:09 AM, enh wrote:
> > On Tue, Sep 1, 2015 at 11:26 PM, Rob Landley  wrote:
> >> On 09/01/2015 08:35 PM, enh wrote:
> >>> Make yesno printf-like.
> >>
> >> This just makes me uncomfortable that it's nonobvious enough somebody's
> >> going to yesno(1, usersupplieddata);.
> > 
> > that seems unlikely, especially given that yesno doesn't supply the
> > "?". (and you really ought to get out of the habit of being so tricky
> > at the cost of readability all the time anyway!)
> > 
> > btw, though this distracts from my point and will probably only
> > encourage you to stick with the status quo... toybox already blindly
> > uses user-supplied strings as format strings. random example:
> > 
> > $ seq -f '%s' 1 3
> > seq: format ‘%s’ has unknown %s directive
> > 
> > $ ./toybox seq -f '%s' 1 3
> > Segmentation fault (core dumped)
> 
> That's a bug and I need to fix it.
> 
> Except... Grrr. No posix entry for this, and the LSB spec is crap as usual.
> 
> seq -f %d 1 3
> seq -f %s 1 3
> seq -f "" 1 3
> seq -f %f 1 3
> seq -f %e 1 3
> seq -f %g 1 3
> seq -f "boo %f yah" 1 3
> seq -f "boo %f %f yah" 1 3
> seq -f "% f" 1 3
> seq -f "%*f" 1 3
> seq -f "%-1.2f" 1 3
> seq -f "%-1.2.3f" 1 3
> seq -f "%1-f" 1 3
> seq -f "%1 f" 1 3
> seq -f "%+-f" 1 3
> seq -f "%+ - f" 1 3
> seq -f "%.2f" 1 3
> seq -f "%3.f" 1 3
> seq -f "%2..2f" 1 3
> seq -f "%'.2f" 1 3
> 
> Ok, it looks like the sanitization pass is using:
> 
> 1) search for one % (must exist in string)
> 2) skip all "+- "
> 3) skip all digits 0-9
> 4) skip at most one "."
> 5) skip all digits 0-9
> 6) ensure next char is [feg]
> 
> Now let's read man 3 printf and see what ELSE I forgot: accept %%
> literal, the target is aAeEfFgG, flags are ['#-+ ]...
> 
> seq -f "%LF" 1 3
> 
> Really? Sigh. I don't think I'm allowing that one because it's INSANE.

This should work:

([^%]|%%)*%[-#0 +']?[0-9]*.[0-9]*[aAeEfFgG]([^%]|%%)*

If you're already linking regcomp/regexec, using them to match it is
probably the smallest/simplest implementation. If not you can write a
state machine for it.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [Nommu] Week ending June 27ish.

2015-06-29 Thread Rich Felker
On Mon, Jun 29, 2015 at 03:26:13PM -0500, Rob Landley wrote:
 On 06/28/2015 09:57 PM, Rich Felker wrote:
  On Sun, Jun 28, 2015 at 06:15:26PM -0500, Rob Landley wrote:
  Are there perhaps any other shells that work on nommu? It would be
  nice to be able to put off toysh until you have time to do a really
  good job on it rather than rushing it because there's nothing else you
  can use...
 
  Eh, I can do it in stages. And setting up another shell is effort (and
  natural test/use cases) that's _not_ going into toysh development, so
  I'd rather do it right than do other things that get thrown away.
 
  That said, sure, http://www.cod5.org/archive/ links to es and pdksh for
  example, and uclibc had like 5 of them (all kinda crappy if I recall).
  None really an improvement on hush in terms of actually building stuff.
 
 I should have clarified these were other shells to look at with
 licensing I didn't mind distributing, not that I'd confirmed they work
 with nommu. (Except the uclinux shells, which are _crap_.)

Yes, I noticed. Whatever that shell from uclinux I was using at first
was, it was utterly awful.

  I'm pretty sure pdksh conforms to POSIX (at least the 1992 version) so
 
 Uh-huh. 1992 was 23 years ago. That would mean the shell's active
 development predates Linux. That standard is a decade older than the
 stale version of bash I'm trying to move off of because stuff keeps
 breaking when I use it.

I'm not aware of important changes in the shell spec since then, but I
may be mistaken.

 And on those grounds you expect it to be better than hush?

I recall reading that hush is known to be missing some important
functionality, though I don't remember what. It worked fine for my
interactive use and simple scripts though.

  it's probably sufficient for running portable scripts including all
  configure scripts.
 
 You think all configure scripts are portable?
 
 Really?

Anything produced by autotools that's not using custom shell script
level code (just the standard m4 macros, etc.) is theoretically
completely portable, and in practice very much so. I've run these
scripts on all kinds of crazy proprietary unices and when I've
encountered trouble, it hasn't been at the shell interpreter level.

 But sure, let's take a look at the actual pdksh code, in jobs.c it says:
 
  * The interface to the rest of the shell should probably be changed
  * to allow use of vfork() when available but that would be way too much
  * work :)
 
 ...
 
 /* create child process */
 forksleep = 1;
 while ((i = fork())  0  errno == EAGAIN  forksleep  32) {
 if (intrsig) /* allow user to ^C out... */
 break;
 sleep(forksleep);
 forksleep = 1;
 }
 
 So no, that doesn't work with nommu either. (And I have no idea what
 weird historical bug that was trying to work around back in 1992.)

OK, so that idea is out. Guess you need to write toysh. :-)

  Yes. If you call toybox as a command name, it gets the normal command
  setup done for it. I'm not optimizing for that case. Pilot error.
  
  OK. This was the source of the duplicates. I would still like to see
  no extra syscalls at all (much like my feeling about glibc startup),
  but the situation isn't nearly as bad as I thought.
  
  BTW what happens with the suid check/drop being done twice? Do
  commands that need suid work when you call them as toybox cmdname?
  Are they supposed to?
 
 in scripts/make.sh:
 
 echo USE_TOYBOX(NEWTOY(toybox, NULL, TOYFLAG_STAYROOT))  \
   generated/newtoys.h
 
 The stayroot tells the multiplexer not to drop privs. (Commit 15a8d71674b4.)

Looks good.

  It's not
  as bad without the duplication, but that's still 4 unnecessary
  round-trips to kernelspace for a lot of commands. Maybe the get[e]uid
  stuff is hard to remove when the suid support is compiled-in, but
  getauxval() could be used instead on systems that have it.
 
  There was talk a few years back of putting getuid and friends in vdso
  and the result was nobody bothered because the overhead wasn't high
  enough for anybody to care. (People tend to call gettimeofday() in tight
  loops because it constantly changes.)
  
  I could cache it in libc too, but I feel the risk of returning a stale
  value outweighs the performance benefits, so I didn't do it. But if
  you're checking it at startup (without having done uid changes),
  getauxval is reliable.
 
 $ man getauxval | grep CONFORMING -A 1
 CONFORMING TO
This function is a nonstandard glibc extension.
 
 So you just suggested I replace a posix function with a nonstandard
 glibc extension in the name of saving a system call that does an
 unlocked fetch on a single integer.
 
 Well that's certainly a point of view.

While glibc was the first to add it, it's conceptually something nice
to have on any ELF-based platform or platform with ELF-like process
startup semantics, and I worked with the glibc team on making

Re: [Toybox] [Nommu] Week ending June 27ish.

2015-06-28 Thread Rich Felker
On Sun, Jun 28, 2015 at 06:15:26PM -0500, Rob Landley wrote:
  Are there perhaps any other shells that work on nommu? It would be
  nice to be able to put off toysh until you have time to do a really
  good job on it rather than rushing it because there's nothing else you
  can use...
 
 Eh, I can do it in stages. And setting up another shell is effort (and
 natural test/use cases) that's _not_ going into toysh development, so
 I'd rather do it right than do other things that get thrown away.
 
 That said, sure, http://www.cod5.org/archive/ links to es and pdksh for
 example, and uclibc had like 5 of them (all kinda crappy if I recall).
 None really an improvement on hush in terms of actually building stuff.

I'm pretty sure pdksh conforms to POSIX (at least the 1992 version) so
it's probably sufficient for running portable scripts including all
configure scripts. I know your goal is much higher (bash
compatibility) but it sounds like pdksh could get you a working
environment until you finish toysh, with little or no effort spent on
the temporary solution.

  2) I'm teaching toybox to probe for the existence of fork(), set a nommu
  symbol if it's not there, and use sort of a fakeroot version of fork
  (re-exec yourself with some way of transferring state to the child,
  maybe an environment variable, maybe a pipe...).
 
  BTW toybox does a lot of expensive probing at startup already (before
  even entering the applet's main function) that makes simple commands
  take about twice the time of their busybox versions.
 
  Could you please be a little more vague?
 
  I'd prefer to respond to this on the toybox list, but since you felt it
  should be asked here instead...
  
  Sorry, it just came up as part of the overhead for self-exec topic. I
  agree the toybox list would have been a better place.
  
  Or do you mean something other by probing than system calls? (Memory
  usage? CPU usage? Could I have some sort of metric here, or at least the
  axis of interest?)
  
  I meant system calls. I have:
  
  # strace toybox true
  set_tid_address(0x15c9eeb4) = 161
  rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER|SA_RESTART, 0x15c4cb48}, 
  {SIG_DFL, [], 0}, 8) = 0
  getuid32()  = 0
  geteuid32() = 0
  umask(0)= 0755
  umask(0755) = 0
  getuid32()  = 0
  geteuid32() = 0
  umask(0)= 0755
  umask(0755) = 0
 
 Ah, that's two instances of setup. The setup for toybox_main() and the
 setup for true.
 
 It can normally skip one of those, but you're explicitly calling
 toybox so it treats it as a command and does setup for it.
 
  exit_group(0)   = ?
  +++ exited with 0 +++
  
  It looks like they're all repeated twice when the toybox command is
  used to invoke a command by name (rather than with symlinks).
 
 Yes. If you call toybox as a command name, it gets the normal command
 setup done for it. I'm not optimizing for that case. Pilot error.

OK. This was the source of the duplicates. I would still like to see
no extra syscalls at all (much like my feeling about glibc startup),
but the situation isn't nearly as bad as I thought.

BTW what happens with the suid check/drop being done twice? Do
commands that need suid work when you call them as toybox cmdname?
Are they supposed to?

  It's not
  as bad without the duplication, but that's still 4 unnecessary
  round-trips to kernelspace for a lot of commands. Maybe the get[e]uid
  stuff is hard to remove when the suid support is compiled-in, but
  getauxval() could be used instead on systems that have it.
 
 There was talk a few years back of putting getuid and friends in vdso
 and the result was nobody bothered because the overhead wasn't high
 enough for anybody to care. (People tend to call gettimeofday() in tight
 loops because it constantly changes.)

I could cache it in libc too, but I feel the risk of returning a stale
value outweighs the performance benefits, so I didn't do it. But if
you're checking it at startup (without having done uid changes),
getauxval is reliable.

  I don't want to replace fork _entirely_ because fork takes about 5% as
  long as exec on systems that _do_ support it, so the toybox shell being
  able to fork and run commands internally is a big potential speedup for
  shell scripts. But codepaths that are _not_ performance critical should
  use the long way round so it gets testing.
 
  This seems like a good approach, but I have a suggestion that will
  perform even better: use pthread_create instead of fork for
  implementing shell builtins
 
  Suppressing the gag refex for involving pthreads in something for no
  reason, I actually looked at that a long time ago (using clone()
  directly to create shared process contexts), and didn't like it. Random
  blog post (one of 

Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-06-10 Thread Rich Felker
On Wed, Jun 10, 2015 at 01:23:07PM -0500, Rob Landley wrote:
 On 05/03/2015 03:53 PM, enh wrote:
  On Sat, May 2, 2015 at 1:32 PM, Rich Felker dal...@libc.org wrote:
  On Sat, May 02, 2015 at 12:04:31PM -0700, enh wrote:
  Sure, outside toybox yay sigpipe. But inside, the question is what the
  correct behavior should be.
 
  It sounds like android wants _exit(0)
 
  no, not particularly. we're arguing internally about whether SIGPIPE
  is a crash or not. i think even undoing the dynamic linker's
  installation of a crash reporter by setting it back to SIG_DFL would
  be fine for toybox --- the real problem is that the dynamic linker
  doesn't know whether it's dealing with an Android app (where you
  probably do want the crash reporting) or a command-line tool (where
  you probably don't), let alone corner cases like a command-line tool
  that's implemented in Java.
 
  In that case the crash-reporting signal handler setup should be in the
  Android app init code, not the dynamic linker, no?
  
  there are a large number of native-only daemons running on the average
  Android device, most of which we only see as binary blobs.
 
 I have a todo item here, the posix guys have been proposing adding set
 -o pipefail and doing their usual insane bikeshedding where yes | head
 should always exit failure because yes couldn't successfully write all
 its (infinite) output. Meanwhile head discarding input is what head is
 _for_...
 
 http://austingroupbugs.net/view.php?id=789
 
 Not sure what the right answer is, error return-wise. The android
 approach is to return success when terminating output due to -EPIPE (as
 opposed to -ENOSPC or other errors), and I'm pretty happy to do the
 same... assuming I can figure out how to beat the error out of ferror().
 (You record that there _was_ an error, but not the errno. So I was
 happily doing printf(), printf(), printf(), and then fflush() and
 ferror() and if the earlier printf returned an errno you ate it and
 didn't save it for me thank you very much...)
 
 Anyway, opinions on this?

The problem with your reasoning is that the program writing to the
pipe cannot know it's piping to head or something that wants to throw
away input. Since it doesn't know, it _must_ exit with failure when
all output could not be written.

The 'yes' command is something of a stupid special case and nobody
should really care what it exits with; it's not even standard. But
something like 'sed' really needs to report write errors. Anything
else could lead to data loss. You can't say EPIPE is special but I'm
going to report errors on ENOSPC because the EPIPE could be _caused_
by ENOSPC in the command you're piping to, e.g.

sed s/a/b/g  foo.txt | tr -d c  out.txt  mv out.txt foo.txt

In the above example, the error status from tr will prevent mv from
running, but now consider something like:

sed s/a/b/g  foo.txt  my_fifo  rm foo.txt

Another example is running a command directly connected to a socket,
e.g. from inetd. If the socket breaks, you get EPIPE, and that needs
to be treated as an error that the calling script sees, not as
successful exit.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-05-02 Thread Rich Felker
On Sat, May 02, 2015 at 12:04:31PM -0700, enh wrote:
 On Wed, Apr 29, 2015 at 12:24 PM, Rob Landley r...@landley.net wrote:
  On 04/27/2015 11:26 AM, enh wrote:
  it would also result in things like top(1) not exiting.
 
  If top is using stdio, it's easy to add a single ferror(stdout) check
  to the main loop. If it's using write(), it _must_ be checking for
  errors anyway since write can always return with a short/partial
  write. But top is unlikely to be hooked up to a pipe or socket anyway;
  it normally needs a terminal.
 
  the point is that then people need to think. if that were a plausible
  solution, this thread wouldn't exist. the advantage of the signal
  handler is that humans can keep on being humans.
 
  I _so_ should have called this project Dorodango. (See
  http://www.dorodango.com.)
 
  The point of this project is to do things right. I want to _fix_
  everything I can. My insane cleanup review passes involve trying to find
  all this stuff and make sure we get it right.
 
  Sure, outside toybox yay sigpipe. But inside, the question is what the
  correct behavior should be.
 
  It sounds like android wants _exit(0)
 
 no, not particularly. we're arguing internally about whether SIGPIPE
 is a crash or not. i think even undoing the dynamic linker's
 installation of a crash reporter by setting it back to SIG_DFL would
 be fine for toybox --- the real problem is that the dynamic linker
 doesn't know whether it's dealing with an Android app (where you
 probably do want the crash reporting) or a command-line tool (where
 you probably don't), let alone corner cases like a command-line tool
 that's implemented in Java.

In that case the crash-reporting signal handler setup should be in the
Android app init code, not the dynamic linker, no?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-04-27 Thread Rich Felker
On Mon, Apr 27, 2015 at 12:30:05PM -0700, Isaac Dunham wrote:
 On Mon, Apr 27, 2015 at 08:36:53AM -0700, enh wrote:
  On Sun, Apr 26, 2015 at 7:30 PM, Rich Felker dal...@libc.org wrote:
   On Sat, Apr 25, 2015 at 12:14:44PM -0700, enh wrote:
   what's the plan wrt SIGPIPE? the desktop is pretty inconsistent. many
   (but not all) commands install a signal handler that does _exit(0).
   others (coreutils 8.21's ls, say) do nothing. normally what you do
   about SIGPIPE isn't a problem but on Android that leads to a crash
   report and people filing ls crashed bugs against me. (our default
   shell PS setup is also noisy about crashes.)
  
   Why not just *block* SIGPIPE (with sigprocmask) so that the write
   returns an error (EPIPE) and the program applies the same logic it
   would for any other write error?
  
  toybox does a little better than toolbox there thanks to xwrite, but
  i've yet never met anyone who checks the return value of printf...
 
 vmstat.c, cal.c, and see xprintf().
 In some places the choice of non-erroring functions is deliberate, so
 that a failure to write will not lead to an early exit.

Thanks for clarifying.

  plus there's the question of whether giving up because you're writing
  to a broken pipe is an error exit or not. you could add a special case
  to xwrite (and add xprintf and xputs and...), but since the whole idea
  is that code shouldn't have to care, it's easier just to install a
  signal handler that does _exit(0).
 
 There are xputs and xprintf already.
 
 Calling _exit() is wrong for toybox, since you cannot use atexit()
 type code (eg, delete tempfiles on exit) - xexit() is the recommended
 path for toybox.

Agreed. _exit was only proposed because exit (or xexit) would be
invalid from a signal handler in general.

 Using a fixed return value is wrong; for an example, grep returns
 0 for line matched, 1 for no match, and 1 for errors.

Agreed. This is another reason the logic needs to be local to the
tool or else xprintf/xwrite/etc. need a way to control the exit value
if they don't already have one.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-04-27 Thread Rich Felker
On Mon, Apr 27, 2015 at 08:36:53AM -0700, enh wrote:
 On Sun, Apr 26, 2015 at 7:30 PM, Rich Felker dal...@libc.org wrote:
  On Sat, Apr 25, 2015 at 12:14:44PM -0700, enh wrote:
  what's the plan wrt SIGPIPE? the desktop is pretty inconsistent. many
  (but not all) commands install a signal handler that does _exit(0).
  others (coreutils 8.21's ls, say) do nothing. normally what you do
  about SIGPIPE isn't a problem but on Android that leads to a crash
  report and people filing ls crashed bugs against me. (our default
  shell PS setup is also noisy about crashes.)
 
  Why not just *block* SIGPIPE (with sigprocmask) so that the write
  returns an error (EPIPE) and the program applies the same logic it
  would for any other write error?
 
 toybox does a little better than toolbox there thanks to xwrite, but
 i've yet never met anyone who checks the return value of printf...
 
 plus there's the question of whether giving up because you're writing
 to a broken pipe is an error exit or not. you could add a special case
 to xwrite (and add xprintf and xputs and...), but since the whole idea
 is that code shouldn't have to care, it's easier just to install a
 signal handler that does _exit(0).

_exit(0) is wrong and broken. It reports success when the command did
not succeed. Failure of the caller to recognize that it didn't succeed
can result in data loss. If anything the right behavior for the signal
handler would be _exit(1) not _exit(0). But I think xwrite/xprintf
solves the issue more elegantly. Note that even if you don't check
every printf, you still get the error status from ferror and/or fflush
at close time (if you check them) but that might result in a lot of
wasted cpu time if you continue processing/writing after a pipe broke.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-04-27 Thread Rich Felker
On Mon, Apr 27, 2015 at 09:26:11AM -0700, enh wrote:
 On Mon, Apr 27, 2015 at 9:22 AM, Rich Felker dal...@libc.org wrote:
  On Mon, Apr 27, 2015 at 09:17:31AM -0700, enh wrote:
  On Mon, Apr 27, 2015 at 9:11 AM, Rich Felker dal...@libc.org wrote:
   On Mon, Apr 27, 2015 at 08:36:53AM -0700, enh wrote:
   On Sun, Apr 26, 2015 at 7:30 PM, Rich Felker dal...@libc.org wrote:
On Sat, Apr 25, 2015 at 12:14:44PM -0700, enh wrote:
what's the plan wrt SIGPIPE? the desktop is pretty inconsistent. many
(but not all) commands install a signal handler that does _exit(0).
others (coreutils 8.21's ls, say) do nothing. normally what you do
about SIGPIPE isn't a problem but on Android that leads to a crash
report and people filing ls crashed bugs against me. (our default
shell PS setup is also noisy about crashes.)
   
Why not just *block* SIGPIPE (with sigprocmask) so that the write
returns an error (EPIPE) and the program applies the same logic it
would for any other write error?
  
   toybox does a little better than toolbox there thanks to xwrite, but
   i've yet never met anyone who checks the return value of printf...
  
   plus there's the question of whether giving up because you're writing
   to a broken pipe is an error exit or not. you could add a special case
   to xwrite (and add xprintf and xputs and...), but since the whole idea
   is that code shouldn't have to care, it's easier just to install a
   signal handler that does _exit(0).
  
   _exit(0) is wrong and broken. It reports success when the command did
   not succeed.
 
  coreutils seems divided on this issue.
 
  Could you provide some examples? I think POSIX is pretty clear that
  you can't return success if the output was not successfully written.
  As I already mentioned doing so results in data loss. I really doubt
  coreutils is doing this, but well, GNU software is not always a good
  example of doing the Right Thing...
 
 i did in my original mail.

OK, I'll look back to it.

   Failure of the caller to recognize that it didn't succeed
   can result in data loss. If anything the right behavior for the signal
   handler would be _exit(1) not _exit(0). But I think xwrite/xprintf
   solves the issue more elegantly. Note that even if you don't check
   every printf, you still get the error status from ferror and/or fflush
   at close time (if you check them) but that might result in a lot of
   wasted cpu time if you continue processing/writing after a pipe broke.
 
  it would also result in things like top(1) not exiting.
 
  If top is using stdio, it's easy to add a single ferror(stdout) check
  to the main loop. If it's using write(), it _must_ be checking for
  errors anyway since write can always return with a short/partial
  write. But top is unlikely to be hooked up to a pipe or socket anyway;
  it normally needs a terminal.
 
 the point is that then people need to think. if that were a plausible
 solution, this thread wouldn't exist. the advantage of the signal
 handler is that humans can keep on being humans.

Either way people need to think; otherwise you have a bug where write
failure is not being reported and data loss results. You won't have
that from EPIPE if you use the signal handler, but you'll instead have
it from ENOSPC (at the very least, and perhaps others). The main flow
of execution _NEEDS_ to check for write errors. If you can't trust
people to do this locally, just have a policy of using xwrite,
xprintf, etc.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] default SIGPIPE handler that calls _exit(0)?

2015-04-26 Thread Rich Felker
On Sat, Apr 25, 2015 at 12:14:44PM -0700, enh wrote:
 what's the plan wrt SIGPIPE? the desktop is pretty inconsistent. many
 (but not all) commands install a signal handler that does _exit(0).
 others (coreutils 8.21's ls, say) do nothing. normally what you do
 about SIGPIPE isn't a problem but on Android that leads to a crash
 report and people filing ls crashed bugs against me. (our default
 shell PS setup is also noisy about crashes.)

Why not just *block* SIGPIPE (with sigprocmask) so that the write
returns an error (EPIPE) and the program applies the same logic it
would for any other write error?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] PATCH telnetd

2015-03-17 Thread Rich Felker
On Tue, Mar 17, 2015 at 07:31:12PM +0900, 김혜진 wrote:
 Hi. All
 
 I found an issue of telnetd that it loops infinitely when client
 disconnected abnormally on working.
 like client is killed by outside.
 
 On embedded system, this issue make that cpu occupation goes high.
 
 In this case :
 select() returns 1, means socket fd has something changes.
 but, read()/write() after select() gets an error because client already
 disconnected.

read does not return an error; it returns 0, indicating EOF.

 attached patch help prevent this issue.
 plz, check.

The patch as written does not seem correct. There is no justification
for killing the child process with SIGKILL; this is a very bad
practice and precludes any sort of cleanup it may need to do. Unless
it's specifically written not to, the child running in the telnet
session should automatically terminate when it loses its controlling
terminal, which should happen when the telnet session exits.

Can you clarify what exactly is going wrong for you?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] perf(1)

2015-03-12 Thread Rich Felker
On Thu, Mar 12, 2015 at 05:24:00PM -0700, enh wrote:
 On Thu, Mar 12, 2015 at 4:18 PM, Rich Felker dal...@libc.org wrote:
  On Thu, Mar 12, 2015 at 03:12:18PM -0700, enh wrote:
  https://android-review.googlesource.com/#/c/128430/
 
  On Thu, Mar 12, 2015 at 3:10 PM, Rich Felker dal...@libc.org wrote:
   On Thu, Mar 12, 2015 at 03:07:39PM -0700, enh wrote:
   there's a tool in Android that packs relative relocations. saves
   several MiB on the chromium webview .so. long term we'd like to see
   gold support it directly, but in the meantime it's a post-processing
   phase.
  
   What kind of packing are you doing?
 
  OK, so it's via customization of the loader to process a different
  form of relocations. This seems dubious to me -- if the number of
  relocations is so large that it makes a practical difference to the
  file size, then you also have a large number of dirty pages that are
  wasting memory at runtime. Fixing the need for so many relative
  relocations, which are presumably a result of unnecessary and harmful
  global state containing pointers to functions or to other global
  state, seems like it would be a much more beneficial approach than
  packing the relocations.
 
 no one ever argued that chromium was svelte. but picking off 10%
 mechanically is a lot easier than rewriting all the code.

I just see it a bit differently -- shaving off some portion of the
relative relocations in chromium itself is a local change that's a
strict improvement, but hacking the ELF format to pack relative
relocations is a tradeoff -- a change that affects many different
tools and has permanent maintenance cost for all of them.

 the fact that this only affects LP64 makes me think it's not as simple
 as that anyway, but afaik no one's looked into that. aarch64 is
 especially bad.

Yes, that seems really suspicious -- perhaps it's a toolchain bug
that's generating all those relocations to begin with.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] perf(1)

2015-03-12 Thread Rich Felker
On Thu, Mar 12, 2015 at 03:07:39PM -0700, enh wrote:
 there's a tool in Android that packs relative relocations. saves
 several MiB on the chromium webview .so. long term we'd like to see
 gold support it directly, but in the meantime it's a post-processing
 phase.

What kind of packing are you doing?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] perf(1)

2015-03-12 Thread Rich Felker
On Thu, Mar 12, 2015 at 03:12:18PM -0700, enh wrote:
 https://android-review.googlesource.com/#/c/128430/
 
 On Thu, Mar 12, 2015 at 3:10 PM, Rich Felker dal...@libc.org wrote:
  On Thu, Mar 12, 2015 at 03:07:39PM -0700, enh wrote:
  there's a tool in Android that packs relative relocations. saves
  several MiB on the chromium webview .so. long term we'd like to see
  gold support it directly, but in the meantime it's a post-processing
  phase.
 
  What kind of packing are you doing?

OK, so it's via customization of the loader to process a different
form of relocations. This seems dubious to me -- if the number of
relocations is so large that it makes a practical difference to the
file size, then you also have a large number of dirty pages that are
wasting memory at runtime. Fixing the need for so many relative
relocations, which are presumably a result of unnecessary and harmful
global state containing pointers to functions or to other global
state, seems like it would be a much more beneficial approach than
packing the relocations.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] I aten't dead.

2015-02-24 Thread Rich Felker
On Tue, Feb 24, 2015 at 01:30:20PM -0800, enh wrote:
 On Tue, Feb 24, 2015 at 11:46 AM, Rob Landley r...@landley.net wrote:
  Sorry for the radio silence, $DAYJOB's been eating the majority of my
  programming time and what's left has gone to Aboriginal Linux recently,
  because I made a largeish design change over there (putting the base
  root filesystem in initmpfs and merging the native-compiler at runtime
  instead of compile time), and the aboriginal linux release soft-blocks
  the toybox release because I use built aboriginal and then built linux
  from scratch under it as my main toybox regression-smoketest.
 
  Last night's I hit a fun bug in toybox stat where it doesn't remotely
  work on arm. (Works fine on x86 host, but 3.18 kernel built against
  uClibc on armv5l: the -f numbers are way off, it thinks the blocksize of
  ext2fs is 32 bytes.)
 
 seems fine on armv7 3.10 with bionic. oh, specifically the -f output.
 yeah, that looks wrong :-)
 
 a lot of those fields are actually 64-bit, so you need an extra 'l' on LP32:
 
 diff --git a/toys/other/stat.c b/toys/other/stat.c
 index d603316..a96c1de 100644
 --- a/toys/other/stat.c
 +++ b/toys/other/stat.c
 @@ -106,11 +106,11 @@ static void print_stat(char type)
  static void print_statfs(char type) {
struct statfs *statfs = (struct statfs *)TT.stat;
 
 -  if (type == 'a') xprintf(%lu, statfs-f_bavail);
 -  else if (type == 'b') xprintf(%lu, statfs-f_blocks);
 -  else if (type == 'c') xprintf(%lu, statfs-f_files);
 -  else if (type == 'd') xprintf(%lu, statfs-f_ffree);
 -  else if (type == 'f') xprintf(%lu, statfs-f_bfree);
 +  if (type == 'a') xprintf(%llu, statfs-f_bavail);
 +  else if (type == 'b') xprintf(%llu, statfs-f_blocks);
 +  else if (type == 'c') xprintf(%llu, statfs-f_files);
 +  else if (type == 'd') xprintf(%llu, statfs-f_ffree);
 +  else if (type == 'f') xprintf(%llu, statfs-f_bfree);
else if (type == 'l') xprintf(%ld, statfs-f_namelen);
else if (type == 't') xprintf(%lx, statfs-f_type);
else if (type == 'i')

Rather than hard-coding an assumption that these have type unsigned
long long, the code should be casting them to unsigned long long or
uintmax_t and using an appropriate format specifier. I prefer
uintmax_t since it's idiomatic and can be used for formatting any
unsigned type, but I know Rob can be allergic to these nice semantic
types... :-)

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] I aten't dead.

2015-02-24 Thread Rich Felker
On Tue, Feb 24, 2015 at 01:50:03PM -0800, enh wrote:
 On Tue, Feb 24, 2015 at 1:48 PM, Rich Felker dal...@libc.org wrote:
  On Tue, Feb 24, 2015 at 01:30:20PM -0800, enh wrote:
  On Tue, Feb 24, 2015 at 11:46 AM, Rob Landley r...@landley.net wrote:
   Sorry for the radio silence, $DAYJOB's been eating the majority of my
   programming time and what's left has gone to Aboriginal Linux recently,
   because I made a largeish design change over there (putting the base
   root filesystem in initmpfs and merging the native-compiler at runtime
   instead of compile time), and the aboriginal linux release soft-blocks
   the toybox release because I use built aboriginal and then built linux
   from scratch under it as my main toybox regression-smoketest.
  
   Last night's I hit a fun bug in toybox stat where it doesn't remotely
   work on arm. (Works fine on x86 host, but 3.18 kernel built against
   uClibc on armv5l: the -f numbers are way off, it thinks the blocksize of
   ext2fs is 32 bytes.)
 
  seems fine on armv7 3.10 with bionic. oh, specifically the -f output.
  yeah, that looks wrong :-)
 
  a lot of those fields are actually 64-bit, so you need an extra 'l' on 
  LP32:
 
  diff --git a/toys/other/stat.c b/toys/other/stat.c
  index d603316..a96c1de 100644
  --- a/toys/other/stat.c
  +++ b/toys/other/stat.c
  @@ -106,11 +106,11 @@ static void print_stat(char type)
   static void print_statfs(char type) {
 struct statfs *statfs = (struct statfs *)TT.stat;
 
  -  if (type == 'a') xprintf(%lu, statfs-f_bavail);
  -  else if (type == 'b') xprintf(%lu, statfs-f_blocks);
  -  else if (type == 'c') xprintf(%lu, statfs-f_files);
  -  else if (type == 'd') xprintf(%lu, statfs-f_ffree);
  -  else if (type == 'f') xprintf(%lu, statfs-f_bfree);
  +  if (type == 'a') xprintf(%llu, statfs-f_bavail);
  +  else if (type == 'b') xprintf(%llu, statfs-f_blocks);
  +  else if (type == 'c') xprintf(%llu, statfs-f_files);
  +  else if (type == 'd') xprintf(%llu, statfs-f_ffree);
  +  else if (type == 'f') xprintf(%llu, statfs-f_bfree);
 else if (type == 'l') xprintf(%ld, statfs-f_namelen);
 else if (type == 't') xprintf(%lx, statfs-f_type);
 else if (type == 'i')
 
  Rather than hard-coding an assumption that these have type unsigned
  long long, the code should be casting them to unsigned long long or
  uintmax_t and using an appropriate format specifier. I prefer
  uintmax_t since it's idiomatic and can be used for formatting any
  unsigned type, but I know Rob can be allergic to these nice semantic
  types... :-)
 
 yeah, i almost said the same but because i know that you can't even
 rely on the signedness being the same between different architectures
 i didn't want to get into that argument :-)

I wouldn't worry about the signedness unless negative values are
actually meaningful. Here they're not so an unsigned type is safe; the
full positive range of any type fits in uintmax_t.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] mktemp fixes

2015-02-10 Thread Rich Felker
On Tue, Feb 10, 2015 at 01:47:01AM -0600, Rob Landley wrote:
  sort-of speaking of which... i didn't include this before since it
  wasn't really a bug fix but do you think we should use more
  randomness? 6 Xes is the minimum you're allowed to pass to the C
  library, and the desktop mktemp(1) defaults to 10.
 
 I don't have a strong opinion either way?
 
 Backing up: the man page for mkdtemp/mkstemp says the last six
 characters of template must be XX and the posix spec backs that up,
 meaning the libc functions seem hardwired to 6 characters.

POSIX is allowing 6 as an extension, but it's not terribly useful.

 If you can predict the random digits, we're toast anyway. If you can

For DoS only. The only times anything worse happens is when you
already have a fatal programming error with regards to file
creation/opening. These were problems that were solved 25+ years ago;
the only reason they (/tmp vulns) keep appearing is because people
fail to make correct use of the tools they already have like O_EXCL.

 rapidly respond to arbitrary file creation ala inotify, we're toast. So
 the attack vector would be... saturating the namespace with symlinks?
 (It'd be really nice if O_NOFOLLOW was more generally applicable, but
 we've had that fight. Also the posix function is specified NOT to use
 O_NOFOLLOW. So let's assume symlink attacks are doable here and somehow
 useful even with O_EXCL, creating the file at a known location you can
 spin to check for or something.)

Which POSIX function is specified not to use O_NOFOLLOW?

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] su patch

2015-02-02 Thread Rich Felker
On Tue, Feb 03, 2015 at 12:14:54AM -0600, Rob Landley wrote:
  A) I was just going to use the encryption the first entry (presumably
  root) had, given a choice. (Although again, if the android mechanism
  doesn't let you get a user list... presumably they still have uid 0 
  though.)
  
  This sounds unreliable; in hardened environments you might not be able
  to get another user's password hash.
 
 In which case I fault back to the first entry in the list because I know
 it's there?
 
 Newer algorithms may not be supported by the toolchain/libc they're
 building against, and if they're not we don't get a compile-time warning
 about it. Right now sha256 is recent enough (added to glibc 2007) that
 we can probably expect it to be there, but if sha3 gets added, how do we
 know if we've got it?

That's why I suggested trying them in sequence of strength -- at
runtime. Just have an array of known setting string forms (this is
tiny) and iterate over it until crypt doesn't spit out an error.

  Simply
  trying different setting strings for each known algorithm in
  decreasing order of strength and using the first one that works seems
  a lot smarter.
 
 Which means my code has to know about decreasing order of strength,
 and I need to change it when new algorithms are introduced, and then
 change it more for random libc forks that have their own experimental
 stuff because Red Hat wants to differentiate itself to justify the
 Enterprise per-seat cost...

You don't need to change it for Red Hat. Just include in the list all
known setting strings, and the ones that aren't supported won't be
used. The current order is something like:

  bcrypt, sha512, sha256, enhanced des, md5, classic des

 I suppose I can do a menuconfig option for it, but ugh. 95% of
 everybody's going to go with the default there anyway, this is way too
 esoteric to ask most people to make decisions about.

Pretty much anyone who does know/care/understand wants the strongest
crypt their libc/tools supports, and those who don't
know/care/understand should have the most secure option chosen for
them by default.

 
  B) des salt is 4096 possible entries, _anything_ else lets you use more
  salt.
  
  md5crypt has been EOL'd by its author and really should not be used as
  a default for anything new. See:
  
  http://phk.freebsd.dk/sagas/md5crypt_eol.html
 
 I actually go by:
 
   http://valerieaurora.org/hash.html
 
 Which basically says anything but sh3 is already toast, and that one's
 just too new to have much data on yet. (See the table at the end for
 explanations of the stages. That's the best part.)

No, that's unrelated to crypt. It's about a single application of the
underlying hash functions, not the KDF's used in crypt. Note that
password hash or KFD is not mentioned anywhere on that page
because that's not what it's about.

 My assumption is that if people can _get_ your hash, you've probably
 already lost. Hash checking is an embarassingly parallelizeable problem

Yes and no. From an individual standpoint defending your own accounts,
this is largely true. On the other hand, from a site-wide perspective
and general internet health perspective where you're concerned about
things like the portion of users whose accounts can be efficiently
compromised, the KDF used makes a big difference.

The extent to which this matters for unix login accounts is of course
open to discussion. Ideally security-conscious users are only using
ssh public key auth anyway and have passwords disabled. So from that
standpoint this is all more relevant to other places passwords are
used, which is mostly outside the scope of the toybox utilities. On
the other hand, for sites that still ARE using password-based unix
account logins, I would think you'd want to mitigate the risks.

 It doesn't matter what hash you're using if passwords being input have
 less bits than the keyspace. The set of keys humans press in sequence
 without 2 factor authentication is pretty darn attackable. My keyboard
 has something like 47 shiftable keys on it, plus space and tab. That's
 96^length, so anything less than 10 chars long is under 2^64 likely
 combinations _before_ you do statistical analysis of the likelihood of
 symbol occurrence to sort your keyspace traversal and so on. (For
 reference, distributed.net brute forced a 64 bit key using volunteered
 CPU time over a dozen years ago.)

That's why KDF's are designed to be expensive to compute.

 Me, I assume any box with cracked permissions to read /etc/shadow has a
 keylogger installed anyway. But I'm not a security expert...

So, like you say about non-experts in licensing trying to make/choose
licenses

 So yes if you think sh3 should be the default when nobody specifies, I'm
 all for it. Let me know when that shows up in crypt() and what number it
 gets assigned, the man page for crypt in ubuntu 14.04 hasn't got it and
 google isn't being helpful either.

sh3 (Keccak) itself would not be a KDF. You can build a KDF 

Re: [Toybox] su patch

2015-02-01 Thread Rich Felker
On Mon, Feb 02, 2015 at 10:33:24AM +0900, 김혜진 wrote:
 Hello. I share a patch of su command.
 
 
 -- Forwarded message --
 From: 김혜진 hj8...@gmail.com
 Date: 2015-01-29 21:35 GMT+09:00
 Subject: Re: su patch
 To: Rob Landley r...@landley.net
 
 
 Hi. Rob
 
 I Send you fixed patch.
 
 As I look at passwd and mkpasswd, default encryption method is des.
 Because useradd calls passwd with no option of encryption method, it make
 des encrypted password.
 It was the fact.
 So, I changed the default encryption method to md5.
 But, su cannot check if 0 index of password is $, because user can choose
 des at any time if they want.
 
 plz check my patch!
 
 And, If you have more time, plz review my questions regarding netcat.

md5 is not better and probably worse than des. You should be using
bcrypt if the system supports it and otherwise sha256, sha512, or the
enhanced des support some systems have. These could be tried as
sequential fallbacks.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] Seperate 'userdel' from testing syntax

2015-01-30 Thread Rich Felker
On Fri, Jan 30, 2015 at 03:55:02PM -0800, enh wrote:
 as i've said, Android doesn't have /etc/passwd (or /etc/group), and it
 doesn't have setpwent/getpwent/endpwent (or equivalents). you *can* do
 uid/gid or name lookups though, because they do make some degree of
 sense. (so id(1) works.) i've considered implementing getpwent so that
 it would cycle through the well known users, but we don't actually
 have an example of anything that would use it, and it's really not
 obvious that we'd be doing anyone any favors --- code calling getpwent
 that wants to run on Android needs to think long and hard about
 exactly what it means by that, and whether it makes any sense at all.

The main application I've seen that makes _any_ sense is
tab-completion of usernames in the shell.

 the fact that Android tends to be pretty locked down with SELinux
 means it's going to be tricky to test a lot of toybox things, but
 getting to a point where i can run the test suite is definitely a
 goal. just not a priority (because i can run the tests on the desktop
 with bionic).

Is there any guide to getting bionic built and working on a desktop
system, short of doing a complete android build? I tried a while back
with the gentoobionic repo which was setup to work without the android
build system, but didn't get anywhere with it. This is not something I
have a lot of time to spend on, but I'd like it if I could get
something working to evaluate differences between musl and bionic.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Is binutils neccessary?

2015-01-27 Thread Rich Felker
On Tue, Jan 27, 2015 at 02:29:52PM -0500, stephen Turner wrote:
 
 
  musl is written for Linux and as long as it doesn't support the *BSD
  syscall/ABI makes no sense speculate on such things like BSD make.
 
 
 can both be supported from the same source package or would musl have to
 fork?

The BSD syscall ABIs I've seen aren't very usable (no way to make a
syscall without a stack, which is mandatory, and likewise no futex),
but fortunately you have Linux syscall ABI available on some BSDs.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Is binutils neccessary?

2015-01-27 Thread Rich Felker
On Tue, Jan 27, 2015 at 01:52:37PM -0500, stephen Turner wrote:
 
  They've (FreeBSD) already made their choice (llvm/c++). Therefore, I
  believe that elftoolchain may be orphaned. The project is not too
  actively developed, but it is small, written in C and it fits
  perfectly with the pcc, tcc, libfirm/cparser etc... and musl. I think
  it could be a very useful addition to musl.
 
 
 I agree, it feels like a good fit for the non-gnu non-llvm crowd. From my
 poking around and i have been wrong before, i think musl and pcc are the
 only real and active alternatives to gnu and bsd. who heads up pcc anyways?
 i forget if pcc is on this email chain, maybe they are interested?
 
 pcc is linux/bsd focused, having elftoolchain be dual focused more so than
 it already is (add support for gnumakefile) and throw in a little musl, we
 have a nicely portable system. I like it!
 
 does musl support bsd makes?

No, because I'm not aware of any way to do the type of generic rules
we use that would be portable to BSD make. Having to add contents to
the makefile for every new source file is unacceptable redundancy (and
part of the reason hideous things like automake were invented).

If there is a way to do the equivalent in BSD make, I'm not opposed to
trying to support both; it might be something reasonable to work on at
the same time as other build system goals like support for out-of-tree
build.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Is binutils neccessary?

2015-01-27 Thread Rich Felker
On Tue, Jan 27, 2015 at 08:21:04PM -0500, stephen Turner wrote:
  The BSD syscall ABIs I've seen aren't very usable (no way to make a
  syscall without a stack, which is mandatory, and likewise no futex),
  but fortunately you have Linux syscall ABI available on some BSDs.
 
 So I read that unix dates back to the 60's...  who is more accurate to the
 unix syscall linux or bsd or other? Im curious who decided it was a good
 idea to be different. Gnu im betting...

I suspect this follows the BSD/SYSV split, but I may be mistaken.
AFAIK Linux copied the SYSV-style syscall ABIs (and other psABI
aspects) for the existing proprietary unices on the platforms it
supported (originally, just the SCO ABI on x86).

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Is binutils neccessary?

2015-01-26 Thread Rich Felker
On Mon, Jan 26, 2015 at 06:34:27PM +0100, Daniel Cegiełka wrote:
 2015-01-26 18:13 GMT+01:00 stephen Turner stephen.n.tur...@gmail.com:
  thanks for the response man! I found this and pointed it out in another
  email chain actually!  I haven't had a chance to test it and see how it
  compares to binutils but i definitely plan to very soon and will report back
  how it works. unfortunately i dont have a setup like rob yet for testing
  against a full lfs build (todo item) but i can test it against my aboriginal
  knock off.
 
 They have interesting intentions and a very fit with toybox (and musl).
 
 http://sourceforge.net/p/elftoolchain/wiki/Home/

Very interesting. Has anyone looked into the quality of the code? Any
ideas what design they're using for the linker? Linking large projects
with GNU ld and gold has gotten basically impossible without a 64-bit
memory space, so it would be really nice to see a new linker that's
efficient. I've worked on linker designs with some of the people on
#musl but none of us have the time to work on a linker, so I'd be
interested in hearing if they're taking a similar approach or if any
of the outcomes of our design discussions might be useful
contributions to their project.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] toybox in Android status

2015-01-18 Thread Rich Felker
On Sat, Jan 17, 2015 at 06:59:57PM -0600, Rob Landley wrote:
 On 01/17/2015 01:53 PM, Rob Landley wrote:
  On 01/17/2015 12:09 AM, enh wrote:
  So anyway I checked that in, and now I'm porting your patch on top of
  that change and I hit the _weirdest_ thing where ./touch nonexistent
  doesn't cause an error because utimensat() on a file that didn't exist
  returned ok (file still doesn't exist afterwards),
 
 Because if both timestamps have nanoseconds set to UTIME_OMIT it returns
 success even if the file did not exist.
 
 Possibly the man page should mention that.

No, it's a kernel bug that should be fixed, and that may be necessary
to work around in userspace if the kernel doesn't fix it. See
http://pubs.opengroup.org/onlinepubs/9699919799/functions/futimens.html

In particular:

The utimensat() and utimes() functions shall fail if:

[ENOENT]
A component of path does not name an existing file or path is an
empty string.

There is no option to ignore this error condition just because both
timestamps are UTIME_OMIT.

This is made even more clear in the description:

If both tv_nsec fields are set to UTIME_OMIT, no ownership or
permissions check shall be performed for the file, but other error
conditions may still be detected (including [EACCES] errors
related to the path prefix).

If you can confirm the incorrect behavior, we should report this to
the kernel folks.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2015-01-04 Thread Rich Felker
On Sun, Jan 04, 2015 at 12:39:25PM -0600, Rob Landley wrote:
  (Redirecting /bin/sh to point to dash instead of bash was still a dumber
  move, though.)
 
  I fail to see how this was dumb. It made shellshock a non-issue
 
  When they switched to dash I _segfaulted_ the thing multiple times. It
  was a buggy pile of crap for _years_.
 
  It's interesting you think shellshock was a non-issue, presumably that's
  why it didn't make the news or anything? (I'm consistently amused that
  
  I don't think shellshock was a non-issue. I just think it was mostly a
  non-issue for systems that don't use bash as /bin/sh or their login
  shell for restricted accounts (forced command in authorized_keys
  file), and this is one reason I think it was a good move not to make
  /bin/sh be a link to bash.
  
  Yes some people write crappy scripts that depend on bash but don't use
  #!/bin/bash. I don't and I don't like paying the penalty for people
  who do this. They should fix their broken scripts, and preferably they
  should write portable shell script instead of bash. I suspect you'll
  be more agreeable to this position once more bash-specific scripts
  start depending on post-GPLv3 versions of bash... :-)
 
 No, I'm implementing the non-crazy bash extensions people actually use
 in toysh. (Toybox should eventually be able to build itself including
 its own sed invocations and its own shell to run scripts/*.sh.) This
 is why I went to the advanced bash scripting talk at Texas LinuxFest,
 to see what tricks that speaker felt were worth knowing/using (and thus
 implementing).

Nothing wrong with that; if it helps eliminate bash from more systems
I'm quite happy to see it happen. But...

 Coming up with a bash replacement, yay. But dash didn't even bother to
 implement source as an alias for ., or implement the function
 keyword (it's basically a comment). Those were trivial bash flavoring
 things that would be like 5 minutes coding, and they explicitly chose
 _not_ to do that because they did not _want_ compatibility.

I see those things as roughly equivalent to sys/poll.h and
sys/fcntl.h in musl -- or maybe even sys/cdefs.h which musl
doesn't have. These are not useful features but coddling of spelling
mistakes made in non-portable code. And it's a matter of opinion,
where both sides have valid points, whether it's better to pressure
people to fix those spelling mistakes or simply work around them.

 (Stuff like
 implementing path/to/{one,two,three} curly bracket support would be more
 like a weekend of coding, but still: that not being there broke stuff.)

Additional features like that also cause incorrect behavior in
non-bash scripts unless they're off-by-default.

 I repeat: ubuntu made a bad technical decision, gratuitously breaking
 compatibility for its existing userbase for a stated reason that was
 inadequate to justify the fallout, one which could easily have been
 accomplished a different way without end user impact, and fairly quickly
 _was_ because it didn't accomplish its stated goal but the change was
 retained anyway.

I think the security and runtime-cost benefits were more than
sufficient to justify the fallout. The broken scripts are easily
remedied just by fixing the #! line, or they can be made portable if
that was the intent to begin with.

If there were cases where dash really was technically worse than bash
(like crashing, memleaks, etc.) then that's another matter, but I
think they just indicate that the switch was made prematurely rather
than that it was wrong in concept.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2015-01-04 Thread Rich Felker
On Sun, Jan 04, 2015 at 06:20:05PM -0600, Rob Landley wrote:
 
 
 On 01/04/2015 01:04 PM, Rich Felker wrote:
  On Sun, Jan 04, 2015 at 12:39:25PM -0600, Rob Landley wrote:
  Coming up with a bash replacement, yay. But dash didn't even bother to
  implement source as an alias for ., or implement the function
  keyword (it's basically a comment). Those were trivial bash flavoring
  things that would be like 5 minutes coding, and they explicitly chose
  _not_ to do that because they did not _want_ compatibility.
  
  I see those things as roughly equivalent to sys/poll.h and
  sys/fcntl.h in musl -- or maybe even sys/cdefs.h which musl
  doesn't have.
 
 http://landley.net/hg/aboriginal/rev/540
 
 I want stuff to _work_. Throw warnings sure, but if something is
 actually _using_ it...
 
  These are not useful features but coddling of spelling
  mistakes made in non-portable code. And it's a matter of opinion,
  where both sides have valid points, whether it's better to pressure
  people to fix those spelling mistakes or simply work around them.
 
 Indeed. I come down on the side of having a dos box you can confine old
 code to so it still works, and have -Wall to let you fix new stuff.
 
 You've seen the lengths I went to in aboriginal to do host/path and
 more/record-commands.sh and forensically determine exactly what was
 going _on_ in the builds and the specific requirements down to the bare
 metal. Heck, I worked out the minimal kernel .config that had the system
 calls I needed.
 
 But then what I _ship_ includes linux from scratch chroot tarballs
 because other people reproducing my minimal environment is science, but
 other people starting with a lfs tarball is engineering.
 
  (Stuff like
  implementing path/to/{one,two,three} curly bracket support would be more
  like a weekend of coding, but still: that not being there broke stuff.)
  
  Additional features like that also cause incorrect behavior in
  non-bash scripts unless they're off-by-default.
 
 Except that on Linux, non-bash scripts were almost nonexistent before
 2006 (ported from other OSes and you know where to find ksh or zsh if it
 says #!/bin/zsh), and even since then it's still a minority and not very
 _interesting_.

Every single practical configure script (autoconf generated or written
by hand) is a non-bash script. That's quite a huge portion of shell
scripts. And they've been around since the beginning on Linux.

 I view posix sh - bash much the way I System V - Linux. (System V
 didn't have procfs. I'm writing stuff that uses procfs. There are
 compatability  modes that implement that in BSD, and limiting yourself
 to /etc/mtab not being a symlink to /proc/mounts means you're not
 dealing with reality on  modern system. You _can't_ get that right
 entirely in userspace these days.)

These are all implementation details that matter only to a few
boot-time utilities and such. They're all irrelevant to applications.
It's stupid and inappropriate for applications to assume procfs or
mtab or even the concept of mounts, which won't exist on future
systems when Linux is considered a dinosaur... This is why the
standards omit them. Standards are about what an application can
expect to see, not how it gets done under the hood. They're about
moving us forward into compatibility and interoperability, not
backwards into lock-in to outdated implementation details.

  I repeat: ubuntu made a bad technical decision, gratuitously breaking
  compatibility for its existing userbase for a stated reason that was
  inadequate to justify the fallout, one which could easily have been
  accomplished a different way without end user impact, and fairly quickly
  _was_ because it didn't accomplish its stated goal but the change was
  retained anyway.
  
  I think the security and runtime-cost benefits were more than
  sufficient to justify the fallout.
 
 I see no benefits.

2x VSZRW, 5x actual ram usage (dirty pages), per instance.

And anecdotally (I don't have figures) performance is considerably
better.

And of course, most importantly, complete lack of function exports and
other dangerous code paths processing potentially untrusted data that
should never have existed in the first place.

  The broken scripts are easily
  remedied just by fixing the #! line, or they can be made portable if
  that was the intent to begin with.
 
 Every script I know changed the #! line, but that was a bug workaround
 for ubuntu breaking the /bin/sh symlink.

Every script you know? No, some small set of bash scripts that were
wrongly labeled as sh scripts. The vast majority of the scripts you
know are configure scripts and most of them even run on ancient
pre-POSIX shells.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2015-01-03 Thread Rich Felker
On Thu, Jan 01, 2015 at 10:41:22PM -0600, Rob Landley wrote:
 On 01/01/2015 01:04 PM, dmccunney wrote:
  On Thu, Jan 1, 2015 at 1:39 AM, David Seikel onef...@gmail.com wrote:
  
  I have a simple test to decide if I like an editor as a result of these
  decades of random editor usage.  If I can't sit down with the editor
  and figure out how to do basic editing and saving in less than a
  minute (sans documentation), then in my opinion it's a crap editor.
  Both TECO and vi fail this test miserably, though oddly enough I have a
  soft spot for TECO.
  
  These days, the general assumption is that you can open a file in an
  editor with editor filename, and that once up, cursor keys can
  be used to move around in the file and that text can be added where
  desired by typing it at the cursor location and deleted with Backspace
  or Delete keys.
  
  Vi originated in the days when some of those assumptions might not be
  true.  Some early terminals on Unix systems didn't *have* cursor keys
  or F-keys.  The vi command set and separation between input and
  command modes was a result.
 
 Indeed.
 
 However, ubuntu's decision to only allow you to cursor around in insert
 mode when you call vim and to _disable_ that when you call it as vi
 (so the cursor keys instead crap B[ and such all over your text) is
 insane and stupid. And the fix is to delete /etc/vim/vimrc.tiny and make
 it a symlink to just vimrc in the same directory. And the fact you
 _need_ to do that on each new ubuntu install is just one more way that
 Mark Shuttleworth is trying to cram his personal preferences down
 people's throats.

Yes this is idiotic.

 (Redirecting /bin/sh to point to dash instead of bash was still a dumber
 move, though.)

I fail to see how this was dumb. It made shellshock a non-issue and
massively reduced the memory requirements (and probably increased the
speed) of portable shell scripts. Bash-specific scripts should always
be using #!/bin/bash.

 If vi/uemacs/joe/nano are trivial extensions of the same basic
 infrastructure (sort of true modulo vi command mode), I have no problem
 implementing lots of sets of keybindings. But the first target is vi
 because it's the only one actually in posix.

My guess is that this is not so easy, and that attempting to do it
this way would have a lot of subtle failures that would just annoy
users. But it might be a lot less annoying than being stuck with
nothing but vi...

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2015-01-03 Thread Rich Felker
On Sun, Jan 04, 2015 at 02:24:19PM +1000, David Seikel wrote:
 On Sat, 3 Jan 2015 23:07:55 -0500 Rich Felker dal...@libc.org wrote:
 
   If vi/uemacs/joe/nano are trivial extensions of the same basic
   infrastructure (sort of true modulo vi command mode), I have no
   problem implementing lots of sets of keybindings. But the first
   target is vi because it's the only one actually in posix.  
  
  My guess is that this is not so easy, and that attempting to do it
  this way would have a lot of subtle failures that would just annoy
  users. But it might be a lot less annoying than being stuck with
  nothing but vi...
 
 It was actually quite easy once I had done my homework.  I don't see
 how it would have a lot of subtle failures that would just annoy
 users.

The how/why is just past experience, e.g. joe -- jmacs feels nothing
like emacs and it's frustrating because your brain has to be in emacs
mode to use it, but then nothing actually works like emacs. The only
emacs workalike I've seen that actually fits my emacs workflow is mg,
and even then only until I have a non-ASCII file, in which case it
breaks horribly...

 Though my design wasn't start with vi and bend it to look
 like other editors, it was research all the targeted editors /
 pagers, figure out what's common and what's not, then come up with a
 generic infrastructure that can handle them all.

That might work a lot better. I'd love to see it work out. I'm just
not holding my breath.. :-)

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] Screen vs. tmux

2014-12-29 Thread Rich Felker
On Tue, Dec 30, 2014 at 04:04:42AM +, Jason Spiro wrote:
  But before I could deal with THAT today I tried to swap in the linux
  3.18 kernel in aboriginal and build i686 and SOMEBODY ADDED PERL BACK as
  a build preprequisite. (I literally spent YEARS getting that removed
  last time, and they added it back in 3.18-rc6. Right at the END of the
  build cycle. Honestly...)
 
 I know you strongly dislike Perl.  But the Linux kernel maintainers don't
 mind it so much, and they probably don't care about Aboriginal Linux at all.
  Why don't you want to add Perl to Aboriginal Linux?  Perl isn't that big,
 and if you delete some or all of the modules which come with it, it's even
 smaller.

Um, I think Perl is about the size of all of Aboriginal combined.
Doubling the size of the package for the sake of gratuitous (i.e.
otherwise useless) crap needed for the kernel build system does not
make sense.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2014-12-23 Thread Rich Felker
On Tue, Dec 23, 2014 at 11:48:42AM -0800, enh wrote:
  Lots of us _do_ have to deal with serial terminals.
 
  Yes, I know.
 
  If you export $ROWS and $COLUMNS it'll do that instead of probe, and I
  can add a compile-time config symbol to switch it off if you care that 
  much.
 
  Exporting ROWS and COLUMNS precludes runtime size changes though. The
  correct way to handle size is TIOCGWINSZ. Perhaps you could use
  TIOCGWINSZ for device numbers that indicate virtual terminal or pty
  and only fall back to the ugly methods for serial ports?
 
 annoyingly, adb (Android Debug Bridge) doesn't currently pass window
 size changes through, so until someone fixes that Android users count
 as serial port users too :-(

From the Android side, does adb use a pty? Or is it a usb serial
port device? My impression was that it's the latter but I'm not very
familiar with it.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2014-12-22 Thread Rich Felker
On Mon, Dec 22, 2014 at 10:21:26PM -0600, Rob Landley wrote:
 If less is a priority, that actually helps prioritize the rest of
 them. Once I've written the basic navigate a line infrastructure (with
 querying screen size via ansi probe fallback, and reassembling escape
 sequences that got decoupled going over serial line; I fixed both of
 these in busybox several years ago), then stacking them isn't quite as
 big a deal.

I hope you'll have an option just to rely on TIOCGWINSZ. The escapes
that cause the terminal to echo back a response are considered a
security misdesign by some people (myself included) and they're also
problematic from the standpoint that you don't know if they'll be
supported and that you can't distinguish between a terminal that
doesn't support them and one which is just slow to respond. I realize
you don't have any other option on serial terminals, but thankfully
lots of us don't have to deal with serial terminals.

 I do follow a security researcher on twitter (@0xabad1dea) and for a
 while she had as her handle:
 
   echo -e Melissa \xe2\x80\xae ασσιλέΜ \xe2\x80\xed
 
 So it would be nice to get the reversal codes to work. But given that
 xfce's terminal doesn't handle it either...
 
 (Dalias wrote a terminal that probably does. I should compile/install
 that and try it out...)

Mine doesn't. Bidirectional text is the main (only?) multilingual
feature uuterm is missing.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2014-12-21 Thread Rich Felker
On Sun, Dec 21, 2014 at 01:07:02PM -0800, James McMechan wrote:
 I am would suggest something like GNU_QUIRKS and POSIX_QUIRKS config
 symbols to add in the odd features like gnu sed directly disagreeing
 with POSIX or POSIX's umm interesting xargs without -0 quoting
 conventions and any other why did they do it that way I know POSIX

There is nothing unusual or quirky about the way xargs without -0
works. It's the way it's always been on all historical systems, and
it's the only way xargs can actually process arbitrary lists of
filenames. Mistakenly treating the input as a newline-delimited list
of literals rather than a whitespace-delimited list of shell-quoted
strings is a toybox quirk that's a bug and just needs to be fixed.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2014-12-19 Thread Rich Felker
On Fri, Dec 19, 2014 at 02:05:10PM -0600, Rob Landley wrote:
 On 12/19/2014 01:54 PM, enh wrote:
  i know there's catv and nl already in toybox, but part of my goal with
  moving us to toybox is that people's muscle memory should just work.
 
 A good argument. :)

Could the cat toy just detect the -v option via a flag and jump to the
catv toy instead when it's set? This would avoid polluting its code
with that mess. The -v option and the code to jump could just be
omitted when catv is not included in the build.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [PATCH] optional fatter cat(1)

2014-12-19 Thread Rich Felker
On Fri, Dec 19, 2014 at 01:48:11PM -0800, enh wrote:
 I want to remove catv though.

Is there a reason you need to, though? This seems like purely an
aesthetic consideration to be balanced with the aesthetic
consideration of not having catv code in the main cat.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] archivers, read-write loops

2014-12-17 Thread Rich Felker
On Wed, Dec 17, 2014 at 02:33:36PM -0800, Isaac Dunham wrote:
   ssize_t according to man 2 sendfile. I just hadn't yet because nothing
   was using it. If I expand it, I'd want to move towards convergence with
   the syscall... except that gratuitously wants one of its arguments to be
   a network socket for NO OBVIOUS REASON.
   
   By indicate bytes written I mean return the total number of 
   bytes written.
   
   According to my man pages, In Linux kernels before 2.6.33, out_fd 
   must refer to a socket. Since Linux 2.6.33 it can be any file.
  
  That's still recent enough (2010) I need a probe, but yeah we should use it.
  
   Using sendfile will of course require a loop if you have a file larger
   than half the address space;
  
  Why? If you enable long file support in libc (hardwired on in musl,
  present in 2.4, _not_ enabling that is pilot error) then size_t should
  be 64 bit?
 
 size_t == maximum size *in memory* == long or unsigned long on *nix
 (musl defines size_t the same as ssize_t, since C sporadically
 intermixes the two...)

Huh? size_t is required to be an unsigned type and ssize_t is required
to be signed, so there's no way to define them the same. musl
certainly doesn't.

But yes they are both about sizes of objects in the C sense (in
memory), not file sizes. off_t is the type for file sizes.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] toybox - init?

2014-12-13 Thread Rich Felker
On Fri, Dec 12, 2014 at 11:19:43AM -0600, Rob Landley wrote:
 On 12/11/2014 11:16 AM, Rich Felker wrote:
  On Tue, Dec 09, 2014 at 06:21:04AM -0600, Rob Landley wrote:
  On 12/09/2014 03:38 AM, scsijon wrote:
  was just wondering on the status of init within toybox as I'm having a
  look at buildroot and maybe it would be useful there.
 
  thanks
  scsijon
 
  oneit is in and has worked for years. I'm pondering extending oneit with
  a pipe that writes exiting child process pids to a non-PID1 process that
  can deal with restarting them if it cares.
  
  For what it's worth, I like that concept. Do you have a way in mind
  for the non-PID1 process that's receiving these pids to match them up
  with services that need restarting?
 
 I haven't done a lot of design work here (I want to research what
 upstart and macosx launchd and so on are currently doing, and see what
 lxc's container launch program is doing, and yes look at systemd to
 understand the full horror; it's a can of worms I haven't opened yet
 because it'll eat my life for a bit, just like sed's been doing).
 
 But: the non-pid1 process should probably be the one that starts the
 other services in the first place.

Yes, we're on the same page there.

 Basically, oneit's design is launch a single child process, wait for it
 to exit, do a thing in response. That thing is currenty shutdown or
 reboot, but adding relaunch the child process is trivial.
 
 If oneit gets other things exiting at it, that's because they got
 reparented to init (I.E. their parent exited, which daemonize() does
 automatically). Right now it ignores them, but writing them to a pipe is
 also trivial.
 
 We should _not_ add anything to oneit that _isn't_ trivial. Policy
 belongs in pid 2.

Agreed.

  I don't see a good way to do this
  in general but it seems safe to use pid files (produced by the daemons
  themselves) here as long as the only purpose you use them for is
  detecting exit.
 
 If oneit launches another init-like program as PID 2, and that pid2
 parses its variant of inittab to launch a bunch of processes, then it
 can make a directory of pid files that contain the pid of the service,

That's the hard part -- how does it get these pids? If a process is
behaving properly and not daemonizing itself, then pid2 is the parent
and can use the return value of fork/vfork/posix_spawn directly with
waitpid to determine that the process exited, so there's no need for
the pipe from pid1 in that case. But in the common case, the daemon
daemonized itself (forked a second time internally) and pid2 has no
direct way of getting its pid. Most such daemons write pidfiles, so
pid2 could grab the pid from a pidfile, but this requires some
service-specific path knowledge/logic I think. :(

Is that what you have in mind? I think it probably works.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] [musl] Re: faccessat and AT_SYM_NOFOLLOW

2014-09-29 Thread Rich Felker
On Mon, Sep 29, 2014 at 08:27:19PM +0400, Alexander Monakov wrote:
 Isn't the reason for faccessat call before unlink is that rm without the -f
 flag is explicitely specified to ask for confirmation when the file is not
 writable?

This may be true (it was never stated when I asked about the purpose),
but in that case, faccessat still won't give the correct result unless
you use AT_EACCESS (which is broken with glibc and very expensive with
musl). The right way to achieve this would be to attempt to open the
file (or performing some other operation that would check for write
access with the correct effective/fs uid/gid) for writing before
unlinking it.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] faccessat and AT_SYM_NOFOLLOW

2014-09-28 Thread Rich Felker
On Sun, Sep 28, 2014 at 12:05:33PM -0500, Rob Landley wrote:
 On 09/25/14 11:01, Rich Felker wrote:
  This message is a follow-up to a discussion Rob Landley and I had on
  #musl regarding musl's returning an error of EINVAL when the
  AT_SYM_NOFOLLOW flag is passed to faccessat (a nonstandard usage),
  which is affecting toybox.
 
 Nonstandard, but documented in the Linux man page, and accepted by
 glibc, uclibc, and even klibc.
 
 The flag is currently ignored by existing libcs because the system call

No, it's not. As I described in the previous email, glibc implements
it in userspace, and does so incorrectly. I have no idea what uclibc
does. klibc probably does ignore it, but klibc is anything but a
guideline for correctness.

 In the meantime, I can either pass in the ignored flag requesting the
 behavior I _want_ (even if I can't currently get it), and then start
 getting the right behavior when libc upgrades, or I can add a when this
 happens, do this item to my todo list and never remember to go back and
 do it.

I'm still not sure what it is you want to get from this flag.

  Issue 1: Is the inclusion of AT_SYM_NOFOLLOW in the man page a total
  documentation error (not actually supported by glibc at all) or just a
  failure to mark it as a glibc extension?
 
  Here's the relevant glibc file:
  
  https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/faccessat.c;h=4a6048ec7930c8fc249ee629b1d2618cd81084b0;hb=HEAD
  
  As I read it:
  
  1. It sets EINVAL for flags that are invalid, but considers
 AT_SYM_NOFOLLOW valid. So far so good...
  
  2. It flag is 0 or the only flag set is AT_EACCESS and the binary is
 not suid, it just makes the syscall directly. OOPS, big bug -- it
 does not honor changes to the effective uid made by programs
 initially started as root!
 
 My lack of caring about AT_EACCESS is deep and profound, and it seems
 like setfsuid() would be the way to handle this if you did care? Rather
 than the weird child spawning thing musl is doing? But that's the quick
 glance reply...

access (always) and faccessat (by default) use the _real_ uid/gid to
check whether the file would be accessible as that user. setfsuid
would not do anything, because unlike functions that actually access
the filesystem, access/faccessat answer the question what would
happen IF I were actually running as the real uid/gid rather than the
current effective/fs uid/gid? This is (one reason) why they're
utterly the wrong thing to use.

As for why we can't use setfsuid, it's because it's not reversible.
Linux allocates kernelspace priv objects whenever you change any of
the uids, and any such operation can fail. This means after setfsuid,
it can be impossible to get back your original fsuid, from which there
is no way to recover. The only way to avoid this is to change the
fsuid in a newly-forked process. This is exactly what musl does,
except for real uid/gid rather than fsuid/fsgid, because
access/faccessat uses real uid/gid.

  3. Otherwise, it uses fstatat, possibly with the AT_SYM_NOFOLLOW flag,
 to get the file ownership and mode and performs its own access
 permissions check in userspace. This is imprecise and does not
 respect ACLs or any other advanced permission models provided by
 LSMs etc.
 
 OS/2 extended attributes (and NT's copy of them) aren't unix
 permissions. (I'd be more concerned about checking for exec when the
 filesystem is mounted noexec or a file where the dynamic linker doesn't
 exist or the file is arm and this is x86... and if you're not going
 there, why do you care what SELinux has to say about it?)

There are various valid uses for access/faccessat (as opposed to the
invalid uses affected by races) where you want to ensure that a
particular file is _not_ accessible by a particular user (e.g.
checking that the authorized_keys file and .ssh dir are not writable
by the user themselves for certain types of restricted accounts). It
would be bad if access/faccessat falsely reported inaccessible (safe)
when the file was actually accessible due to an ACL or other extended
means of granting access.

  I was not even aware that this imprecise emulation was used in the
  AT_SYM_NOFOLLOW case; I figured they would do it in an exact way,
  which is possible with some /proc/self/fd tricks.
 
 There are times when proc isn't mounted. While the system is booting can
 be one of those times.

Indeed, but Linux has lots of stuff that can't be made to work without
/proc mounted.. :(

  So my conclusion? There are some moderate-level documentation errors.
  glibc implements the flag, but not correctly. The changes I would
  recommend to the documentation:
 
 So it's still not your bug, it's the rest of the world's bug. I'll
 continue to code to the rest of the world then, and locally patch in a
 workaround which will require an #ifdef __MUSL__



  1. Document that AT_SYM_NOFOLLOW is not standard for this function,
 and is a glibc extension

Re: [Toybox] [Aboriginal] aboriginal linux

2014-09-28 Thread Rich Felker
On Sun, Sep 28, 2014 at 02:05:02PM -0700, Isaac Dunham wrote:
 On Sun, Sep 28, 2014 at 11:36:11AM -0500, Rob Landley wrote:
  Redirected to the toybox list, where I have a pending message to reply
  to on this topic already...
  
   Long ago I developed a strong dislike for the library emulating features 
   when GNU's glibc emulated aio by helpfully making my program 
   multi-threaded and hijacking signals.
  
  You think that's bad, dalias decided that implementing AT_EACCESS
  required spawning a new process for every system call. (setfsuid()
  exists, thanks to the samba guys, but apparently doesn't apply here. No
  idea why.)
 
 man 2 setfsuid says that 
 
The system call setfsuid() changes the value of the caller's filesystem
user ID--the user ID that the  Linux  kernel  uses  to  check  for  all
accesses to the filesystem.  Normally, the value of the filesystem user
ID will shadow the value of the effective user ID.  In  fact,  whenever
the  effective  user ID is changed, the filesystem user ID will also be
changed to the new value of the effective user ID.
 
Explicit calls to setfsuid() and setfsgid(2) are usually used  only  by
programs such as the Linux NFS server that need to change what user and
group ID is used for file access without a corresponding change in  the
real and effective user and group IDs.
 
 Reading the last sentence of the first paragraph, it would seem to be
 implied that AT_EACCESS is actually the default behavior...
 which I somehow find surprising.

In general, all filesystem access functions use the _effective_ (or,
on Linux, if it's been changed, the _filesystem_) uid/gid to determine
whether the process has access to a file. However, the
access/faccessat functions are special and default to using the _real_
uid/gid. The historical _intent_ was that suid-root programs can use
access() to determine whether the invoking user would be allowed to
access a file before actually performing the operation to access it.
However, such usage has a fundamental TOCTOU race which is always a
vulnerability. So, essentially, the access function is useless for its
original designed purpose.

 Anyhow, an alternate proposal for how to handle rm -rf:
 Try deleting files, and handle errors by the chmod path if errno == EPERM.
 
 Is there a reason not to do this?

This is the right way to do it. I just wrote to the musl list (and
CC'd toybox/landley) with an outline of how to do this that handles (I
think) all of the possible file/dir-replacement races.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


[Toybox] faccessat and AT_SYM_NOFOLLOW

2014-09-26 Thread Rich Felker
This message is a follow-up to a discussion Rob Landley and I had on
#musl regarding musl's returning an error of EINVAL when the
AT_SYM_NOFOLLOW flag is passed to faccessat (a nonstandard usage),
which is affecting toybox. Rob's idea for using it came from the Linux
man pages, which document this flag as supported and do not make it
clear that it's glibc, not Linux, providing the support.

Issue 1: Is the inclusion of AT_SYM_NOFOLLOW in the man page a total
documentation error (not actually supported by glibc at all) or just a
failure to mark it as a glibc extension?

Here's the relevant glibc file:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/faccessat.c;h=4a6048ec7930c8fc249ee629b1d2618cd81084b0;hb=HEAD

As I read it:

1. It sets EINVAL for flags that are invalid, but considers
   AT_SYM_NOFOLLOW valid. So far so good...

2. It flag is 0 or the only flag set is AT_EACCESS and the binary is
   not suid, it just makes the syscall directly. OOPS, big bug -- it
   does not honor changes to the effective uid made by programs
   initially started as root!

3. Otherwise, it uses fstatat, possibly with the AT_SYM_NOFOLLOW flag,
   to get the file ownership and mode and performs its own access
   permissions check in userspace. This is imprecise and does not
   respect ACLs or any other advanced permission models provided by
   LSMs etc.

I was not even aware that this imprecise emulation was used in the
AT_SYM_NOFOLLOW case; I figured they would do it in an exact way,
which is possible with some /proc/self/fd tricks.

So my conclusion? There are some moderate-level documentation errors.
glibc implements the flag, but not correctly. The changes I would
recommend to the documentation:

1. Document that AT_SYM_NOFOLLOW is not standard for this function,
   and is a glibc extension. (uclibc is just a copy of glibc code)

2. Document that AT_SYM_NOFOLLOW and AT_EACCESS are emulated and
   unreliable on glibc.

3. Document that the man page is covering the POSIX/glibc function
   details, and the kernel syscall does not support flags at all.
   (This might aid in getting the kernel folks to add a new faccessat4
   syscall that would do flags at the kernel level.)

Do these sound reasonable?

Issue 2: Should musl support or ignore the AT_SYM_NOFOLLOW with
faccessat?

If anyone would like to see support, I can consider it for a future
agenda item (and of course: patches welcome), but I'm not going to add
an implementation that reports incorrect information. Just ignoring
the flag would give the wrong results (possibly dangerous) and would
leave the application no way to know that the flag was ignored
(whereas failing with EINVAL makes it clear, and is explicitly
documented as an optional error condition for invalid flags).

However:

Issue 3: Does the AT_SYM_NOFOLLOW flag even make sense with faccessat?

The whole point of AT_SYM_NOFOLLOW is to avoid TOCTOU races checking
for symlinks. However, any use of faccessat is fundamentally a TOCTOU
race -- the information it obtained is not necessarily correct by the
time it returns. Reducing two TOCTOU issues in succession to one
TOCTOU issue does not seem useful for most purposes.

Issue 4: Does faccessat, with or without AT_SYM_NOFOLLOW, make sense
for implementing rm -rf?

No. At best it's a wasted syscall that slows you down, and at worst it
gives you wrong information. The efficient and correct implementation
is to simply _try_ the operation which might fail (openat?) and only
change file permissions and retry if it failed with EACCESS. This
requires zero extra syscalls in the success case (versus one extra
with faccessat).

As for AT_SYM_NOFOLLOW, I'm not clear why it was being used. Even if
the inaccessible file is a symlink target, you'd need to use
AT_SYM_NOFOLLOW or similar when doing the fchmodat, and that one's
what actually protects you from races used to trick rm into changing
permissions on files it shouldn't. Use of the result from faccessat to
make the decision is a TOCTOU race.

Rich
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net