Re: [musl] bbox: musl versus uclibc
On Mon, Aug 14, 2017 at 07:43:39PM +0200, Denys Vlasenko wrote: > As uclibc is increasingly aging, I am finally forced > to switch to musl: I'm bitten by a nasty bug in > getopt() - hush is using it in a slightly unusual way, > which uclibc does not expect. While I'm glad musl is working for you, this sounds fragile unless it's actually just a bug in uclibc, in which case it'll hopefully get fixed. If you're relying on weird internal-state of getopt there's a chance this could break in the future on musl or on other libcs. Can you explain the problem you encountered on uclibc? > I built a toolchain using > https://github.com/richfelker/musl-cross-make > (Rich, is this the thing I should be using?) > and it worked with no issues at all. Yes, it's the easiest and canonical way to get a full toolchain. > (I can probably only wish for the README > to also mention how to make this a _static_ > toolchain... I have a box with 32-bit userspace, > would be awesome to be able to copy this fresh > 64-bit toolchain to it and have it working). If you put LDFLAGS="-static --static" (second form is a hack to workaround libtool's stripping of -static :) in COMMON_CONFIG you should get a static build of the toolchain, but it's linked to whatever the libc in your build environment is. In my config.mak I use: ifeq ($(HOST),) COMMON_CONFIG += CC="i486-linux-musl-gcc -static --static" CXX="i486-linux-musl-g++ -static --static" endif COMMON_CONFIG += CFLAGS="-g0 -Os" CXXFLAGS="-g0 -Os" LDFLAGS="-s -static --static" to build the toolchain static-linked using an existing i486/musl-targeting toolchain. The static in $CC/$CXX is needed because some broken configure tests in GMP try to run test binaries built without honoring LDFLAGS, and will fail to run if you don't have ld-musl-i386 installed. Now that canadian-cross support works, you could probably instead treat i486-linux-musl as a cross compiler (host!=build) instead. If you prefer you can use x86_64 instead, too, but I like the toolchain binaries themselves being 32-bit simply because it saves roughly half the ram usage when running them. > Then I built busybox. Impressions: > > Only a few options did not build: > EXTRA_COMPAT and FEATURE_VI_REGEX_SEARCH > failed because they need GNU regexp extensions. Yes. There was a request for the corresponding version of some of these extensions in the POSIX regex API, which bb vi could be converted to use if we adopted them, but there is some nontrivial runtime cost. It probably doesn't matter now since the regexec core is fairly slow (well, good big-O but bad constant) right now, but it may impose a more noticable relative cost once we make regexec faster. > FEATURE_MOUNT_NFS and FEATURE_INETD_RPC do not build > because they need rpc/rpc.h. > Not complaining, since them being in libc was a mistake > in the first place. I think glibc has dropped these too now, haven't they? AFAIK they're just keeping the symbols for ABI compat but they're not an exposed API for linking new code; you're supposed to use tirpc. > Now, the good news - musl has smaller data! > 6695 bytes versus 7129 bytes for uclibc: > >text data bss dechex filename > 894902 465 6664 902031 dc38f busybox.uclibc > 912538 563 6132 919233 e06c1 busybox.musl Probably getpw*/getgr* static buffers or something. musl's backend for these just uses the buffer out of getline so as not to impose cost in apps that don't need the legacy (non-_r) functions or an arbitrary limit on record length. It would be interesting to know where the text size increase comes from. Not that big in abs difference, but given that a large portion of the text is busybox code and not libc, it's a fairly large relative difference. Come to think about it, it might actually be regex.. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Re: Possible Vulnerability in httpd.c
On Tue, Nov 22, 2016 at 02:30:50AM +0100, Denys Vlasenko wrote: > On Mon, Nov 21, 2016 at 8:18 PM, Simon Rettberg >wrote: > > On Mon, 21 Nov 2016 20:37:14 +0200 > > Timo Teras wrote: > >> > >> It is still good practice to fill it with snprintf. If this is done, > >> proper error checking should be done to check the final 'len' that it > >> does not exceed IOBUF_SIZE or you have information leak bug (since > >> snprintf returns the length it would generate if buffer was unbounded; > >> not the length it actually wrote to the buffer). > >> > > Exactly. It typically goes like this: Someone is using functions generally > > considered "unsafe" because they know for sure it is not exploitable the way > > it's being used in this specific instance. Then eventually someone else > > comes > > along, adds feature X using unsafe functions as well (you try to do it the > > same > > way the rest is written, right?), and boom, you suddenly got your exploit > > because X happens to enable the remote user to inject arbitrarily long data > > (think some %s the user can control). > > Yes, I am aware this happens. > > > It should be fixed properly, handling the case where the return value is > > either > >> BUFSIZ or even < 0. > > If you are afraid that there can be bugs of the "we thought it can't overflow, > but it in fact can", then why you aren't afraid that length checking code > wouldn't have bugs? The latter is more obvious and easier for people not familiar with the code to check. The former relies on a lot of hidden assumptions including the value of BUFSIZ and the sizes of types. > Different projects choose their paranoias differently. > >From its inception, bbox was paranoid about code size. > > If you see an actual bug where buffer can overflow, > I'm more than willing to fix it. > > But if there is no actual bug, and it's just a general concern > that "it looks unsafe", then code size trumps it. Have you stopped to consider the size from pulling in the deprecated sprintf function to begin with? If all references to it were removed, then static-linked busybox would only have snprintf, not sprintf. On musl/i386 this would only save about 50 bytes but it might save more on other archs or libcs. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Current git HEAD busybox segfaults on some applets
On Fri, Sep 16, 2016 at 04:35:23PM +, Laurent Bercot wrote: > > A full gdb output is available here: http://pastebin.com/3k6SENiX > > The issue comes from the fact that fflush(stream) is #define'd as > fflush_unlocked(stream), so fflush(0) actually runs > fflush_unlocked(0), which segfaults with the current version of > musl (but not glibc). > > fflush_unlocked() and friends are not part of the standard API > (only getc_unlocked and putc_unlocked are) and it is a mistake > to assume they 1. exist, 2. behave the same as their non-unlocked > counterpart. > > The culprit is this commit: > https://git.busybox.net/busybox/commit/include/libbb.h?id=aa3576a29b9619f4e1c1b131f5db53ad2bc2cb00 > > Later commits modify the additions in libbb.h, but those > additions are incorrect in the first place. I will send a > patch that removes them. More to the point, they're unnecessary and unsafe. The *_unlocked functions are not well-specified, and except for getc/putc, they're not going to yield you any performance benefits anyway. The issue here happened because musl's fflush_unlocked is just provided as an alias for the internal function fflush() calls once the file is locked, or for each function in the list of all open files when fflush(0) is used. This internal function is not capable of handling the fflush(0) behavior itself. Maybe it should be able to; I'm leaning towards getting rid of all the *_unlocked functions except the standard ones and making the rest aliases for the non-_unlocked version just to keep this kind of thing from happening in the future. But I don't think it's a good idea for applications to be using them. They were added purely for compatibility with broken software that assumes they exist, not because they're useful for anything. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Tue, Jul 12, 2016 at 01:09:42PM -0400, Michael Conrad wrote: > On 7/7/2016 11:49 AM, Rob Landley wrote: > >On 07/06/2016 11:41 AM, Etienne Champetier wrote: > >>>Now you really hate the fact that getrandom() is a syscall. > >I do not hate the fact getrandom is a syscall. I'm asking what the point > >is of a new applet to call this syscall. You have suggested it could > >block to show when /dev/urandom is producing a higher grade of > >randomness than it does before being properly seeded. That is, as far as > >I can tell, the only reason for your new applet to exist rather than > >upgrading $RANDOM in ash/hush. > > > Actually in my opinion the syscall is inferior to a new character > device, because blocking syscalls interfere with event-driven > programming. > > Suppose you want to write a single-thread event-driven web server > which initializes its SSL library with this randomness source (i.e. > won't allow SSL until enough entropy is collected for a good > initialization) but you still want it to be able to accept non-SSL > connections. In order to use the syscall you need a thread, or > child process. (haha, such as pipe to a "getrandom" applet...) Thanks for contributing your ideas about what color the bikeshed should be, but getrandom's already got you covered. Until your csPRNG is initialized, just call getrandom with the GRND_NONBLOCK once each time you get any event (i.e. each time poll() returns). There's no need to busy-wait or periodically check with timeouts since you don't need the results until there's an event to act on, anyway. There are very good reasons it's a syscall rather than a device: many use cases require a never-fails entropy source, and with the device node approach they're vulnerable to fd-exhaustion attacks. Most existing bad code, when faced with such a situation, falls back to some completely insecure seed like time(). The only reliable way to prevent such idiocy was to provide an interface that can't fail. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Mon, Jul 11, 2016 at 08:19:29PM +0200, Denys Vlasenko wrote: > On Mon, Jul 11, 2016 at 8:58 AM, Rob Landleywrote: > > Anyway, the point is you can't voluntarily collect _more_ entropy from > > these sources. They run when there's data to gather, calling the > > routines when there isn't new data to gather doesn't help much. > > And yet, interestingly, if you try to solve an opposite problem, > the predictable and accurate benchmarking, you suddenly discover that > the damn thing just can't be made to run completely deterministically. The fact that your benchmark timings vary from run to run doesn't have anything to do with whether they're deterministic or predictable. It's mostly a function of the full state of the kernel (and all running processes) -- aside from asynchronous hardware events, the main factors which influce the timings are the points at which the scheduler interrupts, the state of the cache, page tables, and tlb, etc. On modern x86's there's quite a degree of variation, for complicated reasons that basically all stem from (1) unpredictable asynchronous peripheral hardware and (2) internal unpredictability due to the tricks involved in running at such high clocks. On small embedded systems with little or no peripheral hardware, though, things are quite predictable. I don't have figures but my (educated, based on kernel debugging) guess would be that it's common for the number of possible states when enterring init to be on the order of tens or at most hundreds. > Let's boot in single user, and run just one process, you say. > Let's pin it to CPU0, you say. Switch off all dynamic CPU scaling, > you say. No way. Still not totally deterministic. Your logic is badly^H^H^H^H^Hdisastrously wrong here. Absolute or even near determinism is not needed for there to be a major vulnerability. Even if there are on the scale of 2^32 possible initial states, computing all the resulting keys for each and trying them all is a trivial computation even for individual attackers. With modern cloud computing you don't need to be a nation-state to do this kind of brute forcing. > I tried timing CPU cycle counter delta across a CPUID(0) instruction. > That alone had some jitter. I collected a few megabytes of it and ran > it through entropy estimator, it said there is ~0.9 bit of randomness per > byte. > (I'm not claiming this is true on all CPUs. I'm only sharing my experience > on my CPU). > > I tried sending signals to my own process and timing how long does that take. > Also varies. Your interactive experiments with an interactive system are meaningless because you have the biggest entropy input of all: the timing of your own human-generated input. In order for them to be meaningful you need to start from cold-boot and script the measurements to automatically run. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Wed, Jul 06, 2016 at 12:31:01AM -0500, Rob Landley wrote: > On 07/05/2016 11:49 PM, Rich Felker wrote: > > On the other hand, /dev/urandom has a problem that it will give > > results before sufficient entropy has been collected, resulting in > > duplicate sequences (and thus duplicate keys generated) on identical > > devices with a fairly high probability. > > Which is why you'd read a byte from /dev/random first if you need to > wait for the pool to have entropy in it? Given that there's an existing > mechanism that will block that'sbeen here for 20 years? The answer to why this is not a solution is in the previous paragraph that you didn't quote. > > The getrandom syscall was added both to address this deficiency in > > urandom, and to address the case where random bytes are needed but no > > file descriptors are available (to protect against fd exhaustion > > attacks undermining crypto, basically). > > If no file descriptors are available, you can't launch a getrandom command. This text is about the syscall not the proposed busybox command. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Tue, Jul 05, 2016 at 09:33:04PM -0500, Rob Landley wrote: > On 07/05/2016 06:42 AM, Etienne Champetier wrote: > >> You've implied that this new API can block until it's initialized, which > >> reading from /dev/random can already do, and presumably > >> select/poll/inotify could do on /dev/random without consuming entropy. > > > > As shown in my mail from 29 june 2016 at 15:54 > > [ 14.321536] ### getrandom ### > > [ 42.603677] ### dd ### > > > > getrandom() wait until /dev/urandom is initialized, which can be way > > before /dev/random is initialized > > Ok, this is a new assertion. How does that work? How would /dev/urandom > get random data to seed it without /dev/random having enough entropy for > one byte of output? The key point is that /dev/random is "tinfoil hat compliant". The things it does (and especially considering entropy to become "depleted") have no basis in reality but can't change because people will complain (bikeshedding). /dev/random is safe to use, but very slow and costly, _even after_ it has collected sufficient entropy (because it wrongly considers that entropy to become "depleted"). On the other hand, /dev/urandom has a problem that it will give results before sufficient entropy has been collected, resulting in duplicate sequences (and thus duplicate keys generated) on identical devices with a fairly high probability. The getrandom syscall was added both to address this deficiency in urandom, and to address the case where random bytes are needed but no file descriptors are available (to protect against fd exhaustion attacks undermining crypto, basically). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Bug 9076 - Whois using a non working host for queries by default
On Mon, Jul 04, 2016 at 09:36:07PM -0500, Rob Landley wrote: > On 07/04/2016 02:47 PM, Vito Mulè wrote: > > stopped using strcpy, > > Why? > > > +size_t query_len = strlen(argv_host); > > + char *str_token = malloc(query_len+1 * sizeof(char)); > > > + strncpy(str_token, argv_host, query_len+1); > > Ignoring the way you've taken 3 lines to say strdup() (and your > sizeof(char) is only applying to the 1, not query_len+1, but it's ok > because sizeof(char) is a constant 1 on every system linux has ever > supported)... Not just Linux. sizeof(char) is literally the answer to the question "how many chars in a char?" or "how big is char in units of char?" Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Sun, Jul 03, 2016 at 06:44:22PM +0200, Denys Vlasenko wrote: > On Tue, Jun 28, 2016 at 6:56 PM, Etienne CHAMPETIER >wrote: > > first user of this applet will be LEDE (OpenWrt) to save an urandom seed > > using getrandom() (so we are sure /dev/urandom pool is initialized) > > > Please be more specific which task you want to achieve with this tool. > > I googled for this name and no such tool exists (yet?) in distros, > I hesitate to introduce a new tool and then have an API collision > when they add something similar. > > BTW, I know that security people would scream bloody murder, > but wouldn't > > cd /proc && cat cpuinfo meminfo stat interrupts diskstats slabinfo > schedstat buddyinfo >/dev/random > > in practice work quite satisfactorily for adding some entropy at boot time? > If you don't think so, can you demonstrate a setup where the output is > predictable? Probably at most a few bits on SoCs without unpredictable peripherals involved. I could try running some tests on single-core J2 when I get a chance, but I would not expect this approch to be suitable for anything where you depend on the output being secret/unpredictable. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] getrandom: new applet
On Sun, Jul 03, 2016 at 06:46:32PM +0200, Denys Vlasenko wrote: > On Tue, Jun 28, 2016 at 7:47 PM, walter harmswrote: > > perhaps a better aim ist to improve the $RANDOM in ash ? > > $RAMDOM generator in my tests passed all "dieharder -g 200 -a" tests. > How much better than this do you need? That's not saying much. A tempered LCG with 32-bit state/period can pass all of dieharder except the period tests. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: kbuild overwrites $CC
On Fri, Jul 01, 2016 at 06:16:59PM +0200, Martin Kaiser wrote: > Dear all, > > I'm trying to compile busysbox with a toolchain created from > yoctoproject. This toolchain comes with a setup script to set $CC and > cross-compiler related environment variables. If the toolchain is > installed to a non-standard path, $CC contains the --sysroot parameter, > e.g. > > martin@vagrant-ubuntu-trusty-32:~/src/busybox$ echo $CC > arm-poky-linux-gnueabi-gcc -march=armv5te -marm -mthumb-interwork > --sysroot=/opt/poky-1.7.3/sysroots/armv5te-poky-linux-gnueabi > > Busybox' Makefile unconditionally overwrites CC with > > CC = $(CROSS_COMPILE)gcc > > CROSS_COMPILE ist set up correctly in the environment, so I get the > right compiler but the --sysroot parameter is stripped. As the include > path depends on sysroot, the compiler doesn't find some include files > and busybox compilation fails. > > Do you have an idea how this is supposed to work? Should I submit a > patch that replaces CC = ... with CC ?= ... ;-) In general, putting CC= on the make command line rather than in the environment will override any make variables (unless an obscure GNU make feature to force them is used, and it's not used here). If you already have it in the environment, this will work: make CC="$CC" or if you want _all_ vars from the environment to take precedence: make -e That can be dangerous though since makefiles might happen to use var names that happen to clash with something in your environment for purely internal purposes, and in a worst case it could end up leading to rm -rf of some directory you care about (I doubt this is likely with the busybox makefile but I still don't recommend -e). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH v3 3/4] nsenter: new applet
On Mon, Mar 14, 2016 at 02:27:56PM -0400, Mike Frysinger wrote: > On 14 Mar 2016 18:11, Bartosz Gołaszewski wrote: > > 2016-03-14 15:27 GMT+01:00 Mike Frysinger: > > > On 14 Mar 2016 11:07, Bartosz Golaszewski wrote: > > >> +#ifndef __BB_NAMESPACE_H > > >> +#define __BB_NAMESPACE_H > > > > > > use a naming style like other busybox headers > > > > > >> +/* > > >> + * Longest possible path to a procfs file used in namespace utils. Must > > >> be > > >> + * able to contain the '/proc/' string, the '/ns/user' string which is > > >> the > > >> + * longest namespace name and a 32-bit integer representing the process > > >> ID. > > >> + */ > > >> +#define NS_PROC_PATH_MAX (sizeof("/proc//ns/user") + sizeof(pid_t) * 3) > > > > > > using the sizeof pid_t as a proxy for how many chars it'd take to render > > > a decimal number in ASCII is wonky. just hardcode it as "10" since that > > > is the largest unsigned 32bit number ("4294967296"). > > > > Can you elaborate on how it's wonky? While your solution is perfectly > > fine I think that there's nothing wrong with the way I've done it > > neither. > > the code seems to assume that the byte size scales into the number of > chars required to represent the number in base 10 when it's really a > log scale. here's a table to show it's wonky: sizeof is already a log scale. 3*sizeof(pid_t) is 3*log_256(MAX_PID+1), which is greater than 3*ceil(log_1000(MAX_PID)), the actual space requirement. > pid_t |sizeof |*3 val|actual| > size |(bytes)|(char)| | > (bits)| | | | > --|---|--|--| > 8 | 1| 3 | 4 | > 16 | 2| 6 | 6 | > 32 | 4| 12 | 11 | > 64 | 8| 24 | 20 | > 128 | 16| 48 | 40 | > > the "actual" value is one higher than you might expect because pid_t > is a signed quantity. for 8bit, "-128" is 0x80 and takes 4 bytes. > for 32bit ("int"), "-2147483648" is 0x8000 and takes 11 bytes. While pid_t is signed, negative values are not meaningful as pids; they're pgids or special sentinels in some contexts. If this code is trying to print a negative pid into a pathname used in /proc, that's a bug. But in general, when using the 3*sizeof idiom you should add +1 for a sign if needed and another +1 for null termination if not already included elsewhere. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH v3 3/4] nsenter: new applet
On Mon, Mar 14, 2016 at 10:27:19AM -0400, Mike Frysinger wrote: > On 14 Mar 2016 11:07, Bartosz Golaszewski wrote: > > +#ifndef __BB_NAMESPACE_H > > +#define __BB_NAMESPACE_H > > use a naming style like other busybox headers And in particular, don't use leading underscores, ever. They're not available for use by applications. > > +/* > > + * Longest possible path to a procfs file used in namespace utils. Must be > > + * able to contain the '/proc/' string, the '/ns/user' string which is the > > + * longest namespace name and a 32-bit integer representing the process ID. > > + */ > > +#define NS_PROC_PATH_MAX (sizeof("/proc//ns/user") + sizeof(pid_t) * 3) > > using the sizeof pid_t as a proxy for how many chars it'd take to render > a decimal number in ASCII is wonky. just hardcode it as "10" since that > is the largest unsigned 32bit number ("4294967296"). I disagree that this is "wonky". It's a simple safe bound for number of characters needed to print an integer type. Hard-coding a number, while unlikely to affect any future Linux targets, is a bomb in the code that could become dangerous if someone reused it on a system with pid_t larger than 32-bit. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 0/2] ntpd: retry name-resolution until success
On Thu, Feb 11, 2016 at 05:56:06AM +0100, Denys Vlasenko wrote: > On Thu, Feb 4, 2016 at 2:43 PM, Mark O'Donovanwrote: > > This is an attempt to fix bug 8131: > > ntpd: should retry on name resolving error > > Should it? > Why doesn't ping do this? telnet? netcat? No network utility does this. > Adding it to ntpd only would be inconsistent. Because ping, telnet, and netcat are not daemons. A daemon should respond gracefully to transient failures rather than aborting. > I can see why this request comes specifically for ntpd. It is often started > early at boot. Sometimes, network isn't up for some tens of seconds > after boot, so ntpd fails to resolve names. > > There are other ways to fix this. > An utterly simplistic one would be to delay ntpd startup by, say, 20 seconds. That's utterly broken. You end up with an invalid or badly-wrong clock for the first 20 seconds after boot, which could lead to all sorts of problems with timestamps. > A slightly better one is to restart ntpd when it fails. I don't see how this is any better than having it retry internally. It's probably more work to implement and yields worse behavior. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH V2]networking: Allow dot at the end of the domain name in dhcp response
On Thu, Feb 11, 2016 at 05:59:39AM +0100, Denys Vlasenko wrote: > On Tue, Feb 9, 2016 at 6:23 PM, Balaji Punnuru> wrote: > > When a dhcp server responds with a domain name that ends with a ".", > > domain name validation is failing which leads to populating domain bad in > > resolv.conf > > > > Domain name ending with . is a valid syntax according RFC-1034. > > > > The Patch fixes the domain name validation which ends with "." > > Well, this may be technically allowed, but wouldn't you think that > such DHCP response is definitely odd, and it's better to know about it? Maybe a log message for the oddity, but it should not result in active misconfiguration. Ending dns names with a dot is valid and IIRC historically this practice was used to indicate that the name is anchored to the dns root rather than possibly being interpreted relative to some domain. The standard APIs for lookups all accept names ending with a dot, and this should suppress search domains. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Fix compiling with musl's utmp stubs
On Sun, Jan 31, 2016 at 01:02:27AM -0500, Kylie McClain wrote: > >From 4d76d9b69503b1ad61cf857d83cb8c4d5d92465b Mon Sep 17 00:00:00 2001 > From: Kylie McClain> Date: Sat, 30 Jan 2016 23:00:18 -0500 > Subject: [PATCH] Fix compiling with musl's utmp stubs > > This patch fixes compiling busybox with FEATURE_UTMP and _WTMP enabled. > musl, while not really support utmp/wtmp, provides stub functions, as well > as variables such as _PATH_UTMP, so that programs using utmp or wtmp can > still compile fine. > > My reasoning for this patch is that on Exherbo, I'm currently trying to get > us to be able to use the same busybox config file for both glibc and musl > systems, using utmp/wtmp on systems that support it, and using the stubs > on musl without needing two different configs. > > As of latest musl git, it provides all utmp functions needed; 1.1.12 doesn't, > but I sent a patch to Rich to add the utmp{,x}name functions expected to > exist, which was merged into musl upstream. > --- > include/libbb.h | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/include/libbb.h b/include/libbb.h > index a8ceb44..c1516a9 100644 > --- a/include/libbb.h > +++ b/include/libbb.h > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > #if defined __UCLIBC__ /* TODO: and glibc? */ > /* use inlined versions of these: */ > # define sigfillset(s)__sigfillset(s) > @@ -106,7 +107,11 @@ > # define updwtmpx updwtmp > # define _PATH_UTMPX _PATH_UTMP > # else > +# include > # include > +# if defined _PATH_UTMP && !defined _PATH_UTMPX > +# define _PATH_UTMPX _PATH_UTMP > +# endif > # endif > #endif > #if ENABLE_LOCALE_SUPPORT > -- > 2.7.0 I think we could just expose these from utmpx.h on musl if it helps. They're in the reserved namespace so that's certainly permissible. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Remove a compilation warning
On Fri, Jan 01, 2016 at 02:44:25PM -0500, Jody Bruchon wrote: > How about putting the left side in parentheses to silence the > warning instead of turning the warning off completely? My point was that the parentheses make the code less readable, not more readable, and that nobody would reasonably mistake the operator precedence in this case or in any case reported by -Wlogical-not-parentheses. It's a bogus warning. Rich > On January 1, 2016 2:14:55 PM EST, Rich Felker <dal...@libc.org> wrote: > >On Thu, Dec 31, 2015 at 11:01:24AM +0100, Cristian Ionescu-Idbohrn > >wrote: > >> Reported by gcc (Debian 5.3.1-4) 5.3.1 20151219 > >> > >> shell/ash.c: In function 'evaltree': > >> shell/ash.c:8432:19: warning: logical not is only applied to the left > >hand side of comparison [-Wlogical-not-parentheses] > >>if (!exitstatus == is_or) > >>^ > > > >I would prefer turning off this warning instead of making the code > >harder to read. If someone would read !a == b as !(a==b) then they > >have much bigger C comprehension problems they need to work on. > > > >Rich > >___ > >busybox mailing list > >busybox@busybox.net > >http://lists.busybox.net/mailman/listinfo/busybox > > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Remove a compilation warning
On Thu, Dec 31, 2015 at 11:01:24AM +0100, Cristian Ionescu-Idbohrn wrote: > Reported by gcc (Debian 5.3.1-4) 5.3.1 20151219 > > shell/ash.c: In function 'evaltree': > shell/ash.c:8432:19: warning: logical not is only applied to the left hand > side of comparison [-Wlogical-not-parentheses] >if (!exitstatus == is_or) >^ I would prefer turning off this warning instead of making the code harder to read. If someone would read !a == b as !(a==b) then they have much bigger C comprehension problems they need to work on. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: how to use ntp in busybox
On Tue, Nov 24, 2015 at 10:40:51AM +0100, walter harms wrote: > > > Am 23.11.2015 07:49, schrieb Wei, Catherine: > > Hi: > > I've checked the ntpclient, and found that it can monitor the time > > difference between server and local machine, but seems it cannot support > > runtime configuration as Michael D. Setzer II said. > > > > We use it busybox ntp for some time now, also with ntpclient since it is > convenient. > btw: ntpdate is removed from ntp some time ago. > > Why do you need a runtime configuration at all ? you can always restart with > a new setup, > write a small wrapper and you can use a file again. Indeed, this seems ideal. Proper use of ntp only makes extremely-small adjustments to the clock rate, so stopping the ntp client for a fraction of a second (or even several hours) and restarting it with new configuration is not going to have any noticable effect on the system clock. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] fix hush umask command
On Wed, Oct 07, 2015 at 12:31:15AM -0400, Rich Felker wrote: > The hush umask command is backwards; umask 022 sets the umask to 0755. > Attached patch should fix it. > > Rich > diff --git a/shell/hush.c b/shell/hush.c > index 96c739f..58c8dab 100644 > --- a/shell/hush.c > +++ b/shell/hush.c > @@ -8956,9 +8956,7 @@ static int FAST_FUNC builtin_umask(char **argv) > if (argv[0]) { > mode_t old_mask = mask; > > - mask ^= 0777; > rc = bb_parse_mode(argv[0], ); > - mask ^= 0777; > if (rc == 0) { > mask = old_mask; > /* bash messages: Hmm, this may break symbolic umask specs, a feature I've never seen used and wasn't aware even existed. I might look into it more later but it would be great if someone who already knows this stuff could figure it out and work out a correct patch. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
[PATCH] fix hush umask command
The hush umask command is backwards; umask 022 sets the umask to 0755. Attached patch should fix it. Rich diff --git a/shell/hush.c b/shell/hush.c index 96c739f..58c8dab 100644 --- a/shell/hush.c +++ b/shell/hush.c @@ -8956,9 +8956,7 @@ static int FAST_FUNC builtin_umask(char **argv) if (argv[0]) { mode_t old_mask = mask; - mask ^= 0777; rc = bb_parse_mode(argv[0], ); - mask ^= 0777; if (rc == 0) { mask = old_mask; /* bash messages: ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] su: support denying accounts with blank password
On Thu, Oct 01, 2015 at 03:42:40PM -0300, Alain Mouette wrote: > Why would you want to completely disable root login? > > If it is a security feature, how can it be used? It can be > interesting to avoid escalating priviledges, but then how to to > administrative tasks? (assuming ssh root login is disabled too) Requiring pubkey-based ssh login is the canonical best-practices way. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [RFC/PATCH v2 3/5] libbb: add ends_with() function
On Fri, Aug 21, 2015 at 06:26:27PM +0200, walter harms wrote: Am 21.08.2015 16:23, schrieb Bartosz Golaszewski: This function checks if given key can be found at the end of the string. Signed-off-by: Bartosz Golaszewski bartekg...@gmail.com --- include/libbb.h | 1 + libbb/compare_string_array.c | 30 ++ 2 files changed, 31 insertions(+) diff --git a/include/libbb.h b/include/libbb.h index a56b684..7b41c9b 100644 --- a/include/libbb.h +++ b/include/libbb.h @@ -422,6 +422,7 @@ const char *bb_basename(const char *name) FAST_FUNC; char *last_char_is(const char *s, int c) FAST_FUNC; const char* endofname(const char *name) FAST_FUNC; char *is_prefixed_with(const char *string, const char *key) FAST_FUNC; +int ends_with(const char *str, const char *key) FAST_FUNC; int ndelay_on(int fd) FAST_FUNC; int ndelay_off(int fd) FAST_FUNC; diff --git a/libbb/compare_string_array.c b/libbb/compare_string_array.c index e24815a..a2d77c7 100644 --- a/libbb/compare_string_array.c +++ b/libbb/compare_string_array.c @@ -23,6 +23,19 @@ char* FAST_FUNC is_prefixed_with(const char *string, const char *key) #endif } +int FAST_FUNC ends_with(const char *str, const char *key) +{ + size_t str_len = strlen(str), key_len = strlen(key); + + if (str_len = key_len) { + if (strcmp(str + str_len - key_len, key) == 0) { + return 1; + } + } + + return 0; +} + maybe its a bit late but ... the function name is a bit unfortunate whats about has_suffix() ? you can improve readability by returning strcmp directly and calculating the len immediately. int has_suffix(const char *str, const char *key) { size_t len = strlen(str)-strlen(key); if (len 0 ) return -1; No you can't. len0 is always false because len is unsigned. The original code in the patch was correct; your version is buggy. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: osuosl.org and Spamhaus PBL
On Wed, Jul 22, 2015 at 03:16:15PM +0200, Laurent Bercot wrote: X-Greylist: delayed 00:06:59 by SQLgrey-1.7.6 (For the record, I read the headers wrong, and it's osuosl.org that actually performs that greylisting. My apologies to Numericable, for once I accused them wrongly. Now to understand what that greylisting does for osuosl.org... if they're using PBL in the first place, and only accept SMTP connections from known ISP servers... oh well, some things are best left unexplained.) Greylisting (at least if passing results are cached) is a lot less hostile than outright blocking, but I share your sentiment and would strongly prefer that this list (and in general, all mail services used by multiple users, some of whom may not want this hostile blocking) not use the PBL. I've even had my (non-dynamic, business class) IP addresses blocked by it in the past and nearly had to pay ransom to get them delisted. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 1/2] Revert ash: use alloca to get rid of setjmp
On Wed, Jul 22, 2015 at 04:02:22PM +0100, Daniel Thompson wrote: 2015-07-22 5:19 GMT+02:00 Rich Felker dal...@libc.org: On Sun, Jul 19, 2015 at 11:07:13PM +0200, Denys Vlasenko wrote: I would rather keep it. What is the most horrible thing which can happen here? Arbitrary code execution due to stack overflow. Does this really need a PoC? alloca is _always_ unsafe unless the argument is bounded and tiny. It would interesting to know if ash ever automatically runs its tokenizer over environment variables. If the tokenizer can only run on the command stream then there's not much to be gained from overflowing the stack since anyone who can inject an evil token in to command stream already has shell access. This is not the case. A counterexample is eval acting on a string constructed from untrusted input that was already validated to be safe (e.g. to consist entirely of alphanumeric characters). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 1/2] Revert ash: use alloca to get rid of setjmp
On Sun, Jul 19, 2015 at 11:07:13PM +0200, Denys Vlasenko wrote: I would rather keep it. What is the most horrible thing which can happen here? Arbitrary code execution due to stack overflow. Does this really need a PoC? alloca is _always_ unsafe unless the argument is bounded and tiny. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 2/2] ash: use alloca to get rid of setjmp
On Mon, Jul 13, 2015 at 04:25:02AM +0200, Denys Vlasenko wrote: On Thu, Jul 2, 2015 at 10:01 AM, Ron Yorston r...@frippery.org wrote: Rich Felker wrote: In general alloca is unsafe. It's not obvious to me what the code here is doing, so I can't tell for sure if it's safe or not, but I think this needs a strong justification of safety before being acceptable. It's a parser for a POSIXy shell, I doubt that the code is obvious to anyone. My understanding is that it's reading a token and has got to the point where a command substitution has been detected. It wants to save the bit of the token it's already processed. So if we have echo very long string`date` the code would allocate space for the very long string. In practice, it would be a problem if very long string is some hundreds of kbytes, even a few mbytes long. Is this safe? In most cases it probably is, but not if the script is malicious. If the very long string is too big for your stack you get a seg fault or worse. With a suitably long string and small stack I can reliably crash dash. With a sufficiently small memory limits you can crash any shell. Let's go with this change, unless someone sees a way to not just crash, but do something worse. I suspect it can easily be made to do arbitrary code execution when otherwise-safe (e.g. checked against whitelist for special chars) strings from untrusted input are expanded inside eval commands. Any new use of VLA/alloca should be completely banned. It's basically always an exploitable bug. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 2/2] ash: use alloca to get rid of setjmp
On Wed, Jul 01, 2015 at 04:46:18PM +0100, Ron Yorston wrote: Now that the only thing protected by setjmp/longjmp is the saved string, we can allocate it on the stack to get rid of the jump. Based on commit bd35d8e from git://git.kernel.org/pub/scm/utils/dash/dash.git by Herbert Xu. In general alloca is unsafe. It's not obvious to me what the code here is doing, so I can't tell for sure if it's safe or not, but I think this needs a strong justification of safety before being acceptable. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Re: [EDT][PATCH 0/11] typo fixes in busybox
On Fri, May 22, 2015 at 03:39:18AM +, Maninder Singh wrote: EP-E9D7571734A347E2ADA07C4134AB97EA Hi, Please fix your MUA or MTA to omit that EP- thing in the mail-body. Hi Bernhard, Already requested for fixing, from Next time hopefullt it will be fixed. Do i need to resubmit the patches if changes are acceptable? You're also breaking threading by sending a new message rather than replying (the In-Reply-To: header is missing from this reply). From the headers in your messages, your mail software looks like it's seriously misconfigured.. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Bionic lacks transparent LFS migrations; provide a workaround
On Tue, Apr 28, 2015 at 09:53:26PM -0400, Matt Whitlock wrote: On Wednesday, 29 April 2015, at 1:26 am, Tanguy Pruvot wrote: For this case i generally cast the param to the biggest possible type. printf(llu, (uint64_t) val); You probably shouldn't assume that uint64_t is the largest possible integer type. In particular, off_t is not guaranteed to be = 64 bits. #include inttypes.h printf(%PRIuMAX, (uintmax_t) val); But I hate the idea of involving all the nasty large-integer arithmetic machinery if it's not strictly necessary. PRIuMAX is useless. There's a standard name for this, %ju. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Bionic lacks transparent LFS migrations; provide a workaround
On Tue, Apr 28, 2015 at 03:55:59PM -0400, Matt Whitlock wrote: On Tuesday, 28 April 2015, at 9:56 am, Rich Felker wrote: What I do think is a bad idea is providing a hackish environment where only some things work correctly with 64-bit off_t and others silently fail or even misinterpret their arguments and blow up. If that can be avoided and the ugliness can be contained in a pit that's isolated to Bionic, I'm not opposed to it. For what it's worth, BusyBox *presently* misinterprets arguments on Bionic because Bionic's struct stat has 64-bit st_ino and st_size, but BusyBox happily assigns the values of these fields into variables of type ino_t and off_t, which breaks at run-time. These Uhg. problems would be apparent at compile-time too, except that BusyBox's build system doesn't specify -Wconversion, so no warnings are produced. Specifying -Wconversion reveals numerous places throughout the BusyBox sources where rvalues are implicitly narrowed. My personal belief is that these warnings should always be enabled, and implicit narrowing conversions should always be avoided by using explicit casts, to make obvious that a change in value is possible, but I am not here to convince the BusyBox devs to eliminate all the implicit narrowing conversions in BusyBox. The implicit narrowing warnings are almost all noise. For instance any arithmetic with types smaller than int will result in implicit narrowing when you store the result. So I don't think it's practical to use them. Besides, I agree that Bionic is at fault for deviating from the standard here by declaring st_ino and st_size as unsigned long long and long long rather than as ino_t and off_t. Yes, this is very broken. This all really needs to be fixed on Bionic's side -- they should drop the support for 32-bit off_t and always remap the standard names (at the header level) to the *64 functions. It would have been better if they'd never defined the non-64-suffixed functions with the wrong 32-bit-off_t semantics to begin with but that mistake was already done. Still they can fix the compile-time behavior without breaking their ABI. I'll work on finishing the Large File Support in Bionic and adding support for _FILE_OFFSET_BITS==64. Is there any chance that a Bionic headers patch could be shipped with BusyBox? Again, there is no reason to require BusyBox users to compile against the latest Bionic headers just to get support for large files. In fact, LFS could be implemented on all versions of Bionic with a simple -include in CFLAGS. (That's almost no different than implementing it in platform.h, but maybe you'd like it better in a separate file.) For Busybox in the short term, I'd say just use whatever works well and doesn't uglify non-Bionic support. But considering how broken this is there should really be a push to fix Bionic in a way all apps can use, not just Busybox. Whether that means upstreaming a patch to fix it or having a patch/preinclude that you can just apply/add to CFLAGS probably depends on what's practical with upstream... Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 5/5] Bionic lacks transparent LFS migrations; provide a workaround
On Mon, Apr 27, 2015 at 06:30:59AM +, dietmar.schind...@manroland-web.com wrote: On Friday, 24 April 2015, at 4:12 am, Matt Whitlock wrote: +#if defined(__BIONIC__) _FILE_OFFSET_BITS == 64 The preprocessor needs to test whether _FILE_OFFSET_BITS is defined before using it in an expression. That's not true. INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x, Programming languages - C, 6.10.1 Conditional inclusion, §4: ... After all replacements due to macro expansion and the *defined* unary operator have been performed, all remaining identifiers ... are replaced with the pp-number 0... Thus, it's not needed to write #if defined(__BIONIC__) defined(_FILE_OFFSET_BITS) _FILE_OFFSET_BITS == 64 It's a matter of policy. While it's valid C, -Wundef generates warnings for such usage and thus policy is to avoid it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Bionic lacks transparent LFS migrations; provide a workaround
On Sat, Apr 25, 2015 at 04:06:23PM -0400, Matt Whitlock wrote: On Saturday, 25 April 2015, at 11:38 am, Rich Felker wrote: On Sat, Apr 25, 2015 at 04:53:33AM -0400, Matt Whitlock wrote: This solves some of the problems arising from Bionic's off_t being 32 bits wide despite _FILE_OFFSET_BITS==64. See BusyBox bug #6908. Note that this doesn't solve all such problems since Bionic lacks 64-bit variants of many standard I/O functions: open, creat, lockf, posix_fadvise, posix_fallocate, truncate, sendfile, getrlimit, setrlimit, fopen, freopen, fseeko, ftello, fgetpos, fsetpos, mkstemp. As long as that's the case, I think this approach of trying to fake 64-bit off_t on Bionic is dangerous and probably best omitted. IMO we should either be pushing for Bionic to fix these things or sticking with 32-bit off_t on Bionic and treating that as a limitation of the platform. Don't throw the baby out with the bathwater. Bionic *does* support large files. In fact, it's mandatory in some cases: there are no 32-bit versions of stat(), lstat(), fstat(), fstatat(), statfs(), fstatfs(), or getdents(); and struct stat, struct statfs, and struct dirent are equivalent to struct stat64, struct statfs64, and struct dirent64 from Glibc, respectively. Furthermore, while Bionic does support 32-bit lseek(), pread(), pwrite(), and ftruncate() with 32-bit off_t, it does also have lseek64(), pread64(), pwrite64(), and ftruncate64() with off64_t. In short, Bionic has most of the syscall-level support introduced by _LARGEFILE64_SOURCE in Glibc, but they did not implement the transparent migration introduced by _FILE_OFFSET_BITS==64. This is easy to implement atop Bionic, however, as it is simply a matter of mapping the relevant 32-bit types and functions to their 64-bit equivalents. (See feature_test_macros(7).) This really can and should be fixed at the header level for Bionic. Nobody should be writing foo64 in sources except as a transparent workaround for broken systems. While you could do the workaround in Busybox, I think it would be much more useful as a patch to the Bionic headers that anyone can apply to the installed headers to fix them, and a patch to send upstream to get the source of the problem fixed. Bionic does *not* implement the 64-bit stdio functions, but those use a different type (fpos_t instead of off_t) and so are not affected by this patch. Would you be more willing to apply this patch if I were to produce a unit test suite demonstrating that 64-bit file syscalls work with off_t while stdio functions remain 32-bit with fpos_t? That's something of a mischaracterization. The only useful stdio functions use off_t: fseeko and ftello. fgetpos and fsetpos are rather useless since all they can do is save/restore a position you've already reached as an abstract position object. They cannot work with any kind of sequential addresses. As for being willing to apply the patch, it's not my decision. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Bionic lacks transparent LFS migrations; provide a workaround
On Sat, Apr 25, 2015 at 04:53:33AM -0400, Matt Whitlock wrote: This solves some of the problems arising from Bionic's off_t being 32 bits wide despite _FILE_OFFSET_BITS==64. See BusyBox bug #6908. Note that this doesn't solve all such problems since Bionic lacks 64-bit variants of many standard I/O functions: open, creat, lockf, posix_fadvise, posix_fallocate, truncate, sendfile, getrlimit, setrlimit, fopen, freopen, fseeko, ftello, fgetpos, fsetpos, mkstemp. As long as that's the case, I think this approach of trying to fake 64-bit off_t on Bionic is dangerous and probably best omitted. IMO we should either be pushing for Bionic to fix these things or sticking with 32-bit off_t on Bionic and treating that as a limitation of the platform. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 1/2] Bionic lacks ttyname_r; provide a workaround
On Fri, Apr 24, 2015 at 05:22:57PM -0400, Matt Whitlock wrote: I think providing an alternative implementation of ttyname_r() in missing_syscalls.c won't work. Busybox cannot be statically linked when ttyname_r is defined in missing_syscalls.o: ld: error: /usr/local/arm-linux-androideabi/bin/../sysroot/usr/lib/libc.a(stubs.o): multiple definition of 'ttyname_r' ld: libbb/lib.a(missing_syscalls.o): previous definition here This is only happening because another function from stubs.o is being pulled in. Passing -Wl,-t might be able to show the reason. But it's probably best to #define ttyname_r to something else when providing a fallback to avoid the problem. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] libbb: remove unnecessary argument to nonblock_immune_read
On Sun, Apr 19, 2015 at 10:50:25AM +0100, Ron Yorston wrote: The loop_on_EINTR argument to nonblock_immune_read is always set to 1. function old new delta xmalloc_reads200 195 -5 pgetc488 483 -5 argstr 13131308 -5 nonblock_immune_read 123 86 -37 -- (add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-52) Total: -52 bytes An even better question is whether the looping code could be eliminated entirely. Are any of the bb commands that use this code installing interurpting signal handlers? If not then the code for supporting EINTR is useless. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [musl] Re: Busybox on musl is affected by CVE-2015-1817
On Wed, Apr 01, 2015 at 10:49:40AM +0200, u-w...@aetey.se wrote: Hi Harald, On Wed, Apr 01, 2015 at 10:11:51AM +0200, Harald Becker wrote: There is a big difference if you talk about suid *root* programs or other suid usage. The former is definitely very dangerous and should be used with extreme care (I think this is the case we are talking about), the later use may even be used to drop privileges (not to raise), or to temporarily hop to the You are right about a remarkable difference between suid root and suid somethingelse, just because root has the very special role in Unix. The difference is there but it's not as remarkable as you think. As soon as you can get control of uid==suid_target_uid, you can do nasty things as that user like modifying the binary that's suid to that user, so that the next user who invokes it invokes malicious code. suid is pretty much across-the-board evil. Besides root vs non-root I doubt there is any clear line between raising and dropping privileges, you replace one set of allowed operations by a different one, that's it. Note that root does not need suid to drop its privileges. Dropping privileges in the sense of suid just means setting the effective and saved uids for the process to the real (invoking, returned by getuid()) uid. privileges of a different user (may be allowing access to some files only by using specific commands). As soon as a program is setuid it has to be written for the purpose of very reliably checking and limiting what it does on behalf of who, independently of how it can potentially be invoked out of context. This is known to be hard, I believe it is harder to do reliably than e.g. issue a request to a daemon - talking about the complexity level. Yes. Said differently, the difference is a huge and perpetually expanding attack surface (all process state inheritable across exec) versus a tiny attack surface (one unix socket). Please don't misinterpret this -- I'm not saying that busybox should be implementing a ping daemon. Rather we should be using the functionality the kernel gave us for avoiding suid (SOCK_DGRAM ICMP) when it's available and making it possible for users to get a fully functional busybox without the need for suid. When used with care and as intended, suid and sgid is a nice feature, but Unfortunately I can not really appreciate its beauty which appears to hide the complexity and/or move it to other parties (like the dynamic linker or the software maintenance infrastructure). Yes it looks simple and efficient but is it, really? It's not. nowadays there are too many Unix novices, who misunderstand or misuse this, punching big holes in every security concern. Unfortunately even seasoned gurus easily create / fail to notice holes! :( I'd much rather eliminate the opportunity for the hole from the start. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Busybox on musl is affected by CVE-2015-1817
For details on CVE-2015-1817, see: http://www.openwall.com/lists/musl/2015/03/30/1 With musl-linked Busybox installed setuid and ping enabled, exploiting this issue is trivial. While CVE-2015-1817 is certainly musl's fault, there are two changes to Busybox I'd like to propose that would have prevented it from being exploitable: 1. Having setuid utilities like ping obtain the resource they need (in the case of ping, SOCK_RAW) without processing user input at all, then fully dropping root (setuid(getuid())) before doing anything. This has been standard practice for setuid programs since the 90s and it feels bad that busybox is not doing it. 2. Reconsider the rejection of the patch to add SOCK_DGRAM support for ping, which allows it to run without root. Do either or both of these sound acceptable? Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [RFC] Proof-of-concept for netlink listener for mdev -i
On Thu, Mar 12, 2015 at 05:51:00PM +0100, Laurent Bercot wrote: On 12/03/2015 16:19, Natanael Copa wrote: netlink listener code that needs to be in memory all the time: http://git.alpinelinux.org/cgit/ncopa/nldev/tree/nldev.c A few comments: [...] - It may be worth it to write functions similar to die(), edie() and dbg() that don't use stdio, and call _exit() instead of exit(). If you manage to scrap stdio, you gain 2 pages of text (with the musl implementation). Could you elaborate on how you measure that? With musl only the parts of stdio you actually use will be linked, and use of exit does not result in linking of any additional code, since the startup code has to call exit(main(...)) anyway. In any case exit and its dependencies are tiny. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [RFC] Proof-of-concept for netlink listener for mdev -i
On Sun, Mar 15, 2015 at 01:22:35AM +0100, Laurent Bercot wrote: On 14/03/2015 20:23, Rich Felker wrote: Could you elaborate on how you measure that? With musl only the parts of stdio you actually use will be linked, and use of exit does not result in linking of any additional code, since the startup code has to call exit(main(...)) anyway. In any case exit and its dependencies are tiny. The die(), edie() and dbg() functions/macros in the example use printf format strings and variable arguments, I assumed they are built around a function of the printf family. gcc 4.8.1, musl from 2014-12-23, Linux 3.10, x86_64. $ cat nostdio.c #include unistd.h int main (void) { write(1, Hello World!\n, 13) ; return 0 ; } $ gcc -O2 -static -o nostdio nostdio.c strip nostdio $ wc -c nostdio 2952 nostdio $ cat stdio.c #include stdio.h int main (void) { printf(%s!\n, Hello World) ; /* avoids gcc magic */ return 0 ; } $ gcc -O2 -static -o stdio stdio.c strip stdio $ wc -c stdio 15224 stdio That's almost 3 pages. Yes. That's printf though, not stdio in general. printf and its dependencies are a good 8k or so. Anyway if that was your intended meaning then you're right. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Fwd: [PATCH] ls: no longer assume a 4-digit year on file timestamp.
On Tue, Feb 17, 2015 at 02:29:05PM +0100, Laurent Bercot wrote: On 17/02/2015 13:52, Explorer wrote: My original motivation was to avoid that annoying comment saying that it's buggy after year , and I believe it should be written right in the first place. Would you prefer the following comment: It's buggy after year 9 ? (Not saying that the patch is a bad idea, but if you want to write it right, then it's a bit more complex. And totally not worth it.) Read the new code again. It works fine for that case as far as I can tell. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] linedit, deluser: use POSIX getpwent instead of getpwent_r
On Sat, Feb 07, 2015 at 09:49:19AM -0800, Isaac Dunham wrote: On Thu, Feb 05, 2015 at 03:52:24PM -0500, Rich Felker wrote: On Thu, Feb 05, 2015 at 09:42:08PM +0100, Denys Vlasenko wrote: struct passwd *getpwent() { static char *line; static struct passwd pw; size_t size=0; if (!f) f = fopen(/etc/passwd, rbe); if (!f) return 0; return __getpwent_a(f, pw, line, size); } I would prefer that even struct passwd is malloced... I don't think it would make much practical difference. It could be changed though. But more importantly, bbox can't optimize only for musl. Other libc'es may have static line buffers there. And musl will eventually be forced to implement getpwent_r() if it wants to be usable for more packages... so... getpwent_r makes no sense; the _r functions are for thread-safe versions of their corresponding legacy functions, but getpwent_r has inherent global state -- the iterator. Whoever made it just wasn't thinking. To make a correct interface like this the caller would need to have an iterator object to pass to the function, but I can't see much merit in inventing a new interface for this. Besides having hidden global state, the man page notes: Other systems use the prototype struct passwd *getpwent_r(struct passwd *pwd, char *buf, int buflen); or, better, int getpwent_r(struct passwd *pwd, char *buf, int buflen, FILE **pw_fp); In other words, according to the manpage, getpwent_r() is decidedly unportable. Per my investigations, Dragonfly/Net/FreeBSD seem to use the same prototype as glibc; apparently Solaris uses the first alternate prototype; and the last mentioned seems to be a reference to Tru64, which uses pw_fp to keep track of its position instead of an iterator. OpenBSD and MirBSD do not implement getpwent_r, as far as I can tell. It should be noted here that multiple conflicting historical definitions of a nonstandard interface are one of the big exclusion criteria musl goes by. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] linedit, deluser: use POSIX getpwent instead of getpwent_r
On Thu, Feb 05, 2015 at 09:42:08PM +0100, Denys Vlasenko wrote: struct passwd *getpwent() { static char *line; static struct passwd pw; size_t size=0; if (!f) f = fopen(/etc/passwd, rbe); if (!f) return 0; return __getpwent_a(f, pw, line, size); } I would prefer that even struct passwd is malloced... I don't think it would make much practical difference. It could be changed though. But more importantly, bbox can't optimize only for musl. Other libc'es may have static line buffers there. And musl will eventually be forced to implement getpwent_r() if it wants to be usable for more packages... so... getpwent_r makes no sense; the _r functions are for thread-safe versions of their corresponding legacy functions, but getpwent_r has inherent global state -- the iterator. Whoever made it just wasn't thinking. To make a correct interface like this the caller would need to have an iterator object to pass to the function, but I can't see much merit in inventing a new interface for this. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] linedit, deluser: use POSIX getpwent instead of getpwent_r
On Thu, Feb 05, 2015 at 02:09:36PM +0100, Denys Vlasenko wrote: On Fri, Apr 25, 2014 at 9:36 AM, Natanael Copa nc...@alpinelinux.org wrote: Prefer use POSIX getpwent over glibc extension getpwent_r. This fixes building with musl libc with CONFIG_USE_BB_PWD_GRP disabled. Sorry, I missed this patch. How much static data (data or bss increase according to size busybox) do these functions pull in in static build? The comments specifically say they use getpwent_r because that avoids static data size penalty, usually about 1k. The static size penalty on musl is a few bytes. We don't have full static buffers for the strings but just use the buffer getline() produces while reading the file, so only this pointer and the passwd struct itself, not the strings it points to, have static storage. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
[PATCH] fix gzip applet not to store timestamps
Storing the original file's modification time in the output file is harmful (precludes deterministic results) and unlike official gzip, the busybox version provides no way to suppress this behavior; the -n option is silently ignored. Rather than trying to make -n work, this patch just removes the timestamp-storing functionality. It should be considered deprecated anyway; it's not Y2038-safe and gunzip ignores it by default. Per RFC 1952, 0 is the correct value to store to indicate that there is no timestamp. Rich diff --git a/archival/gzip.c b/archival/gzip.c index 1e779c9..903577b 100644 --- a/archival/gzip.c +++ b/archival/gzip.c @@ -2007,7 +2007,7 @@ static void ct_init(void) * IN assertions: the input and output buffers are cleared. */ -static void zip(ulg time_stamp) +static void zip() { ush deflate_flags = 0; /* pkzip -es, -en or -ex equivalent */ @@ -2018,7 +2018,7 @@ static void zip(ulg time_stamp) /* compression method: 8 (DEFLATED) */ /* general flags: 0 */ put_32bit(0x00088b1f); - put_32bit(time_stamp); + put_32bit(0); /* Write deflated file to zip file */ G1.crc = ~0; @@ -2044,8 +2044,6 @@ static void zip(ulg time_stamp) static IF_DESKTOP(long long) int FAST_FUNC pack_gzip(transformer_aux_data_t *aux UNUSED_PARAM) { - struct stat s; - /* Clear input and output buffers */ G1.outcnt = 0; #ifdef DEBUG @@ -2077,9 +2075,7 @@ IF_DESKTOP(long long) int FAST_FUNC pack_gzip(transformer_aux_data_t *aux UNUSED G2.bl_desc.max_length = MAX_BL_BITS; //G2.bl_desc.max_code= 0; - s.st_ctime = 0; - fstat(STDIN_FILENO, s); - zip(s.st_ctime); + zip(); return 0; } ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] udhcpd: Handle auto_time timeout overflow
On Wed, Jan 21, 2015 at 11:00:03PM +0100, Tim Hentenaar wrote: Howdy y'all, I've noticed an interesting issue with udhcpd and auto_time. Some paths within the while loop don't go through continue_with_autotime. Thus, if it takes a bit too long to reset timeout_end, the monotonic timer may be far enough along that the subtraction which sets tv.tv_sec will overflow, like so: Jan 21 19:38:13 10.0.0.1 udhcpd[75]: Waking from select() Jan 21 19:38:13 10.0.0.1 udhcpd[75]: tv_sec = 10 Jan 21 19:38:21 10.0.0.1 udhcpd[75]: Waking from select() Jan 21 19:38:23 10.0.0.1 udhcpd[75]: Sending OFFER of 10.0.0.2 Jan 21 19:38:23 10.0.0.1 udhcpd[75]: tv_sec = -1 Jan 21 19:38:23 10.0.0.1 udhcpd[75]: Waking from select() Jan 21 19:38:23 10.0.0.1 udhcpd[75]: Sending ACK to 10.0.0.2 Jan 21 19:38:23 10.0.0.1 udhcpd[75]: tv_sec = -1 Jan 21 19:38:43 10.0.0.2 udhcpc[47]: Sending renew... Jan 21 19:38:43 10.0.0.2 udhcpc[47]: Lease of 10.0.0.2 obtained, lease time 30 Jan 21 19:38:43 10.0.0.1 udhcpd[75]: Waking from select() Jan 21 19:38:43 10.0.0.1 udhcpd[75]: Sending ACK to 10.0.0.2 Jan 21 19:38:43 10.0.0.1 udhcpd[75]: tv_sec = -21 Jan 21 19:39:03 10.0.0.2 udhcpc[47]: Sending renew... Jan 21 19:39:03 10.0.0.1 udhcpd[75]: Waking from select() Jan 21 19:39:03 10.0.0.1 udhcpd[75]: Sending ACK to 10.0.0.2 Jan 21 19:39:03 10.0.0.2 udhcpc[47]: Lease of 10.0.0.2 obtained, lease time 30 Jan 21 19:39:03 10.0.0.1 udhcpd[75]: tv_sec = -41 This patch adds a quick and easy check for it, resetting tv_sec to 0, which should fall through to write_leases() and continue_with_autotime, resetting timeout_end again. Tim --- networking/udhcp/dhcpd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/networking/udhcp/dhcpd.c b/networking/udhcp/dhcpd.c index 4b3ed24..d56763f 100644 --- a/networking/udhcp/dhcpd.c +++ b/networking/udhcp/dhcpd.c @@ -415,6 +415,9 @@ int udhcpd_main(int argc UNUSED_PARAM, char **argv) if (server_config.auto_time) { tv.tv_sec = timeout_end - monotonic_sec(); tv.tv_usec = 0; + + if ((unsigned)tv.tv_sec server_config.auto_time) + tv.tv_sec = 0; I don't think this is a valid fix. If overflow occurs, there has already been an invocation of undefined behavior (assuming it's actually an overflow and not an implicit conversion into a signed type from legal unsigned arithmetic, but in that case the result would still be nonsense and might not look negative!). The check belongs before the arithmetic that would invoke UB (or just give a non-meaningful result) rather than after the computation. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] udhcpd: Handle auto_time timeout overflow
On Tue, Jan 27, 2015 at 04:02:16PM +0100, Tim Hentenaar wrote: On Tue, Jan 27, 2015 at 08:41:30AM -0500, Rich Felker wrote: On Wed, Jan 21, 2015 at 11:00:03PM +0100, Tim Hentenaar wrote: --- networking/udhcp/dhcpd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/networking/udhcp/dhcpd.c b/networking/udhcp/dhcpd.c index 4b3ed24..d56763f 100644 --- a/networking/udhcp/dhcpd.c +++ b/networking/udhcp/dhcpd.c @@ -415,6 +415,9 @@ int udhcpd_main(int argc UNUSED_PARAM, char **argv) if (server_config.auto_time) { tv.tv_sec = timeout_end - monotonic_sec(); tv.tv_usec = 0; + + if ((unsigned)tv.tv_sec server_config.auto_time) + tv.tv_sec = 0; I don't think this is a valid fix. If overflow occurs, there has already been an invocation of undefined behavior (assuming it's actually an overflow and not an implicit conversion into a signed type from legal unsigned arithmetic, but in that case the result would still be nonsense and might not look negative!). The check belongs before the arithmetic that would invoke UB (or just give a non-meaningful result) rather than after the computation. Apologies for the double-response, but I revised the patch to use a stack variable for safety, and to save 5 bytes. To be honest, I think this particular bit of code could benefit from some refactoring... Better? --- networking/udhcp/dhcpd.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/networking/udhcp/dhcpd.c b/networking/udhcp/dhcpd.c index 4b3ed24..75ddc12 100644 --- a/networking/udhcp/dhcpd.c +++ b/networking/udhcp/dhcpd.c @@ -399,6 +399,7 @@ int udhcpd_main(int argc UNUSED_PARAM, char **argv) fd_set rfds; struct dhcp_packet packet; int bytes; + unsigned int msec; struct timeval tv; uint8_t *server_id_opt; uint8_t *requested_ip_opt; @@ -413,8 +414,12 @@ int udhcpd_main(int argc UNUSED_PARAM, char **argv) max_sock = udhcp_sp_fd_set(rfds, server_socket); if (server_config.auto_time) { - tv.tv_sec = timeout_end - monotonic_sec(); tv.tv_usec = 0; + tv.tv_sec = 0; + + msec = monotonic_sec(); + if (msec timeout_end) + tv.tv_sec = timeout_end - msec; Why are we using a 32-bit type for seconds in 2015? msec should be time_t, and monotonic_sec should be returning time_t, or they could just use int64_t or something instead (but I feel like there's not much point in supporting 64-bit time if the underlying system's time_t isn't 64-bit anyway, and it's more costly, so time_t is probably best). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: ntpdate-like functionality in ntpd
On Tue, Jan 27, 2015 at 10:26:47PM +0100, Denys Vlasenko wrote: On Tue, Jan 27, 2015 at 7:27 PM, Guillermo Rodriguez Garcia guille.rodrig...@gmail.com wrote: Hello all, Is there a way to make ntpd work just like ntpdate (just use the first response received to set the time)? This is to set the approximate time at boot as quickly as possible before starting other time sensitive services. The closest I can get is ntpd -nqp server but this seems to need 5 valid samples in order to set the time. Would it work for you if you simply background it and let it do its work in parallel with the rest of the boot? I doubt having the boot process continue with the wrong time would be acceptable. File timestamps will be wrong until the background process completes. You really want to wait synchronously for an answer. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] udhcpd: Handle auto_time timeout overflow
On Tue, Jan 27, 2015 at 10:28:26PM +0100, Tim Hentenaar wrote: On Tue, Jan 27, 2015 at 09:51:29PM +0100, Denys Vlasenko wrote: Hmm, I think it's a sign-extension bug. Can you try replacing tv.tv_sec = timeout_end - monotonic_sec(); with tv.tv_sec = (int)(timeout_end - monotonic_sec()); I suspect this will fix the behavior. When I make that change, I get: movq$0, -872(%rbp) #, tv.tv_usec subl%eax, %ecx # D.8486, D.8486 testl %r14d, %r14d# movslq %ecx, %rax # D.8486, movq%rax, -880(%rbp)# D.8494, tv.tv_sec je .L101 #, testq %rax, %rax # D.8494 jle .L192 #, Hmm... Looking at the assembly before the change, it's moving eax - edx instead of sign-extending, while here (with the explicit cast) it sign-extends the result. It then generates the proper jump instruction to boot. Perhaps it wrongly assumes that since the operands for the subtraction are 32-bit unsigned integers, that the result will be also unsigned. Then, the sign-extension gets optimized away. Then, making the cast explicit forces gcc to sign-extend the result. This is not a wrong assumption. The result of unsigned subtraction is unsigned, and it's computed via modular arithmetic. The problem is that the semantics of the C code as written are wrong, not anything weird done by the compiler in translating it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] udhcpd: Handle auto_time timeout overflow
On Tue, Jan 27, 2015 at 04:51:33PM -0500, Cathey, Jim wrote: At the bottom, some of C's arithmetical rules are 'stupid'. Especially as regards type promotion. At least they're well-defined. No, unsigned types are just modular arithmetic. There's nothing 'stupid' or unexpected about how they behave unless you're not aware of that, at least not any more stupid than = and == (and even === in some langs :) having different meanings or left-shifting cout in C++... Absolutely true in a mathematical sense is that the difference between two unsigned numbers is SIGNED! It depends on what number system you're working in. Objects of a fixed size cannot represent the integers so you have to either pick a number system that can be represented (C's unsigned types) or work with an approximation of the integers (C's signed types). But that's not what C does. You can get around this, easy enough, but you do have to understand exactly what is going on. It helps if you are working on a 2's-complement machine. Twos complement is irrelevant to the operation of unsigned arithmetic. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: bug-report: ash: fix a memory leak
On Wed, Dec 24, 2014 at 10:32:19AM +0800, yhu2 wrote: The script which triggers the leak: while true do while true do break; done/dev/null done someone had fixed this bug, the commit is: http://git.busybox.net/busybox/commit/shell/ash.c?id=4ba6c5d3ba3d2c7922aff6b5c2e73b8325f1cf17 but this commit results in crash running some shell scripts, so it was reverted. I am trying to resolve this defect,any commnent would be appreciated! There is no easy fix. The patch you sent is incorrect. The problem is that the stack model used for allocation in ash is semantically wrong, and needs to be replaced with a dependency-tree model or something else that can handle freeing one object without freeing all subsequently-allocated objects. As a workaround, it might be possible to code a hack to reuse the existing copy of the string on the allocation stack rather than allocating a new copy whenever an existing copy already exists. This would not solve the general case (e.g. when the redirection uses a different file on each iteration) but it would solve the case above, I think. It might be worth pursuing this approach if nobody is available to work on the proper fix in the near future. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: problem of memory leak in re-direction (shell/ash.c: expredir)
On Mon, Dec 15, 2014 at 02:55:41PM +0800, shengyong wrote: hi, all I meet the memory leak problem when use re-direction in a reading-loop, like: while true do while true do break done /dev/null done There is some discussion on this issue: * http://lists.busybox.net/pipermail/busybox/2012-December/078738.html * https://bugs.busybox.net/show_bug.cgi?id=5822 But it is still not fixed in the later versions? AFAIK it is not fixed. As I stated in the bug report you linked, a proper fix would require dropping the (semantically invalid) use of the current stack-like allocation strategy and replacing it with some sort of reference-tree structure that allows freeing of allocations other than the most-recent. The easiest way I can see to do this would be writing an allocator similar to talloc or adapting talloc itself if the license and code size are acceptable. But in any case I think it's a non-trivial task and nobody's volunteered to do it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] mktemp: don't use mktemp() function
On Thu, Dec 11, 2014 at 06:01:57PM +0100, Bartosz Golaszewski wrote: The linker emits this warning: warning: the use of `mktemp' is dangerous, better use `mkstemp' or `mkdtemp' Fix it by using mkstemp() instead of mktemp(). function old new delta mktemp_main 214 233 +19 -- (add/remove: 0/0 grow/shrink: 1/0 up/down: 19/0) Total: 19 bytes Signed-off-by: Bartosz Golaszewski bartekg...@gmail.com --- debianutils/mktemp.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/debianutils/mktemp.c b/debianutils/mktemp.c index 983d7a2..5ef557a 100644 --- a/debianutils/mktemp.c +++ b/debianutils/mktemp.c @@ -59,6 +59,8 @@ int mktemp_main(int argc UNUSED_PARAM, char **argv) const char *path; char *chp; unsigned opts; + int fd; + enum { OPT_d = 1 0, OPT_q = 1 1, @@ -93,8 +95,9 @@ int mktemp_main(int argc UNUSED_PARAM, char **argv) chp = concat_path_file(path, chp); if (opts OPT_u) { - chp = mktemp(chp); - if (chp[0] == '\0') + close(fd = mkstemp(chp)); + unlink(chp); + if (fd 0) How is this an improvement? It increases the code size and performs unnecessary and potentially harmful filesystem operations. And it's just covering up the dangerous issue rather than fixing it -- using mkstemp then deleting the file and reusing the name is even MORE dangerous than using mktemp, since creating the file even momentarily exposed its name to an attacker. Of course if the code using the mktemp utility is written correctly, neither is dangerous anyway. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Writing unicode ncurses applications for busybox
On Tue, Nov 18, 2014 at 12:47:36PM +0300, Pugnator wrote: Hi all, I write ncurses based application which utilizes russian characters. It looks like this: wchar_t *unicode_string = LЭто юникод; mvwprintw(stdscr,1,5,%ls, unicode_string); And it works pretty well on my desktop. I enabled all UNCIODE_ config options I found I enabled terminal: export TERM=linux2.2 I copied terminfo into my busybox system (kernel 3.2) When I run my application, only ascii is printed. Unicode chars are absent: no garbage or whatever. How it can be workarounded if it is possible. Application is dynamically linked and all libraries were transferred too Busybox has nothing to do with how your own applications behave, unless they're calling out to the system utilities to do things. I suspect either your libc (uclibc?) was built without locale/UTF-8 support, or your ncurses was built without wide character support. If you're using Buildroot or a similar tool, it should offer you the option to configure these aspects of the build. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: nslookup's Server: 0.0.0.0 but resolves correctly
On Sun, Nov 02, 2014 at 09:02:57PM +, Steven Honeyman wrote: On 2 November 2014 18:05, strob...@gmail.com wrote: musl: 24 25 /* unused; purely for broken apps */ 26 typedef struct __res_state { 27 int retrans; sorry you lost me. is this an nslookup.c from your own repo? I thought musl only needs header patches with bb. No - that's in the dns related source code of musl libc (which is also broken like you describe android bionic) I should really say now, I know almost nothing about android/bionic which is why I can't offer any direct suggestions for that... but I use musl quite a bit so can try to approach that way. Basically, busybox tries to get dns settings through an (old?) method that is not supported in musl (and possibly bionic). The code in musl behind the scenes is just a bunch of return 0; to keep broken apps happy in thinking the res functions work, but meanwhile it uses its own method of parsing resolv.conf - which might be the easiest way I can think of to work around this issue... but I'm sure there is a much better way that I don't know about. ^ that's my current understanding at least, but if anyone can explain better then I'm sure we'd all benefit! I'll try to explain what it's like for musl. The above comment in regards to __res_state is referring to the idea that this structure was never meant to be a public API, but got abused as one by applications attempting to override nameservers or settings or otherwise fiddle with the resolver's internal state. musl's resolver is completely stateless and parses resolv.conf on each request. This both ensures that it's always using fresh data (e.g. on systems where the DNS configuration changes frequently) and avoids all the issues of having global state that's not thread-safe or library-safe. (IIRC glibc avoids this by making the resolver state thread-local, and by providing the new n-prefixed versions of the res functions which can use a caller-provided __res_state.) The res_* functions are not stubs/dummies in musl. As of a few versions ago, they're all intended to work. res_init is a nop simply because there is no state. The right way to do lookups with a custom DNS is to call res_mkquery and then sendto() your preferred DNS. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Considering porting acpi
On Sun, Oct 12, 2014 at 03:08:03AM -0500, Rob Landley wrote: Heh. The ping.c one is particularly strange because clause 2, Redistributions in binary form must reproduce the above notice but busybox does not include the word Regents in any text string, so how it would emit it at runtime I couldn't tell you. That said it's been there like that for _years_ and nobody has ever cared except me. The subject of the verb produce there is not the program, when executed but [the] redistribution. This just means you have to have a text file or paper distributed with the binary containing the notice. It does not mean that the program itself needs to embed the notice and be able to display it on an output device; however, doing so is an easy way to ensure that it stays attached unless somebody intentionally removes it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Considering porting acpi
On Fri, Oct 10, 2014 at 08:43:15AM -0700, Isaac Dunham wrote: The code I have is currently using dirtree_read(), which corresponds to recursive_action(). Basically it looks in /sys/class/*/*/, /sys/class/power_supply, and /sys/class/thermal for device status. I could make it shorter if I use glob() instead of a recursive callback; does anyone know whether glob() is likely to be a bloat problem? There's a cut-down glob function you could borrow from the linux kernel ( lib/glob.c ) if you really wanted it slim. That would make a smaller static binary where only the acpi applet is concerned, but it might well increase the net size; glob uses fnmatch, which is use in several parts of busybox. Yes, I would think replacing glob would be a net increase in size unless you have an extremely minimal bb config to begin with. This reminds me: I keep thinking about writing fngrep, a grep-like tool that uses fnmatch instead of regexes. Not widely useful, perhaps on occasion. Since grep already has -E and -F, wouldn't it make more sense just to add this as a nonstandard option to normal grep? Then you'd get all the other grep functionality for free. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: netcat nc only ipv4 (ping6 ipv6 already present)
On Sun, Aug 31, 2014 at 02:35:33PM +1000, Jaro Gress wrote: It would be great to have netcat nc upgraded to handle ipv6. the usual option is nc -6 -l x or nc -6 -l -p x. The -p option could be done without. It is kind of non standard. ping6 is handling ipv6 already well. (BeagleBone arm debian is also backward on ipv6 netcat) Why should an option like -6 be needed at all? If the specified address is an ipv6 literal, or a hostname that resolves to ipv6, then it should just work without needing a special option. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Re: Re: missing format string in applets/usage_pod.c
On Thu, Aug 14, 2014 at 11:58:56PM -0700, Isaac Dunham wrote: On Thu, Aug 14, 2014 at 04:15:50PM +0100, Laszlo Papp wrote: On Thu, Aug 14, 2014 at 4:05 PM, Rich Felker dal...@libc.org wrote: On Thu, Aug 14, 2014 at 03:28:51PM +0100, Laszlo Papp wrote: On Thu, Aug 14, 2014 at 2:44 PM, Tanguy Pruvot tanguy.pru...@gmail.com wrote: its the same with bionic libc (arm) printf(test) is ok but not printf(buf) with char buf[] = test; printf(%s, buf) is ok Yeah, I guess it is about personal preference. I personally do not like the extended code just to make some smart option silent. IMHO, it makes the code needlessly longer by adding another small layer for the content of the variable. If there is no need for formatting, why use printf in the first place and not fputs with stdout? IMHO, stdout is nicer than %s :-) On the other hand, it is too bad that there is nothing between fputs and puts where you do not need to use the file descriptor and you do not get a new line either. In principle stdout is more expensive than . stdout is a GOT reference in PIC, and is just PC-relative. For non-PIC it makes no difference though. Then my PC is probably lying since the practice does seem to disagree with you, apparently as well as this post: http://stackoverflow.com/questions/17238784/c-puts-without-newline#comment24978835_17238840 (and the one below, ok, it is not critical, but it performs better for others, too) Raw strings are also uglier IMHO than well-defined identifiers, like stdout, but that is personal taste, so I do not mind that one. printf() is almost always going to call vfprintf(stdout, fmt, ap); I've checked musl, uclibc, glibc, and netbsd; dietlibc is the only exception I've found, and it's ugly and does not have locking/retry/... newlib calls _vfprintf_r, which is almost the same concept. Meanwhile, printf has to do the formatting. In other words: Just because printf(%s, string) doesn't name stdout, don't assume that it doesn't use a reference to stdout. So an ACK for Laszlo Papp's recommendation of fputs(). I was talking about code size overhead, not run time overhead. Of course there's a reference to stdout in printf. When you reference it again yourself, that's a second reference. It's not a big deal but apparently busybox folks care about this kind of micro-optimization... Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Possible Unicode Problems in Busybox - Collect and Discussion
On Fri, Aug 15, 2014 at 12:31:15AM +0200, Harald Becker wrote: and how want you behave in case of invalid UTF-8 sequences? My functions just skip over stray codes of 0x80..0xBF and synchronize on next valid UTF-8 leading byte. How would you count invalid sequences? In general, I would count the whole operation as a failure, returning some value such as -1 reserved for failure, since the string is not actually UTF-8 and thus how many characters? has no meaning. For specific uses, there might be other preferred behaviors. If your goal is display, you may want to simply replace illegal sequences with U+FFFD in which case you'd count each such sequence as 1, but if you're using this character-counting to allocate a buffer for the converted string, you need to be sure your conversion function and character-counting function agree on how illegal sequences are counted, or you might overflow your buffer or end up having to truncate the output. Rich, will you ever use the result of counting the numbers of UTF-8 characters to allocate a buffer? I don't think so. That would be very ill behavior. To allocate buffer space you need the number of bytes occupied by a string, not the number of UTF-8 characters. If your intent is to convert the string to UTF-32/wchar_t/whatever, then yes, you use the result for allocating a buffer. In my mind that's the main point of counting characters (since otherwise you usually care about either bytes, for storage, or columns, for presentation), and while I personally consider it better to work character-at-a-time and keep the string as UTF-8, some APIs require a string in a different format, especially ones that work with a whole string and prepare it for visual presentation. The main other place counting characters makes sense is for implementing languages that do substring operations with character indexes, which I think is the one you care about. So the big question is: Is there anybody who still needs the BB internal Unicode handling and can't use the locale functions of a libc. Why and for what purpose is this needed? In which environment? I think the intent was to let uClibc users (and possibly eglibc users?) omit locale support from the libc, which reduces libc size quite a bit, and use the UTF-8 code in busybox instead. As far as I know, the beginning of those BB internal functions, where at times where only glibc had locale support and there where no alternatives for small environments. But things changed and there are now alternatives. So have we reached a point, where we are able to simplify things in BB (which means to focus on correct mb function usage everywhere and to strip unnecessary decisions, configs and helper code)? I wouldn't object to this change. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Possible Unicode Problems in Busybox - Collect and Discussion
On Wed, Aug 13, 2014 at 07:06:38PM +0200, Harald Becker wrote: Hi Denys ! The world seems to be standardizing on utf-8. Thank God, supporting gazillion of encodings is no fun. You say this, but libbb/unicode.c contains a unicode_strlen calling this complex mb to wc conversion function to count the number of characters. Those multi byte functions tend to be highly complex and slow (don't know if they have gone better). For just UTF-8, things can be optimized. This depends on your libc. In musl, the only thing slow about them is having to account for some idiotic special-cases the standard allows (special meanings for null pointers in each of the arguments) and even that should not be slow on machines with proper branch prediction. e.g. size_t utf8len( const char* s ) { size_t n = 0; while (*s) if ((*s++ ^ 0x40) 0xC0) n++; return n; } This function is only valid if the string is known to be valid UTF-8. Otherwise it hides errors, which may or may not be problematic depending on what you're using it for. Another fast function I use for UTF-8 ... skip to Nth UTF-8 character in a string (returns a pointer to trailing \0 if N number of UTF-8 chars in string): char *utf8skip( char const* s, size_t n ) { for ( ; n *s; --n ) while ((*++s ^ 0x40) = 0xC0); return (char*)s; } This code is invalid; it's assuming char is unsigned. In practice, *++s ^ 0x40 is going to be negative on most archs. Better would be doing an unsigned range check like (unsigned char)*++s-0x800x40U. Of course it also gets tripped up badly on invalid sequences. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Re: Re: missing format string in applets/usage_pod.c
On Thu, Aug 14, 2014 at 03:28:51PM +0100, Laszlo Papp wrote: On Thu, Aug 14, 2014 at 2:44 PM, Tanguy Pruvot tanguy.pru...@gmail.com wrote: its the same with bionic libc (arm) printf(test) is ok but not printf(buf) with char buf[] = test; printf(%s, buf) is ok Yeah, I guess it is about personal preference. I personally do not like the extended code just to make some smart option silent. IMHO, it makes the code needlessly longer by adding another small layer for the content of the variable. If there is no need for formatting, why use printf in the first place and not fputs with stdout? IMHO, stdout is nicer than %s :-) On the other hand, it is too bad that there is nothing between fputs and puts where you do not need to use the file descriptor and you do not get a new line either. In principle stdout is more expensive than . stdout is a GOT reference in PIC, and is just PC-relative. For non-PIC it makes no difference though. As for the topic at hand, I don't think it's necessary to make this change if the messages are DOCUMENTED as being format strings that are required not to contain any format specifiers, and there's a big warning to this effect in the source wherever they appear. printf allows format strings chosen (or even constructed) at runtime and the -W option to make this an error is bogus; at least it cannot be applied to code that was not written for a specific coding style that forbids such usage. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Possible Unicode Problems in Busybox - Collect and Discussion
On Thu, Aug 14, 2014 at 07:16:36PM +0200, Tanguy Pruvot wrote: size_t utf8len( const char* s ) { size_t n = 0; while (*s) if ((*s++ ^ 0x40) 0xC0) n++; return n; } you need to test s != NULL, else *s will crash Says who? NULL is not a valid pointer. Should you also check for things like s != (char *)-1 ? What value would you return then, anyway? Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Possible Unicode Problems in Busybox - Collect and Discussion
On Thu, Aug 14, 2014 at 07:14:52PM +0200, Harald Becker wrote: Hi Rich! You say this, but libbb/unicode.c contains a unicode_strlen calling this complex mb to wc conversion function to count the number of characters. Those multi byte functions tend to be highly complex and slow (don't know if they have gone better). For just UTF-8, things can be optimized. This depends on your libc. that is, why I added don't know if gone better ... really good when musl is fast on this ... the problem is BB is more likely linked with glibc or uClibc ... there the results are not so great :( I think uClibc is pretty fast at this too. It's glibc that's horribly slow. Rough comparison: For processing a full string buffer, musl is roughly twice as fast as uClibc, and uClibc is roughly twice as fast as glibc. For byte-by-byte processing: musl is roughly 3x as fast as uClibc and roughly 4x as fast as glibc. Source: my comparison at http://www.etalabs.net/compare_libcs.html Presumably you would use a full string operation here (mbstowcs with null output pointer) for computing length in characters. size_t utf8len( const char* s ) { size_t n = 0; while (*s) if ((*s++ ^ 0x40) 0xC0) n++; return n; } This function is only valid if the string is known to be valid UTF-8. Yes, I told it's for UTF-8. Yes, but there's a difference between nominally UTF-8 and known-valid UTF-8. Otherwise it hides errors, which may or may not be problematic depending on what you're using it for. If you know you are using UTF-8 you do not need to check every string over and over again, else it's pure paranoia. It is robust, as it will not run away on anything which is valid C string. Well if the string comes from a source outside of your control, you need to check it at least once. But you might not want to check and reject it at the original point of input, e.g. if you want to be able to preserve arbitrary byte sequences that might not be UTF-8, e.g. an argument that's a filename in an invalid encoding which you're trying to delete or rename to fix. So IMO it makes a lot more sense to do your checking at the point of treating the string as a sequence of characters, even if it happens multiple times. The cost is not high if your implementation is efficient. Of course it also gets tripped up badly on invalid sequences. How can it get tripped? It silently skip over invalid sequences (of 0x80 to 0xBF until next leading of a sequence). It shall not get stuck in any way. Or tell me exactly how ... By itself it's not a problem, but the interaction with other code may be a problem if the other code does not follow exactly the same conventions. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Possible Unicode Problems in Busybox - Collect and Discussion
On Thu, Aug 14, 2014 at 09:09:02PM +0200, Harald Becker wrote: Hi Rich! I think uClibc is pretty fast at this too. It's glibc that's horribly slow. Rough comparison: Pretty fast is still slower than UTF-8 optimized functions. The standard functions certainly can be UTF-8-optimized, and they are in at least several implementations. I think glibc still has a pretty slow path to get to the UTF-8 decoding but hopefully that will be fixed eventually. I'll remind myself to pursue that in the future. For processing a full string buffer, musl is roughly twice as fast as uClibc, and uClibc is roughly twice as fast as glibc. You don't need to make ads for musl here, I' would like to see prebuild versions of BB statically linked with musl. It's not an ad. It's just pointing out that uClibc is probably not significantly slower for what you care about. My interpretation was that you trusted me that musl is fast here, but thought other more commonly used implementations might be slow, so I stated the relative speeds as a basis for evaluating that. Presumably you would use a full string operation here (mbstowcs with null output pointer) for computing length in characters. Do you remember my question only UTF-8 or full multi byte locale? It is exactly this decision. The former may be optimized the later more accurate. Yes I remember the question. Assuming the standard function has a fast path for UTF-8, which it should, the only reason to expect the standard multibyte functions to be significantly slower than your custom ones is that they detect illegal sequences rather than blindly assuming the input is valid. If you know you are using UTF-8 you do not need to check every string over and over again, else it's pure paranoia. It is robust, as it will not run away on anything which is valid C string. Well if the string comes from a source outside of your control, you need to check it at least once. But you might not want to check and reject it at the original point of input, e.g. if you want to be able to preserve arbitrary byte sequences that might not be UTF-8, e.g. an argument that's a filename in an invalid encoding which you're trying to delete or rename to fix. So IMO it makes a lot more sense to do your checking at the point of treating the string as a sequence of characters, even if it happens multiple times. The cost is not high if your implementation is efficient. and how want you behave in case of invalid UTF-8 sequences? My functions just skip over stray codes of 0x80..0xBF and synchronize on next valid UTF-8 leading byte. How would you count invalid sequences? In general, I would count the whole operation as a failure, returning some value such as -1 reserved for failure, since the string is not actually UTF-8 and thus how many characters? has no meaning. For specific uses, there might be other preferred behaviors. If your goal is display, you may want to simply replace illegal sequences with U+FFFD in which case you'd count each such sequence as 1, but if you're using this character-counting to allocate a buffer for the converted string, you need to be sure your conversion function and character-counting function agree on how illegal sequences are counted, or you might overflow your buffer or end up having to truncate the output. Of course it also gets tripped up badly on invalid sequences. How can it get tripped? It silently skip over invalid sequences (of 0x80 to 0xBF until next leading of a sequence). It shall not get stuck in any way. Or tell me exactly how ... By itself it's not a problem, but the interaction with other code may be a problem if the other code does not follow exactly the same conventions. Sure, you can't mix multi byte functions with pure UTF-8 functions, you always need to look what type of function you call in your code. So what's different here. Interaction with other code was not about mixing your own pure UTF-8 functions with the standard C multibyte functions in possibly-non-UTF-8 locales. It was about mixing them with other code that's processing UTF-8 but handling errors differently. One such example would be the standard C multibyte functions when nl_langinfo(CODESET) has already been determined to be UTF-8 (so you know they're processing UTF-8), but pure UTF-8 code outside of the standard functions might also be handling errors differently from what you're doing, and mixing it with your handling _could_ be dangerous, depending on what you do. and the convention is just UTF-8 (even with invalid sequences) not a mixture with other multi byte codes. Not so much requirement of a convention? The functions have bean designed carefully to be not trapped on invalid sequences. I know they look extreme simple, but this is part of the optimization. remember: We are not talking about the ability to work with other multi byte locales. The assumption was pure ASCII or UTF-8. I'm fine with assuming
Re: How do I (unconditionally) enable unicode support in busybox?
On Mon, Aug 11, 2014 at 05:15:21PM +0200, Harald Becker wrote: IMO there is still something very strange with sed and unicode YES! I did not stop looking for this. Looks like this is a problem in the regular expression parser. s /./x/g shall match every character and replace with a single x, but indeed it matches every byte of UTF-8 characters too (which is wrong). But this doesn't seam to depend on setting of LANG (which confused me). Is it possible, it only worked when BB is linked with glibc in a fully functional environment. Maybe than an UTF-8 aware regex scanner is used. We need to look further on this! I think this is the result of using uclibc with a broken regex implementation -- either as a result of a build time option (omitting locale? omitting full regex?) or just a deficiency in uclibc. Using glibc or musl would solve it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: How do I (unconditionally) enable unicode support in busybox?
On Wed, Aug 06, 2014 at 02:51:21PM +0200, Harald Becker wrote: Hi James! The problem is mainly when I use a busybox script as /init inside an initrd (initramfs). It runs as process 1. The /init script is called directly by the bootloader. Its environment is controlled by the command line parameters which are set by the *user*. Worse, in this situation: export LANG=... DOES NOT WORK for changing subsequent unicode behavior. Why shall it not work in this situation? It works for me. At the start of the init script I use export LANG=... whatever I need. As it is exported it goes to the environment of every process started from init script, so UTF-8 support is kept. Perhaps the issue is that it doesn't work in the initial shell itself -- for example, glob bracket expressions treat a character as a set of bytes and match any one of those bytes rather than the character. The behavior is set and locked in by the value of LANG in the initial environment which I don't have control over. No control? Why? It is usally not needed, but in case you really want to achieve this: first init script: export LANG=... exec /bin/sh second_init_script MAY EVEN SET COMMAND LINE ARGUMENTS second init script runs as process 1 with environment set as you like. I agree this is a good workaround. If exporting worked then the original code I posted would work: echo -n $x | LANG=utf sed 's/./x/g' | wc -c Argh ... this kind of exporting makes trouble with ash (and many other Unix shells, not all behave correct as expected). Always set it at the beginning of your script: Are you sure? It's well-documented, required by the standard, and has always worked fine with ash for me. I would consider it the preferred way of exporting when you don't want to affect subsequent commands or the state of the current shell, only the current command. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: How do I (unconditionally) enable unicode support in busybox?
On Sun, Aug 03, 2014 at 11:50:29PM -0600, James Bowlin wrote: I run busybox in an initrd (initramfs) environment using both legacy Grub and isolinux as boot loaders. I want to be able to get the correct length of unicode strings in characters, not bytes. I always have these two options set: CONFIG_UNICODE_SUPPORT=y CONFIG_UNICODE_WIDE_WCHARS=y and I've played with the 3 combinations of: CONFIG_UNICODE_USING_LOCALE CONFIG_FEATURE_CHECK_UNICODE_IN_ENV both off, and one or the other on. At best I get very erratic results that depend on the value the LANG variable when the first busybox shell (/init script) starts and it seems to be immune to later using export LANG=xxx. One of the things I've tried is: echo -n $x | LANG=en_US.UTF-8 sed 's/./x/g' | wc -c This works *sometimes* but seems to depend on the value of LANG when the first busybox shell (/init script) is started but it has been flakey at best. Also, I don't have absolute control over that initial value of LANG because it is can be set by users with a lang=xxx boot parameter. Unicode strings always print fine. I'm just struggling with getting the length of a string that has unicode characters. Using ${#x} has always failed. So has wc -c which is what led me to the sed trick above. The code above to get the length always seems to work consistently in my development environment (using the right busybox .config). Oddly enough it fails in my development environment when CONFIG_FEATURE_CHECK_UNICODE_IN_ENV=y which is exactly the opposite of what I would expect. This option is utterly broken and should never be used. It searches for the string .utf or .UTF in $LANG to determine if UTF-8 should be enabled. There is no reason that this string needs to appear in the name of a locale for the locale to be UTF-8-based (plenty of locales have no legacy encoding). This option should really be removed from Busybox, or at least get a big warning slapped on it that it's broken and doesn't do what it should. The correct (and only correct) way to determine if UTF-8 should be used is to call setlocale(LC_CTYPE, ) and then check nl_langinfo(CODESET). it seems to mostly fail when I choot into a plain busybox environment or inside my initrd (initramfs) during boot. If you're having trouble even with CONFIG_FEATURE_CHECK_UNICODE_IN_ENV turned off and CONFIG_UNICODE_USING_LOCALE turned on, please let me know which libc you're using -- this could matter too -- and if it's uClibc, whether you have locale properly enabled for it. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: How do I (unconditionally) enable unicode support in busybox?
On Mon, Aug 04, 2014 at 03:00:07PM -0400, Michael Conrad wrote: On 8/4/2014 2:48 PM, James Bowlin wrote: BTW, the following code is an infinite loop in my initrd: [ $LANG = en_US.UTF-8 ] || LANG=en_US.UTF-8 exec $0 $@ I think you should focus on this bug. Which busybox version? Which libc version? Because my busybox runs it just fine. (as should any conforming shell) I suspect it's an issue with some option to run busybox commands in the shell process itself rather than forking and execing. Seeing a full config file would be useful. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] top: fix parsing of /proc/meminfo
On Tue, Jul 22, 2014 at 08:24:30PM +0300, Timo Teras wrote: On Tue, 22 Jul 2014 13:07:15 -0400 Cathey, Jim jcat...@ciena.com wrote: Am I missing something here? There is no structure, just a character pointer. If you leave off static it will be compiled as an instruction that pushes a constant onto the stack, by any/every compiler. Which is probably _larger_ code than just referring to something stored in the .text/.data segment. It's copying something that is found in text/data into the stack space, then referring to that address thereafter. The compiler has to do: Copy A-B Then use B. -vs- Use A. You do this when you want the ability to modify B. If A is fixed and inviolate, you mark it static const and be done with it. What this turns into, exactly, depends upon your compiler, CPU, and ABI, but I can't think of any case where the extra copy is going to turn out _better_ than the simpler case. Usage of 'static' depends whether it's const char * or const char[]. When defining const char * static does not make sense, as in theory it should enforce *also* the pointer value to go to .rodata (the string literal of course goes there too). So you might end up wasting one extra pointer in .rodata. OTOH, with const char[] you definitely want the static as other wise you'd end up with the copy described above. I only suggested removing 'static' since I had it a pointer. The latest patch has array, and thus is static. I agree that using a static const array is the best approach here, and that with a *single* pointer, making a static object to hold the pointer rather than just an automatic const-qualified variable is a pessimization, if anything. However static const pointers are useful when they're arrays of pointers, or part of structures, etc., but their usefulness is limited except with static linking and non-PIE main programs, since they'll need relocations for PIC/PIE which would throw them out of rodata. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Use the standard libc types in fatattr
On Sat, Jul 05, 2014 at 12:20:51PM +0100, Laszlo Papp wrote: To be pedantic, uint32_t was introduced in the Open Group Base Specifications, Issue 5 (released in 1997, basis for UNIX98). At that point it was defined in inttypes.h, which only defined (u)int*_t. As far as when inttypes.h was available, it was first shipped in glibc 2.1 (and prereleases thereof), meaning it was first packaged in Debian 2.1 (slink) on sparc. Slink shipped with kernel 2.0.36. But the names of sized types in the kernel _could_ have been switched over when the standard was released, or rather during any of the unstable series after that; this is not a compiler change but a header change, and the kernel has its own headers. Kernel 2.0 was released in 1996 (before the standard), but 2.2, 2.4, and 2.6 were released in 1999, 2001, and 2003. Glibc is not the only thing, in fact it has not been used much in embedded, and even today, it is mostly for computing embedded. From a header standpoint, uClibc effectively is (an old version of) glibc. From well after the standard types were added. So this is not an argument against using them. The real reason is mentioned in the kernel coding style manual: some people don't like the new types. Tes... for good: do not fix what is not broken. Do not fix what is not broken does not apply here. The kernel headers have ALWAYS been broken for use in userspace. The UAPI effort has partly fixed this (and is evidence that they _are_ trying to fix this, albeit poorly). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Use the standard libc types in fatattr
On Fri, Jul 04, 2014 at 04:25:11PM +0200, Denys Vlasenko wrote: On Fri, Jul 4, 2014 at 4:06 PM, Laszlo Papp lp...@kde.org wrote: On Fri, Jul 4, 2014 at 2:47 PM, Denys Vlasenko vda.li...@googlemail.com wrote: On Fri, Jul 4, 2014 at 3:26 PM, Laszlo Papp lp...@kde.org wrote: -# define FAT_IOCTL_GET_ATTRIBUTES_IOR('r', 0x10, __u32) -# define FAT_IOCTL_SET_ATTRIBUTES_IOW('r', 0x11, __u32) +# define FAT_IOCTL_GET_ATTRIBUTES_IOR('r', 0x10, uint32_t) +# define FAT_IOCTL_SET_ATTRIBUTES_IOW('r', 0x11, uint32_t) #endif /* Currently supports only the FAT flags, not the NTFS ones. Applied, thanks! (why kernel doesn't just use std types?...) What do you mean by std types? Like uint32_t As indicated before, it was only introduced in C99. The kernel project predates that for one. I understand that. It's 2014. 15 years to convert to (now-)standard type should be doable. Yes. If the kernel would convert to using the stdint.h types and provide its own stdint.h for use with -nostdinc when building the kernel itself, the kernel headers would automatically be compatible with userspace when used directly by applications like busybox. On the other hand, if the kernel headers are intended to be used for implementing some libc headers, they cannot depend on or pull in stdint.h unless the corresponding libc header is allowed to do so; if they did so, it would be namespace pollution. I suspect this is one of the reasons they may be hesitant to use these types. Right now, all of their wacky types are in the reserved namespace (double-underscore prefix) and thus they don't have to worry about it. Of course libc implementations should not be using kernel headers -- I thought we learned that lesson 20 years ago with libc5 -- but uclibc does so heavily and glibc even resurrected the practice at some point for various things (especially socket stuff, and lots more on some archs). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [Question] Questions about the BusyBox specification.
On Tue, Jul 01, 2014 at 08:47:45PM +0200, Harald Becker wrote: Hi Rich! Obviously something like that isn't acceptable for inclusion. It was probably just a hacked-up version of upstream iptables. Just as a question. I did not look into that very deep. You are talking about iptables. I thought newer kernel have a different firewall, with a complete different language/interpreter. Is that really intentional to look still at the old iptables? Wouldn't it be better to implement applets of the new firewall rules, giving also other users a push to use the new firewall infrastructure. I was under the impression that most users/products are still using the iptables interface, despite it having a new backend that they could use directly. It wouldn't hurt to have both, but a command-line-compatible version of iptables is probably more important from a user perspective. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [Question] Questions about the BusyBox specification.
On Mon, Jun 30, 2014 at 07:57:56PM -0700, Isaac Dunham wrote: On Fri, Jun 27, 2014 at 09:26:27AM -0400, Rich Felker wrote: On Fri, Jun 27, 2014 at 10:06:07AM +0200, Frank Ihle wrote: (6) Is there a (stateless/statefull) firewall for BusyBox ? think this is not related to busybox. Use iptables? The lack of an iptables command in Busybox is something that would be nice to fix, especially since the official iptables is bloated and (last I checked) requires dynamic linking. But this would still not be a firewall for Busybox (because Busybox is NOT AN OS), just an alternate implementation of the low-level firewall configuration tool. At one point someone ported iptables to busybox. However, that was in the busybox 1.00/1.1 era, so it's probably missing several bugfixes and features, as well as having a lot of bitrot; it also runs ~14000 lines and requires rpc/rpc.h. The copyright notices say (C) 2000-2002. (Rob Landley was not interested.) Obviously something like that isn't acceptable for inclusion. It was probably just a hacked-up version of upstream iptables. The right way to add iptables would be to reverse engineer it via strace (i.e. only look at the syscalls it makes to perform the actual commands, not the source code, which is likely hideous) and write code that does the same thing. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [Question] Questions about the BusyBox specification.
On Fri, Jun 27, 2014 at 10:06:07AM +0200, Frank Ihle wrote: (6) Is there a (stateless/statefull) firewall for BusyBox ? think this is not related to busybox. Use iptables? The lack of an iptables command in Busybox is something that would be nice to fix, especially since the official iptables is bloated and (last I checked) requires dynamic linking. But this would still not be a firewall for Busybox (because Busybox is NOT AN OS), just an alternate implementation of the low-level firewall configuration tool. (7.1) Does BusyBox provide a Network stack ? Run: make menuconfig and look under Networking utilities Alright, according to my question, is the answer in this Thread (http://stackoverflow.com/questions/19349923/tcp-ip-stack-in-busybox) then wrong? There they say there is no TCP/IP stack in BusyBox. There's not, because Busybox is not an OS/kernel. Busybox is simply an alternate implementation of some of the essential userspace utilities for providing things like standard POSIX commands, network configuration, filesystem creation and repair, (de)compression and archive utilities, etc. The TCP/IP stack (a somewhat dated term, BTW) is in the kernel. (10) What is the long-term availability of BusyBox? Can it be expected to be available for at least 10 years ? You download the source, you will have it forever. Maybe I should have asked differently: will BusyBox be developped and supported in future or e.g.: will there be a successor program and therefore BusyBox would be left on a final state ? There's little guarantee of anything like this, but considering how widely Busybox is used, I would be really surprised if somebody else didn't pick up development if the current maintainers got tired of it. Of course there's no guarantee that someone who did so would continue to develop in a direction that's appealing to your needs, but since it's free software, you're always free to do your own fork or hire someone to do it. There's also Toybox, which I suspect Rob would call a successor to Busybox.. ;-) Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: ftpd authentication [PATCH] ftpd: NOMMU/chroot fix
On Fri, Jun 27, 2014 at 01:55:26PM +0200, Denys Vlasenko wrote: On Fri, Jun 27, 2014 at 1:46 PM, Morten Kvistgaard m...@pch-engineering.dk wrote: A small detail, why do you check if the root_fd is valid? Eg. ... if (G.root_fd = 0) { ... It shouldn't be possible for it to be invalid and the old code didn't validate it either. I added this in the next patch: #if !BB_MMU - G.root_fd = xopen(/, O_RDONLY | O_DIRECTORY); - close_on_exec_on(G.root_fd); + G.root_fd = -1; #endif argv += optind; if (argv[0]) { +#if !BB_MMU + G.root_fd = xopen(/, O_RDONLY | O_DIRECTORY); + close_on_exec_on(G.root_fd); +#endif It would be nice to have an assume O_CLOEXEC build option and use O_CLOEXEC wherever close_on_exec_on is currently used. Might save a little bit of bloat if nothing else. This could probably be achieved just by adding O_CLOEXEC at the call sites, defining it to 0 if the system doesn't define it, and defining close_on_exec_on(x) away to nothing if the build option is selected. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH 0/6] Add a unit-testing framework to Busybox
On Thu, Jun 26, 2014 at 12:35:53PM +0200, Denys Vlasenko wrote: On Thu, Jun 26, 2014 at 12:00 PM, Bartosz Gołaszewski bartekg...@gmail.com wrote: 2014-06-22 16:31 GMT+02:00 Denys Vlasenko vda.li...@googlemail.com: Applied all patches with some editing. Thanks a lot! Hi Denys, I see, that you removed the INIT_FIRST macro from my implementation. Are you sure llist_t *tests will always be initialized before calling the test registering functions? Global data is initialized to zero at program load time. And even if you need a non-zero initial value, static initialization is usually suitable; there's no need for code for this kind of thing. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: ftpd authentication [PATCH] ftpd: NOMMU/chroot fix
On Thu, Jun 26, 2014 at 04:47:04PM +0200, Denys Vlasenko wrote: On Thu, Jun 26, 2014 at 1:37 PM, Morten Kvistgaard m...@pch-engineering.dk wrote: ... execve(proc/self/exe, [ftpd, -l, /], [/* 9 vars */]) = -1 ENOENT (No such file or directory) ... This is strange. Any ideas why this fails on your machine? Yes, the fchdir(G.root_fd) is not enough to break the jail. (And it's not just my machine. It's all of our Ubuntu versions and all of our uClinux versions. Which made me assume that it was a general issue.) There's a nice quote, I think: Ref: http://m.oschina.net/blog/113399. (One of the first hits on google. There're prolly better sources.) === /* Partially break out of the chroot jail by doing an fchdir() This only partially breaks out of the chroot() jail since whilst our current working directory is outside the chroot jail, our root directory is still within it. Thus anything which refers to / will refer to files under the chroot point. */ if (fchdir(dir_fd)0) { fprintf(stderr, Failed to fchdir - %s\n, strerror(errno)); exit(1); } The point is, we *do not* refer to /. We exec proc/self/exe, NOT /proc/self/exe. It does work on my machine. How come it doesn't work on your machine? If proc/self/exe make be relative to the working directory, but /lib/ld-whatever.so from proc/self/exe's PT_INTERP header is certainly not relative. This is the cause of the ENOENT. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Add coreutils/unlink.c
On Mon, Jun 23, 2014 at 07:44:23PM +0200, Denys Vlasenko wrote: On Mon, Jun 23, 2014 at 7:24 PM, Cathey, Jim jcat...@ciena.com wrote: The entire point of unlink, the reason it even exists, is that it never takes _any_ options. Anything you feed it is a filename, and it will delete it. coreutils disagree: $ unlink --version unlink (GNU coreutils) 8.17 Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Michael Stone. $ unlink -qwerty unlink: invalid option -- 'q' Try 'unlink --help' for more information. This seems to be a bug in coreutils then. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/unlink.html Under OPTIONS it says None. For other utilities that take options, the text reads something like: The rm utility shall conform to XBD Utility Syntax Guidelines. I think the specification is clear in that the coreutils behavior is not permitted. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] Add coreutils/unlink.c
On Mon, Jun 23, 2014 at 07:36:50PM +0100, Laurent Bercot wrote: Well, unlink takes '--version' and '--help' as options. I think there's a conflict between open standard and coreutils' oddity to bring command syntax and version information with command line switches. unlink is not the only nonconforming executable in GNU coreutils. true and false, for instance, also accept GNU options. (Because being able to perform true --help and true --version is vital.) It's just a symptom of GNU's tendency to think itself as better than the standard while pretending to conform to it. Busybox has to make a choice: either be conformant or mimic GNU coreutils' behaviour. Both paths are valid, but Busybox has to be consistent with it: all utilities have to make the same choice. I vote for a general compile-time option BB_MIMIC_GNU, if there isn't anything similar already. If this is the path taken, I think the config option should be clearly name and documented to explain that it is not adding useful features but rather compatibility with gratuitous GNU brokenness. For example in the case of unlink, the coreutils brokenness defeats the whole purpose of having the utility. Of course having the utility is rather useless anyway since rm -- filename is just as good. I would argue that unlink is useless and could just be replaced by such a script if needed. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
netstat output issues
Two small issues I found in busybox's netstat vs non-busybox versions: 1. The any address 0.0.0.0 or :: is not printed as '*' but actually looked up via the resolver. If nothing else this is a waste of time; it's also (at least in my opinion) uglier and harder to read. 2. IPv6 literals are printed without brackets around them, resulting in confusion between the port number and the ':' characters within the address. Neither is a big deal, but these might be nice to fix at some point if others agree. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Issues removing files with certain characters in their names.
On Thu, May 29, 2014 at 06:41:17PM -0400, Joshua Judson Rosen wrote: But why is ls able to match the files when rm is not able to remove them? I have no idea. Have you tried running them under strace and seeing where the failure occurs? Is it perhaps because ls is not actually doing any operations on the files themselves (not even a stat?), and just reporting the dirent-d_name strings that it got from readdir()? In which case ls -l * would fail on the same files even when ls * doesn't? I doubt it. Or is there something deeper whereby stat() succeeds but unlink() fails? I'm guessing it's a failure at in the userspace code for rm, not the syscall. Again strace could help you confirm this and possibly determine where rm's idea of the filename is getting corrupted. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Proposition: use a hashtable instead of bsearch to locate applets
On Fri, May 30, 2014 at 12:13:41PM +0100, Laurent Bercot wrote: On 05/30/2014 10:16 AM, Bartosz Gołaszewski wrote: I've checked the times just by looking up all the applets in a loop and measuring the time using gettimeofday() - the results are: ~220 microseconds for bsearch and ~150 microseconds for hashtable on my linux laptop. Is it significant? I think so. Is it noticeable? Probably not. The idea came to me, when thinking about unifying the hashtables used in busybox. I guess you're right and it isn't really worth the size increase. I think it would definitely be a worthwhile improvement if all busybox is doing was looking up the applets in a loop. ;) A more realistic test, however, would be to fork+exec a trivial applet (true, for instance) in a loop, and measure the times with bsearch vs. with the hash table. And I'm certain you'll find that the difference becomes quite insignificant. The lowest time I've ever seen for exec (not even counting fork; measured from immediately before the exec syscall to entry into main) is over 200µs, and can easily exceed 1ms with dynamic linking. So even if the program did nothing but applet lookup and exit, I think we'd be looking at a performance increase of ~30% at best. Realistically, as soon as you throw in some actual work and other syscalls that actually do something, I suspect the difference to be less than 5%. If there is a desire to change the way applet lookup is done, I would suggest trying to make it optimal in both size and performance. 150µs is not impressive at all. A perfect hash constructed at build time for the list of busybox applet-name strings should make it possible to do the applet lookup in ~1µs with trivial code/table size (all constant in .text). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Issues removing files with certain characters in their names.
On Fri, May 30, 2014 at 08:06:24PM +0200, Harald Becker wrote: Hi Joshua ! On 29-05-2014 18:41 Joshua Judson Rosen jro...@harvestai.com wrote: But why is ls able to match the files when rm is not able to remove them? The problem happens on the opposite direction you expect. It is not the create/open/unlink which modify the file name it is the directory scan. ls doesn't really match the filename. It just scans the directory, filters with given file name masks, then display what it got. ls never performas any filtering. If ls is given a directory name, it just reads the entries with readdir(); no 'matching' needs to take place at all. If it's given a glob pattern, that glob never reaches ls; it's expanded by the shell. If you quote the glob to actually pass it to ls, ls will not process it as a glob pattern; it's treated as a literal filename containing *'s or ?'s. So I don't understand what you're claiming happens. But on some unusual file names the directory scan gives names which does not exactly match the name stored on file system. They can be displayed, but used for an remove or move operation the stat/unlink fails. This usually happens when the names contain control or unprintable characters (e.g. a byte with all zero) which get removed by the kernel/file system driver. Printability has nothing to do with processing the filename. And a zero byte fundamentally cannot be in a filename (the filename in the directory table consists of those characters up to, and not including, any zero byte stored). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Issues removing files with certain characters in their names.
On Fri, May 30, 2014 at 09:18:19PM +0200, Harald Becker wrote: Hi Rich ! My statement was imprecise; of course to support users still stuck on legacy locales, nl_langinfo(CODESET) should be consulted. How do you determine the correct code set of a foreign file system on an external drive? How can you tell if all systems which accessed this drive has handled translations in the correct way? All modern filesystems used on external devices (fat32, ntfs, udf, ...) use Unicode-based encodings for filenames, so the foreign encoding is known and fixed. and not only unzip may produce such results. Think of using an USB stick at an Windows machine, then carry that over to an Linux machine. The filenames are stored in UCS-2. No problem. UCS-2 with different code page translations from an 8 bit charset. Translations which leave name mapping in inconsistent state when further translations occur. I don't follow what you think the problem is. If you mount it incorrectly, then this is user error. Correct, all those trouble arrives due to anybody having an incorrect setup. This will ripple trough and may produce trouble on other ends. All modern Linux-based systems use the utf8 option by default when mounting filesystems that don't store filenames as pure byte strings but in a Unicode-based form. You have to be rolling your own or else actively breaking your system's default setup to get this wrong. All programs are not affected. Only programs which read filenames as byte strings from foreign sources (such as the directory table of a zip file) are affected. but how do you know the code page the zip archive uses. How do you know you need to do translations? I'm unsure if the archiv contains this information, so it needs to be provided by a much more error prone user. When encountering such an archive, the unzip utility could simply exit with an error when there are non-ASCII names unless the user specifies the encoding. To be less error-prone, it could print the names as interpreted in several different encodings as part of the error message, to help the user identify which one is correct. IMO it should also automatically assume UTF-8 and suppress the error condition if the names all decode as valid UTF-8, since the probability of meaningful non-UTF-8 text decoding successful as UTF-8 is negligible. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox nslookup slow on x86_64
On Sat, May 31, 2014 at 02:02:35AM +0200, Denys Vlasenko wrote: On Thursday 29 May 2014 20:47, Rich Felker wrote: On Wed, May 28, 2014 at 04:34:23PM +0200, Denys Vlasenko wrote: On Tue, May 27, 2014 at 9:34 AM, muddyboot emu...@gmail.com wrote: Hi, I found nslookup resolve very slow on x86_64 system , it cost 5 seconds or longer almost everytime. Tested OS: Debian 7 x86_64 with kernel 3.2.5 LFS x86_64 with kernel 3.12 No IPv6 enabled in kernel config. DNS server works fine nslookup program from bind-9.7 works fine nslookup from busybox test on i686 system OK target busybox version: 1.17.4、1.20.2、1.21.1、1.22.1 Any response for this problem is great appreciated. bbox nslookup uses libc to perform the lookup. glibc maintainers known to be quite.. er.. stubborn about how DNS should work. For example, they insist that IPv6 DNS requests must be sent even if the machine has no IPv6 support in kernel (let alone a more typical case where machine has no IPv6 connectivity). Your DNS server does not respond to IPv6 requests, but glibc waits for them. Unless the caller requested AI_ADDRCONFIG or requested AF_INET explicitly as opposed to AF_INET6, it's required to do this. And I don't think it's a bug. It may be useful to know all DNS results even if some of them (v6) won't be used for your current client setup. The bug is in whatever broken nameserver is ignoring requests rather than properly looking them up and returning a result. Well, in many cases users have no power over that nameserver. Forcing them to suffer instead of giving them ways to disable requests is arrogant. As I said an option in nslookup to do this would be nice. I believe glibc also has an option in resolv.conf (mentioned earlier in this thread, IIRC) to disable the requests globally. But I don't think it makes sense to complain about glibc doing both lookups by default, since that is the behavior specified/required by the standard. IMO this is one of the few cases of glibc doing the right thing (properly supporting modern usage rather than focusing on bug-compatibility for broken legacy setups). Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox nslookup slow on x86_64
On Wed, May 28, 2014 at 04:34:23PM +0200, Denys Vlasenko wrote: On Tue, May 27, 2014 at 9:34 AM, muddyboot emu...@gmail.com wrote: Hi, I found nslookup resolve very slow on x86_64 system , it cost 5 seconds or longer almost everytime. Tested OS: Debian 7 x86_64 with kernel 3.2.5 LFS x86_64 with kernel 3.12 No IPv6 enabled in kernel config. DNS server works fine nslookup program from bind-9.7 works fine nslookup from busybox test on i686 system OK target busybox version: 1.17.4、1.20.2、1.21.1、1.22.1 Any response for this problem is great appreciated. bbox nslookup uses libc to perform the lookup. glibc maintainers known to be quite.. er.. stubborn about how DNS should work. For example, they insist that IPv6 DNS requests must be sent even if the machine has no IPv6 support in kernel (let alone a more typical case where machine has no IPv6 connectivity). Your DNS server does not respond to IPv6 requests, but glibc waits for them. Unless the caller requested AI_ADDRCONFIG or requested AF_INET explicitly as opposed to AF_INET6, it's required to do this. And I don't think it's a bug. It may be useful to know all DNS results even if some of them (v6) won't be used for your current client setup. The bug is in whatever broken nameserver is ignoring requests rather than properly looking them up and returning a result. However, it may be nice to have an option for bb nslookup to turn off v6 lookups if such an option doesn't already exist. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Issues removing files with certain characters in their names.
On Thu, May 29, 2014 at 09:18:26AM +0200, Harald Becker wrote: Hi Denys ! For what it's worth the users with this problem were unable to remove the files using wildcards. For example, one user had a file named: På hjul.mkv ls P* displayed the file. rm P* returned the error can't remove 'På Hjul.mkv': No such file or directory I have hard time believing this. Wildcard expansion is done by the shell, not by ls and rm. IOW: ls and rm see exactly the same expanded names. Since they don't mangle the names in any way (e.g. no UTF-8 decoding) before feeding them to system calls, it should work. I know this problem very well. It happens about every few month, that I get a ZIP packaged file from a Windows system. As the maintainer is a bit stupid, he can't manage to avoid foreign characters and I end up with unusual file names after unzip. This sounds like a bug in the unzip utility. If it encounters byte sequences which are not UTF-8, it should convert them from whatever legacy encoding they're in to UTF-8, possibly issuing an error that the user needs to specify this encoding if it can't be determined. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Issues removing files with certain characters in their names.
On Thu, May 29, 2014 at 11:27:07PM +0200, Harald Becker wrote: Hi Rich ! I know this problem very well. It happens about every few month, that I get a ZIP packaged file from a Windows system. As the maintainer is a bit stupid, he can't manage to avoid foreign characters and I end up with unusual file names after unzip. This sounds like a bug in the unzip utility. If it encounters byte sequences which are not UTF-8, it should convert them from whatever legacy encoding they're in to UTF-8, possibly issuing an error that the user needs to specify this encoding if it can't be determined. Then you need to consider all programs buggy which don't mangle with the file names. There are so many programs which just copy filenames through and let the kernel decide what to do. And I do not mean BB unzip here, normally I'm using the upstream unzip. and how can you consider all names being UTF-8 ... nowadays may be, but what when using 8 bit locales with different charsets? UTF-8 mangling would be wrong on those. My statement was imprecise; of course to support users still stuck on legacy locales, nl_langinfo(CODESET) should be consulted. and not only unzip may produce such results. Think of using an USB stick at an Windows machine, then carry that over to an Linux machine. The filenames are stored in UCS-2. No problem. Depending on how the file system is mounted you may get unusual file names when copying names with foreign characters. Now who is bad? If you mount it incorrectly, then this is user error. Note that correct versus incorrect does not depend on the contents of the storage device, only the encoding the local system where you're mounting it is using. Would be nice to have them all fixed ... get them all fixed the same way when doing some mapping ... but can that ever reach all programs? This is a so long standing problem, nobody really cares. All programs are not affected. Only programs which read filenames as byte strings from foreign sources (such as the directory table of a zip file) are affected. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: busybox nslookup slow on x86_64
On Thu, May 29, 2014 at 11:30:58PM +0200, Harald Becker wrote: Hi Rich ! bbox nslookup uses libc to perform the lookup. However, it may be nice to have an option for bb nslookup to turn off v6 lookups if such an option doesn't already exist. The problem has been solved by placing single-request option in /etc/resolv.conf. So it was a glibc related problem. This option is a workaround for buggy nameserver software on some routers that hangs when you perform multiple requests at the same time. It's far from being a complete workaround since multiple processes/threads (or even different machines behind the router) might make simultaneous requests in a way that hangs the router. The correct fix is not to use the built-in nameserver on such routers but to instead either configure a local nameserver on 127.0.0.1 or use a third-party one (e.g. 8.8.8.8). Or replace the router's firmware with OpenWRT if possible. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Shell test or quoting mechanism breaks on parens
On Mon, May 19, 2014 at 05:02:48PM -0500, dung_ngu...@dell.com wrote: Dell - Internal Use - Confidential Hello, I used to use busybox 1.00 and this command line returns 0 or at least the output is the same as RHEL 5.3 shell: # [ '(' = '(' ] ; rc=$? ; echo $rc ; [ $rc -ne 0 ] echo EVIL SH TEST/QUOTE 0 However, after I upgraded to busybox 1.20.2 and I see different output: # [ '(' = '(' ] ; rc=$? ; echo $rc ; [ $rc -ne 0 ] echo EVIL SH TEST/QUOTE sh: closing paren expected 2 EVIL SH TEST/QUOTE Does anyone know why the output is different with busybox 1.00 or is this a security issue from cmd line ? As far as I can tell the issue is in the test command (both standalone and in the shell), not in the shell itself. The standard test/[ command has very specific rules for how its arguments are interpreted and Busybox does not seem to be honoring them. See: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html In particular, if there are 3 arguments and the second is a binary primary (which '=' is) then, regardless of the contents of the first and third, the binary primary is applied to them. Hopefully someone familiar with the Busybox source can look into this. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: bug in busybox sed with non-ascii chars
On Mon, May 05, 2014 at 08:08:32PM +0100, Sam Liddicott wrote: One of the advantages of utf-8 encoding was that it was easy to re-sync after an invalid sequence. It's a bit of a waste to then not do that. Minus points for musl. An application can resync, although the C multibyte interfaces are not really designed to be used this way (and you have to be careful if the locale's encoding might be state-dependent, e.g. some legacy CJK encodings). However the implementation cannot silently resync behind your back. Doing so introduces serious bugs, some of which may be security-relevant, since you either silently miss seeing some bytes from the input when processing input via conversion to wide characters, or some invalid sequences appear to the application as valid. Either possibility is dangerous. In particular, it's wrong for the regex . to match anything that's an illegal sequence, and wrong for ^.*$ to match a line containing any illegal sequences (since the . can't match it). Can you not run sed with LANG=C or LANG=POSIX? That's not what they're doing, but it's not a solution anyway. ISO C leaves the character encoding of the C locale implementation-defined, and the Rationale text from the 1995 amendments to C explicitly allows for the possibility that the C locale's character encoding has multibyte characters (e.g. is UTF-8). musl presently does not support byte-based characters at all, only UTF-8. This conforms to the current versions of ISO C and POSIX, but the Austin Group has adopted a requirement that the C locale be 8 bit clean as a future requirement, which musl will probably support at some time in the future. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: bug in busybox sed with non-ascii chars
On Sun, May 04, 2014 at 04:44:10PM +0200, Denys Vlasenko wrote: On Sat, May 3, 2014 at 5:07 PM, Rich Felker dal...@libc.org wrote: Lets refuse to find end of line if there is a non UTF-8 sequence inside that line? Sounds wrong to me... sed (also regcomp and regexec) requires text input. Byte streams with illegal sequences are not text. Actually since the regex is not trying to match the illegal sequence, just the end-of-line, it would theoretically be possible to make this work (and it will once we overhaul the regex implementation to work with byte-based DFA's rather than character-based ones), but that doesn't change the fact that it's an invalid test. Language lawyering is less important that real world usage. Indeed it's nice to support additional real-world usage when doing so does not harm any other usage. But we're not talking about real-world usage here. We're talking about a buggy configure test. I'd love to improve or even rewrite the regex engine but that's a lot of work and lower priority than a number of other things on the musl roadmap. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: bug in busybox sed with non-ascii chars
On Sat, May 03, 2014 at 03:17:49PM +0200, Denys Vlasenko wrote: On Saturday 03 May 2014 05:10, Rich Felker wrote: On Wed, Apr 30, 2014 at 10:31:00AM +0200, Natanael Copa wrote: Hi, I came across a bug (or feature) in busybox sed when trying to build firefox-29. Testcase based on what firefox's configure scripts does: ASCII='AA' NONASCII=$'\246\246' echo -e ($ASCII)\n($NONASCII) | busybox sed 's/$/,/' The above script is invalid; \246\246 is an illegal sequence and thus is rejected by regexec. It will work only on non-UTF-8 systems/locales (which musl does not support). Lets refuse to find end of line if there is a non UTF-8 sequence inside that line? Sounds wrong to me... sed (also regcomp and regexec) requires text input. Byte streams with illegal sequences are not text. Actually since the regex is not trying to match the illegal sequence, just the end-of-line, it would theoretically be possible to make this work (and it will once we overhaul the regex implementation to work with byte-based DFA's rather than character-based ones), but that doesn't change the fact that it's an invalid test. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: Changing default shell for Cron
On Mon, Apr 28, 2014 at 05:23:07PM +, Rob Anderson wrote: Hello, I have compiled Busybox for Android with a fully working cron. However, when I try to run crond/crontab, I get the following output in the log; crond: can't execute '/bin/sh' for user root Unfortunately, I have tried all the usual suggestions on the internet e.g. adding PATH and SHELL to the cron but to no avail (I can get cron working perfectly by symlinking /system/bin/sh to /bin/sh but I do not like this method). Is it possible to hardcode a different default shell path into the Busybox source for cron to execute e.g. the default /system/bin/sh for Android? If so, could you please advise whereabouts in the source this change needs to be made? This is really just android being gratuitously broken; it would be really nice to pressure them to fix this instead of adapting everything to their brokenness... Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: bug in busybox sed with non-ascii chars
On Wed, Apr 30, 2014 at 10:31:00AM +0200, Natanael Copa wrote: Hi, I came across a bug (or feature) in busybox sed when trying to build firefox-29. Testcase based on what firefox's configure scripts does: ASCII='AA' NONASCII=$'\246\246' echo -e ($ASCII)\n($NONASCII) | busybox sed 's/$/,/' The above script is invalid; \246\246 is an illegal sequence and thus is rejected by regexec. It will work only on non-UTF-8 systems/locales (which musl does not support). Please file a bug with Firefox. Rich P.S. I think you got my response to this on #musl but it's nice to have the resolution here for the record anyway. ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] memset 0 in obscure is optimized away by compiler
On Wed, Apr 16, 2014 at 07:14:05PM +0200, Harald Becker wrote: Hi Tito ! I've tried to find out if memset is really optimized away in this case with some test code that I've compiled with : What is wrong with optimization of code, e.g. replacing call to memset by a quick loop which does same thing even faster than a function call? ... beside that such optimization needs normally a bit more of space it is faster and does same thing as original function call. Try using size optimization (-Os) to avoid replacing function calls by inline replacements. The quick loop will equally be optimized away, as neither it nor the memset have any observable effect. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH] memset 0 in obscure is optimized away by compiler
On Wed, Apr 16, 2014 at 07:51:42PM +0200, Harald Becker wrote: Hi Tito ! my fear is/was that the call to memset is totally optimized away when optimization is turned on and therefore the memory containing the password strings is not zeroed at all. This would be a completely ill behavior of compiler optimization. No it's not. It's identical to optimizing the following to an empty function: void foo() { int x; x=1; } This function is equivalent to the empty function because it has no observable behavior. Someone who expects it to leave a 1 at some position on the stack that the caller could later observe is smoking the undefined behavior pipe and needs to forget everything wrong they learned about the C language and Read The Fine Standard. I do only know one case where the memset may be optimized to nothing, that is, when you fill content of an auto variable near end of a function, and don't further access that memory space. That would be same as something like: int func( int i, j ) { int x; ... // put any code here using variable x x = 0; // this may be optimized away, as x is not accessed after setting it to a value return i + j; } That's exactly the situation here. The lifetime of the object being cleared by memset ends sufficiently close to the memset that the compiler is able to achieve the same observable effects that would be seen in the abstract machine without actually performing any memory fill. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: [PATCH v2 1/2] mktemp: avoid removed mktemp()
On Mon, Apr 14, 2014 at 02:29:21PM +0200, Denys Vlasenko wrote: On Mon, Apr 14, 2014 at 12:20 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: This is a change in behavior - now we would actually create, and then immediately delete the directory if run as mktemp -du. I have mixed feelings about that. Do you think it's ok? I think it's ok, yes. The race-window is about the same as before, the creat()/unlink() do make it a bit slower but that is IMO ok. There was no race at all before for mktemp -n case: if (opts OPT_u) { chp = mktemp(chp); if (chp[0] == '\0') goto error; } else ... puts(chp); return EXIT_SUCCESS; No race here. The 'race' is in the caller, but it's not always a race. If the caller properly uses O_EXCL or similar and retries on fail, there's no race. Note that there are legitimate uses of mktemp -u, such as along with mkfifo or for choosing a name for a socket or shared memory object or similar. An alternative would be to export a uClibc-specific __gen_tempname() function How about exporting a function named mktemp()? ;) POSIX removing it from its text doesn't mean that a conforming libc must *not* provide it. Agreed. I don't see any value in removing these functions. If support for libcs that lack them is desired, a proper implementation should be provided in busybox rather than poor hacks that involve creating and deleting the temp file, and which could fail due to odd permission setups, etc. Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox
Re: dc hitting a compiler bug, or undefined behavior
On Mon, Mar 31, 2014 at 02:17:33PM +0200, Denys Vlasenko wrote: On Sun, Mar 30, 2014 at 11:18 AM, Lauri Kasanen cur...@operamail.com wrote: Hi, I'm seeing busybox dc acting funny when compiled with some versions of gcc. This is on busybox git. The x86 binary busybox_unstripped and config are attached. gcc 4.2.2 - ok gcc 4.7.2: nc 10 1 add p 2.738e+93 So either bb is hitting a compiler bug, Looks like a compiler bug: d = strtod(argument, end); if (end != argument *end == '\0') { push(d); is compiled to this: 0x08049490 +27:call 0x80488b0 strtod@plt 0x08049495 +32:mov-0x14(%ebp),%eax 0x08049498 +35:pop%ecx 0x08049499 +36:pop%ebx 0x0804949a +37:cmp%esi,%eax 0x0804949c +39:je 0x80494b1 stack_machine+60 0x0804949e +41:cmpb $0x0,(%eax) 0x080494a1 +44:jne0x80494b5 stack_machine+64 0x080494a3 +46:push %eax 0x080494a4 +47:push %eax 0x080494a5 +48:fstpl (%esp) HERE 0x080494a8 +51:call 0x8048f10 push Note how push(double a) argument gets passed on stack. But push() starts with this: Dump of assembler code for function push: = 0x08048f10 +0:fldl (%edi) which reads argument from some bogus location instead of stack: (gdb) nexti === EXECUTE fldl 0x08048f12 in push () (gdb) p $st0 $1 = 2.0554264135011661055418432229752339e-314 That's not 10! 10 exists on stack all right: (gdb) p *(double*)($esp+4) $5 = 10 the program just doesn't fetch it correctly. It really looks to me like some non-default optimization is going on and misbehaving. I've never seen gcc emit this kind of 'bare' function without prologue/epilogue (or in this case, the ugly prologue moved inside the conditional just before the function call). Is the busybox build system doing anything behind the scenes (or perhaps with #pragma or hidden attribte tags?) to get gcc to adjust the calling convention for some functions? Rich ___ busybox mailing list busybox@busybox.net http://lists.busybox.net/mailman/listinfo/busybox