On Sunday 24 October 2010 06:47:32 Denys Vlasenko wrote:
> On Sunday 24 October 2010 11:02, Rob Landley wrote:
> > On Saturday 23 October 2010 20:20:01 Denys Vlasenko wrote:
> > > I wanted to post a very similar rant about uclibc. But did not. Ranting
> > > usually doesn't help. Doing something about things you want to see
> > > improved is the way forward.
> >
> > For uClibc, I tended to mail cakes to people.  It seems to work, for some
> > reason.
> >
> > Alas at this point, it's hard to consider the uClibc project anything
> > other than a zombie.  NTPL for uClibc has been in development for
> > _five_years_ and still hasn't shipped.  It took the original NPTL
> > developers what, six months to get a working implementation in glibc? 
> > And they had to do the kernel-side infrastructure while they were at it. 
> > The uClibc web page hasn't been updated in over 6 months...
> >
> > I know, I should download the git snapshots and regression test and go
> > engage with that project more...  It's on the todo list.
>
> So let's do it.

It's one of the things I mean to do with Aboriginal Linux.  I've got the 
ability to run exactly the same setup on a dozen different architectures, and 
to rebuild with arbitrary package versions.  And it all runs under qemu so you 
don't need special hardware to test the result.

I need to document it a bit better some people other than _me_ can do this.  
(I have enough trouble keeping up with new kernel versions.)

> My problem with working on NTPL is that I dislike everything
> thread-related.

I went from the C64 to DOS to desqview to OS/2 to Java to Linux.  You can't go 
through OS/2 without getting a pretty thourough grounding in threads, it was 
sort of their thing.  (Java was just a refresher course.)

That said, I spent almost ten years programming for Linux before bothering to 
actually look up the pthreads API.  You usually don't _need_ threads on Linux.

And _that_ said, I've pondered doing threaded bunzip2 and bzip2 
implementations, because the separate 900k blocks make the sucker almost 
trivially parallelizeable, and SMP is available even on ARM these days.

Attached is my threaded hello world program, which exercises pretty much all 
of the thread infrastructure.  It contains all you'd need to know to add 
thread support to pretty much any program that could benefit from it.  I use it 
to smoke test the threading infrastructure: if this builds/works then the 
threading is at least sane.

> The only piece of software of the 1024 CPU machine which
> _has to be_ threaded is the kernel, everything else is easier to
> parallelize on a task basis: don't waste time degeloping, say, 1024-CPU
> parallelized gzip - instead, run thousands of gzip copies!

Which then wind up talking to each other through pipes or fifos or sockets, and 
you have a scalability bottleneck in a select statement sending the data back.  
Less so now with the new pipe infrastructure, but still, you have to copy data 
between process contexts because fiddling with page tables for non-persistent 
shared memory is _more_ expensive than copying, and it's a cache flush either 
way...

Threads are a tool, just like object orientation.  There are times when it's 
appropriate and very helpful, and times when shoehorning it onto a problem 
makes things worse.  I haven't wanted to introduce it to busybox for the same 
reason I haven't wanted to introduce an ncurses dependency.

> It will be somewhat hard for me to work on NTPL in uclibc...

I'm not too worried about it, I just haven't gotten around to it.

The LLVM/Clang guys just built a working Linux kernel:

  http://lists.cs.uiuc.edu/pipermail/cfe-dev/2010-October/011711.html

To quote Scotty from Star Trek 2, "It's bypassed like a christmas tree", but 
it did actually work for one architecture.  It's not yet to the point where 
finding more bugs is useful if you can't fix them yourself (they've got plenty, 
thanks), but in another release or two it'll be worth trying that out on a 
bunch of architectures, possibly even nightly regression testing.

Meaning that's yet _another_ todo item added to my list. :)

> > > So why are you so grumpy? :D
> >
> > Eh, I'm usually grumpy.  I yell at my own code all the time.  (That's why
> > I keep rewriting it so much.)
>
> Well, this reduces the number of people who want to cooperate with you.

I'm aware of this.  Didn't say it was a virtue.

> People are illogical. They better take criticism if it is stuff
> sugar-coated.
>
> > Once upon a time I spent months each rewriting individual busybox apps,
> > over and over, until I was pleased with the results.  And umount was one
> > of them, so it seemed safe to go in and make a specific change to busybox
> > umount, to add a small and simple feature that should have been maybe a
> > three line patch.
> >
> > Instead I got sidetracked into wondering why dietlibc #ifdefs and hurd
> > patches were splattered all over the code since the last time I looked at
> > it.  "git show fbedacfc8caa1" is a huge mess replacing realpath(), which
> > is a posix.1-2001 function.
>
> realpath to a char buf[PATH_MAX] has a problem of requiring 4k large
> buffer. In almost all cases, xmalloc_realpath() will be MUCH smaller. I
> liked this patch mostly for this reason.

Behind the scenes, the uClibc iplementation of realpath, with NULL as the 
second argument, mallocs a 4k buffer.  (Because they're going for small code 
over small data set, and assuming that the realpath() return value is 
generally temporary.)  So you're mallocing a 4k buffer, allocating a second 
buffer, copying the data, and freeing the first buffer (probably fragmenting 
the 
heap).

If you wind up fighting the C library that much, I'd get a better C library.  
It's the kind of added complexity for a micro-optimization that really doesn't 
appeal to me.

> Maybe we need to have allocating version of getmntent too.

More likely we need an mmap() based version, but mostly I trust the C library 
to do its' thing.

Linux won't boot in less than 4 megs of ram.  It _will_ boot from less than 4 
megs of persitent storage.  And the smaller amount of code is easier to audit, 
easier to maintain, easier to learn/understand, is more likely to fit in L1 and 
L2 caches...

*shrug*  Different goals.

> > We weren't even using the NULL extension we were doing
> > our own malloc, and it got replaced with:
> >
> > +               char *resolved_path = xmalloc_realpath(*argv);
> > +               if (resolved_path != NULL) {
> >                         puts(resolved_path);
> > +                       free(resolved_path);
> >
> > So we have an xmalloc function and we're checking the null return.  Huh?
> > Either it's badly named, or the caller is wasting bytes.
>
> NULL here happens olny when realpath fails to "realpathize" its argument.
> Say, xmalloc_realpath("/it/does/not/exist") is NULL.

So it's an xfunction() which can return failure.  Personally I found that non-
obvious.

> > -                       safe_strncpy(path, m->dir, PATH_MAX);
> > +                       path = xstrdup(m->dir);
> >
> > What was wrong with xstrdup() here?
>
> The change replaces safe_strncpy with xstrdup, not the other way around.

*shrug*  Ok.

> > And what's "safe" about randomly
> > truncating a string (which isn't necessary on Linux), this arbitrarily
> > breaks umount on really long paths when it might otherwise work fine.
>
> "safe" in safe_strncpy means "it will always be NUL-terminated" (unlike
> strncpy).

There used to be a strlcpy() that would always null terminate.  I forget where 
that was from.  (Ok, I wrote my own under DOS in 1991, but I later found out 
that other C libraries already included it.  Ulrich Drepper thought it was a 
horrible idea, but then he thinks static linking is a horrible idea...)

> > -                       realpath(zapit, path);
> > -                       for (m = mtl; m; m = m->next)
> > -                               if (!strcmp(path, m->dir) ||
> > !strcmp(path, m-
> >
> > >device))
> >
> > -                                       break;
> > +                       path = xmalloc_realpath(zapit);
> > +                       if (path) {
> > +                               for (m = mtl; m; m = m->next)
> > +                                       if (strcmp(path, m->dir) == 0 ||
> > strcmp(path, m->device) == 0)
> > +                                               break;
> > +                       }
> >
> > Posix 2008 says we can rely on realpath() doing a malloc with NULL now,
> > so wrapping it seems like something we'd want to _undo_ now.  As for the
> > rest of it, the previous code was better.  The new code is pointlessly
> > verbose to perform the same function.
>
> Readability. !strcmp() reads as "not strcmp" - ?

In the shell, 0 is success and nonzero is failure.  In C, 0 is failure and 
nonzero is success.  In strcmp() there's greater than, less than, and zero.  
Anybody who can't keep all these straight is going to have to be very good at 
writing test suites.

> strcmp() != 0 reads as "not equal" - much closer to what it actually do.

The linux kernel style guys actually edit out those kind of useless 
appendanges as part of their style checking during code review.  So you're 
adding stuff to make the code look less like the kernel does.

Do you similar bloat if (x) to say "if (x != FALSE)"?  I see that kind of 
thing in people who are new to C, but not much in people who've been doing it 
for a while.

>
> I prefer more "spelled out" code on the source level exactly beause
> it's easier to read. Your code is usually more densely written.
> This is not a significant difference.

*shrug*  Your call.

There are some studies out there that say there's an optimal module size, 
above which defect density increases because you can't keep all the code in 
your head at once, and below which defect density increases because the code 
is chopped up into such small pieces that you can't see the forest for the 
trees and it spends all its time interfacing with itself:

  http://catb.org/esr/writings/taoup/html/ch04s01.html

One of the interesting things about those studies is actually that defect 
density is not strongly influenced by language.  (Variation in individual 
programmers is larger than variation across languages.)  So once you're 
reasonably proficient at both 1000 lines of python and 1000 lines of C are 
likely to have about the same number of bugs to work out, but 1000 lines of 
python _does_ way more.

What I took away from that is "making the code bigger makes it buggier, 
because people can't keep as much of it in their heads".  Which is part of the 
reason I personally consider making the code "terse but clear" to be a good 
call.  Even comments need to pull their own weight, there are times when 
deleting unnecessary comments and just letting you see all the CODE at once 
makes things easier to understand, and I say that as a compulsive commenter. 
:)

> > Which means if I go in and start fiddling with umount, the first thing
> > I'm likely do is break Hurd support.  I haven't got a regression test
> > environment, and I actively fail to care about that target, so I pretty
> > much expect it.
>
> Don't worry about it. It's up to hurd people (if any) to fix up.

Ok.

I vaguely wonder if there should be a place we list busybox features that 
other immplementations don't have.  For example, busybox mount never needs to 
say "-o loop", because you can trivially autodetect when you're being asked to 
mount a file on a directory.  (And directory on directory or file on file are 
bind mounts, although I don't remember if I made it autodetect that.  --move 
still needs to be explicitly specified, and overrides the default "directory on 
directory" behavior.)

Adding the ability to umount everything under a given directory is another 
extension.

> > A
> > significant part of my frustration is not knowing whether or not I'm
> > _allowed_ to clean out that crap to get us back to simple, or whether
> > maintaining hurd support (which I can't test) is important.
> >
> > I don't understand what the requirements are for this code.  Which means
> > my approach would have to be "make a small localized change and don't
> > change anything else because it's brittle black magic"... which takes all
> > the fun out of a project like busybox.
>
> If you want to change something but not sure whether it is ok,
> post the patch(es) to the ml.

Gotta write 'em first, which is where the question arises.

> > Attached is the sha1sum.c I wrote for toybox a few years back.  (Never
> > did quite get around to cleaning it up, extending it to support the
> > various other shaXXXsum sizes, or implementing md5sum.  My todo list
> > runneth over.)  But what I was going for was _simple_, and I confirmed it
> > produced the right output on all the data I threw at it.  It's 185 lines
> > including the actual applet implementation and help text and everything. 
> > The busybox one is 900 lines for the engine.
>
> I saw it.
> Wondered "how the hell Rod managed to squeeze sha1 into that few lines" :D

Read the darn algorithm, understood what it was doing (if not why) long enough 
to implement it.

*shrug*  The usual.

The problem is this kind of thing can't be done in 15 minute increments (or at 
least I can't do it), it needs nice solid multi-hour blocks to sit down and 
think a big thing through, where I start out well rested.  And spare ones of 
those are hard to come by these days, they tend to get used for other 
things...

> > Busybox has three or four different implementations of each piece of
> > infrastructure in the sha1sum/md5sum stuff, depending on the "size vs
> > speed" knob.  I focused on doing it once in a way that was easiest to
> > read and to understand.
> >
> > I.E. my implementation was simple first, worrying about small and fast
> > second. The code you were referring to (current as of October 19th, git
> > says) has a size vs speed config entry leading to a heavily redundant
> > implementation, at the expense of simplicity, with rather a lot of what
> > it's doing hidden in various macros.
>
> This "size vs speed" thing im md5 is not mine. busybox-1.00 has it too.
> As you can see, sha256/512 which I added don't have them.

I need to sit down and read through busybox's implementation, figure out what 
bits of it are md5 and what are shaXXX and what benefit it gets from sharing...

Won't be today, though.  I'm backporting "#pragma visibility" (introduced in 
gcc 4.0) to gcc 3.4.6 in hopes that's all uClibc++ needs to build there, then 
dusting off and finishing up the strace port.

Tonight I might have the energy to finish up the lfs-bootstrap stuff so 
aboriginal can natively build the Linux From Scratch 6.7 packags as part of 
its nightly automation.  Last night I meant to, but wound up just watching 
"Meerkat Manor" instead.  (Imagine a live action lolcats soap opera, narrated 
by Samwise Gamgee.)

Rob
-- 
GPLv3: as worthy a successor as The Phantom Menace, as timely as Duke Nukem 
Forever, and as welcome as New Coke.
// Threaded hello world program that uses mutex and event semaphores to pass
// the string to print from one thread to another, and waits for the child
// thread to return a result before exiting the program.

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Thread, semaphores, and mailbox
struct thread_info {
	pthread_t thread;
	pthread_mutex_t wakeup_mutex;
	pthread_cond_t wakeup_send, wakeup_receive;
	unsigned len;
	char *data;
};

// Create a new thread with associated resources.
struct thread_info *newthread(void *(*func)(void *))
{
	struct thread_info *ti = malloc(sizeof(struct thread_info));
	memset(ti, 0, sizeof(struct thread_info));

	pthread_create(&(ti->thread), NULL, func, ti);

	return ti;
}

// Send a block of data through mailbox.
void thread_send(struct thread_info *ti, char *data, unsigned len)
{
	pthread_mutex_lock(&(ti->wakeup_mutex));
	// If consumer hasn't consumed yet, wait for them to do so.
	if (ti->len)
		pthread_cond_wait(&(ti->wakeup_send), &(ti->wakeup_mutex));
	ti->data = data;
	ti->len = len;
	pthread_cond_signal(&(ti->wakeup_receive));
	pthread_mutex_unlock(&(ti->wakeup_mutex));
}

// Receive a block of data through mailbox.
void thread_receive(struct thread_info *ti, char **data, unsigned *len)
{
	pthread_mutex_lock(&(ti->wakeup_mutex));
	if (!ti->len)
		pthread_cond_wait(&(ti->wakeup_receive), &(ti->wakeup_mutex));
	*data = ti->data;
	*len = ti->len;
	// If sender is waiting to send us a second message, wake 'em up.
	// Note that "if (ti->len)" be used as an unlocked/nonblocking test for
	// pending data, although you still need call this function to read data.
	ti->len = 0;
	pthread_cond_signal(&(ti->wakeup_send));
	pthread_mutex_unlock(&(ti->wakeup_mutex));
}

// Function for new thread to execute.
void *hello_thread(void *thread_data)
{
	struct thread_info *ti = (struct thread_info *)thread_data;

	for (;;) {
		unsigned len;
		char *data;

		thread_receive(ti, &data, &len);
		if (!data) break;
		printf("%.*s", len, data);
		free(data);
	}

	return 0;
}

int main(int argc, char *argv[])
{
	void *result;
	char *data = strdup("Hello world!\n");
	struct thread_info *ti = newthread(hello_thread);

	// Send one line of text.
	thread_send(ti, data, strlen(data));
	// Signal thread to exit and wait for it to do so.
	thread_send(ti, NULL, 1);
	pthread_join(ti->thread, &result);

	return (long)result;
}
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to