On Friday 05 November 2010 20:27:47 Denys Vlasenko wrote: > On Wed, Nov 3, 2010 at 7:09 PM, Rob Landley <[email protected]> wrote: > >> > I thought it was inherent in the mandate of the project, but > >> > apparently not. The focus these days is on features, adding more and > >> > more, always making the project bigger and more complicated. > >> > > >> > I look around and everywhere see things that aren't that hard to clean > >> > up, > >> > >> Which ones (except those mentioned in TODO)? > > > > It's sort of a constant background thing. > > > > If you want a specific example, there's bound to be a way to simplify > > editors/vi.c. Or miscutils/less.c. > > Ohh, I *gladly* would take patches which simplify these. > Or patches which fix them wrt Unicode. Or both.
My point was that finding this stuff is easy. Dealing with it is the part that requires a lot of time and careful thought. Making any change to busybox requires reading through the code that's there to gain a broad enough understanding of it that you're not making it worse. And I can't do that anymore without coming across buckets of tangents that need doing, and I tend to lose track of my original goal. Last time I seriously engaged with BusyBox it took over my life for a couple years. Which meant the rest of my Linux work essentially rolled to a stop for a while. Now I'm back getting a minimal native development environment to boot and run on over a dozen different hardware architectures that QEMU emulates, and getting existing architectures to _keep_ working is a heck of a Red Queen's race: http://kerneltrap.org/mailarchive/linux-kernel/2010/9/4/4615621/thread http://www.mail-archive.com/[email protected]/msg27071.html http://www.openfirmware.info/pipermail/openbios/2009-March/003601.html http://lkml.indiana.edu/hypermail/linux/kernel/0705.1/1962.html http://kerneltrap.org/mailarchive/linux-kernel/2010/2/22/4540565 And so on. Not counting the perl removal patches I still haven't gotten upstreamed into the kernel, or the pending uClibc NPTL mess, or my supposed goal of bootstrapping Linux From Scratch, Gentoo, Fedora, and Ubuntu to natively under the resulting system. Or other things I _want_ to do like learn Lua and reimplement toybox in it, turn tinycc into qcc by ripping the back-end off and replacing it with QEMU's TCG, or testing out all the new device tree stuff that's going into the kernel, or helping out with the new llvm/clang work to come up with a viable replacement compiler for the political morass GCC has become, or do a "hello world" kernel for each target stripped down enough that with a bit of kexec magic you could use Linux as its own bootloader, or reinstalling my laptop with Gentoo instead of an Ubuntu version so stale the update manager gave me a "no more updates, upgrade already" pop-up last week... (Or the whole "get a new day job" thing since my contract with Qualcomm ran out on the 31st and the department's new budget won't be approved before january at the earliest so they couldn't renew it. But I'm used to having time between contracts, that's when I get the bulk of my open source programming done. :) It's not that I don't want to work on busybox, it's that the scope of the problem is beyond the time commitment I can offer. The project is pervasively messy, and continuing to get messier, to the point where just poking at it a couple days a month can't hope to keep up with the continuing influx of mess. I keep bookmarking things like this: http://lists.busybox.net/pipermail/busybox/2010-October/073518.html Which was a 30 second fix: all those #ifdefs could be if() statements. I realize you don't see this is a problem, but I do. Henry Spencer nicely sums up why here: http://doc.cat-v.org/henry_spencer/ifdef_considered_harmful And Greg Kroah-Hartman covered it in his kernel coding style talk (this slide and the next two, and page 6 of the corresponding paper): http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/mgp00029.html http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_paper/codingstyle.ps But by the time I read that message on the mailing list you're already applied it, and by the time I sat down to deal with the resulting code it had changed again to an even denser forest of #ifdefs, and if I have to argue about _why_ removing them is a good thing that takes even more time... And when people ask where the mess needing cleanup is as if they can't see it, or act like #ifdef removal is black magic it takes special talent to do, or when you say you wonder how I came up with such a small sha1sum implementation... I find that really depressing. I am not a very good coder. By my standards, I suck at this. I really do. I just don't let sucking at it stop me from trying to figure out how to make it suck _less_. The fact that I can't always manage doesn't make the goal any less worthwhile, and I would LOVE if other people could do a better job at this so I didn't have to. You don't see the code I throw away, or all time time I spend _thinking_ before coding. One of the reasons I tend to have three or so open source projects ongoing at once (when I'm not just banging out some barely functional schlock to make a deadline and HOPE they throw it away afterwards) is that I get writer's block. Not because I can't figure out how to make it work but because I can't figure out how to do it RIGHT. Because I haven't yet convinced myself I've minimized the suck. I haven't got the DESIGN right, which means I'm not thinking about the problem the right way yet, and it's far easier to tell I've got it wrong than to figure out what right is. You'd think the BusyBox project would be the right place for computational Dorodango if anywhere was, especially five years after its' 1.0 release where it's supposedly code complete and presumably implementing all of the Single Unix Specification's command line stuff it cares to. You'd think the focus would switch to doing what it already does better. But no, the focus is on adding more to the project. New commands, new features, new complexity... *shrug* > All these ideas seem like good ones to me. Again, see "belling the cat". Ideas are not the limiting factor. > >> I wouldn't say 'nobody'. > > > > It is no longer the majority opinion. > > I actually look at code size VERY closely. See my other recent mail > where I show that there is, on average, net reduction in size > since 1.00 on the same config. Great, but there's a pitfall here. You know how pointy haired managers love quoting the phrase "You can't manage what you can't measure"? http://www.galorath.com/wp/you-can-manage-what-you-cant-measure.php To which the rebuttal is Einstein's quote, "Not everything that counts can be counted, and not everything that can be counted counts": http://www.anecdote.com.au/archives/2006/09/if_you_cant_mea.html The failure mode is "managing what you can measure". Focus on what you can measure and consider it more important than what you can't. BusyBox doesn't have a metric for simplicity, but it does have a metric for size (and to a lesser extent functionality: you can enumerate new features and even add regression tests for them). Thus size/features are what you think about, what you constantly check, it becomes more important over time, and comes to eclipse simplicity. This gets you in big trouble when the metric you've got is only a proxy for the thing you really want, because people game the system. When IBM focused on KLOCS (thousands of lines of code) as a programmer productivity metric, their employees cut and pasted their way to productivity bonuses and the actual code quality suffered tremendously until they stopped incentivizing that metric. This isn't just a programming thing, it's the same failure mode which leads corporations to take away the free towels in the gym, because the cost of providing the service can be easily measured but the morale boost it gives the employees can't, therefore one is "real" and the other isn't. Even when the metric is real and important, it tends to lead to the things you can't as easily measure getting ignored, because they're harder to think about. You have to make an effort to see them. It's an easy trap to fall into and a hard problem to solve, especially when the things you can measure are good things and important to get right, so spending time on them isn't necessarily _bad_... They're just not the whole story. That's part of the reason I personally valued simplicity _above_ the other two. Precisely because it's harder to measure. > I'm just doing it not in Rob's way ("rewrite this crap!"), > but in "let's simplify this crap!" way. > > I invite Rob to rewrite any part he likes. Then I will > try to simplify his rewrite. It's a win-win situation. Been there, done that. > >> > I've come to the conclusion I'm not helping here. > >> > >> From my point of view, you _are_ helping. In your own way 8). > > > > No, I'm telling _myself_ to "shut up and show me the code". > > > > I just don't see it making a difference here with the amount of time I > > have to put into it. It's like trying to mop up a river, the new > > arrivals bury any small gains I could make. > > New arrivals do help in one area: they reduce dependencies > in LFS-type systems. You know it yourself since that's precisely > the reason you use busybox in Aboriginal Linux: > you want to have fewer packages. > > And when busybox acquires a new feature _you_ need, you see it > as a win. (For example, 1.18.x will have brace expansion in hush). I'm not saying more features is bad. I'm saying the loss of simplicity is bad. All three things trade off for each other, but one of them has taken it on the chin in favor of the other two. It makes the project uncomfortable for me to work on, and I don't believe the amount of time/energy I have available to put in won't even keep up with the ongoing degradation of the quality. And being only one who sees the imbalance as an actual _problem_ is discourging enough that I'd rather just not watch, thanks. > Which makes it more useful. Which is good. > > Why you fail to extrapolate your feeling of a win when > *others* submit stuff, and it is accepted? I love it when other people solve my problems. But having to solve a second set of problems other people created while solving the first set of problems isn't always a net win. I'm already running a red queen's race over on the kernel and qemu side, and LLVM will be another, and when I start caring about the unreleased uClibc code that will be another, and when I get distros natively bootstrapped the bootstrapping logic will bit rot too. Mostly I want to push this stuff upstream. If nothing else, automate the bug reporting and git bisect to the commit that broke test case X. (That's half of what the cron job stuff is for. Alas impactlinux.com went away this past week and it's all back on landley.net now which hasn't got the bandwidth for heavy use... Yet another todo item I hadn't planned on...) Cleaning up busybox is only an issue if I'm going to be developing on busybox. It's just my weird aesthetic sensibility that obviously isn't important to the metrics the project uses to measure itself, and I have enough self-imposed tasks for the moment, thanks. > They are in exactly > the same position as you: they don't want to cross-compile > a $HUGE_BLOATED_PACKAGE, so they reimplement or port > part of it to busybox. > > Same thing. Actually in my case I want to keep the set of environmental dependencies as simple as possible. The thing that really appeals to me about busybox is I can build it on a wide range of host systems without having to worry about whether that system has lex and bison and autoconf and automake and perl and python and zlib and internationalization support and It lets me worry about what I'm cross compiling _to_ and not have to worry about what I'm cross compiling _from_. And that's valuable. But it's the _simplicity_ that appeals to me, not the size or speed. I'm building busybox "defconfig" because that makes my build scripts conceptually very simple, even though that literally enables a hundred more apps than my build actually needs. Micro-managing busybox's config to strip it down would make it smaller, and make the _result_ simpler, but it would make my build more complicated, harder to maintain, harder for new people to learn, more likely to bit-rot... I'm glad busybox does more things for more people, but I'd already implemented most of the feature set I personally needed to do this back in 1.2.2, and the remaining bits I've added since (oneit, patch, nbd-client, ccwrap, etc.) could live as individual files in sources/toys/*.c if necessary. (You'll note I haven't pushed oneit into busybox, even though an init program designed to launch a single executable with a proper controlling TTY and signal handling, reap zombies until that executable exits, and then shut the system down... once upon a time that simplicity would have been in busybox's purview. But it would have been a unique facility of busybox, not a copy of an existing program somebody else already wrote and maintains externally, and thus not part of busybox's _current_ mandate. I never saw busybox as a shadow of other projects, but I was weird. I went to a Weird Al concert last night where I bought the tour T-shirt I'm wearing now and shouted the civic motto "Keep Austin Weird" at him. And the only two songs that were new to me were the "Polka face" medly at the beginning and the one about cellphones. That's how weird we're talking here.) I'm happy that busybox is well maintained, and I'm happy that if I post a bug report here it tends to get addressed promptly. But my goals and busybox's goals have drifted apart over the years, and I'd rather spend the majority of my time elsewhere now. Rob -- GPLv3: as worthy a successor as The Phantom Menace, as timely as Duke Nukem Forever, and as welcome as New Coke. _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
