On Thursday 12 August 2010 11:39:34 Bernhard Reutner-Fischer wrote:
> Hi Rob,
>
> Can you provide bloat-o-meter stats, please?
> TIA,

Until I build it as part of busybox, I can't tell you how much size it adds to 
busybox, no.  (My question was about how to splice it into busybox using the 
new infrastructure.  I could add it to Kbuild.src and Config.src and applets.h 
and so on, but I'd like to get the new way documented in the FAQ.)

Also, at the moment size comparisons are kinda useless.  it's probably around 
3x bigger, something ridiculous.  I'm not proposing replacing busybox patch 
with this any time soon.

There's several reasons for this:

1) Denys sucked the toybox common code that toybox_patch was calling into the 
.c file, even for things like linked list handling which busybox presumably 
already has.  That needs cleanup before you can even ballpark estimate things.  
(That's the main reason it's 559 lines vs 306 lines.  The actual patch code 
starts around line 225.)

2) The designs aren't remotely similar, and I know my approach is bigger.  
That's because the current busybox patch is a toy, which can't apply a patch 
at an offset.  (I tried using busybox patch before writing my own.  Re-diffing 
patches with every -rc got really old really fast.)

Trusting the location information is a simplifying design decision for the 
busybox version, but it essentially means it's a toy version of patch that's 
not very useful in the real world.  It doesn't (can't) search the file for 
places to apply a hunk, and _most_ patches have offsets.  (Which are not the 
same as fuzz.)

Handling offsets requires a different (search-based) design, cacheing potential 
hit sites and then flushing that cache when you've confirmed it doesn't apply 
here.  (But not flushing _all_ the cache, you flush just one line and then 
retest the hunk against your cache because if, for example, your first line of 
hunk context is two blank lines and a curly bracket, and your patch site has 
_three_ blank lines and a curly bracket, you match two blank lines, _don't_ 
match the curly bracket, flush the two blank lines you'd previously matched out 
of your cache, and now miss the hunk application site because it started with 
the second blank line you flushed.  Hit that, fixed it a couple years ago.)

It might be fairly easy to overlay a brain-dead "pass-through $X lines, check 
this one location and either patch it or fail, now pass through $Y lines..." 
codepath with a config symbol.  The question is whether it's worth it, and 
whether the code could be refactored to get enough size back (dead code 
eliminating the search bits) to be worth it.  Patch is already more complex 
than I'm happy with, and there's a point where making code _more_ complex to 
reduce size isn't a win.  (Hand-optimized assembly is smaller than busybox 
too.)

3) Because the toybox patch _is_ useful in real-world situations, it got 
bloated handling a lot more corner cases that happen in the real-world patches 
people have used it to apply over the years.  Things like "comparing against a 
file dated the epoch is a synonym for /dev/null and implies the file wasn't 
just 
empty should be created/deleted" (which a patch fished out of the gentoo 
repository was doing), and due to time zones the epoch sometimes claims to be 
1969 and sometimes claims to be 1970 (which a patch off kernel.org was doing).

Or this little gem:

$ echo hello > test
$ diff -u /dev/null test
--- /dev/null   2009-08-01 20:56:12.000000000 -0500
+++ test        2010-08-12 13:11:19.000000000 -0500
@@ -0,0 +1 @@
+hello

Notice that the ,1 on the _second_ coordinate got dropped.  Busybox currently 
only handles dropping it on the first one.  Regression introduced in 64a76d7b a 
couple years, ago, and never quite fixed up.

4) The feature sets differ the other way too, I never implemented things like 
--dry-run, or patch_level -1 that the current busybox version does, which 
would be trivial to implement but I just never got around to it.  That makes 
toybox patch look smaller in comparison but isn't useful to compare.  (And a 
lot of the stuff the toybox patch does currently do, which the current busybox 
one doesn't, could get a CONFIG symbol.)

There's a lot of cleanup needed to merge these, and the designs differ enough 
that the big questions are design ones,  "what do you want patch to _do_?"

But right now I'm just asking the proper way to get the code built in the new 
context.

Rob
-- 
GPLv3: as worthy a successor as The Phantom Menace, as timely as Duke Nukem 
Forever, and as welcome as New Coke.
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to