On Tuesday, February 27th, 2024 at 23:02, Rob Landley <r...@landley.net> wrote:
> On 2/27/24 21:46, Oliver Webb wrote:
> 
> > On Tuesday, February 27th, 2024 at 19:19, Rob Landley r...@landley.net 
> > wrote:
> > 
> > > On 2/27/24 13:46, Oliver Webb via Toybox wrote:
> > > 
> > > > Since we build toybox with -funsigned-char, there is no reason to have 
> > > > a type for unnsigned chars in bc.c,
> > > > since that is the default for all chars. And removing it gets rid of a 
> > > > typedef
> > > 
> > > Sigh, the bc in the tree is "fraught". The guy who submitted it said he 
> > > was
> > > escaping from a cloud of drama at the time (I didn't care), and then was
> > > surrounded by a fresh cloud of drama after "escaping" and tried to suck 
> > > me into
> > > it (I declined). He wandered off to busybox and also maintaining his 
> > > command as
> > > a separate project.
> > > 
> > > According to "wc toys//.c | sort -n" it is the largest command in the 
> > > entire
> > > toybox tree. Larger than the shell despite not having a dozen subcommands 
> > > nor
> > > needing to manage terminal control or child processes. It's 5 times the 
> > > size or
> > > sed or tar, and at least 3 times the size it NEEDS to be. I'd far rather 
> > > clean
> > > up xzcat or implement awk (each of which is a similar effort ballpark) 
> > > than
> > > spend months cleaning it up.
> > 
> > Just a idea, but could we clean up the lexxer for bc, rip some of it out 
> > and make it it's
> > own separate thing (lib/lex.c?), then base awk off of that?
> 
> 
> Toybox needs a "lex.c" that works like flex (it's a posix command) and a yacc
> that works like bison (because linux screwed up kconfig in 2018).
> 
> But I haven't seen anything salvageable in bc yet, and I've written my own
> parser multiple times:
> 
> https://github.com/landley/toybox/blob/master/toys/pending/sh.c#L2976
> 
> > I have barely looked through
> > the bc.c code and so far have only gotten the really obvious things cleaned 
> > up. But since
> > there is a lexxer somewhere in there it is theoretically possible.
> 
> 
> My understanding is lex is just parallel grep: search for these patterns in
> parallel and tell me which one occurred next.
> 
> I tend to use if/else staircases because it's easy to do, cache local, and
> doesn't leave you with the problem of marshalling data between two 
> disconnected
> lists. (Here's a table of string to magic number value, then here's a case
> statement on the magic numbers. The table and the case statement have two 
> halves
> of the same data separated from each other, as we were complaining about 
> earlier
> today...)
> 
> Whereas:
> 
> if (strstart(&x, "potato")) {
> } else if (strstart(&x, "walrus)) {
> } else if (strstart(&x, "nictitating") {
> } ...
> 
> Is all together in one place, reasonably cache local, and mostly checking the
> first character of the string on failure anyway. (No point getting fancy to
> check ~30 strings, the overhead of one system call is probably more work.)
> 
> > I know the syntax differs in
> > a lot of places, but they both share C style functions, control flow 
> > statements, logical operators,
> > etc. So they could share at least some common base?
> 
> 
> The first user is not "reuse".
> 
> > When skimming through bc.c, I've found a lot of code for the -w (warn if 
> > GNU extension)
> > and -s (no GNU extensions) options. Since they only make sure the program 
> > does NOT provide
> > functionality, and enforce stupid things like single letter variable names 
> > (because bc was created
> > when someone in the 70s duct taped a lexxer to dc to give it C like syntax 
> > instead of rpn. and dc could
> > only store 255 variables as one character "registers")
> 
> 
> A whole lot of work went into this, I agree.
> 
> "Is that what we should be doing" does not appear to have been asked much.
> 
> > I have never seen -w or -s used by anything, and mktimeconst.bc (the bc 
> > code in the linux-kernel)
> > uses plently of GNU extensions in it's code.
> 
> 
> https://landley.net/bin/mkroot/0.8.10/linux-patches/0004-Replace-timeconst.bc-with-mktimeconst.c.patch
> 
> I note that my C replacement could be way smaller, I wrote it that way
> intentionally to mirror the (perl) code it was replacing.
> 
> And my FIRST attempt at replacing it was written in shell, it just required 64
> bit math which wasn't 100% guaranteed at the time. (I had to poke Denys to
> update busybox ash to always use long long for $((math)) even on 32 bit.)
> 
> > Would you mind if I stripped the logic of these options out of bc.c? 
> > Nothing USES them from my knowledge
> > and that would free up at least a few hundred lines of code (Probably 
> > more). And we would lose no
> > functionality from doing so as long as we keep the options there
> 
> 
> I have no issue,

Thanks, I usually wouldn't ask but I'd like to know if either of the patches
I've submitted for bc.c will be merged, before I start doing things like 
ripping out
typedefs and "is this strictly POSIX compliment?" code. So that nothing 
conflicts

> but I'm still not convinced bc is a command we should keep?

It is needed to build a (unpatched) linux kernel. Also android uses bc in a few 
places:
https://cs.android.com/search?q=file:sh$%20%5Cbbc%5Cb%20-file:external%2F(bc%7Ctoybox%7Cmksh)%2F%20case:yes
since floating point shell math or expr math isn't available. 

If any way of doing floating point math existed without bc or python or perl 
that was well standardized 
and the linux-kernel bureaucrats removed bc from the kernel bc could die and we 
could ignore it like 
"fort77" or "qdel". But the process of everyone agreeing on ONE solution and 
implementing it usually 
takes years. And bc has been in the kernel for over a decade at this point

> Rob

-   Oliver Webb <aquahobby...@proton.me>
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to