Re: [dev] [sbase] New branch for sbase+ubase

2024-03-14 Thread Elie Le Vaillant

On 2024-03-14 11:49, Roberto E. Vargas Caballero wrote:

The idea is that everything is portable. Let's take the example of
ps.  We can extract the non portable bits of ps into the libsys
library (or whatever name we like), so the code in posix/ is portable.


Ok, that is evidently better. Having in libsys/ all the nonportable bits
is feasible and better than re-implementing each tool for each new 
platform.


I think some tools such as sysctl are going to be way more complex than 
that

though. Linux kindly provides /proc/sys, and the current sysctl merely
replaces '.' by '/' and writes to the corresponding file. Such an
implementation is impossible on OpenBSD. It does not provide any 
systematic

way of mapping a "key" (such as "vm.drop_caches", or "vm.loadavg")
to the underlying bit-array that uniquely identifies it for sysctl(2).
We would need to write thousands of lines of OS-dependant code just to
_parse_ the command-line.

But sysctl _is_ fundamental for a working userland. Should we thus put
it in linux/, or find a way to make it portable nonetheless?


I don't get why we should have categories like "shell-only", or why
we should have so high level of customization.


Well, this is just a suggestion. I can see why it would be a bad idea
to have such high level of customization.

But I sill believe that those categories should be defined in the 
Makefile

rather than in directories. It is _very_ easy to pick-and-choose which
utilities to build with the master branch. It just requires to modify 
the
BIN list, ie. remove tools that don't fit. With a directory-based 
separation,

such a process would be more complex, and would require to modify each
Makefile individually; which, arguably, is cumbersome.

This is basically just overlapping lists of utilities, defined by 
standards,

type of utility, etc.


What is the reasoning for categories like "shell-only"?


The categories you've outlined are very clear and make sense; but they
aren't the only way of grouping tools. "shell-only" was just an example,
but some tools only make sense when used in a terminal, for displaying
data to the user. Sometimes, this is not needed (such as routers for 
example,
where there is very little need for tools like cal, cols or clear). 
Maybe this

it way too specific (and way too arbitrary), but it was just an example.
But if we ever implement tools such as gzip or xz, ping or nc, etc.
it _could_ be useful to have sets such as "compression", "net", that
may overlap with other categories.

This proposed scheme is quite inspired by the way busybox separates
its utilities, but maybe it is wrong for this precise reason. I find
the directory separation both in toybox and busybox quite useful
and elegant, so that's why I'm suggesting a system that would allow
for such separation in a simple way (Makefile lists).

I put only linux because at this moment we only have linux specific 
tools,
if we add OpenBSD tools then we can have the openbsd directory. Of 
course

that it would mean that if you add linux and openbsd at the same time
your build will fail, just don't do stupid things.

  - Allow for categories _inside the Makefile_. We could have 
something

like:
   POSIXTOOLS = portable/ls unportable/$(OS)/ps.c ...
   MISCTOOLS = portable/sponge.c portable/cols.c portable/rev.c 
...

   INTERACTIVETOOLS = portable/cols.c portable/clear.c ...
   ...
   BIN = $(POSIXTOOLS) $(MISCTOOLS) $(LINUXTOOLS)



At this stage and with the configuration that we want to have I
think we have to get rid of the monolithic approach and just define
the directories that we want to include in our build:

all: posix misc curses

and every directory with simple Makefiles.


As commented before, I think this is problematic for users who want
more control on which tools they want to build. Also, I'm really not
sure how you could get rid of the monolithic approach while still
being able to build sbase-box.

The current script-based method is quite good I think, except that it
builds all the C files in the current directory. The Makefile recipe
would need to provide the script with which files to build, ie. with
lists of C files. The root Makefile has to have knowledge of the
C files in each subdirectory, and pass it to mkbox (along with which
libsys version to use). We would have a non-monolithic build for
individual tools, but a monolithic one for sbase-box, which means
duplicating lists of utilities.

If we get rid of the monolithic approach, then what I propose
(overlapping categories as lists of utilities in a Makefile)
would be inapplicable.



As commented, the idea is to have all the tools written with portable
code. Tools in POSIX are implemented in all the systems, so they
have a way to be implemented.  At this moment I just began to
classify the tools, that is not easy, and I didn't care about fixing
the portability problems, but after deciding the categories then it
should be the next step.


Of course! I think some 

Re: [dev] [sbase] New branch for sbase+ubase

2024-03-14 Thread Roberto E. Vargas Caballero
Hi,

On Wed, Mar 13, 2024 at 11:51:46PM +0100, Elie Le Vaillant wrote:
> I think this layout has a few problems:
> 
>   - we lose the meaningful separation that the sbase/ubase layout allowed,
> i.e the distinction
> between what is portable and what isn't. This _could_ be included as
> part of the Makefile
> (PORTABLE set vs NONPORTABLE), but I think it is better to explicitly
> separate in different
> directories portable from non-portable tools. If we don't, we are making
> sbase a Linux-only
> project (the only implementation for ps becomes a Linux-only one, and we
> would need to fork
> sbase to make it available to other platforms; same for all the other
> tools from ubase), which
> I find a bit sad considering the goals of portability of sbase

The idea is that everything is portable. Let's take the example of
ps.  We can extract the non portable bits of ps into the libsys
library (or whatever name we like), so the code in posix/ is portable.
In some cases it would mean to remove some functionalities or worst
performance, but it is not a problem because we can keep in ubase
the optimized versions.  For example, sbase has a portable dd and
ubase has a customized linux dd version.

>   - I think Mattias Andrée scheme is better than this one. With a
> directory-base separation
> (between categories of utilities, based on somewhat arbitrary factors
> (standards, rather than
> say, minimalness, or use-case, or platform or implementation, etc.)), we
> cannot have overlapping
> categories, which is quite problematic. For example, if we wished to
> have a category for
> "shell-only usage" (such as clear, or cols), we couldn't implement it in
> an easy way, because
> this category overlaps with misc/, curses-dummy/, and maybe others too.

I don't get why we should have categories like "shell-only", or why
we should have so high level of customization. Having some categories
makes sense, but going to the level of user custom lists is too
much in my opinion. The separation between posix/ and misc/ is there
because may people raised the concern that they didn't use tools
like sponge, tac or shuf ever, so it is questionable to include
them, but as there are other people that use them we add the option
to include them.  The curses tools have the problem that include a
strong dependency with a complex library, and it is desirable to
skip them, because it can be very problematic to build them in some
cases. Having the unix/ category is there for all the tools that
were part of POSIX previously and/or were part of UNIX historically
but for different reasons are not part of POSIX today.  Having a
POSIX only set of tools has the advantage of being a tool to check
if your shell script is portable or not.

What is the reasoning for categories like "shell-only"?

> What I suggest to fix both problems:
>   - separate on the grounds of portability/nonportability. In other words,
> something along the lines
> of:
>libutil/
>portable/
>  ls.c
>  cols.c

As commented before, all the tools must be portable, except of
course linux specific tools. After talking with quinq he liked the
idea because then tools for Openbsd can be easily added just adding
a new openbsd directory.

> This more or less reproduces the sbase/ubase separation, but allowing
> future OSes to come in
> the future rather than just Linux (so, for example, *maybe* OpenBSD

I put only linux because at this moment we only have linux specific tools,
if we add OpenBSD tools then we can have the openbsd directory. Of course
that it would mean that if you add linux and openbsd at the same time
your build will fail, just don't do stupid things.

>   - Allow for categories _inside the Makefile_. We could have something
> like:
>POSIXTOOLS = portable/ls unportable/$(OS)/ps.c ...
>MISCTOOLS = portable/sponge.c portable/cols.c portable/rev.c ...
>INTERACTIVETOOLS = portable/cols.c portable/clear.c ...
>...
>BIN = $(POSIXTOOLS) $(MISCTOOLS) $(LINUXTOOLS)

At this stage and with the configuration that we want to have I
think we have to get rid of the monolithic approach and just define
the directories that we want to include in our build:

all: posix misc curses

and every directory with simple Makefiles.

> This allows for such grouping, while also allowing overlapping
> categories. This also doesn't
> hinder the useful semantic sbase/ubase separation (which is especially
> handy when working on
> non-Linux OSes). I think overall that sbase/ubase is the most useful
> distinction, so it should
> be treated specially, and not as part of the build-system (but more as
> part of the directory
> organization).

As commented, the idea is to have all the tools written with portable
code. Tools in POSIX are implemented in all the systems, so they
have a way to be implemented.  At this moment I just began to
classify the tools, that is 

Re: [dev] [sbase] New branch for sbase+ubase

2024-03-13 Thread Elie Le Vaillant

On 2024-03-13 11:27, Roberto E. Vargas Caballero wrote:

I am thinking about the new layout, and for now I was considering
something like:

- posix (tools defined by POSIX)
- misc (helpful tools not defined by POSIX).
- linux (tools that only make sense in linux)
	- curses-terminfo (tools with dependency of terminfo, like clean and 
reset).

- curses-dummy (curses tools implementation without using terminfo).
- libutil (library with utility functions)
- libsys (library with system dependant functions).



I think this layout has a few problems:

  - we lose the meaningful separation that the sbase/ubase layout 
allowed, i.e the distinction
between what is portable and what isn't. This _could_ be included as 
part of the Makefile
(PORTABLE set vs NONPORTABLE), but I think it is better to 
explicitly separate in different
directories portable from non-portable tools. If we don't, we are 
making sbase a Linux-only
project (the only implementation for ps becomes a Linux-only one, 
and we would need to fork
sbase to make it available to other platforms; same for all the 
other tools from ubase), which

I find a bit sad considering the goals of portability of sbase

  - I think Mattias Andrée scheme is better than this one. With a 
directory-base separation
(between categories of utilities, based on somewhat arbitrary 
factors (standards, rather than
say, minimalness, or use-case, or platform or implementation, 
etc.)), we cannot have overlapping
categories, which is quite problematic. For example, if we wished to 
have a category for
"shell-only usage" (such as clear, or cols), we couldn't implement 
it in an easy way, because
this category overlaps with misc/, curses-dummy/, and maybe others 
too.


What I suggest to fix both problems:
  - separate on the grounds of portability/nonportability. In other 
words, something along the lines

of:
   libutil/
   portable/
 ls.c
 cols.c
   unportable/
 linux/
   lsmod.c
   ps.c
   libsys/
 maybe-other-os-in-the-future/
This more or less reproduces the sbase/ubase separation, but 
allowing future OSes to come in
the future rather than just Linux (so, for example, *maybe* OpenBSD 
could have some tools in

the future. This is a suggestion).

  - Allow for categories _inside the Makefile_. We could have something 
like:

   POSIXTOOLS = portable/ls unportable/$(OS)/ps.c ...
   MISCTOOLS = portable/sponge.c portable/cols.c portable/rev.c ...
   INTERACTIVETOOLS = portable/cols.c portable/clear.c ...
   ...
   BIN = $(POSIXTOOLS) $(MISCTOOLS) $(LINUXTOOLS)
This allows for such grouping, while also allowing overlapping 
categories. This also doesn't
hinder the useful semantic sbase/ubase separation (which is 
especially handy when working on
non-Linux OSes). I think overall that sbase/ubase is the most useful 
distinction, so it should
be treated specially, and not as part of the build-system (but more 
as part of the directory

organization).

Possible drawbacks which I've thought about could be:
  - This requires to occasionally write and update different, 
overlapping lists of tools. I believe
this is not too much of an issue. Adding new items should be quite 
easy, and you've already done
a form a grouping which we could use for this proposed alternative 
layout. This task could also
be aided by the way busybox groups its different utilities, and the 
different standards and

packages listed on the toybox website.

  - The Makefile would be a bit more complex. If we truly add support 
for multiple platforms (which
is a mere suggestion), it could become a bit complex (as in: 
platform-specific bits of Makefile
to properly compile libsys, or platform-specific utilities (lsmod, 
sysctl...)). If we don't,
and just keep a portable/linux separation at the directory-level, we 
still need a bit more
complexity (defining tool categories, properly distinguishing 
between libutil and libsys at
compile-time, and only for platform-specific tools, etc.) than what 
we currently have, but this
complexity would be very similar to the one we would have with your 
proposed layout.


Overall, I think that the benefits of the proposed alternative layout 
(sbase/ubase primary distinction,
reflected at the directory level; and Makefile-specific lists as 
secondary distinction, only reflected
at the build-system level, which allows very great flexibility) outweigh 
the inconveniences. Moreover,
I think that your layout has issues, such as the fact that sbase becomes 
a de-facto Linux-only project,
considering that tools whose implementation are Linux-specific (ps, 
mknod, etc.) are not treated
differently from tools which are inherently portable (sponge, sed...), 
or the fact that the categories
you've made so far forbid any kind of 

[dev] [sbase] New branch for sbase+ubase

2024-03-13 Thread Roberto E. Vargas Caballero
Hi,

I have pushed a new branch called ubase-merge that has all
the files from ubase in the directory ubase of the root directory
in the repository. You can still see the full history of a file
in that directory using something like git log --follow.

I am thinking about the new layout, and for now I was considering
something like:

- posix (tools defined by POSIX)
- misc (helpful tools not defined by POSIX).
- linux (tools that only make sense in linux)
- curses-terminfo (tools with dependency of terminfo, like clean and 
reset).
- curses-dummy (curses tools implementation without using terminfo).
- libutil (library with utility functions)
- libsys (library with system dependant functions).


The separation between curses and curses-dummy is because we have
currently some curses tools implemented using hardcoded sequences
instead of using terminfo to locate the correct sequences for the
terminal used. While this would work in the majority of terminal
emulators (because almost all of them emulate vt100 compatible
emulatros) it would not work in many cases. It is a good idea
to keep them because it reduces the dependencies of the project,
and it makes easier to bootstrap systems. But I think we can also
implement the correct solutions and select what is the option
at build time.

it was commented to have functions that isolated the system dependencies
between different systems, and I thought that instead of adding them
to libutil it was better to define a new library only for that
purpose.

Please, give your opinion.

Regards,