Re: [Toybox] find(1) -name vs -wholename

2024-03-05 Thread enh via Toybox
On Mon, Mar 4, 2024 at 6:09 PM Rob Landley  wrote:
>
> On 3/4/24 18:03, enh wrote:
> > On Mon, Mar 4, 2024 at 3:31 PM Rob Landley  wrote:
> >>
> >> On 3/4/24 12:19, enh via Toybox wrote:
> >> > obviously the patch is trivial, but i can't think of an existing
> >> > toybox tool that has one of these "you're holding it wrong" errors,
> >> > but this is one that i do find useful:
> >>
> >> I thought there was one in tar but couldn't find it. Gzip has "need -f to 
> >> read TTY".
> >>
> >> I'm not conceptually against "this CAN'T work" errors. (Except this isn't 
> >> an
> >> error, it prints to stderr and then exits with 0. Seems a bit 
> >> indecisive...)
> >
> > /facepalm
> >
> > (probably no-one noticed yet because it's most likely to be hit
> > interactively. still seems like a bug though!)
>
> Well, a warning isn't exactly an error...?
>
> It's design-level indecision. Is this a problem or not?
>
> >> > where i'm left wondering why
> >> > it can't just do the right thing... since `/` is illegal in a POSIX
> >> > name, what other interpretation could there be? but, still, better
> >> > than nothing.)
> >>
> >> I'd be happy to do the right thing instead? Fairly minor code change 
> >> either way.
>
> Thinking about it more, the "right thing" might be for -name to match the
> trailing whole entries, so if you "find toybox -name pending/git.c" it could
> come up with "toybox/toys/pending/git.c".

yeah, that's what i always assume it does until it doesn't work. (i've
never really seen the use for -path given its various limitations, and
although there's value to -wholename, that's a terrible name for "the
regex version of -name, but on the whole path".)

> Or in my case, my ~/toybox work directory has... 21 toybox repo directories
> under it (basically instead of branches+stash), so "find . -name toys/*/git.c"
> could find all the instances of that file using a more shell-like expansion
> syntax, even if there are subdirectories in the way (such as, real example,
> ~/toybox/android/toybox/toys/pending/git.c).
>
> This is getting us away from "minor code change", though.

yeah, that's why i was wondering whether we should just do the warning :-(

> But I think I already
> have code for this in... tar.c maybe? All that --anchored stuff calling
> do_filter() calling fnmatch. Not that hard to do it again with a model at 
> hand. :)
>
> > yeah, i'm not sure why coreutils doesn't do that --- perhaps to avoid
> > the question of whether `-name bits/syscall.h` means `-wholename
> > .*bits/sycalls.h` or `-wholename .*/bits/syscall.h`?
> >
> > (`-path` with a trailing `/` is a similarly unhelpful sharp corner.)
>
> My brain is FRIED by packing to move, and I would need test cases with 
> proposed
> output to follow the distinctions you're making there.
>
> I'm also a little confused about what "-wholename" is for given that the shell
> already does path expansion? I guess it's so you don't get your wildcards back
> as a result when there's no match? Hmmm... ah, I see, * can eat slashes here,
> and the shell won't do that.
>
> And... what does -path do again? In the toybox directory, none of these 
> produce
> a result with the host find or toybox find:
>
>   find . -path pending
>   find . -path toys/pending
>   find . -path toys/pending/ip.c
>
> It's been some years since I implemented this stuff. What do the tests do...
>
>   $ mkdir dir
>   $ touch dir/file
>   $ find . -wholename 'dir*e'
>   find: ‘./dirtest/subdir’: Permission denied
>   $ find dir -wholename 'dir*e'
>   dir/file
>
> Ah, the ./ at the start is preventing both -wholename and -path from matching,
> because of course.
>
> >> We could even ping the coreutils guys about that, since they recently 
> >> agreed to
> >> add -x when I grumped at them. (I'm moving house! It's very stressful!) 
> >> Speaking
> >> of, I just remembered to ping busybox list about that... Alas, still no 
> >> cut -DF
> >> in coreutils, last I checked...
> >
> > tbh, since starting to read the coreutils list i'm _less_ convinced
> > that anyone really thinks about anything,
>
> Coreutils is gnu.
>
> > and especially not about interactions between things.
>
> It's very gnu.
>
> Your expectations weren't low enough. There's a reason I started poking at
> busybox back in 2002. The gnu project was announced in 1983, and gnu/hurd
> remains unusable essentially today.
>
> > (i saw the -x,--swap thread but didn't
> > have the energy to point out the -x,--exchange would have been quite a
> > bit less unclear...)
>
> I know, but I only cared about A) the short option, B) having mv finally able 
> to
> call that rename() functionality the linux kernel added ten years ago. (Which
> came up again recently with a patch implementing atomic exchange in the VFAT
> driver.)
>
> Letting the --longopt smell like that particular gatekeeper is fine with me, I
> never voluntarily use them, and it's sort of the opposite of an ablative duck:
>
>   https://bwiggs.com/notebook/queens-duck/
>
> He got to 

Re: [Toybox] find(1) -name vs -wholename

2024-03-04 Thread Rob Landley
On 3/4/24 18:03, enh wrote:
> On Mon, Mar 4, 2024 at 3:31 PM Rob Landley  wrote:
>>
>> On 3/4/24 12:19, enh via Toybox wrote:
>> > obviously the patch is trivial, but i can't think of an existing
>> > toybox tool that has one of these "you're holding it wrong" errors,
>> > but this is one that i do find useful:
>>
>> I thought there was one in tar but couldn't find it. Gzip has "need -f to 
>> read TTY".
>>
>> I'm not conceptually against "this CAN'T work" errors. (Except this isn't an
>> error, it prints to stderr and then exits with 0. Seems a bit indecisive...)
> 
> /facepalm
> 
> (probably no-one noticed yet because it's most likely to be hit
> interactively. still seems like a bug though!)

Well, a warning isn't exactly an error...?

It's design-level indecision. Is this a problem or not?

>> > where i'm left wondering why
>> > it can't just do the right thing... since `/` is illegal in a POSIX
>> > name, what other interpretation could there be? but, still, better
>> > than nothing.)
>>
>> I'd be happy to do the right thing instead? Fairly minor code change either 
>> way.

Thinking about it more, the "right thing" might be for -name to match the
trailing whole entries, so if you "find toybox -name pending/git.c" it could
come up with "toybox/toys/pending/git.c".

Or in my case, my ~/toybox work directory has... 21 toybox repo directories
under it (basically instead of branches+stash), so "find . -name toys/*/git.c"
could find all the instances of that file using a more shell-like expansion
syntax, even if there are subdirectories in the way (such as, real example,
~/toybox/android/toybox/toys/pending/git.c).

This is getting us away from "minor code change", though. But I think I already
have code for this in... tar.c maybe? All that --anchored stuff calling
do_filter() calling fnmatch. Not that hard to do it again with a model at hand. 
:)

> yeah, i'm not sure why coreutils doesn't do that --- perhaps to avoid
> the question of whether `-name bits/syscall.h` means `-wholename
> .*bits/sycalls.h` or `-wholename .*/bits/syscall.h`?
> 
> (`-path` with a trailing `/` is a similarly unhelpful sharp corner.)

My brain is FRIED by packing to move, and I would need test cases with proposed
output to follow the distinctions you're making there.

I'm also a little confused about what "-wholename" is for given that the shell
already does path expansion? I guess it's so you don't get your wildcards back
as a result when there's no match? Hmmm... ah, I see, * can eat slashes here,
and the shell won't do that.

And... what does -path do again? In the toybox directory, none of these produce
a result with the host find or toybox find:

  find . -path pending
  find . -path toys/pending
  find . -path toys/pending/ip.c

It's been some years since I implemented this stuff. What do the tests do...

  $ mkdir dir
  $ touch dir/file
  $ find . -wholename 'dir*e'
  find: ‘./dirtest/subdir’: Permission denied
  $ find dir -wholename 'dir*e'
  dir/file

Ah, the ./ at the start is preventing both -wholename and -path from matching,
because of course.

>> We could even ping the coreutils guys about that, since they recently agreed 
>> to
>> add -x when I grumped at them. (I'm moving house! It's very stressful!) 
>> Speaking
>> of, I just remembered to ping busybox list about that... Alas, still no cut 
>> -DF
>> in coreutils, last I checked...
> 
> tbh, since starting to read the coreutils list i'm _less_ convinced
> that anyone really thinks about anything,

Coreutils is gnu.

> and especially not about interactions between things.

It's very gnu.

Your expectations weren't low enough. There's a reason I started poking at
busybox back in 2002. The gnu project was announced in 1983, and gnu/hurd
remains unusable essentially today.

> (i saw the -x,--swap thread but didn't
> have the energy to point out the -x,--exchange would have been quite a
> bit less unclear...)

I know, but I only cared about A) the short option, B) having mv finally able to
call that rename() functionality the linux kernel added ten years ago. (Which
came up again recently with a patch implementing atomic exchange in the VFAT
driver.)

Letting the --longopt smell like that particular gatekeeper is fine with me, I
never voluntarily use them, and it's sort of the opposite of an ablative duck:

  https://bwiggs.com/notebook/queens-duck/

He got to keep --swap. He _wanted_ it to be called --swap. Which meant he wanted
the feature to go in, because otherwise it couldn't be called --swap.

>> Rob

Rob

P.S. Oddly enough, while Linux beat gnu because gnu sucked, Linux beat BSD using
the standard disruptive technology playbook Clayton Christensen described in the
Innovator's Dilemma in 1997.

The problem FreeBSD had back in 1991 was it was big iron tech ported down to PCs
and didn't fit comfortably in the smaller space. A bit like IBM's OS/2, which
was full of System Object Model implementations of Common Object Request Broker
Architecture, in triplicate. 

Re: [Toybox] find(1) -name vs -wholename

2024-03-04 Thread enh via Toybox
On Mon, Mar 4, 2024 at 3:31 PM Rob Landley  wrote:
>
> On 3/4/24 12:19, enh via Toybox wrote:
> > obviously the patch is trivial, but i can't think of an existing
> > toybox tool that has one of these "you're holding it wrong" errors,
> > but this is one that i do find useful:
>
> I thought there was one in tar but couldn't find it. Gzip has "need -f to 
> read TTY".
>
> I'm not conceptually against "this CAN'T work" errors. (Except this isn't an
> error, it prints to stderr and then exits with 0. Seems a bit indecisive...)

/facepalm

(probably no-one noticed yet because it's most likely to be hit
interactively. still seems like a bug though!)

> Alas my find.c is dirty because of the whole pending environment measuring 
> mess:
>
> -  TT.max_bytes = sysconf(_SC_ARG_MAX) - environ_bytes();
> +  TT.max_bytes = child_env_free(0);
>
> Yet another open can of worms where I need to do heavy lifting to close a 
> tab...
>
> > ~/aosp-main-with-phones/prebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.17-4.8$
> > find . -name bits/syscall.h
> > find: warning: ‘-name’ matches against basenames only, but the given
> > pattern contains a directory separator (‘/’), thus the expression will
> > evaluate to false all the time.  Did you mean ‘-wholename’?
>
> Code change is easy enough, something like:
>
>   dprintf(2, "%s: -name can't match paths, try -wholename\n", 
> toys.which->name);
>
> > (of course, it's also a bit like the macOS `grep -r` "hey, i'm just
> > going to sit here doing nothing because -r defaults to stdin rather
> > than the `.` that you obviously intended"
>
> Which debian fixed ages ago. :)

it was probably debian that caused me to get out of the habit of
typing the `.` :-)

> > where i'm left wondering why
> > it can't just do the right thing... since `/` is illegal in a POSIX
> > name, what other interpretation could there be? but, still, better
> > than nothing.)
>
> I'd be happy to do the right thing instead? Fairly minor code change either 
> way.

yeah, i'm not sure why coreutils doesn't do that --- perhaps to avoid
the question of whether `-name bits/syscall.h` means `-wholename
.*bits/sycalls.h` or `-wholename .*/bits/syscall.h`?

(`-path` with a trailing `/` is a similarly unhelpful sharp corner.)

> We could even ping the coreutils guys about that, since they recently agreed 
> to
> add -x when I grumped at them. (I'm moving house! It's very stressful!) 
> Speaking
> of, I just remembered to ping busybox list about that... Alas, still no cut 
> -DF
> in coreutils, last I checked...

tbh, since starting to read the coreutils list i'm _less_ convinced
that anyone really thinks about anything, and especially not about
interactions between things. (i saw the -x,--swap thread but didn't
have the energy to point out the -x,--exchange would have been quite a
bit less unclear...)

> Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net


Re: [Toybox] find(1) -name vs -wholename

2024-03-04 Thread Rob Landley
On 3/4/24 12:19, enh via Toybox wrote:
> obviously the patch is trivial, but i can't think of an existing
> toybox tool that has one of these "you're holding it wrong" errors,
> but this is one that i do find useful:

I thought there was one in tar but couldn't find it. Gzip has "need -f to read 
TTY".

I'm not conceptually against "this CAN'T work" errors. (Except this isn't an
error, it prints to stderr and then exits with 0. Seems a bit indecisive...)

Alas my find.c is dirty because of the whole pending environment measuring mess:

-  TT.max_bytes = sysconf(_SC_ARG_MAX) - environ_bytes();
+  TT.max_bytes = child_env_free(0);

Yet another open can of worms where I need to do heavy lifting to close a tab...

> ~/aosp-main-with-phones/prebuilts/gcc/linux-x86/host/x86_64-linux-glibc2.17-4.8$
> find . -name bits/syscall.h
> find: warning: ‘-name’ matches against basenames only, but the given
> pattern contains a directory separator (‘/’), thus the expression will
> evaluate to false all the time.  Did you mean ‘-wholename’?

Code change is easy enough, something like:

  dprintf(2, "%s: -name can't match paths, try -wholename\n", toys.which->name);

> (of course, it's also a bit like the macOS `grep -r` "hey, i'm just
> going to sit here doing nothing because -r defaults to stdin rather
> than the `.` that you obviously intended"

Which debian fixed ages ago. :)

> where i'm left wondering why
> it can't just do the right thing... since `/` is illegal in a POSIX
> name, what other interpretation could there be? but, still, better
> than nothing.)

I'd be happy to do the right thing instead? Fairly minor code change either way.

We could even ping the coreutils guys about that, since they recently agreed to
add -x when I grumped at them. (I'm moving house! It's very stressful!) Speaking
of, I just remembered to ping busybox list about that... Alas, still no cut -DF
in coreutils, last I checked...

Rob
___
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net