Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread David Wright
On Fri 15 Dec 2023 at 08:58:10 (-0500), Greg Wooledge wrote:
> On Fri, Dec 15, 2023 at 02:30:21PM +0100, Nicolas George wrote:
> > Greg Wooledge (12023-12-15):
> > > readarray -d '' fndar < <(
> > > find "$sdir" ... -printf 'stuff\0' |
> > > sort -z --otherflags
> > > )
> 
> > It is possible to do it safely in bash plus command-line tools, indeed.
> > But in such a complex case, it is better to use something with a
> > higher-level interface. I am sure File::Find and Version::Compare can
> > let Perl do the same thing in a much safer way.
> 
> Equally safe, perhaps.  Not safer.  I don't know those particular perl
> modules -- are they included in a standard Debian system, or does
> one need to install optional packages?  And then there's a learning
> curve for them as well.
> 
> By the way, your MUA is adding 1 years to its datestamps.

Don't knock it: beats using the French Republican calendar.
But I miss the hours:minutes used by most MUAs (the minutes
being relatively unaffected by time zones). They can help
with following threads stored in different locations.

Cheers,
David.



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Nicolas George
Greg Wooledge (12023-12-15):
> Equally safe, perhaps.  Not safer.  I don't know those particular perl
> modules -- are they included in a standard Debian system, or does
> one need to install optional packages?  And then there's a learning
> curve for them as well.

File::Find is a standard module, Version::Compare is packaged.

I consider it safer because I factor mistakes in my estimate: if you get
the Perl version working without using strange constructs in your code,
the odds that it will break on special characters are vanishingly thin.
With shell, unless we tested for it, there are chances we forgot a
corner case.

> By the way, your MUA is adding 1 years to its datestamps.

It is called the Holocene calendar, the principle being that everything
that happened that might deserve to be expressed as a year in the last
12K years.

See:

https://en.wikipedia.org/wiki/Holocene_calendar

Or possibly:

https://www.youtube.com/watch?v=czgOWmtGVGs

Regards,

-- 
  Nicolas George



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Greg Wooledge
On Fri, Dec 15, 2023 at 02:30:21PM +0100, Nicolas George wrote:
> Greg Wooledge (12023-12-15):
> > readarray -d '' fndar < <(
> > find "$sdir" ... -printf 'stuff\0' |
> > sort -z --otherflags
> > )

> It is possible to do it safely in bash plus command-line tools, indeed.
> But in such a complex case, it is better to use something with a
> higher-level interface. I am sure File::Find and Version::Compare can
> let Perl do the same thing in a much safer way.

Equally safe, perhaps.  Not safer.  I don't know those particular perl
modules -- are they included in a standard Debian system, or does
one need to install optional packages?  And then there's a learning
curve for them as well.

By the way, your MUA is adding 1 years to its datestamps.



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Nicolas George
Greg Wooledge (12023-12-15):
> On Fri, Dec 15, 2023 at 01:42:14PM +0100, Nicolas George wrote:
> > Also, note that file names can also contain newlines in general. The
> > only robust delimiter is the NUL character.
> 
> True.  In order to be 100% safe, the OP's code would need to look
> more like this:
> 
> readarray -d '' fndar < <(
> find "$sdir" ... -printf 'stuff\0' |
> sort -z --otherflags
> )
> 
> The -d '' option for readarray requires bash 4.4 or higher.  If this
> script needs to run on bash 4.3 or older, you'd need to use a loop
> instead of readarray.
> 
> This may look a bit inscrutable, but the purpose is to ensure that
> a NUL delimiter is used at every step.  First, find -printf '...\0'
> will print a NUL character after each filename-and-stuff.  Second,
> sort -z uses NUL as its record separator (instead of newline), and
> produces sorted output that also uses NUL.  Finally, readarray -d ''
> uses the NUL character as its record separator.  The final result is
> an array containing each filename-and-stuff produced by find, in the
> order determined by sort, even if some of the filenames contain
> newline characters.

It is possible to do it safely in bash plus command-line tools, indeed.
But in such a complex case, it is better to use something with a
higher-level interface. I am sure File::Find and Version::Compare can
let Perl do the same thing in a much safer way.

Regards,

-- 
  Nicolas George



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Greg Wooledge
On Fri, Dec 15, 2023 at 01:42:14PM +0100, Nicolas George wrote:
> Also, note that file names can also contain newlines in general. The
> only robust delimiter is the NUL character.

True.  In order to be 100% safe, the OP's code would need to look
more like this:

readarray -d '' fndar < <(
find "$sdir" ... -printf 'stuff\0' |
sort -z --otherflags
)

The -d '' option for readarray requires bash 4.4 or higher.  If this
script needs to run on bash 4.3 or older, you'd need to use a loop
instead of readarray.

This may look a bit inscrutable, but the purpose is to ensure that
a NUL delimiter is used at every step.  First, find -printf '...\0'
will print a NUL character after each filename-and-stuff.  Second,
sort -z uses NUL as its record separator (instead of newline), and
produces sorted output that also uses NUL.  Finally, readarray -d ''
uses the NUL character as its record separator.  The final result is
an array containing each filename-and-stuff produced by find, in the
order determined by sort, even if some of the filenames contain
newline characters.



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Albretch Mueller
On 12/15/23, Greg Wooledge  wrote:
> More to the point, bash has a 'readarray' command which does what you
> *actually* want:
>
> readarray -t fndar < <(find "$sdir" ...)
>
 Yes, that was what I actually needed!

 lbrtchx



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Nicolas George
Albretch Mueller (12023-12-15):
> sdir="$(pwd)"
> #fndar=($(IFS=$'\n'; find "$sdir" -type f -printf '%P|%TY-%Tm-%Td
> %TI:%TM|%s\n' | sort --version-sort --reverse))
> #fndar=($(IFS='\n'; find "$sdir" -type f -printf '%P|%TY-%Tm-%Td
> %TI:%TM|%s\n' | sort --version-sort --reverse))
> fndar=($(find "$sdir" -type f -printf '%P|%TY-%Tm-%Td %TI:%TM|%s\n' |
> sort --version-sort --reverse))
> fndarl=${#fndar[@]}
> echo "// __ \$fndarl: |${fndarl}|${fndar[0]}"
> 
> the array construct ($( ... )) is using the space (between the date
> and the time) also to split array elements, but file names and paths
> may contain spaces, so ($( ... )) should have a way to reset its
> parsing metadata, or, do you know of any other way to get each whole
> -printf ... line out of find as part of array elements?

You set IFS in the subshell, but the subshell is doing nothing related
to IFS, it is just calling find and sort. You need to set IFS on the
shell that does the splitting.

Also, note that file names can also contain newlines in general. The
only robust delimiter is the NUL character.

Also, ditch batch. For simple scripts, do standard shell. For complex
scripts and interactive use, zsh rulz:

fndar=(${(f)"$(...)"})
fndar=(${(ps:\0:)"$(...)"})
fndar=(**/*(O))

(I do not think zsh can sort version numbers easily, though.)

Regards,

-- 
  Nicolas George



Re: setting IFS to new line doesn't work while searching?

2023-12-15 Thread Greg Wooledge
On Fri, Dec 15, 2023 at 12:33:01PM +, Albretch Mueller wrote:
> #fndar=($(IFS=$'\n'; find "$sdir" -type f -printf '%P|%TY-%Tm-%Td
> %TI:%TM|%s\n' | sort --version-sort --reverse))

> the array construct ($( ... )) is using the space (between the date
> and the time) also to split array elements,

Yeah, no.  That's not how it works.

You're setting IFS *inside* the command substitution whose value is
what you're trying to word-split.  It needs to be set outside.

In addition to word splitting, an unquoted command substitution's
output is going to undergo filename expansion (globbing).  So you
would also need to disable that.

More to the point, bash has a 'readarray' command which does what you
*actually* want:

readarray -t fndar < <(find "$sdir" ...)

This avoids all of the issues with word splitting and globbing and
setting/resetting the IFS variable, and is more efficient as well.

BTW, readarray is a synonym for 'mapfile'.  You may use either spelling.



setting IFS to new line doesn't work while searching?

2023-12-15 Thread Albretch Mueller
sdir="$(pwd)"
#fndar=($(IFS=$'\n'; find "$sdir" -type f -printf '%P|%TY-%Tm-%Td
%TI:%TM|%s\n' | sort --version-sort --reverse))
#fndar=($(IFS='\n'; find "$sdir" -type f -printf '%P|%TY-%Tm-%Td
%TI:%TM|%s\n' | sort --version-sort --reverse))
fndar=($(find "$sdir" -type f -printf '%P|%TY-%Tm-%Td %TI:%TM|%s\n' |
sort --version-sort --reverse))
fndarl=${#fndar[@]}
echo "// __ \$fndarl: |${fndarl}|${fndar[0]}"

the array construct ($( ... )) is using the space (between the date
and the time) also to split array elements, but file names and paths
may contain spaces, so ($( ... )) should have a way to reset its
parsing metadata, or, do you know of any other way to get each whole
-printf ... line out of find as part of array elements?

lbrtchx