Re: awk not just using the Field separator as such. it is using the blank space as well ...
lbrt...@gmail.com wrote: >On 2/21/23, Greg Wooledge wrote: >> I have a funny feeling Albretch might be using Microsoft file systems >> (FAT, NTFS) for a large chunk of his system. Those have a much larger >> set of restricted characters. > > Certainly not FAT32 and definitely not FAT, but at work (I work as a >Math teacher and most schools use Microsoft) I have had to use WSL and >NTFS. I always thought that FSs used length-defined raster data >structures in order to avoid messing with points and such things. Different filesystems can vary massively here, you can't really assume anything. All of the following can vary in filesystems supported by Linux: * allowed characters in filenames * allowed filename lengths * allowed full-path lengths * character encodings for filenames * case-sensitivity * max number of files per directory * max number of files per filesystem * timestamps (minimum, maximum and resolution) * support for symlinks and hardlinks * support for extended attributes, permissions and and ACLs * ... The VFS layer does a very good job of hiding the complexity and giving you a reasonably consistent view, but it's not difficult to find edges if you look. :-) -- Steve McIntyre, Cambridge, UK.st...@einval.com < sladen> I actually stayed in a hotel and arrived to find a post-it note stuck to the mini-bar saying "Paul: This fridge and fittings are the correct way around and do not need altering"
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On 2/21/23, Greg Wooledge wrote: > I have a funny feeling Albretch might be using Microsoft file systems > (FAT, NTFS) for a large chunk of his system. Those have a much larger > set of restricted characters. Certainly not FAT32 and definitely not FAT, but at work (I work as a Math teacher and most schools use Microsoft) I have had to use WSL and NTFS. I always thought that FSs used length-defined raster data structures in order to avoid messing with points and such things. lbrtchx
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Tue, Feb 21, 2023 at 05:19:13AM +, Tim Woodall wrote: > On Mon, 20 Feb 2023, Albretch Mueller wrote: > > > On 2/15/23, Greg Wooledge wrote: > > > > The reason why I use pipes as field delimiter is because it is an > > excellent meta character when you are working with filesystems. Pipes > > would not accepted for files or directory names for good reasons, > > anyway. > > > > tim@einstein(7):~ (none)$ touch 'i|use|pipes' > tim@einstein(7):~ (none)$ ls -l i*use* > -rw-rw-r-- 1 tim tim 0 Feb 21 05:14 'i|use|pipes' > tim@einstein(7):~ (none)$ rm i\|use\|pipes > tim@einstein(7):~ (none)$ > > AFAIR only / and nul are prohibited in file names. In Unix-like file systems, including Debian's default ext4, this is true. I have a funny feeling Albretch might be using Microsoft file systems (FAT, NTFS) for a large chunk of his system. Those have a much larger set of restricted characters.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Mon, 20 Feb 2023, Albretch Mueller wrote: On 2/15/23, Greg Wooledge wrote: The reason why I use pipes as field delimiter is because it is an excellent meta character when you are working with filesystems. Pipes would not accepted for files or directory names for good reasons, anyway. tim@einstein(7):~ (none)$ touch 'i|use|pipes' tim@einstein(7):~ (none)$ ls -l i*use* -rw-rw-r-- 1 tim tim 0 Feb 21 05:14 'i|use|pipes' tim@einstein(7):~ (none)$ rm i\|use\|pipes tim@einstein(7):~ (none)$ AFAIR only / and nul are prohibited in file names.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Mon, Feb 20, 2023 at 09:12:08PM +, Albretch Mueller wrote: > However this would rightly split that line based on the pipe delimiter: > > $ echo "${_PTH}" | awk -F '|' '{for (i=1; i<=NF; i++) print $i;}' > 83847547 > 2 > dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf So you're just converting pipelines to newlines? You can do that with tr. tr '|' '\n' > There should be a sane way ;-) to feed those three lines into a bash array. mapfile -t myarray < <(...) But calling multiple processes just to split *one* line of input is rather inefficient.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
Thank you! I noticed my mistake and yes, once again it was a hack which I thought to be a typo. I had removed the pipe you had included in the last part of the input string!: "${_PTH}|" _PTH="83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" IFS="|" read -ra _PTH_AR <<< "${_PTH}|" _PTH_AR_L=${#_PTH_AR[@]} echo "// __ \$_PTH_AR_L: |${_PTH_AR_L}|, \"${_PTH}\"" for(( IX=0; IX<${_PTH_AR_L}; ++IX )); do echo "// __ [$IX/$_PTH_AR_L): |${_PTH_AR[$IX]}|" done // __ $_PTH_AR_L: |3|, "83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" // __ [0/3): |83847547| // __ [1/3): |2| // __ [2/3): |dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf| With awk I just to do such things like this: _PTH_AR=($( echo "${_PTH}" | awk -F '|' '{for (i=1; i<=NF; i++) print $i;}' )) echo "// __ \$_PTH_AR_L: |${_PTH_AR_L}|, \"${_PTH}\"" // __ $_PTH_AR_L: |1|, "83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" However this would rightly split that line based on the pipe delimiter: $ echo "${_PTH}" | awk -F '|' '{for (i=1; i<=NF; i++) print $i;}' 83847547 2 dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf $ There should be a sane way ;-) to feed those three lines into a bash array.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Mon, Feb 20, 2023 at 07:24:01PM +, Albretch Mueller wrote: > > https://mywiki.wooledge.org/BashPitfalls#pf47 > > > what I am trying to do is split a string using as delimiter a pipe The web page you cited tells you how, doesn't it? Assuming your string is a line (e.g. something you pulled out of a *simplified* CSV file, where there are no delimiters inside fields), and that you want to store the fields in an array, you can simply do: IFS="|" read -ra myarray <<< "$mystring|" Demonstration: unicorn:~$ mystring='foo|bar|last|field|is|empty|' unicorn:~$ IFS="|" read -ra myarray <<< "$mystring|" unicorn:~$ declare -p myarray declare -a myarray=([0]="foo" [1]="bar" [2]="last" [3]="field" [4]="is" [5]="empty" [6]="") > I used to do that with awk, I don't understand how awk helps you populate the elements of a bash array. Awk can write a new string to stdout, but then you still have to parse that string in bash...? I don't see what benefit awk gives you here. > How do you split a string using as delimiter a pipe these days > without using a bloody hack? You cited a bash web page. So, everything you're doing is a hack. That's the nature of bash.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
> https://mywiki.wooledge.org/BashPitfalls#pf47 > what I am trying to do is split a string using as delimiter a pipe. I used to do that with awk, but it doesn't work anymore after someone had the great idea of substituting awk with mawk, it seems; and Hey! They could have done it with python!: $ which awk /usr/bin/awk $ which mawk /usr/bin/mawk $ awk -W version mawk 1.3.4 20200120 Copyright 2008-2019,2020, Thomas E. Dickey Copyright 1991-1996,2014, Michael D. Brennan random-funcs: srandom/random regex-funcs:internal compiled limits: sprintf buffer 8192 maximum-integer 2147483647 $ mawk -W version mawk 1.3.4 20200120 Copyright 2008-2019,2020, Thomas E. Dickey Copyright 1991-1996,2014, Michael D. Brennan random-funcs: srandom/random regex-funcs:internal compiled limits: sprintf buffer 8192 maximum-integer 2147483647 $ How do you split a string using as delimiter a pipe these days without using a bloody hack?
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Mon, Feb 20, 2023 at 07:10:11AM +, Albretch Mueller wrote: > On 2/15/23, Greg Wooledge wrote: > > If you want to read FIELDS of a SINGLE LINE as array elements, use > > read -ra: > > > > read -ra myarray <<< "$one_line" > > It didn't work. I tried different options. I am getting: "bash: read: > ... : not a valid identifier" > > _PTH="83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" > echo "// __ \$_PTH: \"${_PTH}\"" > > # read -ra -d "\\|" _PTH_AR <<< "${_PTH}" > # read -ra -d "\|" _PTH_AR <<< "${_PTH}" > # read -ra -d "|" _PTH_AR <<< "${_PTH}" The -a option has to be followed by the array name. The -d option has to be followed by the delimiter. However, you do NOT want -d "|" here. The -d delimiter tells read where to stop reading entirely. For you, that's the newline character, which is the default for read, and which is added by the <<< operator. If you wish to do field splitting when using read, that's what IFS is for. However, beware of the atrociously stupid pitfall regarding IFS with non-whitespace values. unicorn:~$ _PTH="83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" unicorn:~$ declare -p _PTH declare -- _PTH="83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" unicorn:~$ IFS="|" read -ra _PTH_AR <<< "${_PTH}|" unicorn:~$ declare -p _PTH_AR declare -a _PTH_AR=([0]="83847547" [1]="2" [2]="dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf") That, I believe, is what you were trying to accomplish. Note that I added a trailing | character on the <<< "${_PTH}|" command. That's because of this pitfall: https://mywiki.wooledge.org/BashPitfalls#pf47 Now we just need to teach you to stop using _ALL_CAPS variable names, especially ones with leading underscores.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On 2/15/23, Greg Wooledge wrote: > If you want to read FIELDS of a SINGLE LINE as array elements, use > read -ra: > > read -ra myarray <<< "$one_line" It didn't work. I tried different options. I am getting: "bash: read: ... : not a valid identifier" _PTH="83847547|2|dli.ernet.449320/449320-Seduction Of The Innocent_text.pdf" echo "// __ \$_PTH: \"${_PTH}\"" # read -ra -d "\\|" _PTH_AR <<< "${_PTH}" # read -ra -d "\|" _PTH_AR <<< "${_PTH}" # read -ra -d "|" _PTH_AR <<< "${_PTH}" # read -ra -d '\\|' _PTH_AR <<< "${_PTH}" # read -ra -d '\|' _PTH_AR <<< "${_PTH}" # read -ra -d '|' _PTH_AR <<< "${_PTH}" _PTH_AR_L=${#_PTH_AR[@]} echo "// __ \$_PTH_AR_L: |${_PTH_AR_L}|, \"${_PTH}\"" The reason why I use pipes as field delimiter is because it is an excellent meta character when you are working with filesystems. Pipes would not accepted for files or directory names for good reasons, anyway.
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Wed, Feb 15, 2023 at 12:09:28PM +, Albretch Mueller wrote: > On 2/15/23, DdB wrote: > > $ echo "Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" | awk > > -F'\"' '{for (i=1; i<=NF; i++) print $i;}' > > Adams, Fred, and Ken Aizawa > > The Bounds of Cognition > > yes and this also works: > > _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" > echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i;}' > Adams, Fred, and Ken Aizawa > The Bounds of Cognition > > but I wasn't able to write the output into an array If you want to read LINES of a STREAM as array elements, use mapfile: mapfile -t myarray < <( printf '%s\n' "$stuff" | awk -F'\"' '...' ) If you want to read FIELDS of a SINGLE LINE as array elements, use read -ra: read -ra myarray <<< "$one_line" Note the caveats associated with each of these, especially the second one. Very few things in bash ever work as you expect once you start poking at the corner cases. https://mywiki.wooledge.org/BashPitfalls#pf47
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On 2/15/23, DdB wrote: > $ echo "Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" | awk > -F'\"' '{for (i=1; i<=NF; i++) print $i;}' > Adams, Fred, and Ken Aizawa > The Bounds of Cognition yes and this also works: _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i;}' Adams, Fred, and Ken Aizawa The Bounds of Cognition but I wasn't able to write the output into an array > $ awk --version I also discovered that there seems to be something wrong with the version of awk I am working: $ awk --version awk: not an option: --version $ which awk /usr/bin/awk $ awk -W version mawk 1.3.4 20200120 Copyright 2008-2019,2020, Thomas E. Dickey Copyright 1991-1996,2014, Michael D. Brennan random-funcs: srandom/random regex-funcs:internal compiled limits: sprintf buffer 8192 maximum-integer 2147483647 $ On 2/15/23, David wrote: > Start reading here: > http://mywiki.wooledge.org/BashFAQ/005 which helped me find a hack around it I am comfortable with: _DT=$(date +%Y%m%d%H%M%S) _TMPFL=$(basename "$(pwd)")_$(mktemp ${_DT}.XX) _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i;}' > "${_TMPFL}" mapfile -t _AR < "${_TMPFL}" _AR_L=${#_AR[@]} echo "// __ \$_AR_L: |${_AR_L}|" rm --force --verbose "${_TMPFL}" I think the problem is whatever bash is using as "awk" is also including a blank space as delimiter for the splitting of the string lbrtchx
Re: awk not just using the Field separator as such. it is using the blank space as well ...
On Wed, 15 Feb 2023 at 18:22, DdB wrote: > Am 15.02.2023 um 07:25 schrieb Albretch Mueller: > > $ _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" > > echo "// __ \$_L: |${_L}|" > > _AR=($(echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i}' )) > > _AR_L=${#_AR[@]} > > echo "// __ \$_AR_L: |${_AR_L}|" > > for(( _IX=0; _IX<${_AR_L}; _IX++ )); do > > echo "// __ [$_IX/$_AR_L): |${_AR[$_IX]}|" > > done > what awk are you using? gnu awk works fine. see: The complaint has nothing to do with awk. The reason this is happening is because when the shell creates the elements of the array _AR, it parses those elements as separated by any whitespace. Whereas the OP expects the elements to be separated by newlines. Just looking at this made my eyes bleed so that, combined with the total lack of troubleshooting effort, means that my answer ends as follows: Start reading here: http://mywiki.wooledge.org/BashFAQ/005
Re: awk not just using the Field separator as such. it is using the blank space as well ...
Am 15.02.2023 um 08:21 schrieb DdB: > $ awk --version > GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2) > Copyright © 1989, 1991-2018 Free Software Foundation. even mawk would. see: $ mawk -W version compiled limits: max NF 32767 sprintf buffer 2040 $ echo "Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" | mawk -F'\"' '{for (i=1; i<=NF; i++) print $i;}' Adams, Fred, and Ken Aizawa The Bounds of Cognition
Re: awk not just using the Field separator as such. it is using the blank space as well ...
Am 15.02.2023 um 07:25 schrieb Albretch Mueller: > $ _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" > echo "// __ \$_L: |${_L}|" > _AR=($(echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i}' )) > _AR_L=${#_AR[@]} > echo "// __ \$_AR_L: |${_AR_L}|" > for(( _IX=0; _IX<${_AR_L}; _IX++ )); do > echo "// __ [$_IX/$_AR_L): |${_AR[$_IX]}|" > done what awk are you using? gnu awk works fine. see: $ echo "Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i;}' Adams, Fred, and Ken Aizawa The Bounds of Cognition $ awk --version GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2) Copyright © 1989, 1991-2018 Free Software Foundation. Dieses Programm ist Freie Software. Sie können es unter den Bedingungen der von der Free Software Foundation veröffentlichten GNU General Public License weitergeben und/oder ändern. Es gilt Version 2 dieser Lizenz oder (nach Ihrer Wahl) irgendeine spätere Version. Dieses Programm wird weitergegeben in der Hoffnung, dass es nützlich ist, aber OHNE JEDE GEWÄHRLEISTUNG; nicht einmal mit der impliziten Gewähr- leistung einer HANDELBARKEIT oder der EIGNUNG FÜR EINEN BESTIMMTEN ZWECK. Sehen Sie bitte die GNU General Public License für weitere Details. Sie sollten eine Kopie der GNU General Publice License zusammen mit diesem Programm erhalten haben. Wenn nicht, lesen Sie bitte http://www.gnu.org/licenses/.
awk not just using the Field separator as such. it is using the blank space as well ...
Once again one of my silly problems ;-). I search and search for an answer/the reason why this is happening. $ _L="Adams, Fred, and Ken Aizawa \"The Bounds of Cognition\"" echo "// __ \$_L: |${_L}|" _AR=($(echo "${_L}" | awk -F'\"' '{for (i=1; i<=NF; i++) print $i}' )) _AR_L=${#_AR[@]} echo "// __ \$_AR_L: |${_AR_L}|" for(( _IX=0; _IX<${_AR_L}; _IX++ )); do echo "// __ [$_IX/$_AR_L): |${_AR[$_IX]}|" done // __ $_L: |Adams, Fred, and Ken Aizawa "The Bounds of Cognition"| // __ $_AR_L: |9| // __ [0/9): |Adams,| // __ [1/9): |Fred,| // __ [2/9): |and| // __ [3/9): |Ken| // __ [4/9): |Aizawa| // __ [5/9): |The| // __ [6/9): |Bounds| // __ [7/9): |of| // __ [8/9): |Cognition| $ This is the result I am looking for (probably the last empty string could be discarded): // __ $_L: |Adams, Fred, and Ken Aizawa "The Bounds of Cognition"| // __ $_AR_L: |3| // __ [0/3): |Adams, Fred, and Ken Aizawa | // __ [1/3): |The Bounds of Cognition| // __ [2/3): || lbrtchx