Re: Handling files with CRLF line ending

2022-12-06 Thread Yair Lenga
 Valid question.

I believe a major goal of bash will be to cross operate with other tools.
In this case, being able to read text files generated by python, when
running under WSL, seems like something bash should do.

On the question of minimal changes. I believe many bash users (some are not
hard core developers, just devops) are tasked with transfering existing
solutions to WSL. I am not aware of hard data, but I believe those are
underrepresented in this forum.

I admit no hard data to support any of those.

On Mon, Dec 5, 2022, 15:36 Chet Ramey  wrote:

> On 12/3/22 8:53 AM, Yair Lenga wrote:
> > Thank you for suggestions. I want to emphasize: I do not need help in
> > striping the CR from the input files - it's simple.
> >
> > The challenge is executing a working bash/python solution from Linux on
> > WSL, with MINIMAL changes to the scripts.
>
> That's certainly your priority. But is it a compelling enough reason to
> change bash to accomplish it?
>
> It seems easy enough to set up a pipeline on WSL to provide input in the
> form the script authors assume.
>
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>
>


Re: Handling files with CRLF line ending

2022-12-03 Thread Yair Lenga
Thank you for suggestions. I want to emphasize: I do not need help in
striping the CR from the input files - it's simple.

The challenge is executing a working bash/python solution from Linux on
WSL, with MINIMAL changes to the scripts.

Specifically in my case, the owners of the various modules are working in
Linux. They are research people, with no access to window dev boxes. I
would also mention: the research people have little interest in
cross-platform portability issues.

Yair

On Sat, Dec 3, 2022 at 8:44 AM Greg Wooledge  wrote:

> On Sat, Dec 03, 2022 at 05:40:02AM -0500, Yair Lenga wrote:
> > I was recently asked to deploy a bash/python based solution to windows
> > (WSL2).  The solution was developed on Linux. Bash is being used as a
> glue
> > to connect the python based data processing (pipes, files, ...).
> Everything
> > works as expected with a small BUT: files created by python can not be
> read
> > by bash `read` and `readarray`.
> >
> > The root cause is the CRLF line ending ("\r\n") - python on windows uses
> > the platform CRLF line ending (as opposed to LF line ending for Linux).
>
> The files can be read.  You just need to remove the CR yourself.  Probably
> the *easiest* way would be to replace constructs like this:
>
> readarray -t myarray < "$myfile"
>
> with this:
>
> readarray -t myarray < <(tr -d \\r < "$myfile")
>
> And replace constructs like this:
>
> while read -r line; do
> ...
> done < "$myfile"
>
> with either this:
>
> while read -r line; do
> ...
> done < <(tr -d \\r < "$myfile")
>
> or this:
>
> while read -r line; do
> line=${line%$'\r'}
> ...
> done < "$myfile"
>
> > The short term (Dirty, but very quick) solution was to add dos2unix pipe
> > when reading the files.
>
> dos2unix wants to "edit" the files in place.  It's not a filter.
> I'd steer clear of dos2unix, unless that's what you truly want.  Also,
> dos2unix isn't a standard utility, so it might not even be present on
> the target system.
>


Handling files with CRLF line ending

2022-12-03 Thread Yair Lenga
Hi,

I was recently asked to deploy a bash/python based solution to windows
(WSL2).  The solution was developed on Linux. Bash is being used as a glue
to connect the python based data processing (pipes, files, ...). Everything
works as expected with a small BUT: files created by python can not be read
by bash `read` and `readarray`.

The root cause is the CRLF line ending ("\r\n") - python on windows uses
the platform CRLF line ending (as opposed to LF line ending for Linux).

The short term (Dirty, but very quick) solution was to add dos2unix pipe
when reading the files. However, I'm wonder about a better solution: add
"autocrlf" to basic input/output.

Basically, new option "autocrlf" (set -o autocrlf), which will allow bash
scripts (under Unix and Windows) to work with text files that have CRLF
line ending. The goal should be to minimize the number of changes that are
needed.

Possible better alternative will be use environment variable
("BASH_AUTOCRLF" ? ) that will perform the same, with the advantage that it
will be inherit by sub processes, making it possible to activate the
behavior, without having to modify the actual code. Huge plus in certain
situations.

Specifically:
* For "read": remove CR before line ending if autocrlf is on.
* For "readarray": the '-t' option should also strip CRLF line ending if
autocrlf is on.
* No impact on other commands, in particular echo/printf. It's impossible
to know if a specific printf/echo should produce CRLF, as it is unknown if
the program that will read the data is capable of handing CRLF line ending.

Feedback/comments.

Yair


Re: Supporting structured data (was: Re: bug-bash Digest, Vol 238, Issue 2)

2022-09-07 Thread Yair Lenga
Another comment:

While it’s important to use “natural” access, I believe it is ok to have a 
command to set values inside the h-value. It does not have to be supported as 
part of …=… , which has lot of history, rule, interaction with env var, etc. I 
think something like:

hset var.foo.bar=value
hset var.{complex.$x}=value

Are ok. Does not have to be hset - just borrowed it from redis. :-). Having a 
separate command can simplify implementation - less risk to break existing code.

Yair

Sent from my iPad

> On Sep 7, 2022, at 3:19 AM, Martin D Kealey  wrote:
> 
> So may I suggest a compromise syntax: take the ${var.field} notation from 
> Javascript, and the {var.$key} as above, but also extend it to allow 
> ${var.{WORD}} (which mimics the pattern of allowing both $var and ${var}) 
> with the WORD treated as if double-quoted. Then we can write 
> ${var.{complex.$key/$expansion}} and var.{complex.$key/$expansion}=value, 
> which are much more reasonable propositions for parsing and reading



Re: Supporting structured data (was: Re: bug-bash Digest, Vol 238, Issue 2)

2022-09-07 Thread Yair Lenga
Thanks for providing feedback and expanding with new ideas.

I believe the summary is:

${a.key1.key2} - Static fields
${a.key1.$key2} - Mixed dynamic/static,  simple substitution.
${a.key1.{complex.$key2}} - For complex keys that may contain anything
${a.key1[expr].key2] - expr is evaluated in numeric context

Did I get it right ?

Yair

On Wed, Sep 7, 2022 at 3:19 AM Martin D Kealey 
wrote:

> Some things do indeed come down to personal preference, where there are no
> right answers. Then Chet or his successor gets to pick.
>
> Keep in mind that most or all of my suggestions are gated on not being in
> backwards-compatibility mode, and that compat mode itself would be
> lexically scoped. With that in mind, I consider that we're free to *stop*
> requiring existing antipatterns that are harmful to comprehension or
> stability.
>
> I would choose to make parsing numeric expressions happen at the same time
> as parsing whole statements, not as a secondary parser that's always
> deferred until runtime. This would improve unit testing and debugging,
> starting with bash -n being able to complain about syntax errors in
> expressions. (Yes that precludes $(( x $op y )) unless you're in compat
> mode.)
>
> On Mon, 5 Sept 2022 at 19:55, Yair Lenga  wrote:
>
>> Personally, I think adopting Javascript/Python like approach (${a.b.c} )
>> is preferred over using Perl approach ( ${a{b}{c}} ), or sticking with the
>> existing bash approach. The main reason is that it is better to have a
>> syntax similar/compatible with current/future directions, and not the past.
>>
>
> By having var['key'] and var.key as synonyms, Javascript already sets the
> precedent of allowing multiple ways to do the same thing.
>
> PPS: I'm under no illusions that it will take a LOT of work to move Bash
> this far. But we'll never get there if we keep taking steps in the opposite
> direction, piling on ever more stuff that has to be accounted for in
> "compat" mode.
>


Re: bug-bash Digest, Vol 238, Issue 2

2022-09-05 Thread Yair Lenga
Martin brings up several good points, and I think it's worth figuring out
the direction of the implementation. Bash currently does not have good
syntax for H-values, so a new one is needed. It does not make sense to have
a completely new one, as there are few accepted syntax - python,
JavaScript, Perl, Java to name a few. Ideally, bash will decide on a
direction, and then apply bash-style changes.

Wanted to emphasize - those are my own preferences. There is NO right
answer here.

Personally, I think adopting Javascript/Python like approach (${a.b.c} ) is
preferred over using Perl approach ( ${a{b}{c}} ), or sticking with the
existing bash approach. The main reason is that it is better to have a
syntax similar/compatible with current/future directions, and not the past.
JavaScript/Python knowledge is much more common nowadays vs Perl or Bash
arrays (regular, and associative). Using '.' is also in line with many
scripting/compiled languages - C/C++/Java/Groovy/lua - all support '.' for
choosing fields from structure - developers with this background will feel
comfortable with this grammar.

I believe supporting bracket notation, can be a plus. One tricky issue - Is
the content inside the bracket (${a[index_expression}} is evaluated or not.
Martin highlight some background compatibility issues, My vote is going to
evaluating the expressions.

Bottom line - IHMO - no point in joining a losing camp (Perl), or having a
"bash" camp.

That brings the second question of bash adaptations. My own opinion is that
it will be great to support multiple approaches:
* a.$b.$c - bash style substitution, should include ANY substitution,
including command, arithmetic, ...
* a[expr1][expr2] - will be nice alternative, where b and c are
evaluated/interpolated.
* a.$b[$expr] - mixed usage ?

As far as supporting '.', in the key, I do not see this as a major issue.
For me, the main goal is to support light-weight structure-like values. Not
deep hash tables. Bash will not be the preferred solution for complex
processing of data structure, even if it will be able to support '.' in the
key.

Needless to say - those are my own preferences. There is NO right answer
here.

Yair

On Mon, Sep 5, 2022 at 4:15 AM Martin D Kealey 
wrote:

> Rather than var[i1.i2.i3] I suggest a C-like var[i1][i2][i3] as that
> avoids ambiguity for associative arrays whose keys might include ".", and
> makes it simpler to add floating point arithmetic later.
>
> I would like to allow space in the syntax to (eventually) distinguish
> between an object with a fairly fixed set of fields and a map with an
> arbitrary set of keys. Any C string - including the empty string - should
> be a valid key, but a field name should have the same structure as a
> variable name.
>
> Moreover I'm not so keen on ${var.$key}; I would rather switch the
> preferred syntax for associative arrays (maps) to a Perl-like ${var{key}}
> so that it's clear from the syntax that arithmetic evaluation should not
> occur.
>
> Then we can write ${var[index_expression].field{$key}:-$default}.
>
> Retaining var[key] for associative arrays would be one of the backwards
> compatibility options that's only available for old-style (single-level)
> lookups.
>
> These might seem like frivolous amendments, but they deeply affect future
> features; I can only highly a few things here.
>
> Taken together they enable expression parsing to be a continuation of the
> rest of the parser, rather than a separate subsystem that has to be
> switched into and out of, and so bash -n will be able to tell you about
> syntax errors in your numeric expressions.
>
> There there won't be separate code paths for "parse and evaluate" and
> "skip" when handling conditionals; instead there will be just have "parse",
> with "evaluate" as a separate (bypassable) step. That improves reliability
> and simplifies maintenance. And caching of the parse tree could improve
> performance, if that matters.
>
> Backwards compatibility mode would attempt to parse expressions but also
> keep the literal text, so that when it later turns out that the variable is
> an assoc array, it can use that rather than the expression tree. This would
> of course suppress reporting expression syntax errors using bash -n.
>
> -Martin
>
> On Mon, 5 Sep 2022, 05:49 Yair Lenga,  wrote:
>
>> Putting aside the effort to implement, it might be important to think on
>> how the h-data structure will be used by users. For me, the common use
>> case
>> will be to implement a simple, small "record" like structure to make it
>> easier to write readable code. Bash will never be able to compete with
>> Python/Node for large scale jobs, or for performance critical services,
>> etc. However, there are many devops/cloud tasks

Re: bug-bash Digest, Vol 238, Issue 2

2022-09-04 Thread Yair Lenga
Putting aside the effort to implement, it might be important to think on
how the h-data structure will be used by users. For me, the common use case
will be to implement a simple, small "record" like structure to make it
easier to write readable code. Bash will never be able to compete with
Python/Node for large scale jobs, or for performance critical services,
etc. However, there are many devops/cloud tasks where bash + cloud CLI
(aws/google/azure) could be a good solution, eliminating the need to build
"hybrids". In that context, being able to consume, process and produce data
structures relevant to those tasks can be useful. Clearly, JSON and YAML
are the most relevant formats.

As a theoretical exercise, looking for feedback for the following, assuming
that implementation can be done. Suggesting the following:
* ${var.k1.k2.k3}  -> value   # Should lookup an item via h-data,
supporting the regular modifiers ('-', for default values, '+' for
alternate, ...)
* var[k1.k2.k3]=value   # Set a specific key, replacing
sub-documents, if any - e.g. removing any var[.k1.k2.k3.*]
* var[k1.k2.k3]=(h-value)  # set a specific key to a new h-value
* ${var.k1.k2.k3.*}   -> h->value   # extract h-value string that represent
the sub-document k1.k2.k3

The 'h-value' representation may be the same format that is currently used
by the associative array. No need to reinvent here.

Assuming the above are implemented, the missing pieces are "converters" to
common formats: json, yaml, and possibly XML (yet, there is still a lot of
those). In theory, following the 'printf' styles:
* printjson [-v var] h-value
* readjson var # or event var.k1
* printyaml [-v var] h-value
* readyaml var # or event var.k1

To summarize:
* Using '.' to identify the hierarchy of the h-data - extension to bash
syntax.
* Allow setting a "node" to new value, or new sub-document - may be
extension
* Converters to/from standard formats - can be extensions

Looking for feedback
Yair


Date: Fri, 2 Sep 2022 09:38:35 +1000
From: Chris Dunlop 
To: Chet Ramey 
Cc: tetsu...@scope-eye.net, bug-bash@gnu.org
Subject: Hierarchical data (was: Light weight support for JSON)
Message-ID: <20220901233835.ga2826...@onthe.net.au>
Content-Type: text/plain; charset=us-ascii; format=flowed

On Wed, Aug 31, 2022 at 11:11:26AM -0400, Chet Ramey wrote:
> On 8/29/22 2:03 PM, tetsu...@scope-eye.net wrote:
>> It would also help greatly if the shell could internally handle
>> hierarchical data in variables.
>
> That's a fundamental change. There would have to be a better reason to
> make it than handling JSON.

I've only a little interest in handling JSON natively in bash (jq usually
gets me there), but I have a strong interest in handling hierarchical data
(h-data) in bash.

I admit I've only had a few cases where I've been jumping through hoops to
manage h-data in bash, but that's because, once it's clear h-data is a
natural way to manage an issue, I would normally handle the problem in
perl rather than trying to force clunky constructs into a bash script. In
perl I use h-data all the time. I'm sure if h-data were available in bash
I'd be using it all the time there as well.

Chris


>
>


Re: Light weight support for JSON

2022-08-28 Thread Yair Lenga
First, thanks for taking the time to read and provide your thoughts. This
is the real value of the discussion/

Second: I'm NOT trying to argue that there isn't valid use for
combining bash/curl/jq, Nor do I suggest adding JSON as first class object
to bash (Python/node/Perl/Groovy are way ahead  ...).

I hope to get feedback from other readers for the news group that may find
this approach useful.

I'll take the very common use case of using AWS CLI, which does produces
JSON response for most calls. Processing the response, while possible with
JQ, it is challenging to many junior (and intermediate) developers. In many
cases, they fall into the traps that I mentioned above - performance
(excessive forking or fork/exec), or code that is hard to read (I've seen
really bad code - combining pipes of JQ/awk/sed). I'm trying to address
those cases. Almost always they fail to properly handle objects with while
space, new-lines, etc.

To be practical, I'll try to follow the loadable extension path, and see
how much I can get thru that path. Possibly will make sense to continue
discussion with a concrete implementation. I believe the necessary commands
are:

json_read -a data-array -m meta-array -r root-obj
Parse the stdin input and into data-array with the items (as described
above), and the meta-array with helper information (length, prop list under
each node) - to help iterating.

json_write -v variable -a data-array -m meta-array -r root
The reverse - generate the JSON for associated array following '.' naming
convention

json_add -a data-array [-m meta-array] [-r root] key1=value1 key2=value2
key3=value3
Helper to add items into associative array representing JSON object,
Autodetect type, with the ability to force stringfication using format
string.


On Sun, Aug 28, 2022 at 3:22 PM John Passaro 
wrote:

> interfacing with an external tool absolutely seems like the correct answer
> to me. a fact worth mentioning to back that up is that `jq` exists. billed
> as a sed/awk for json, it fills all the functions you'd expect such an
> external tool to have and many many more. interfacing from curl to jq to
> bash is something i do on a near daily basis.
>
> https://stedolan.github.io/jq/
>
> On Sun, Aug 28, 2022, 09:25 Yair Lenga  wrote:
>
>> Hi,
>>
>> Over the last few years, JSON data becomes a integral part of processing.
>> In many cases, I find myself having to automate tasks that require
>> inspection of JSON response, and in few cases, construction of JSON. So
>> far, I've taken one of two approaches:
>> * For simple parsing, using 'jq' to extract elements of the JSON
>> * For more complex tasks, switching to python or Javascript.
>>
>> Wanted to get feedback about the following "extensions" to bash that will
>> make it easier to work with simple JSON object. To emphasize, the goal is
>> NOT to "compete" with Python/Javascript (and other full scale language) -
>> just to make it easier to build bash scripts that cover the very common
>> use
>> case of submitting REST requests with curl (checking results, etc), and to
>> perform simple processing of JSON files.
>>
>> Proposal:
>> * Minimal - Lightweight "json parser" that will convert JSON files to bash
>> associative array (see below)
>> * Convert bash associative array to JSON
>>
>> To the extent possible, prefer to borrow from jsonpath syntax.
>>
>> Parsing JSON into an associative array.
>>
>> Consider the following, showing all possible JSON values (boolean, number,
>> string, object and array).
>> {
>> "b": false,
>> "n": 10.2,
>> "s: "foobar",
>>  x: null,
>> "o" : { "n": 10.2,  "s: "xyz" },
>>  "a": [
>>  { "n": 10.2,  "s: "abc", x: false },
>>  {  "n": 10.2,  "s": "def" x: true},
>>  ],
>> }
>>
>> This should be converted into the following array:
>>
>> -
>>
>> # Top level
>> [_length] = 6# Number of keys in object/array
>> [_keys] = b n s x o a# Direct keys
>> [b] = false
>> [n] = 10.2
>> [s] = foobar
>> [x] = null
>>
>> # This is object 'o'
>> [o._length] = 2
>> [o._keys] = n s
>> [o.n] = 10.2
>> [o.s] = xyz
>>
>> # Array 'a'
>> [a._count] =  2   # Number of elements in array
>>
>> # Element a[0] (object)
>> [a.0._length] = 3
>> [a.0._keys] = n s x
>> [a.0.n] = 10.2
>> [a.0.s] = abc
>> [a.0_x] = false
>>
>> ---

Re: Light weight support for JSON

2022-08-28 Thread Yair Lenga
I do not think that JSON (and REST) are "data exchange format of the
month". Those are established formats that are here to stay. Like YAML.
Those are "cornerstones" of cloud computing/configuration. I do not have to
argue for them, they can speak for themselves.

As for using external utilities: two main issues:
* Performance - Processing data in bash processes can be 100X times faster
than using external tools. The fork/exec is expensive. To emphasize, the
intention is not to build ETL processes with bash - those should still use
dedicated tools (or Python or frameworks).
* Readability - Each tool has its own syntax, escapes, etc. The final
result of mixing JQ and bash is not pretty (just lookup jq/bash questions
on stack overflow)
* It is not easy to construct valid (JSON) documents with bash - by
concatenating strings. Many other tools that are used for automation have
support to ensure correctness. will be nice to have the same - it will make
bash more useful for the proper use cases.

Having them as a loadable extension seems like a good practical solution.
They do not have to be "built-in".

Yair

On Sun, Aug 28, 2022 at 2:11 PM Lawrence Velázquez  wrote:

> On Sun, Aug 28, 2022, at 9:24 AM, Yair Lenga wrote:
> > Wanted to get feedback about the following "extensions" to bash that will
> > make it easier to work with simple JSON object. To emphasize, the goal is
> > NOT to "compete" with Python/Javascript (and other full scale language) -
> > just to make it easier to build bash scripts that cover the very common
> use
> > case of submitting REST requests with curl (checking results, etc), and
> to
> > perform simple processing of JSON files.
>
> I do not think bash needs to sprout functionality to support every
> data-exchange format of the month.  A loadable module might be okay,
> I guess.
>
> Why are people so allergic to just using specific utilities for
> specific tasks, as appropriate?  (This question is rhetorical.
> Please do not respond with an impassioned plea about why JSON is
> so special that it deserves first-class shell support.  It's not.)
>
> --
> vq
>


Re: bug-bash Digest, Vol 237, Issue 30

2022-08-28 Thread Yair Lenga
Yes, you are correct - (most/all of) of those examples "K".

However, given bash's important role in modern computing - isn't it time to
take advantage of new language features ? this can make code more readable,
efficient and reliable. Users who are using  old platforms are most likely
using a "snapshot" of tools - e.g., old gcc, make, ... etc. I doubt that
many users are trying to install a new bash in a system that was
built/configured 15 years ago.

Many Java/python/C++ projects that want to move forward do it as part of
the "major" release, in which they indicate Java 7 (or java java 8) support
will be phased out. Same for C++ and python.


On Sun, Aug 28, 2022 at 12:00 PM  wrote:

> Send bug-bash mailing list submissions to
> bug-bash@gnu.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.gnu.org/mailman/listinfo/bug-bash
> or, via email, send a message with subject or body 'help' to
> bug-bash-requ...@gnu.org
>
> You can reach the person managing the list at
> bug-bash-ow...@gnu.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of bug-bash digest..."
>
>
> Today's Topics:
>
>1. Bash Coding style - Adopting C99 declarations (Yair Lenga)
>2. Re: Light weight support for JSON (Yair Lenga)
>3. Re: Bash Coding style - Adopting C99 declarations (Greg Wooledge)
>
>
> ------
>
> Message: 1
> Date: Sun, 28 Aug 2022 10:47:38 -0400
> From: Yair Lenga 
> To: bug-bash 
> Subject: Bash Coding style - Adopting C99 declarations
> Message-ID:
>  io-antwp9vxaxjvac0elnp2tm4...@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Hi,
>
> I've noticed Bash code uses "old-style" C89 declarations:
> * Parameters are separated from the prototype
> * Variables declared only at the beginning of the function
> * No mixed declaration/statements
> * No block local variables
>
> intmax_t
> evalexp (expr, flags, validp)
>  char *expr;
>  int flags;
>  int *validp;
> {
>   intmax_t val;
>   int c;
>   procenv_t oevalbuf;
>
>   val = 0;
>   noeval = 0;
>   already_expanded = (flags_EXPANDED);
>
>
> ---
> Curious as to the motivation of sticking to this standard for new
> development/features. Specifically, is there a requirement to keep bash
> compatible with C89 ? I believe some of those practices are discouraged
> nowadays.
>
> Yair
>
>
> --
>
> Message: 2
> Date: Sun, 28 Aug 2022 10:51:33 -0400
> From: Yair Lenga 
> To: Alex fxmbsw7 Ratchev 
> Cc: bug-bash 
> Subject: Re: Light weight support for JSON
> Message-ID:
> <
> cak3_kppv5xnwbctxacmktvgqahegubm1y7bowa7j6ygpvwo...@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> Interesting point. Using (optional) separate array can also address the
> problem of "types" - knowing which values are quoted, and which one are
> not. This can also provide enough metadata to convert modified associative
> table back to JSON.
>
> On Sun, Aug 28, 2022 at 9:51 AM Alex fxmbsw7 Ratchev 
> wrote:
>
> >
> >
> > On Sun, Aug 28, 2022, 15:46 Yair Lenga  wrote:
> >
> >> Sorry for not being clear. I'm looking for feedback. The solution that I
> >> have is using python to read the JSON, and generate the commands to
> build
> >> the associative array. Will have to rewrite in "C"/submit if there is
> >> positive feedback from others readers. Yair.
> >>
> >
> > ah, cool
> > i just have a suggestion, .. to store the keys in a separate array, space
> > safe
> >
> > On Sun, Aug 28, 2022 at 9:42 AM Alex fxmbsw7 Ratchev 
> >> wrote:
> >>
> >>>
> >>>
> >>> On Sun, Aug 28, 2022, 15:25 Yair Lenga  wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Over the last few years, JSON data becomes a integral part of
> >>>> processing.
> >>>> In many cases, I find myself having to automate tasks that require
> >>>> inspection of JSON response, and in few cases, construction of JSON.
> So
> >>>> far, I've taken one of two approaches:
> >>>> * For simple parsing, using 'jq' to extract elements of the JSON
> >>>> * For more complex tasks, switching to python or Javascript.
> >>>>
> >>>> Wanted to get feedback about the following "extensions" to bash that
> >>

Re: Light weight support for JSON

2022-08-28 Thread Yair Lenga
Interesting point. Using (optional) separate array can also address the
problem of "types" - knowing which values are quoted, and which one are
not. This can also provide enough metadata to convert modified associative
table back to JSON.

On Sun, Aug 28, 2022 at 9:51 AM Alex fxmbsw7 Ratchev 
wrote:

>
>
> On Sun, Aug 28, 2022, 15:46 Yair Lenga  wrote:
>
>> Sorry for not being clear. I'm looking for feedback. The solution that I
>> have is using python to read the JSON, and generate the commands to build
>> the associative array. Will have to rewrite in "C"/submit if there is
>> positive feedback from others readers. Yair.
>>
>
> ah, cool
> i just have a suggestion, .. to store the keys in a separate array, space
> safe
>
> On Sun, Aug 28, 2022 at 9:42 AM Alex fxmbsw7 Ratchev 
>> wrote:
>>
>>>
>>>
>>> On Sun, Aug 28, 2022, 15:25 Yair Lenga  wrote:
>>>
>>>> Hi,
>>>>
>>>> Over the last few years, JSON data becomes a integral part of
>>>> processing.
>>>> In many cases, I find myself having to automate tasks that require
>>>> inspection of JSON response, and in few cases, construction of JSON. So
>>>> far, I've taken one of two approaches:
>>>> * For simple parsing, using 'jq' to extract elements of the JSON
>>>> * For more complex tasks, switching to python or Javascript.
>>>>
>>>> Wanted to get feedback about the following "extensions" to bash that
>>>> will
>>>> make it easier to work with simple JSON object. To emphasize, the goal
>>>> is
>>>> NOT to "compete" with Python/Javascript (and other full scale language)
>>>> -
>>>> just to make it easier to build bash scripts that cover the very common
>>>> use
>>>> case of submitting REST requests with curl (checking results, etc), and
>>>> to
>>>> perform simple processing of JSON files.
>>>>
>>>> Proposal:
>>>> * Minimal - Lightweight "json parser" that will convert JSON files to
>>>> bash
>>>> associative array (see below)
>>>> * Convert bash associative array to JSON
>>>>
>>>> To the extent possible, prefer to borrow from jsonpath syntax.
>>>>
>>>> Parsing JSON into an associative array.
>>>>
>>>> Consider the following, showing all possible JSON values (boolean,
>>>> number,
>>>> string, object and array).
>>>> {
>>>> "b": false,
>>>> "n": 10.2,
>>>> "s: "foobar",
>>>>  x: null,
>>>> "o" : { "n": 10.2,  "s: "xyz" },
>>>>  "a": [
>>>>  { "n": 10.2,  "s: "abc", x: false },
>>>>  {  "n": 10.2,  "s": "def" x: true},
>>>>  ],
>>>> }
>>>>
>>>> This should be converted into the following array:
>>>>
>>>> -
>>>>
>>>> # Top level
>>>> [_length] = 6# Number of keys in
>>>> object/array
>>>> [_keys] = b n s x o a# Direct keys
>>>> [b] = false
>>>> [n] = 10.2
>>>> [s] = foobar
>>>> [x] = null
>>>>
>>>> # This is object 'o'
>>>> [o._length] = 2
>>>> [o._keys] = n s
>>>> [o.n] = 10.2
>>>> [o.s] = xyz
>>>>
>>>> # Array 'a'
>>>> [a._count] =  2   # Number of elements in array
>>>>
>>>> # Element a[0] (object)
>>>> [a.0._length] = 3
>>>> [a.0._keys] = n s x
>>>> [a.0.n] = 10.2
>>>> [a.0.s] = abc
>>>> [a.0_x] = false
>>>>
>>>> -
>>>>
>>>> I hope that example above is sufficient. There are few other items that
>>>> are
>>>> worth exploring - e.g., how to store the type (specifically, separate
>>>> the
>>>> quoted strings vs value so that "5.2" is different than 5.2, and "null"
>>>> is
>>>> different from null.
>>>>
>>>
>>> did you forget to send the script along ? or am i completly loss
>>>
>>> a small thing i saw, a flat _keys doesnt do the job..
>>>
>>> I will leave the second part to a different post, once I have some
>>>> feedback. I have some prototype that i've written in python - POC - that
>>>> make it possible to write things like
>>>>
>>>> declare -a foo
>>>> curl http://www.api.com/weather/US/10013 | readjson foo
>>>>
>>>> printf "temperature(F) : %.1f Wind(MPH)=%d" ${foo[temp_f]}, ${foo[wind]}
>>>>
>>>> Yair
>>>>
>>>


Bash Coding style - Adopting C99 declarations

2022-08-28 Thread Yair Lenga
Hi,

I've noticed Bash code uses "old-style" C89 declarations:
* Parameters are separated from the prototype
* Variables declared only at the beginning of the function
* No mixed declaration/statements
* No block local variables

intmax_t
evalexp (expr, flags, validp)
 char *expr;
 int flags;
 int *validp;
{
  intmax_t val;
  int c;
  procenv_t oevalbuf;

  val = 0;
  noeval = 0;
  already_expanded = (flags_EXPANDED);


---
Curious as to the motivation of sticking to this standard for new
development/features. Specifically, is there a requirement to keep bash
compatible with C89 ? I believe some of those practices are discouraged
nowadays.

Yair


Re: Light weight support for JSON

2022-08-28 Thread Yair Lenga
Sorry for not being clear. I'm looking for feedback. The solution that I
have is using python to read the JSON, and generate the commands to build
the associative array. Will have to rewrite in "C"/submit if there is
positive feedback from others readers. Yair.

On Sun, Aug 28, 2022 at 9:42 AM Alex fxmbsw7 Ratchev 
wrote:

>
>
> On Sun, Aug 28, 2022, 15:25 Yair Lenga  wrote:
>
>> Hi,
>>
>> Over the last few years, JSON data becomes a integral part of processing.
>> In many cases, I find myself having to automate tasks that require
>> inspection of JSON response, and in few cases, construction of JSON. So
>> far, I've taken one of two approaches:
>> * For simple parsing, using 'jq' to extract elements of the JSON
>> * For more complex tasks, switching to python or Javascript.
>>
>> Wanted to get feedback about the following "extensions" to bash that will
>> make it easier to work with simple JSON object. To emphasize, the goal is
>> NOT to "compete" with Python/Javascript (and other full scale language) -
>> just to make it easier to build bash scripts that cover the very common
>> use
>> case of submitting REST requests with curl (checking results, etc), and to
>> perform simple processing of JSON files.
>>
>> Proposal:
>> * Minimal - Lightweight "json parser" that will convert JSON files to bash
>> associative array (see below)
>> * Convert bash associative array to JSON
>>
>> To the extent possible, prefer to borrow from jsonpath syntax.
>>
>> Parsing JSON into an associative array.
>>
>> Consider the following, showing all possible JSON values (boolean, number,
>> string, object and array).
>> {
>> "b": false,
>> "n": 10.2,
>> "s: "foobar",
>>  x: null,
>> "o" : { "n": 10.2,  "s: "xyz" },
>>  "a": [
>>  { "n": 10.2,  "s: "abc", x: false },
>>  {  "n": 10.2,  "s": "def" x: true},
>>  ],
>> }
>>
>> This should be converted into the following array:
>>
>> -
>>
>> # Top level
>> [_length] = 6# Number of keys in object/array
>> [_keys] = b n s x o a# Direct keys
>> [b] = false
>> [n] = 10.2
>> [s] = foobar
>> [x] = null
>>
>> # This is object 'o'
>> [o._length] = 2
>> [o._keys] = n s
>> [o.n] = 10.2
>> [o.s] = xyz
>>
>> # Array 'a'
>> [a._count] =  2   # Number of elements in array
>>
>> # Element a[0] (object)
>> [a.0._length] = 3
>> [a.0._keys] = n s x
>> [a.0.n] = 10.2
>> [a.0.s] = abc
>> [a.0_x] = false
>>
>> -
>>
>> I hope that example above is sufficient. There are few other items that
>> are
>> worth exploring - e.g., how to store the type (specifically, separate the
>> quoted strings vs value so that "5.2" is different than 5.2, and "null" is
>> different from null.
>>
>
> did you forget to send the script along ? or am i completly loss
>
> a small thing i saw, a flat _keys doesnt do the job..
>
> I will leave the second part to a different post, once I have some
>> feedback. I have some prototype that i've written in python - POC - that
>> make it possible to write things like
>>
>> declare -a foo
>> curl http://www.api.com/weather/US/10013 | readjson foo
>>
>> printf "temperature(F) : %.1f Wind(MPH)=%d" ${foo[temp_f]}, ${foo[wind]}
>>
>> Yair
>>
>


Light weight support for JSON

2022-08-28 Thread Yair Lenga
Hi,

Over the last few years, JSON data becomes a integral part of processing.
In many cases, I find myself having to automate tasks that require
inspection of JSON response, and in few cases, construction of JSON. So
far, I've taken one of two approaches:
* For simple parsing, using 'jq' to extract elements of the JSON
* For more complex tasks, switching to python or Javascript.

Wanted to get feedback about the following "extensions" to bash that will
make it easier to work with simple JSON object. To emphasize, the goal is
NOT to "compete" with Python/Javascript (and other full scale language) -
just to make it easier to build bash scripts that cover the very common use
case of submitting REST requests with curl (checking results, etc), and to
perform simple processing of JSON files.

Proposal:
* Minimal - Lightweight "json parser" that will convert JSON files to bash
associative array (see below)
* Convert bash associative array to JSON

To the extent possible, prefer to borrow from jsonpath syntax.

Parsing JSON into an associative array.

Consider the following, showing all possible JSON values (boolean, number,
string, object and array).
{
"b": false,
"n": 10.2,
"s: "foobar",
 x: null,
"o" : { "n": 10.2,  "s: "xyz" },
 "a": [
 { "n": 10.2,  "s: "abc", x: false },
 {  "n": 10.2,  "s": "def" x: true},
 ],
}

This should be converted into the following array:

-

# Top level
[_length] = 6# Number of keys in object/array
[_keys] = b n s x o a# Direct keys
[b] = false
[n] = 10.2
[s] = foobar
[x] = null

# This is object 'o'
[o._length] = 2
[o._keys] = n s
[o.n] = 10.2
[o.s] = xyz

# Array 'a'
[a._count] =  2   # Number of elements in array

# Element a[0] (object)
[a.0._length] = 3
[a.0._keys] = n s x
[a.0.n] = 10.2
[a.0.s] = abc
[a.0_x] = false

-

I hope that example above is sufficient. There are few other items that are
worth exploring - e.g., how to store the type (specifically, separate the
quoted strings vs value so that "5.2" is different than 5.2, and "null" is
different from null.

I will leave the second part to a different post, once I have some
feedback. I have some prototype that i've written in python - POC - that
make it possible to write things like

declare -a foo
curl http://www.api.com/weather/US/10013 | readjson foo

printf "temperature(F) : %.1f Wind(MPH)=%d" ${foo[temp_f]}, ${foo[wind]}

Yair


Re: Revisiting Error handling (errexit)

2022-07-12 Thread Yair Lenga
(typo correction).
Thanks for sharing your thoughts. I admit that my goals are
significantly less ambitious compared with what you described (lexical
scope, etc.). I do not think that it's possible to stretch my proposal to
meet all the use cases you describe. For me, the 'errfail' is similar to
'pipefail' option - practical solution for real problems. The suggested
'errfail' in opt-in - anyone that want the old way (errexit) can use it,
without saying anything. As you said, errexit was not 'good' solution when
conceived, no point in trying to match it (IHMO).

Yair

On Tue, Jul 12, 2022 at 6:08 PM Martin D Kealey 
wrote:

>
>
> On Sun, 10 Jul 2022 at 05:39, Yair Lenga  wrote:
>
>> Re: command prefaced by ! which is important:
>> * The '!' operator 'normal' behavior is to reverse the exit status of a
>> command ('if ! check-something ; then ...').
>>
>
>


Re: Revisiting Error handling (errexit)

2022-07-12 Thread Yair Lenga
Thanks for sharing your thoughts. I admit that my goals are
significantly less ambitious compared with what you described (lexical
scope, etc.). I do not think that it's possible to stretch my proposal to
meet all the use cases you describe. For me, the 'errfail' is similar to
'pipefail' option - practical solution for real problems. The suggested
'errfail' in opt-in - anyone that want the old way (errexit) can use it,
without saying anything. As you said, errfail was not 'good' solution when
conceived, no point in trying to match it (IHMO).

Yair

On Tue, Jul 12, 2022 at 6:08 PM Martin D Kealey 
wrote:

>
>
> On Sun, 10 Jul 2022 at 05:39, Yair Lenga  wrote:
>
>> Re: command prefaced by ! which is important:
>> * The '!' operator 'normal' behavior is to reverse the exit status of a
>> command ('if ! check-something ; then ...').
>>
>
> Unless that status is ignored, in which case, well, it's still ignored.
>
>
>> * I do not think it's a good idea to change the meaning of '!' when
>> running with 'error checking'.
>> * I think that the existing structures ('|| true', or '|| :') to force
>> success status are good enough and well understood by beginner and advanced
>> developers.
>>
>
> I'm not suggesting a change; rather I'm suggesting that your new errfail
> should honour the existing rule for "!" (as per POSIX.1-2008 [
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html],
> under the description of the "set" built-in):
>
> 2. The *-e* setting shall be ignored when executing the compound list
>> following the *while*, *until*, *if*, or *elif* reserved word, *a
>> pipeline beginning with the ! reserved word*, or any command of an
>> AND-OR list other than the last.
>>
>
> So the exit status of a command starting with "!" (being the inverse of
> the command it prefaces) is *not* considered by errexit, regardless of
> whether it in turn is "tested".
>
> It follows that
>
>>  ! (( expression ))
>>
> and
>
>>  (( expression )) || true
>>
> are equivalent under errexit; the former is the preferred idiom in some
> places precisely because it is expressly called out in that clause of the
> standard.
>
> If you propose to make the former unsafe under errfail, then I suggest
> that the onus is on you to explain why you would break code where the
> author has clearly indicated that its exit status should not be taken as
> indicating success or failure.
>
> Question: What is the penalty for "|| true" ? true is bash builtin, and in
>> bash it's equivalent to ':'. (see:
>> https://unix.stackexchange.com/questions/34677/what-is-the-difference-between-and-true
>> )
>>
>
> Even if there were no performance difference (and I'll admit, the
> performance difference is very small), there's the visual clutter and the
> cognitive load this places on subsequent maintainers. (One can adopt a
> strategy of pushing the "|| true" off to the right with lots of whitespace,
> but then there is the converse problem that any change to the expression is
> just that bit harder to read in "git diff".)
>
> Re: yet another global setting is perpetuating the "wrong direction".
>> Most other scripting solutions that I'm familiar with are using dynamic
>> (rather than lexical) scoping for 'try ... catch.'.
>>
>
> You're quite right that throw+catch is dynamically scoped, though
> try+catch is of course a lexically scoped block.
>
> The problem here is that you're retrofitting; in effect, you're making a
> global declaration that *removes* an implicit try+catch+ignore around
> every existing statement; *that's* what I want to have under
> lexically-scoped control.
>
> Considering that bash is stable, I do not think that it is realistic to
>> try to expect major changes to 'bash'.
>>
>
> Expect, maybe not. Hope for? certainly. Fork it and do it myself? I'll
> think about it.
>
>
>> For a more sophisticated environment, python, groovy, javascript or (your
>> favorite choice) might be a better solution.
>>
>
> Agreed that there are better languages for most complex tasks.
>
> However they're not as pervasively available as Bash; the only language
> that comes close to the same availability is Perl, and much as I like Perl,
> it's abjectly detested by many folk. (The Shell would be *as* abjectly
> detested if people actually understood it, but they're under the delusion
> that it's "simple", and so they don't bother to hate it until it bites
> them, and by the time they understand why it bit them, they're hooked and
> can't leave even if they do hate it.)
>
&

Re: Revisiting Error handling (errexit)

2022-07-09 Thread Yair Lenga
Hi Martin,

Long answer - my own view: The old 'errexit' logic was spec'ed many years
ago. As far as I can tell, it existed in bash 2.0, from 1996. I think
requirements/expectations are different now. The 'exit on error' error
handling was good for 1996 - does not meet today's requirement - using bash
to glue complex systems together. For that reason, I do not think that
replicating the errexit small details is a good strategy. In fact, I think
that most developers are looking for something that is closer to the "try
... catch" that they have in other environments. The 'errfail' is my
proposal to get there, without introducing lsignificant changes, or risk.

Re: command prefaced by ! which is important:
* The '!' operator 'normal' behavior is to reverse the exit status of a
command ('if ! check-something ; then ...').
* I do not think it's a good idea to change the meaning of '!' when running
with 'error checking'.
* I think that the existing structures ('|| true', or '|| :') to force
success status are good enough and well understood by beginner and advanced
developers.

Question: What is the penalty for "|| true" ? true is bash builtin, and in
bash it's equivalent to ':'. (see:
https://unix.stackexchange.com/questions/34677/what-is-the-difference-between-and-true
)

Re: yet another global setting is perpetuating the "wrong direction".
I prefer not to get into "flame war" on the issue of dynamic scope vs.
lexical scope. Most other scripting solutions that I'm familiar with are
using dynamic (rather than lexical) scoping for 'try ... catch.'.
Considering that bash is stable, I do not think that it is realistic to try
to expect major changes to 'bash'. Also, Bash code was developed with
specific design/requirements in mind. It can not be easily stretched to
integrate new ideas - sometimes it is better to go for practical solutions
than for the best solution. There is a big range of applications where bash
can play an important role. For a more sophisticated environment, python,
groovy, javascript or (your favorite choice) might be a better solution.

Question: Out of curiosity, can you share your idea for a better solution ?

Thanks for taking the time !
Yair


Sent from my iPad

On Jul 8, 2022, at 3:31 PM, Martin D Kealey  wrote:


The old errexit explicitly exempts any command prefaced by ! which is
important so that one can write ! (( x++ )) without it blowing up when x is
0 and without paying the penalty for  "|| true".
Does this new proposal honour that?

Aside from that, I still think that yet another global setting is
perpetuating the "wrong direction"; "local -" has the same dynamic scope as
any other "local", which means magic action at a difference, and makes
brittle code that can abort unexpectedly in production.

-Martin


Re: Revisiting Error handling (errexit)

2022-07-08 Thread Yair Lenga
Oğuz,

Thanks for time to take a look at my proposal. I believe that you brought
an excellent question - which many readers will be asking themselves. I
opted for longer answer - given the importance of this questions.

While you can achieve results similar to 'errfail' with existing bash
commands (you will need more than && !), the required effort is not
practical:
* Consider a common scenario (for me) - a script with 1000 lines, complex
logic, commands than span multiple lines - there is no practical way to
review the scripts, and enter the '&&' into hundreds of lines - in the hope
of not breaking anything, including scripts that generate code on the fly
(The 'evil' eval :-).
* While '&&' works for simple sequences, sometimes the functionality that
is needed require injecting 'break', 'return', etc. See example below.
* To reality (at least for me) is that most developers who are making
changes to my bash scripts are not bash experts, In many cases, they are
wrapping functionality from other languages into a process. They do not
have the knowledge, expertise and patience to implement the above
constructs correctly.

Item #1 and #3 above are not purely technical items - but are real issues.
Without being too philosophical, I would argue that the lack of easy to
implement error handling in bash (which I hope to address with the
errfail), lot of teams (and companies) are opting out of bash, into a much
more complex solutions for simple "glue" job. Even in cases, where the
ideal solution is bash. I've seen many 100 lines pythons scripts written to
implement a 10 line bash script - just because because they could not
figure out to make bash script handle errors consistently.

On the technical aspect - using '&&' vs 'errfail'. Consider the above
example where the "copy logic" is embedded in a while (or for) loop, the
desired behavior in those cases is usually to "break" to enclosing loop. To
implement it, you will need something like below:

function copy-files {
for f in f1 f2 f3 ; do
cp ... && cp ... && cp ... && important-job || break
done
} ;

And for the original function, if you do not use errfail, you will need in
inject a 'return' in the RIGHT place (see below). I think that this level
of detailed implementation will hurt the productivity of using bash for
glue jobs.

Hope it make sense.

Yair

function copy-files {
 # ALL files are critical, script will not cont
 cp /new/file1 /production/path/ &&
 cp /new/file2 /production/path/ &&
 # Use the files for something - e.g., count the lines.
 important-job /production/path/file1 /production/path/file2  || return
$?

 ls -l /production/path | mail -s "all-good" not...@company.com
}


On Fri, Jul 8, 2022 at 1:22 PM Oğuz  wrote:

> 8 Temmuz 2022 Cuma tarihinde Yair Lenga  yazdı:
>>
>> Practical Example - real life. A job has to copy 3 critical data files. It
>> then sends notification via email (non-critical).
>>
>> #! /bin/bash
>> set -o errfail
>> function copy-files {
>>  # ALL files are critical, script will not cont
>>  cp /new/file1 /production/path/
>>  cp /new/file2 /production/path/
>>  # Use the files for something - e.g., count the lines.
>>  important-job /production/path/file1 /production/path/file2
>>
>>  ls -l /production/path | mail -s "all-good" not...@company.com ||
>> true# Not critical
>> }
>>
>> if copy-files ; then
>>  more-critical-jobs
>>   echo "ALL GOOD"
>> else
>>   mail -s "PROBLEM" nor...@company.com < /dev/null
>> fi
>>
>> What is the difference ? consider the case where /new/file1 does not
>> exists, which is critical error.
>> * Without errfail, an error message will be sent to script stderr, but the
>> script will continue to copy the 2nd file, and to perform the
>> important-job, even though the data is not ready.
>
>
> How is this any better than doing `cp ... && cp ... && important-job ...'?
>
>
> --
> Oğuz
>
>


Re: Revisiting Error handling (errexit)

2022-07-08 Thread Yair Lenga
Greetings,

First, I wanted to thank all the people that took the time to provide
comments on the proposed improvements to the error handling. I believe that
the final proposal is ready for evaluation (both behavior and
implementation).

Summary:
* Proposing adding new option 'errfail' (sh -o errfail). The new option
makes it significantly easier to build production quality scripts, where
every error condition must be handled, and errors will not silently be
ignored by default. With the new option, it's possible to improve tje error
handling of existing scripts, as it follows common patterns. In general,
the implementation works along the lines of the 'try ... catch'
functionality supported by most modern scripting languages (Python,
Javascript, Perl, Groovy, ... to name a few). I believe the solution
implements what the 'errexit' that we really need - abort sequences of
commands whenever unhandled error occurs. The solution avoids (hopefully)
all the glitches that exist with the 'errexit' option - which can not be
fixed due to backward compatibility issues.

To activate: sh -o errfail
To disable: sh +o errfail

Short summary: when 'errfail' is on (sh -o errfail), each command must
succeed, otherwise, the command will fail.
More details:
* Functions that run under 'errfail' will return immediately on the first
statement that returns a non-zero error code.
* Sequence of commands separated by ';' or new line will stop on the first
statement that returns non-zero error code.
* The while and until statements will 'break' if a command in the "body"
returns a non-zero error code.
* The for each, and arithmetic for statement will "break" if a command in
the body returns a non-zero error code.
* The top level read-eval-loop will exit if a top level command returns a
non-zero error code.
* Behavior inherited by default into sub-shells
* Possible to explicitly turn on/off in a function scope, using the 'local
-' construct.

For users that are familiar with the 'errexit' behavior and limitations:
The limitations of the 'errexit' are covered extensively in stack overflow,
current mailing list, and numerous other locations - no need to rehash.

What is "unhandled error condition'' ? and how to handle them ? How to
ignore non-critical errors ?
* In bash, error conditions are signaled with non-zero return code from a
command, function or external process.
* Two common way to specify error handling is with 'if-then-else'
statement, or with the '||' connector (see below)
* Common method for ignoring non-critical error is to add '|| true'
(assuming the command itself will produce a meaningful error message, if
needed).
---
# Using if-then-else
if command-that-can-fail ; then
   echo "ALL GOOD"
else
   try-to-recover  # Try to handle the error
   if FATAL ; then exit 1# If not possible to recover
fi

# Using the '||' connector
command-than-can-fail || try-to-recover || exit 1


# Handling non-critical errors
if non-critical-command ;
non-critical-command-that-can-fail || true  # silent error handling
non-critical-command-that-can-fail || :   # harder to notice the
':', but does the job.
non-critical-command-than-can-fail || echo "warning doing X - continuing"
>&2   # With some message


Practical Example - real life. A job has to copy 3 critical data files. It
then sends notification via email (non-critical).

#! /bin/bash
set -o errfail
function copy-files {
 # ALL files are critical, script will not cont
 cp /new/file1 /production/path/
 cp /new/file2 /production/path/
 # Use the files for something - e.g., count the lines.
 important-job /production/path/file1 /production/path/file2

 ls -l /production/path | mail -s "all-good" not...@company.com ||
true# Not critical
}

if copy-files ; then
 more-critical-jobs
  echo "ALL GOOD"
else
  mail -s "PROBLEM" nor...@company.com < /dev/null
fi

What is the difference ? consider the case where /new/file1 does not
exists, which is critical error.
* Without errfail, an error message will be sent to script stderr, but the
script will continue to copy the 2nd file, and to perform the
important-job, even though the data is not ready.
* With errexit, we hit one of the pitfalls, an error message will be sent
to the script stderr, but the script will continue, same as without
'errexit'. errexit will only get triggered for 'more-critical-jobs'.
* With errfail, copy-files will stop after the first failure in the 'cp',
it will then continue to the 'else' section to send the alert.

Thanks for taking the time to review. Patch on bash-devel attached. For
those interested: 50 lines of code, most of them are comments. 8 hours of
development, including automated test script.

Looking for advice on how to "officially" submit.

For developers that want more stylish coding:
alias try=''
alias catch='||'

try { copy-files ; }

Re: Revisiting Error handling (errexit)

2022-07-06 Thread Yair Lenga
Koichi  - Thanks for highlighting this 'local -'.

This feature effectively eliminates the need to support scoped 'errfail'.
If it's needed in a limited context, the 'local -' can be used.

Yair

On Wed, Jul 6, 2022 at 1:21 PM Koichi Murase  wrote:

> 2022年7月6日(水) 19:05 Yair Lenga :
> > Function foo will run with errfail logic. But bash will revert back to no
> > errfail when control revert back to bar, and zap will run WITHOUT
> errfail.
> > I remember reading about other bash setting that are automatically
> restored
> > on function exits. Can not remember which one.
>
> That should be `local -' of Bash 4.4+. E.g.,
>
> function foo { local -; set -e; ...; }
>
> --
> Koichi
>


Re: Revisiting Error handling (errexit)

2022-07-06 Thread Yair Lenga
Hi. Thanks for proposal.

For my use case, where I have large existing code base, where I want to
improve error checking, and avoid unhandled errors, per function setting
will not work. This will also deviate from the try ... Catch pattern used
in many other scripting solution, making it hard for developer to adapt.
Asking for large set of changes (touch each function) not realistic for
many production environments.

That said, there might be some other use cases for limited scope. Probably
make sense to restore that errfail when a function exit. Practically
allowing more scoped execution.

Function foo {  Set -oerrfail: ... ; }

Function bar { foo ; zap ; } ;

Function foo will run with errfail logic. But bash will revert back to no
errfail when control revert back to bar, and zap will run WITHOUT errfail.
I remember reading about other bash setting that are automatically restored
on function exits. Can not remember which one.

Another  interesting option will be to limit the errfail to "blocks",
allowing something like
{ Set -o errfail ; foo ; bar ; } ; Baz ;

Instead off:
set -o errfail
{ Foo ; bar ; } || true
Set +o errfail.

Reflecting on my own usage, any will work, since I am planning to apply it
globally. Looking for community feedback. Want also to mention that I need
to change on scope of changes for each option - goal is to keep changes to
minimum, avoid risk of  impact on existing scripts.

As of implementing -E, that seems like a big deviation from existing bash
style, and is likely to cause issues for developers targeting solutions
that will be somehow portable between *sh implementation.

Yair.


On Wed, Jul 6, 2022, 11:24 AM Martin D Kealey 
wrote:

> Modules can have conflicting requirements of global settings that require
> weird contortions to interoperate; so starting out by making this yet
> another global setting seems like a retrograde step.
>
> So I would like to propose an alternative approach.
>
> Firstly, I would reduce the action to simply "return from function upon
> unchecked failure"; the issue is primarily with *invisible* behaviour at
> a difference making this difficult to debug.
>
> Secondly I would set this as a per-function attribute, rather than as a
> global setting outside the function.
>
> Perhaps
>
> function -E funcname { … }
>


Re: bug-bash Digest, Vol 236, Issue 8

2022-07-05 Thread Yair Lenga
Hi.

I agree that the bash local variables are less than ideal (dynamic scope vs
local scope). However, we got to use what we have. In that context, using
'main' has lot of value - documentation, declarative, etc.

In my projects, we use "named" main to create reusable code (e.g. date
calculator can expose date_calc_main, which can be called as a function
after file being sourced. Ideal ? No, productive? Yes, fewer global/bugs:
yes.

One day I will be rewriting code to python :-), probably this  day will be
never. Will be interesting to look alternatives, at a different thread.
Until then, Bash is my tool.

As stated before, I will extend the errfail to the top level, as not
everyone uses bash in the same way (with respect to (not) placing logic at
the top level.


Yair




>
> Message: 5
> Date: Wed, 6 Jul 2022 13:23:14 +1000
> From: Martin D Kealey 
> To: Yair Lenga 
> Cc: Lawrence Velázquez , Martin D Kealey
> , bug-bash 
> Subject: Re: Revisiting Error handling (errexit)
> Message-ID:
>  0dd4vdyl...@mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Wed, 6 Jul 2022 at 08:34, Yair Lenga  wrote:
>
> > in general, for complex scripts, I prefer to move the ‘main’ logic into a
> > function (‘main’, ‘run’,…). This make it possible to keep clean name
> space.
> > Otherwise, all the main variables are becoming global: see below for ‘x’.
> > With many variables, it can be ugly.
> >
> > Function main () {
> > Local x.   # x is local
> > For x in a b ; do process $x ; done
> > }
> >
> > Vs.
> > # x is global, all function will see it.
> > For x in a b ; do process $x ; done
> >
>
> Unfortunately that's not how "local" works. In both cases the variable "x"
> is visible to (and modifiable by) everything that is called from "main"; so
> everything is inside the "main" function, "local" there is quite
> indistinguishable from true global.
>
> -Martin
>
>
> --
>
> Subject: Digest Footer
>
> ___
> bug-bash mailing list
> bug-bash@gnu.org
> https://lists.gnu.org/mailman/listinfo/bug-bash
>
>
> --
>
> End of bug-bash Digest, Vol 236, Issue 8
> 
>


Re: Revisiting Error handling (errexit)

2022-07-05 Thread Yair Lenga
My opinion is that we should be looking at the expected behavior - for a 
developer that want to implement ”strong”  error handling: any error will break 
execution until explicitly handled. In the same spirit as the ‘try … catch’ 
that from JavaScript, python, groovy, etc.

So: assuming f1 does not exists,
For f in f1 f2 f3 ; do cat $f ; done || echo “bad cat”

One would expect the “exception” from “cat f1” to ”break” the loop.

Same way that one would expect the “exception” in the function to “return” from 
the function:
Function f { cat f1 ; cat f2 ; cat f3 ; }
F || echo “bad cat”

And for a block, one would expect the block to be ’unwind’ (no existing command 
in bash for early exit from a block).
{ cat f1 ; cat f2 ; cat f3 ; } || echo “bad cat”

To push the limit, using function f from above, will result in the “exception” 
triggering both return and break.
For x in a b ; do f ; done || echo “bad cat”

I think that all four cases, while each one matches different existing bash 
command(s) (break, return, and ‘unwind’, and return+break) are consistent with 
the accepted / expected pattern for try … catch, and most developers will 
understand how to use it.

Hope it make sense. In any case, I would like to ask for some time until 
additional community input/comments are sorted out. While the final result may 
be different from where we started, I hope it will be better.

Side comment: I think the following should work: 
Alias try=‘if !’
Alias catch=‘then’

Try { cat f1 ; cat f2 ; cat f3 ; }   # 
Catch { echo “bad cat” ; }.   # ‍⬛‍⬛‍⬛

Yair.

Sent from my iPad

> On Jul 6, 2022, at 2:19 AM, Lawrence Velázquez  wrote:
> 
> On Tue, Jul 5, 2022, at 6:34 PM, Yair Lenga wrote:
>> I probably did not described the intended behavior clearly. I believe 
>> both cases should behave identical under errfail. The loop will ‘break’ 
>> on the first iteration (false when word = a). Same for the all looping 
>> commands. I believe this is consistent with if-then-else-if, when an 
>> error in the then or else block will result in terminating (‘breaking’) 
>> the if. 
> 
> It's only consistent if your notion of consistency is "terminate
> all compound commands immediately".  This kind of works for "if"
> but changes "for" and "while" in a very fundamental way.  Your
> initial description of "treat a; b; c like a && b && c" implies
> that
> 
>if condition; then a; b; c; fi
> 
> should behave like
> 
>if condition; then a && b && c; fi
> 
> and
> 
>for word in words; do a; b; c; done
> 
> should behave like
> 
>for word in words; do a && b && c; done
> 
> but it turns out what you apparently want is
> 
>for word in words; do a && b && c || ! break; done
> 
> which is a larger change than you let on.
> 
> -- 
> vq



Re: Revisiting Error handling (errexit)

2022-07-05 Thread Yair Lenga



Sent from my iPad

> On Jul 6, 2022, at 1:07 AM, Lawrence Velázquez  wrote:
> 
> On Tue, Jul 5, 2022, at 5:18 PM, Yair Lenga wrote:
>> I’m not in front of my desktop, so I can not verify behavior, but here 
>> is my expectation - let me know if it make sense, in the context of the 
>> project goal - every unhandled failed statement should unwind execution 
>> up, until explicitly handled.
> 
> But you've said that this won't apply to the topmost level, so
> you've already introduced a distinction between that level and every
> other level, which will have to be explained.
> 
By top level, I refer to the REPL (read-eval-print) loop. It can  be handled as 
well. My personal opinion is to leave the REPL loop untouched (see below). 
However, if there is agreement that the this is the right implementation, I 
will add it to the next patch set.

Why addressing the REPL is not important: in general, for complex scripts, I 
prefer to move the ‘main’ logic into a function (‘main’, ‘run’,…). This make it 
possible to keep clean name space. Otherwise, all the main variables are 
becoming global: see below for ‘x’. With many variables, it can be ugly.

Function main () { 
Local x.   # x is local
For x in a b ; do process $x ; done
}

Vs.
# x is global, all function will see it.
For x in a b ; do process $x ; done

> 
>>> if false ; echo Foo ; [[ notempty ]] ; then echo Full ; else echo Empty ; 
>>> fi || echo Fail
>>> 
>> Output: Empty
>> If false … should fail on the first false, going to the else part, 
>> printing Empty. Since the ‘echo empty’ succeeded, Fail will not be 
>> printed.
>> 
>>> or
>>> 
>>> if { false ; echo Foo ; [[ notempty ]] ; } ; then echo Full ; else echo 
>>> Empty ; fi || echo Fail
>>> 
>> Output: Empty
>> Same as above. Grouping does not change behavior.
>>> or
>>> 
>>> if ( false ; echo Foo ; [[ notempty ]] ) ; then echo Full ; else echo Empty 
>>> ; fi || echo Fail
>>> 
>> Output: empty
>> Same as above. Running in a subshell does not change behavior.
>>> or 
>>> 
>>> for x in a b ; do if false ; echo Foo ; [[ notempty ]] ; then echo Bar ; 
>>> else echo Empty ; fi ; done
>> Output: Empty (for a), Empty (for b)
>> Since the if statement succeed, the code will flow to from the first 
>> iteration to the second iteration. 
> 
> In a nutshell, you are proposing that this:
> 
>set -o errfail
>if false; true; then cmd1; else cmd2; fi
> 
> ...should behave like this:
> 
>if false; then cmd1; else; cmd2; fi
> 
> Okay, but...
> 
> 
>> The last case has interesting variant: the case when the loop body fail 
>> (for x in a b ; do echo $x ; false ; done.
>> Output: a
>> In this case, the code failed body will result in effective ‘break’ to 
>> the loop, where the whole for statement will fail. I believe this case 
>> is not currently covered in the implementation (2 or 3 lines to cover 
>> each of the loop constructs: for (list, arithmetic), until and while.
> 
> ...now you are applying different logic to "for" commands.  You
> want this to stop iterating:
> 
>set -o errfail
>for word in a b c; do false; true; done
> 
> ...which is NOT what this does:
> 
>for word in a b c; do false; done
> 
> Yet another inconsistency that will have to be explained.

I probably did not described the intended behavior clearly. I believe both 
cases should behave identical under errfail. The loop will ‘break’ on the first 
iteration (false when word = a). Same for the all looping commands. I believe 
this is consistent with if-then-else-if, when an error in the then or else 
block will result in terminating (‘breaking’) the if. 
> 
> Contrary to your pitch, this option is shaping up to be about as
> incoherent as errexit.  You should heed the suggestion to iron out
> the semantics.
> 
> 
See my notes. Might be too early to judge the solution, as We are only starting 
to get community feedback. The comments from Martin helped me realized the need 
to apply consistent behavior for the ‘body’ part of the if, until, while, for 
list and arithmetic for. I am looking for additional community feedback that 
will make the proposal and implementation useful, clean and productive to the 
bash community and users.

Regards,
Yair.
> -- 
> vq



Re: Revisiting Error handling (errexit)

2022-07-05 Thread Yair Lenga
Hi Martin,

Thanks for taking the time to review my proposal.

Wanted to highlight that the implementation was less than 3 hour (fun) job - no 
trouble at all. Credit should go to current bash dev team (I am being told it’s 
a team of 1) - for keeping organized, well written, documented code !

I’m not in front of my desktop, so I can not verify behavior, but here is my 
expectation - let me know if it make sense, in the context of the project goal 
- every unhandled failed statement should unwind execution up, until explicitly 
handled. “Up” means current semi-colon connected statement - in a group, sub 
shell or a list.

> if false ; echo Foo ; [[ notempty ]] ; then echo Full ; else echo Empty ; fi 
> || echo Fail
> 
Output: Empty
If false … should fail on the first false, going to the else part, printing 
Empty. Since the ‘echo empty’ succeeded, Fail will not be printed.

> or
> 
> if { false ; echo Foo ; [[ notempty ]] ; } ; then echo Full ; else echo Empty 
> ; fi || echo Fail
> 
Output: Empty
Same as above. Grouping does not change behavior.
> or
> 
> if ( false ; echo Foo ; [[ notempty ]] ) ; then echo Full ; else echo Empty ; 
> fi || echo Fail
> 
Output: empty
Same as above. Running in a subshell does not change behavior.
> or 
> 
> for x in a b ; do if false ; echo Foo ; [[ notempty ]] ; then echo Bar ; else 
> echo Empty ; fi ; done
Output: Empty (for a), Empty (for b)
Since the if statement succeed, the code will flow to from the first iteration 
to the second iteration. 

The last case has interesting variant: the case when the loop body fail (for x 
in a b ; do echo $x ; false ; done.
Output: a
In this case, the code failed body will result in effective ‘break’ to the 
loop, where the whole for statement will fail. I believe this case is not 
currently covered in the implementation (2 or 3 lines to cover each of the loop 
constructs: for (list, arithmetic), until and while.

Thank you for proposing those test cases. I will add a small verification 
script with all those cases to my submission - I need to prepare the patch 
against the dev branch.

Regards,

Yair.

Sent from my iPad

> On Jul 5, 2022, at 7:27 PM, Martin D Kealey  wrote:
> 
> 
> Before going to the trouble of implementing this, I think it's worth having a 
> discussion about the precise semantics.
> 
> The examples given suggest that if a command has an un-tested non-zero exit 
> status, that it will cause the current function to return that status 
> immediately.
> 
> But what about other compound statements?
> 
> What do you propose should be the output (if any) from:
> 
> In short, how far does unwinding this propagate, and what stops it?
> 
> -Martin 



Re: Revisiting Error handling (errexit)

2022-07-05 Thread Yair Lenga
Hi,

Below is the patch for the new 'errfail' option.

Please note that this is MINIMAL implementation. It will cover the cases
below. Possible to think about this as replacing every ';' (or new line
line that terminate statements) with '&&' - forcing execution to break.
.
Patch is minimal (less than 15 lines of code changes), and is low-risk -
all changes are conditional on the (global) flag setting. No complex logic,
No write of functions.

Feedback/comments are welcomed.

set -o errfail
function foo { echo A ; false ; echo B ; }
function bar { echo C ; foo ; echo D ; }

 # Will print A, return non-zero status
foo

# Return from function on first error
# Will print A, CATCH
foo || { echo CATCH  ; }

# Failures will propagate through function calls, unless explicitly "caught"
# Print C A CATCH
bar || echo CATCH

# Fancier: "throw"
function throw { echo "${0##*/}: $@" >& 2 ; false ; }

function foo {
if [ ! -f required-file.txt ] ; then
throw "Missing required file"
fi
echo "YES"
}

Small Letters:
* The errfail does NOT cover "top-level" errors. Only "connected"
statements. Either create a 'main' function, or create a top level block:

On Tue, Jul 5, 2022 at 12:00 AM Lawrence Velázquez  wrote:

> On Mon, Jul 4, 2022, at 3:55 PM, Yair Lenga wrote:
> > I'm sorry - I misunderstood your original comments. I'll prepare the
> > patched version (at least, I would like to add comments before
> > publishing...) , and share it.
> > Where/how can I post it ?
>
> Send it to this list as an attachment [1] with a .txt suffix [2].
>
> [1] Gmail will mangle the patch if you send it inline.
> [2] Alleviates issues with clients on the receiving end.
>
> > I did not see anyone else dropping source
> > code/patches into the group ?
>
> Code contributions are not as common as you might think, given
> bash's prominence.
>
> --
> vq
>
diff -ru orig/bash-master/builtins/set.def new/bash-master/builtins/set.def
--- orig/bash-master/builtins/set.def   2022-01-05 00:03:45.0 +0200
+++ new/bash-master/builtins/set.def2022-07-05 11:54:31.545828400 +0300
@@ -76,6 +76,8 @@
   emacsuse an emacs-style line editing interface
 #endif /* READLINE */
   errexit  same as -e
+  errfail  execution of command lists will stop whenever
+   a single command return non-zero status
   errtrace same as -E
   functracesame as -T
   hashall  same as -h
@@ -196,6 +198,7 @@
   { "emacs", '\0', (int *)NULL, set_edit_mode, get_edit_mode },
 #endif
   { "errexit",   'e', (int *)NULL, (setopt_set_func_t *)NULL, 
(setopt_get_func_t *)NULL  },
+  { "errfail",   '\0', _opt, (setopt_set_func_t *)NULL, 
(setopt_get_func_t *)NULL  },
   { "errtrace",  'E', (int *)NULL, (setopt_set_func_t *)NULL, 
(setopt_get_func_t *)NULL  },
   { "functrace",  'T', (int *)NULL, (setopt_set_func_t *)NULL, 
(setopt_get_func_t *)NULL  },
   { "hashall",'h', (int *)NULL, (setopt_set_func_t *)NULL, 
(setopt_get_func_t *)NULL  },
@@ -655,6 +658,7 @@
 {
   pipefail_opt = 0;
   ignoreeof = 0;
+/*  errfail_opt = 0 ;   errfail IS inherit by sub-shells */
 
 #if defined (STRICT_POSIX)
   posixly_correct = 1;
diff -ru orig/bash-master/execute_cmd.c new/bash-master/execute_cmd.c
--- orig/bash-master/execute_cmd.c  2022-01-05 00:03:45.0 +0200
+++ new/bash-master/execute_cmd.c   2022-07-05 13:03:18.379533800 +0300
@@ -2706,7 +2706,7 @@
   QUIT;
 
 #if 1
-  execute_command (command->value.Connection->first);
+  exec_result = execute_command (command->value.Connection->first);
 #else
   execute_command_internal (command->value.Connection->first,
  asynchronous, pipe_in, pipe_out,
@@ -2714,10 +2714,15 @@
 #endif
 
   QUIT;
-  optimize_fork (command); /* XXX */
-  exec_result = execute_command_internal 
(command->value.Connection->second,
+
+  /* With errfail, the ';' is similar to '&&' */
+  /* Execute the second part, only if first part was OK */
+  if ( !errfail_opt || exec_result == EXECUTION_SUCCESS ) {
+  optimize_fork (command); /* XXX */
+  exec_result = execute_command_internal 
(command->value.Connection->second,
  asynchronous, pipe_in, pipe_out,
  fds_to_close);
+  } ;
   executing_list--;
   break;
 
diff -ru orig/bash-master/flags.c new/bash-master/flags.c
--- orig/bash-master/flags.c2022-01-05 00:03:45.0 +0200
+++ new/bash-master/flags.c 2022-07-05 13:42:12.287799400 +0300
@@ -156,6 +156,12 @@
with a 0 status, the status of the pipeline is

Re: bug-bash Digest, Vol 236, Issue 5

2022-07-05 Thread Yair Lenga
Greg,

I agree with you 100%. Not trying to fix errexit behavior. The new errfail (if 
accepted) will provide better error handling (via opt-in) without breaking 
existing code.

Yair.

Sent from my iPad

> On Jul 4, 2022, at 10:00 PM, bug-bash-requ...@gnu.org wrote:
> 
> From: Greg Wooledge 
> To: bug-bash@gnu.org
> Subject: Re: Revisiting Error handling (errexit)
> Message-ID: 
> Content-Type: text/plain; charset=us-ascii
> 
>> On Mon, Jul 04, 2022 at 09:33:28PM +0300, Yair Lenga wrote:
>> Thanks for taking the time to review my post. I do not want to start a
>> thread about the problems with ERREXIT. Instead, I'm trying to advocate for
>> a minimal solution.
> 
> Oh?  Then I have excellent news.  The minimal solution for dealing with
> the insurmountable problems of errexit is: do not use errexit.
> 
> It exists only because POSIX mandates it.  And POSIX mandates it only
> because it has been used historically, and historical script would
> break if it were to be removed or changed.



Re: Revisiting Error handling (errexit)

2022-07-04 Thread Yair Lenga
Hi Lawrence,

I'm sorry - I misunderstood your original comments. I'll prepare the
patched version (at least, I would like to add comments before
publishing...) , and share it.
Where/how can I post it ? I did not see anyone else dropping source
code/patches into the group ?

Yair

On Mon, Jul 4, 2022 at 10:00 PM Lawrence Velázquez  wrote:

> On Mon, Jul 4, 2022, at 2:33 PM, Yair Lenga wrote:
> > Thanks for taking the time to review my post. I do not want to start a
> > thread about the problems with ERREXIT.
>
> Neither do I.
>
> > Instead, I'm trying to advocate for
> > a minimal solution.
> >
> > [...]
> >
> > Please take a look at the specific short example that I listed (below).
> > I believe fair summary is the the behavior defined by the POSIX spec is
> > less than ideal.
> >
> > #! /bin/bash
> > set -e
> > function foo { echo BEFORE ; false ; echo AFTER ; }
> > foo || echo FAIL
> >
> > Where the expected output is "BEFORE ... FAILED, but actual output is
> > "BEFORE ... AFTER". The root cause is the way "errexit" (-e) is handled
> in
> > functions, statements-lists, ... when combined with control statements
> (if,
> > while, &&, ...).
>
> So what does your proposed option actually *do*?  You're continuing
> to provide examples of changed behavior, instead of explicitly
> *describing the changes*.
>
> If you don't understand what I'm asking for, just send a patch.
>
> > As for my question about code being accepted - I'm trying to figure out
> if
> > the bash development team is open to outside contributions. Some projects
> > are not open to contribution. If bash is open to contribution, I'll
> prepare
> > the code for submission/review (style, comments, test cases, ...) as
> > needed.
>
> The "development team" is one person.  (That person isn't me, to
> be clear.)  He's not averse to outside contributions, but there's
> no guarantee that he'll agree with your goal in the first place.
>
> Again, you might as well send what you have without worrying too
> much about polishing it.  You can always submit refinements later.
>
> --
> vq
>


Re: Revisiting Error handling (errexit)

2022-07-04 Thread Yair Lenga
Lawrence,

Thanks for taking the time to review my post. I do not want to start a
thread about the problems with ERREXIT. Instead, I'm trying to advocate for
a minimal solution. There are already many threads in bash mailing lists,
stack overflow, and numerous articles related to advanced bash programming
(google "bash errexit problems", e.g.
https://stackoverflow.com/questions/13870291/trouble-with-errexit-in-bash
). Please take a look at the specific short example that I listed (below).
I believe fair summary is the the behavior defined by the POSIX spec is
less than ideal.

#! /bin/bash
set -e
function foo { echo BEFORE ; false ; echo AFTER ; }
foo || echo FAIL

Where the expected output is "BEFORE ... FAILED, but actual output is
"BEFORE ... AFTER". The root cause is the way "errexit" (-e) is handled in
functions, statements-lists, ... when combined with control statements (if,
while, &&, ...).

As for my question about code being accepted - I'm trying to figure out if
the bash development team is open to outside contributions. Some projects
are not open to contribution. If bash is open to contribution, I'll prepare
the code for submission/review (style, comments, test cases, ...) as
needed. Was not implying for blank approval ;-)

Regards,
Yair

On Mon, Jul 4, 2022 at 7:41 PM Lawrence Velázquez  wrote:

> On Mon, Jul 4, 2022, at 8:20 AM, Yair Lenga wrote:
> > I was able to change Bash source and build a version that supports the
> new
> > option 'errfail' (following the 'pipefail' naming), which will do the
> > "right" thing in many cases - including the above - 'foo' will return 1,
> > and will NOT proceed to action2. The implementation changes the
> processing
> > of command-list ( '{ action1 ; action2 ; ... }') to break of the list, if
> > any command returns a non-zero code, that is
> >
> > set -oerrfail
> > { echo BEFORE ; false ; echo AFTER ; }
> >
> > Will print 'BEFORE', and return 1 (false), when executed under 'errfail'
>
> This is already how errexit behaves in most contexts.
>
> % bash -ec '{ echo BEFORE; false; echo AFTER; }'; echo "$?"
> BEFORE
> 1
>
> So what did you actually change?  Presumably you tweaked the special
> treatment given to the conditional portion of "if" commands, but
> we shouldn't have to guess.
>
> > I'm looking for feedback on this implementation. Will be happy to share
> the
> > code if there is a chance that this will be accepted into the bash core
> > code
>
> I don't think there's a chance unless you share the code first.
> Your description is wholly inadequate for understanding the true
> nature of your proposal.
>
> --
> vq
>


Revisiting Error handling (errexit)

2022-07-04 Thread Yair Lenga
Hi,

In my projects, I'm using bash to manage large scale jobs. Works very well,
especially, when access to servers is limited to ssh. One annoying issue is
the error handling - the limits/shortcomings of the 'errexit', which has
been documented and discussed to the Nth degree in multiple forums.

Needless to say, trying to extend bash to support try/catch clauses, (like
other scripting solutions: Python, Groovy, Perl, ...), will be a major
effort, which may never happen. Instead, I've tried to look into a minimal
solution that will address the most common pitfall of errexit, where many
sequences (e.g., series of commands in a function) will not properly
"break" with 'errexit'. For example:

function foo {
cat /missing/file   # e.g.: cat non-existing file.
action2   # Executed even if action 1 fail.
action3
}

set -oerrexit   # want to catch errors in 'foo'
if ! foo ; then
# Error handling for foo failure
fi

I was able to change Bash source and build a version that supports the new
option 'errfail' (following the 'pipefail' naming), which will do the
"right" thing in many cases - including the above - 'foo' will return 1,
and will NOT proceed to action2. The implementation changes the processing
of command-list ( '{ action1 ; action2 ; ... }') to break of the list, if
any command returns a non-zero code, that is

set -oerrfail
{ echo BEFORE ; false ; echo AFTER ; }

Will print 'BEFORE', and return 1 (false), when executed under 'errfail'

I'm looking for feedback on this implementation. Will be happy to share the
code, if there is a chance that this will be accepted into the bash core
code - I believe it will make it easier to strengthen many production
systems that use Bash.

To emphasize, this is a minimal proposal, with no intention of expanding it
into full support for exceptions handling, finally blocks, or any of the
other features implemented in other (scripting) languages.

Looking for any feedback.

Yair