On Saturday, 24 March 2018 at 16:11:18 UTC, Andrei Alexandrescu wrote:
Anyhow. Right now the order of processing is the same as the lexical order in which flags are passed to getopt. There may be use cases for which that's the more desirable way to go about things, so if you author a PR to change the order you'd need to build an argument on why command-line order is better. FWIW the traditional POSIX doctrine makes behavior of flags independent of their order, which would imply the current choice is more natural.

Several of the TSV tools I built rely on command-line order. There is an enhancement request here: https://issues.dlang.org/show_bug.cgi?id=16539.

A few of the tools use a paradigm where the user is entering a series instructions on the command line, and there are times when the user entered order matters. Two general cases:

* Display/output order - The tool produces delimited output, and the user wants to control the order. The order of command line options determines the order.

* Short-circuiting - tsv-filter in particular allows numeric tests like less-than, but also allow the user to short-circuit the test by testing if the data contains a valid number prior to making the numeric test. This is done by evaluating the command line arguments in left-to-right order.

Short-circuiting is supported the Unix `find` utility.

I have used this approach for CLI tools I've written in other languages. Perl's Getopt::Long processes args in command-line, so it supports this.

I considered submitting a PR to getopt to change this, but decided against it. The approach used looks like it is central to the design, and changing it in a backward compatible way would be a meaningful undertaking. Instead I wrote a cover to getopt that processes arguments in command-line order. It is here: https://github.com/eBay/tsv-utils-dlang/blob/master/common/src/getopt_inorder.d. It handles most of what std.getopt handles.

The TSV utilities documentation should help illustrate these cases. tsv-filter use short circuiting: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-filter-reference. Look for "Short-circuiting expressions" toward the bottom of the section.

tsv-summarize obeys the command-line order for output/display. See: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-summarize-reference.

There's one other general limitation I encountered with the current compile-time approach to command-line argument processing. I couldn't find a clean way to allow it to be extended in a plug-in manner.

In particular, the original goal for the tsv-summarize tool was to allow users to create custom operators. The tool has a fair number of built-in operators, like median, sum, min, max, etc. Each of these operators has a getopt arg invoking it, eg. '--median', '--sum', etc. However, it is common for people to have custom analysis needs, so allowing extension of the set would be quite useful.

The code is setup to allow this. People would clone the repo, write their own operator, placed in a separate file they maintain, and rebuild. However, I couldn't figure out a clean way to allow additions to command line argument set. There may be a reasonable way and I just couldn't find it, but my current thinking is that I need to write my own command line argument handler to support this idea.

I think handling command line argument processing at run-time would make this simpler, at the cost loosing some compile-time validation.

--Jon

Reply via email to