PSARC 2008/374 dladm parseable output

Garrett D'Amore Fri, 13 Jun 2008 10:19:52 -0700

I still have a philosophical objection to the idea that we are going to 
standardize some kind of tabular format for utilities to "dump" their 
data for further massaging ("parsing") by shell scripts.

I'm not opposed to the idea that shell scripts need to access the data 
that is in these "databases", I just don't think that a general opinion 
providing a way to dump the "whole" database for subsystem X (whatever 
the subsystem is) is really the best approach, and I'm fairly confident 
that whatever we settle upon, native parsing will become difficult for 
at least some dialect.  (E.g. dealing with escaped characters may be 
easy for a particular version of sh, but what about for awk or perl or 
for that matter Java?  There already seems to be anecdotal evidence that 
even sh versus ksh93 have some annoying differences in their handling of 
read.)

If we need programmatic access to this data from shell scripts and such, 
then lets quit trying to solve the problem by dumping the entire state 
at once to the shell, and offer utilities to extract the state and 
present it in a format so that shell scripts *don't* have to "parse" it.

My favored option is still the -o type of solution, with some other 
option indicating a look up key (assuming that is pertinent.)  I don't 
think the ability to choose different delimiters is really that 
important here, nor, IMO, is the ability to dump more than a single 
field in an invocation.  Both of those wind up raising the whole 
"parsing" question because you have to find a neutral delimiter, and 
thus require token parsing of some sort.   (Hmm... that does still leave 
the issue of listing all the records ala zfs list, open, but *probably* 
its safe to assume that we can separate records by newlines.)

That said, if, as a one-off solution, there is a desire to dump more 
information at once, I don't see a problem with inventing a special 
format for it.   I just don't think we're likely to standardize on one 
that works everywhere.  Instead, we should, IMO, discourage the creation 
of solutions which require token separation to be performed by shell 
scripts.

Alternatively, we can provide tools which perform general format parsing 
on behalf of the shells and have the parseable format come in such a 
format.  (The tools I'm talking about would perform lookup and field 
extraction on behalf of the calling script.) I'd advise in such a case 
against inventing yet another new file format though.   (I think I 
already mentioned XML.  Likely something simpler, such as CSV or 
tab-delimited fields, would be more palatable.  It would certainly make 
processing easier for languages that don't already have XML support.)

    -- Garrett

John Plocher wrote:
> Darren Reed wrote:
>> To bring this back to where it started, the issues are (for PSARC):
>> - given that there will be future work that wants to generate
>>  parsable output, do we need an opinion written up (for this case)
>>  to serve as the notice of our decision about it or is it sufficient
>>  to just cite this case?
>
> No opinion should be needed - though a best practice (written by Joe
> or Garrett or Nico or you or...) that summarizes this into something
> reusable would be good.
>
> Unlike Joe, I do not believe this is a one-off - we need structure
> and consistency in this area, and this case (like zoneadm) presents
> a reasonable way to provide it  *if*, in fact, the project team can
> solve the escape sequence parsing problem).
>
> To me, that structure is:
>
>     We (the ARC, Sun,...) do not want every utility to do
>     one-off parsable output formats if we can help it - or
>     to use different CLI utterances to obtain it.  We want
>     the output to be easily usable in the places where we
>     expect it to be commonly used - shells, scripting languages,
>     etc.  And we don't need to handle every conceivable future
>     possibility as part of this case.
>
>    A spec that would work for me would say simply
>      use
>     command -t ':' -p -o xx,yy,zz
>      to get tabular, ':' delimited and properly escaped output
>
>      Here are examples of how to use this output:
>
>      ksh93: ...
>      perl: ....
>      fortran: ... :-) ...
>
>
>
>
>> - if we're going to use this case as the foundation for all future
>>  cases that are presenting output from commands, such as these,
>>  that is meant to be parsable, do we:
>>  1) decide that we insist that commands use -o/-p unless history
>>     prevents it? (i.e. new commands *MUST* use this combination)
>
> If new commands choose to provide parsable output, the CLIP
> guidelines strongly suggest use of a common CLI term.  "-p" seems
> to be the one we have defacto standardized upon.
>
> Same for "-o aaa,bbb,ccc".  And ":" as a separator.
>
> Wishing that we didn't have to parse command output so we wouldn't
> have to address this issue is IMO naive.  The fact remains that
> it is common, useful and expedient to provide this type of data
> in tabular multiline form.  If it turns out that it isn't easily
> parsable in shell, then we'll all just use perl or whatever - and
> not lose any sleep over it.  Getting access to the data is the key
> enabler here - its exact format is secondary - if I can't get the
> data in the first place, it doesn't matter what format it isn't in.
>
> A revised spec would be good.
>
>   -John
>

PSARC 2008/374 dladm parseable output

Reply via email to