Dnia 6 maja 2013 17:56 Tim Kientzle <t...@kientzle.com> napisaĆ(a):
> > On May 6, 2013, at 6:46 AM, Marek Kielar wrote: > > > Colorizations made using piping to external tools suffer from lack of > > semantic information about the output. Take colordiff for example - in many > > places it parses (regex's) the output hoping it does The Good Thing, but > > still its just hoping (and fails sometimes). > > > > If there was an output version that was semantically complete (e.g. some > > kind of markup on normal output) for automatic interpretation, it would > > always be easy to achieve colorization (and other mangling) through > > external tools. It would also make those tools way simpler, since they > > would just re-interpret and not heuristically regex through the > > human-readable output. Moreover, in such case, there would be no problems > > with signaling etc. > > So you are basically suggesting that tar have an XML/json/yaml/etc > output format so that external tools can robustly utilize the data. > (E.g., format it, present it in a GUI, etc.) > > Tim > Not really, no. Rather, I was trying to point out that the goal would be to make sure that output syntax is unambiguous, because it is a weak spot of many programs, because they cannot be later easily reused as modules. Proper markup, in the general meaning, is just a good way to achieve this. tar's output is quite readable and actually quite marked-up, especially the verbose versions. Its syntax has been playing well up to now, it's been made a module several times, no need to overthrow this. However, for machine readability the output syntax has to be unambiguous, because that is what human readers easily overcome and machines choke on. This might include giving error messages, as both output streams might be merged. Interpreters also tend to like Polish notation. Still, the main goal would be to make sure that output syntax is absolutely unambiguous. After that, it's up to the parsers. As a side note, well designed markups, just like tar's, generally come out short, and are natural or easily ignorable by a human reader even when presented in normal output: - diff markups using single character line beginners are pretty good and readable, though "error" messages lack markup at all, - HTTP's error codes are easily understandable and include both the machine-readable number and human-readable information, and its other general syntax is both understandable and flexible, - first python pickle format is another good example of short and readable markup. Best regards, Marek Kielar