Julian Foad wrote on Thu, Nov 07, 2019 at 17:53:43 +0000: > We have said the outputs (except XML) simply are not intended to be parsed.
We've said that? I think what we said is that the non-XML outputs _may_ be parsed, but with care, since they may change in minor releases as functionality gets added. (Example: when we added tree conflicts, we added a 7th column to the output of 'svn st'.) That's why we say «%d lines» at the end of every «svn log | grep '^r[0-9]* '» line. > Tab-separated output seems a rather arbitrary addition to the flock, though > it's an OK choice in isolation for this particular use. The rationale was to offer a line-based format for use in shell scripts and pipelines. Shell scripts are basically the reason we added --show-item in the first place, too. We could use NUL separators instead of tabs. > Better consistency with other subcommands would be achieved by using > space-separated output with field widths chosen per field, like we chose > them for "status" and "list". That's not possible in this case: just consider «--show-item=foo,url,relative-url,bar». > Or item-per-line format like the original "info" output. Firstly, I'm not this would be worth implementing; the default 'info' output serves well enough for that. Secondly, that would be an unusual/unnatural interface for scripts. Line-based formats usually operate on the premise that all lines are parsed the same way. (Example: the output of «netstat».) «svn info», with or without the RFC822 headers at the start of each line, is the exception to the rule. The corresponding parsing idioms (for example, Python's 'fileinput' module, and «for line in open('foo')») operate one line at a time, not N lines at a time. > Why should selecting items not be orthogonal to the output format? There's a distinction between the default and XML outputs on the one hand, and TSV output on the other hand: In the latter it's harder to select just the particular fields one cares about. Without the ability to select items to be output, a consumer of the tab-separated format that cares just about a particular field would have to either hardcode magic numbers (as in «awk 'print $1, $13'») or deal with a header line (meaning the parsing would need to be stateful — exactly the problem we started with — and things like «xargs -n1 svn info --show-item=foo,bar --» would print multiple header lines). On the other hand, in the default and XML formats selecting just the fields one cares about is easy. So, selecting output fields is more important for TSV than for RFC822 or XML. > The XML output should also support --show-item. It's arbitrarily > inconsistent that it presently doesn't. See above: consumers of the XML output can easily ignore the parts they don't care about, even without --show-item support. On the other hand, I could see a case for having «svn info --rfc822 --show-item=foo,bar», which would generate the default output format but print only some of the lines, for interactive use. More generally, I see your point that selecting fields and selecting output format should be orthogonal. That does make sense from an abstract (pure mathematics) point of view, but more pragmatically, I don't think it's as high priority to support «--show-item=… --xml» as to to support multiple arguments to «--show-item=…» in TSV mode. In fact, with info-cmd.c as it stands, supporting «--show-item=… --xml» would not _reduce_ complexity but _increase_ it, since several different receiver functions would need to become aware of --show-item. > These days, JSON would be a reasonable choice of output format as an > alternative to XML. Consider if we offered it, how would we select it? > Perhaps a --json flag? Then why not a --tsv flag for TSV format, Yes, we could do this. We could add --json and --tsv, as well as --rfc822 that would select the default output mode explicitly. We could generalize that to --output-mode=(rfc822|json|xml|tsv). We could make --show-item work in conjunction with any of json/xml/tsv/rfc822. For compatibility reasons, «svn info» would default to --rfc822, but --show-item would imply --tsv unless one of --json/--xml/--rfc822 was passed. We would then have «svn info --rfc822 --show-item=depth» that prints "Depth: empty" (with the RFC822 header) and no other lines, and we'd be able to do «svn info --json --show-item=foo» to save a couple of microseconds to the process on the other end of the pipe. I don't see how any of these visions — leaving aside whether they're good ideas or not — is a blocker to the patch I posted. You're saying we could add other features, and I'm sure we could, but that's not the right question to ask. The question to ask is whether the patch adds value, and whether it adds something that we won't want to support until 2.x. I don't see any concern of the sort in all your points. The patch as it stands is forward-compatible with all your ideas. > and expect that too to be made available to other subcommands? Which other commands might use TSV? > We might want to allow using multiple options, like "--show-item=revision > --show-item=kind" in addition to comma-separated values. I don't think this is high priority. Daniel