On Mon, 12 Aug 2019 09:52:40 -0700
Alec Warner <anta...@gentoo.org> wrote:

> CSV, JSON and YAML are both popular machine-and-people readable
> specifications with broad support.

No, not CSV. There isn't really "a spec" for that. Even though there is
a "proposed spec", "CSV editors" and things that emit CSV just make up
their own rules.

The more I know about CSV, the less I want anything to do with it.

In essence, to make CSV viable (or any other delimiter-seperated format
viable), you have to locally redefine what that means to a limited
subset of the spec.

For instance, forbid the feature where the first line is the string
"Sep=,"[1], which tools like Excel (may) generate, but aren't spec
compliant, and leads to ... interesting things.

But pretty much you can take it for granted that a '.csv' extension
will make *somebody* make assumptions about the format that aren't true.

Like for instance, is leading/trailing whitespace around delimiters
significant? Spec says yes[2], but implementations may desire it to be
no ( And gentoo probably prefers them not to be significant for
alignment reasons )

As for JSON/YAML, ... eh... that may be the case for like, 4 line files.

But once you have hundreds of entries, that becomes less true.

And both of those can have "Fun" merge conflict issues due to the
requirements around record delimiters and syntax,

eg: You're using JSON, does your JSON formatter emit every record on
its own line? No? That's going to create annoying merge conflicts.

Does your formatter/decoder support trailing "," ?
No? That's going to introduce problems.

That's why I'd rather a more narrow, less general, domain specific
format, instead of throwing these general tools at the problem.

1: https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality
2: https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules

Attachment: pgpDvdaUkEFMC.pgp
Description: OpenPGP digital signature

Reply via email to