>>>> but what are the advantages of journald's representation compared >>>> to a naive one? >>> >>> in short: querability without text parsing. That's about it. >> >> They have to parse the binary format, so that's not in and of itself >> an upside compared to parsing CSV. >> >> I've made my share of bad design decisions that don't pan out. But >> there's always an upside to my decision (even when it turns out it >> speeds up only those cases which can never occur, because of some >> other aspect of the system). >> >> AFAICT the format is *not* just a plain sequence of log entries, so >> there's some additional structure which is intended to speed up some >> operations. >> >> IOW, even if contrived, there should be *some* use case where it >> does better than CSV, no? > > I can think of two possibilities, just offhand, in no particular order: > > * No need to parse the timestamps, et cetera, and take the risk that > someone put in one that's in a format you don't expect; the times are > stored internally in a consistent guaranteed format, so you can just use > internal reader functions (paired with, and updated alongside, the > internal writer functions) and be done with it.
Can't think of any reason why the same wouldn't apply to CSV: if someone messes up the timestamps by hand, they're on their own. > * No need to worry about handling log entries that *contain* commas, or > whatever other element was chosen as the separator. That's just a very minor convenience issue and it does not require a structure any more complex than a plain sequence of log entries. Same for FSS, it doesn't seem to require the more complex structure used by journald. There must have been some other use-case they had in mind where they thought they could avoid the linear-time scan or something in a way that they expected would be algorithmically beneficial. I just can't see what it is they had in mind. Stefan