On Thu, Oct 31, 2019 at 12:23:58PM -0400, Cleber Rosa wrote: > On Tue, Oct 29, 2019 at 01:14:30PM -0300, Beraldo Leal wrote: > > Hi all, > > > > So, we have a Trello card [1] to discuss what date/time format we are > > going to adopt when saving date/time on a file. > > > > Hi Beraldo, > > I don't think I meant that the date/time format to be discussed and > defined was meant to be primary saved on a file. The mention of it > being used as a "wire-format" was my attempt to signal the primary > use. But, let me make it start with a clearer definition of the > current state of the "nrunner" code. > > The "avocado nrun" command is, right now, an ad-hoc implementation of > something similar to an Avocado job. A loose definition of an Avocado > job's role is that it runs and collects results for one or more tests. > The closest thing to collecting results from many jobs there's in > nrunner right now is the status server[1], which waits for those > status messages on a TCP socket. Those messages are currently encoded > as JSON, so the format of the date/time would has to be encoded in > either a JSON string or number. > > Note: I'm already working on alternative implementations that > integrates the nrunner execution into the existing Avocado Job code, by > writing a "nrunner based" test runner implementation, whose interface > has now been defined[2] and it's used even by the regular runner[3]. > > The "nrunner" based runners, then, have the resposibility of publishing > relevant events, including test start and test end time. It's this date/time > format that I'm most concerned with, because, once those are collected > by the results server (or job depending on the implementation) it can > certainly be stored or presented in an alternative format if it makes > sense to do so. > > > I'm moving the discussion here because it seems better to discuss here > > than on Trello. > > > > For sure! > > > When it comes to date/time storage format, I can think of two very > > well-used standards: 1. Unix Time and 2. ISO 8601. > > > > I’m in favor of the “disambiguation” feature. Read a date/time and not > > have to guess which timezone is a plus. > > > > I think that few questions should be answered before we decide this: > > > > 1. Is storage a problem? > > I would certainly like to save a few bytes on each message that > contains a date/time, provided everything else is equal. > > But, to be honest, I don't think reading a JSON number as a date (say > for Unix time) or a string (say for ISO 8601) would have a signficant > impact on the transmission/processing/storage costs. I think if we > come to the point of needing to optmize the communication, a more > comprehensive change, such as replacing the protocol/encoding > altogether would probably yield the best results. > > > 2. Is a CPU bound problem to parse this date/time? > > Like I said before, I doubt that the "status server" would have its > CPU pressured just for parsing the date/time, no matter the format. I > think it's more important that the test runner is given as little work > as possible, though, so that it causes as little disturbance as > possible on the test and on the tested system. Think of low powered > embedded systems running a test, for instance. Being able to use a > native data type and cheap encoding would be favorable IMO. > > > 3. Who is going to read this information? Machine or human? > > > > Initially the "raw" info is machine readable, even though most people > would agree that JSON is quite human readable. When it comes to the > date/time format itself, a Unix time has poor human readability. > > > I believe that by answering these questions, we can go smoothly with > > one format or another, as all languages have libraries to handle it. > > > > Agreed. I hope I was able to give my general impression on the > requirements above and answered those points. > > > I have listed below the advantages and disadvantages that I have been > > able to collect so far. Feel free to add or comment about. > > > > # Unix Time / Posix Time / Epoch Time > > ## Advantages: > > * Better for machine readability; > > * Optimized for storage; > > * Very well-known with builtin libraries in many languages; > > > > ## Disadvantages: > > * No timezone support (assumes UTC); > > * Leap seconds are ignored; > > That was news to me. After reading an article[4] I think it doesn't > impact our use case. > > > * Cannot store values before “1970-01-01 00:00:00 UTC”; > > Shouldn't be a problem, as we're not supposed to store tests started > or that have ended before that. :) > > > * On 32-bit systems there is the “Year 2038 problem”; > > This is trickier... and I hate to feel cornered by it. Even if, to > the best of my knowledge and assumptions, we won't be dealing with > 32-bit systems by then, or, the problem would have been solved / > worked around at another layer. > > <joke>TBH, you shouldn't had mentioned this!</joke> > > > > > ## Examples using Unix Time: > > * 915148800.25 > > * 1095379201.00 > > > > The presentation aspect is really what bothers me, which is in direct > conflict with the fact that the primary consumers of the nrunner > messages are not humans. But, given that one can easily see that output > by running, say, "avocado runnable-run ...", I was bothered by it. > > Anyway, I'm going to dismiss those feelings on the basis of the > primary use cases. > > > # ISO 8601 > > ## Advantages: > > * Better for human readability; > > For sure. > > > * Very well-known international standard with builtin libraries in > > many languages; > > (First edition in 1988 and updated until nowadays); > > * UTC time zone can be represented by only one “Z” char; > > Interesting. > > > * The lexicographical order of the representation thus corresponds > > to chronological order; > > Also interesting. > > > (except for date representations involving negative years or time > > offset); > > * A fraction may be added to the lowest order time element in the > > representation. > > (A decimal mark, either a comma or a dot can be used); > > * There is no limit on the number of decimal places for the decimal > > fraction; > > Does this mean that a very high time resolution can be used? This was > one of the questions/concerns I had on the back of my mind... > > > * Has support for a basic format (without - or : ) and an extended > > format with separators added to enhance human readability > > (The standard notes that: "The basic format should be avoided in > > plain text."); > > > > ## Disadvantages: > > * Needs more time to parse (not so optimal for machine parsing); > > True, but as I've said before, I think the cost of producing it is > more important than the cost of parsing it (as the results server > should have much more resources than the test runner). > > > * Needs more space to store; > > > > True... for instance, Python's time.time() gives me: > > >>> len(json.dumps(time.time())) > 18 > > While for ISO 8601 with > > >>> > len(json.dumps(datetime.datetime.utcnow().replace(tzinfo=datetime.timezone.utc).isoformat())) > 34 > > > ## Examples using ISO 8601: > > * 2019-10-29T11:22:32+00:00 > > * 2019-10-29T11:22:32Z > > * 20191029T112232Z > > > > I like the last example a lot, but that is the one suggested by the > standard notes to not be used, right? > > > If the answers to questions 1 and 2 are "no", I think that I would go > > with ISO 8601 using 'Z' as UTC timezone, always. > > > > And you? Any thoughts? Do you have a third option? > > I think those two are the real contenders indeed. I'm wondering if > both formats shouldn't be supported by the status server when reading > the messages, so that the writing of native runners would be > facilitated and the load on them would be minimized. > > For the runners producing UNIX times, we could even have something like: > > $ avocado runnable-run ... | > ./contrib/scripts/avocado-beautify-status-messages > > In the best UNIX tradition. > > Thanks for the thorought analisys! > - Cleber. >
For adding closure to this topic, it's my understanding that, given it's a "wire-format", we can keep using Unix time. - Cleber. > > > > [1] - https://trello.com/c/w4iFhDfM > > > > Regards, > > -- > > Beraldo Leal > > Senior Software Engineer, Virtualization Team > > Red Hat > > > > [1] > https://github.com/avocado-framework/avocado/blob/f1cdf81284e01ae2c20b2392b1e3718aefbeec2c/avocado/core/nrunner.py#L522 > [2] > https://github.com/avocado-framework/avocado/blob/f1cdf81284e01ae2c20b2392b1e3718aefbeec2c/avocado/core/plugin_interfaces.py#L290 > [3] > https://github.com/avocado-framework/avocado/blob/f1cdf81284e01ae2c20b2392b1e3718aefbeec2c/setup.py#L128 > [4] https://derickrethans.nl/leap-seconds-and-what-to-do-with-them.html >