Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
Thanks all! Super helpful. Ben On Wed, Feb 6, 2019 at 11:55 AM Ben Asher wrote: > Hi there! We're having a debate at my company about date storage in > SQLite. SQLite has builtin support for ISO8601 in its date functions, so > some folks have started storing dates as ISO8601 SQLite-compatible date > strings. Are there pitfalls to storing dates this way compared to a unix > timestamp? I'm curious to know if anyone has experience and would highly > recommend sticking to one or the other for a particular reason. I'd also be > grateful if anyone could point me to any articles exploring this subject. > > Thanks! > > Ben > -- Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
On Wednesday, 6 February, 2019 12:55, Ben Asher wrote: > Hi there! We're having a debate at my company about date storage in > SQLite. SQLite has builtin support for ISO8601 in its date functions, > so some folks have started storing dates as ISO8601 SQLite-compatible > date strings. > Are there pitfalls to storing dates this way compared to a unix > timestamp? Well, the most common pitfalls are the failure to store "instant time strings" and instead to only store ambiguous data. This is amplified by the fact that the built-in datetime functions only produce (output) ambiguous timestrings and then only with limited precision (ie, usually not suitable for putting back into a database). You can work around this by always storing such strings in UT1 (UTC). For example, a full IS8601 "instant time string" looks something like: 2019-02-06T15:45:47.305872603-07:00 This format can be handled on input to the datetime functions producing a correct UT1 instant time. The input routines are general enough that they can accept varients such as a space in place of the "T", a space before the + or - timezone specifier, and an arbitrary number of decimal seconds (though the internal storage is Julian milliseconds, so the internal representation is limited to millisecond precision. Z can be used to denote +0:00 and -0:00. The "output" formats however are somewhat lacking as they do not specify decimal seconds (unless you use your own strftime format string) and do not include timezone information (thus making the time ambiguous). The "localtime" and "utc" modifiers are processed by the underlying OS and are subject to the vagaries of such handling on the OS, which varies by OS. For example, Windows conversions are only accurate for the current daylight rules. Linux works properly. I do not know about other OSes. "localtime" means the timezone of the computer. Storing a "Unix Timestamp" or other numeric format of course is not subject to these vagaries since it is an offset in POSIX seconds (or days) from a fixed epoch and that epoch is almost always UT1 (UTC). Note that you can store floating point unix timestamps, you just need to tell the internal datetime functions that they are working with unix epochs. Storing the stamps as ISO8601 strings has the advantage that anyone can recognize them and use the standard SQLite3 shell tool to work with them. Numeric epoch offsets are not so "human readable" and you need to be more knowledgeable to see and work with them using the shell tools, but they are harder to screw up. > I'm curious to know if anyone has experience and would highly recommend > sticking to one or the other for a particular reason. I'd also be > grateful if anyone could point me to any articles exploring this subject. I prefer working with numeric epoch dates myself because they are (a) more compact and (b) inherently sortable. ISO8601 strings are "mostly sortable" so long as they all have the same offset from UT1 -- the timestring format though does require about 31 bytes to store the same information as can be stored in an 8-byte float. I have some patches that modify the datetime function library to always output full instant timestrings and that can use the Olsen database (which needs to be loaded into the database) to do timezone conversions inside SQLite3 and to maintain the offsets within the datetime objects. (Unixtime uses the VDBE current statement time and simply convert the internal Julian Milliseconds into Unix Epoch Seconds as a double. UnixInstant calls the GetSystemTimePreciseAsFileTime (windows) API and returns the current machine time as a unix epoch float (it does the Precise variant gets the currrent time, not the time as of the last tick). >sqlite tz.db SQLite version 3.27.0 2019-02-06 01:18:36 Enter ".help" for usage hints. sqlite> select datetime('now', 'Canada/Mountain'); 2019-02-06 16:09:08.039 -07:00 sqlite> select datetime('now', 'Europe/Moscow'); 2019-02-07 02:09:23.943 +03:00 sqlite> select unixtime(), unixinstant(); 1549496112.409|1549496112.40904 sqlite> select datetime(unixinstant(), 'unixepoch', 'America/Regina'); 2019-02-06 17:35:46.849 -06:00 sqlite> Or using the geopoly extension find the Olsen timezone for a given lat/long: >gettz -110 50 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] SQlite3 Source ID 2019-02-06 01:18:36 3087a0c31e9fbfaedb7cf10a2fda59bc22318ff178614aefcc00ac83d9a7alt2 Find timezone at Location -110.00 50.00 Exact Row(tzid='America/Regina') 00:00:00.004773 st = time.time() for row in db.cursor().execute("""select tzid from tz_geopoly where geopoly_overlap(_shape, geopoly_regular(?1, ?2, 0.0002699785, 8)) and geopoly_contains_point(_shape, ?1, ?2) order by abs(geopoly_area(_shape)), instr(lower(tzid),
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
> On Feb 6, 2019, at 2:21 PM, J Decker wrote: > > From a JS point of view new Date( ISOString )and .toISOString() are quick > and available…. Available, yes, but expensive (compared to using a number.) > ISO format parsing is NOT that hard it's just a minor varient of > parsing floats. (maybe the conversion from parts into numeric?) Most date parsing (and formatting) functions are general-purpose and take format strings. That adds to the overhead. (SQLite does use a parser that’s specific to ISO-8661 and looks a lot faster.) —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
From a JS point of view new Date( ISOString )and .toISOString() are quick and available ISO format parsing is NOT that hard it's just a minor varient of parsing floats. (maybe the conversion from parts into numeric?) Haven't bothered to benchmark it. Date Diffs easily avaialble. On Wed, Feb 6, 2019 at 1:59 PM Jens Alfke wrote: > > > > On Feb 6, 2019, at 11:55 AM, Ben Asher wrote: > > > > Hi there! We're having a debate at my company about date storage in > SQLite. > > SQLite has builtin support for ISO8601 in its date functions, so some > folks > > have started storing dates as ISO8601 SQLite-compatible date strings. Are > > there pitfalls to storing dates this way compared to a unix timestamp? > > Date-string parsing can be surprisingly expensive. I’ve seen it as a > hot-spot when profiling a program of mine that analyzed some XML-based data > sets, and I’ve also seen it slow down CouchDB map functions. > > Date strings are many times larger than the equivalent numeric timestamps. > > On the plus side, they’re much more readable if someone has to look at the > raw data in the database. > > —Jens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
> On Feb 6, 2019, at 11:55 AM, Ben Asher wrote: > > Hi there! We're having a debate at my company about date storage in SQLite. > SQLite has builtin support for ISO8601 in its date functions, so some folks > have started storing dates as ISO8601 SQLite-compatible date strings. Are > there pitfalls to storing dates this way compared to a unix timestamp? Date-string parsing can be surprisingly expensive. I’ve seen it as a hot-spot when profiling a program of mine that analyzed some XML-based data sets, and I’ve also seen it slow down CouchDB map functions. Date strings are many times larger than the equivalent numeric timestamps. On the plus side, they’re much more readable if someone has to look at the raw data in the database. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
Hi, Integer unix timestamps are only accurate to one second, where ISO8601 (at least as implemented by SQLite) can go to 1 millisecond. Also you have to know the epoch to interpret a unix timestamp - not everybody uses 1970-01-01 00:00:00. Will people be able to figure out what the field value means when somebody discovers your data in 100 years? The SQLite implementation is *not* subject to the 2038-01-19 integer overflow problem. But other systems that might interact with SQLite are and so that is something to keep in mind as well. On the other hand, integer timestamps take up less space in the file. The third option is a fractional julian day number stored as an 8-byte floating point value. Do you want to store timestamps from the future (for example, a calendar-style application) or will all your timestamps always represent points in the past? If you want to be able to accurately store future timestamps then the integer seconds and julian day number representations require some extra metadata. Whilst past timestamps can always be deterministically converted from their local timezone to UTC and back again, this is not the case for timestamps in the future. This is because the timezone and daylight saving rules change from time-to-time. In the integer seconds and julian day number representations, future timestamps must be stored in local time along with their timezone so that information is not lost. In a perfect world you would store local time and location (rather than timezone) as places do also occasionally move timezone. However, this is difficult to work with because databases that map from suitably represented locations to timezones are more difficult to come by than databases that describe the relationships between timezones. This is not a hypothetical problem or one where the error is small. Bangladesh cancelled their daylight savings observances in 2010. In 2011 Russia made their daylight savings time permenant; moving onto it in the Spring and never moving back. If you had prematurely converted timestamps from these places to UTC then your error would be measured on the order of an hour. In 1994 Eastern Kiribati crossed the International Date Line (bureaucratically speaking). If you had prematurely converted a timestamp from there then your error would be on the order of a day! Doing consistent arithmentic on future dates is left as an exercise for the reader (sorry)! Regards, @ndy -- andy...@ashurst.eu.org http://www.ashurst.eu.org/ 0x7EBA75FF ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
On 2/6/19 9:10 PM, Richard Hipp wrote: On 2/6/19, Ben Asher wrote: Hi there! We're having a debate at my company about date storage in SQLite. SQLite has builtin support for ISO8601 in its date functions, so some folks have started storing dates as ISO8601 SQLite-compatible date strings. Are In my own work, I have variously used ISO8601 text dates, unix timestamp integers, and fractional Julian Day numbers to represent dates and times, according to whichever worked best in that particular application. Since it is easy to convert between them all, this has never been a big problem. Why not merely use the data from : struct timespec tn; ec = clock_gettime( CLOCK_REALTIME, ); That should give some sort of data down to the nanosec and if you have decent ntp in place ( and black magic ) it may even be accurate. :-) -- Dennis Clarke RISC-V/SPARC/PPC/ARM/CISC UNIX and Linux spoken GreyBeard and suspenders optional ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
On 2/6/19, Ben Asher wrote: > Hi there! We're having a debate at my company about date storage in SQLite. > SQLite has builtin support for ISO8601 in its date functions, so some folks > have started storing dates as ISO8601 SQLite-compatible date strings. Are > there pitfalls to storing dates this way compared to a unix timestamp? I'm > curious to know if anyone has experience and would highly recommend > sticking to one or the other for a particular reason. I'd also be grateful > if anyone could point me to any articles exploring this subject. Integer unix timestamps are only accurate to one second, where ISO8601 (at least as implemented by SQLite) can go to 1 millisecond. Also you have to know the epoch to interpret a unix timestamp - not everybody uses 1970-01-01 00:00:00. Will people be able to figure out what the field value means when somebody discovers your data in 100 years? The SQLite implementation is *not* subject to the 2038-01-19 integer overflow problem. But other systems that might interact with SQLite are and so that is something to keep in mind as well. On the other hand, integer timestamps take up less space in the file. The third option is a fractional julian day number stored as an 8-byte floating point value. Such values are accurate to about 1 millisecond during the modern era, and it is much easier to compute the number of days by which two dates differ (you just subtract). There are no overflow crises pending, though precision does decay as you move further and further away from the epoch, though for dates in nearby centuries this is not a factor. Space requirements are in between integer unix timestamps and ISO8601 strings. In my own work, I have variously used ISO8601 text dates, unix timestamp integers, and fractional Julian Day numbers to represent dates and times, according to whichever worked best in that particular application. Since it is easy to convert between them all, this has never been a big problem. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage
On 2/6/19 7:55 PM, Ben Asher wrote: Hi there! We're having a debate at my company about date storage in SQLite. SQLite has builtin support for ISO8601 in its date functions, so some folks have started storing dates as ISO8601 SQLite-compatible date strings. Are there pitfalls to storing dates this way compared to a unix timestamp? I'm curious to know if anyone has experience and would highly recommend sticking to one or the other for a particular reason. I'd also be grateful if anyone could point me to any articles exploring this subject. Thanks! Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users Isn't ISO 8601 designed for communications with humans in an international and standard way? It is not for storage of data. At least in my opinion one needs a data element that one may store and later fetch and then perform computation and comparisons with. That would be the unix timestamp way of things. The ISO 8601 format is for display to human beings and other soft squishy creatures. I don't see how you can check two dates readily unless you have a pile of libs in your pocket that do that. So .. this works real well : l$ date -u ; tn; sleep 4; date -u; tn -f Wed Feb 6 20:40:54 UTC 2019 1549485654 Wed Feb 6 20:40:58 UTC 2019 1549485658.659547276 l$ Easy to compare those unix timestamps ripped out of an struct timespec. Dennis ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] ISO8601 vs Numeric Timestamp for Date Storage
Hi there! We're having a debate at my company about date storage in SQLite. SQLite has builtin support for ISO8601 in its date functions, so some folks have started storing dates as ISO8601 SQLite-compatible date strings. Are there pitfalls to storing dates this way compared to a unix timestamp? I'm curious to know if anyone has experience and would highly recommend sticking to one or the other for a particular reason. I'd also be grateful if anyone could point me to any articles exploring this subject. Thanks! Ben ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users