Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-08 Thread Ben Asher
Thanks all! Super helpful.

Ben

On Wed, Feb 6, 2019 at 11:55 AM Ben Asher  wrote:

> Hi there! We're having a debate at my company about date storage in
> SQLite. SQLite has builtin support for ISO8601 in its date functions, so
> some folks have started storing dates as ISO8601 SQLite-compatible date
> strings. Are there pitfalls to storing dates this way compared to a unix
> timestamp? I'm curious to know if anyone has experience and would highly
> recommend sticking to one or the other for a particular reason. I'd also be
> grateful if anyone could point me to any articles exploring this subject.
>
> Thanks!
>
> Ben
>


-- 
Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Keith Medcalf

On Wednesday, 6 February, 2019 12:55, Ben Asher wrote:

> Hi there! We're having a debate at my company about date storage in
> SQLite.  SQLite has builtin support for ISO8601 in its date functions, 
> so some folks have started storing dates as ISO8601 SQLite-compatible 
> date strings.

> Are there pitfalls to storing dates this way compared to a unix
> timestamp?

Well, the most common pitfalls are the failure to store "instant time strings" 
and instead to only store ambiguous data.  This is amplified by the fact that 
the built-in datetime functions only produce (output) ambiguous timestrings and 
then only with limited precision (ie, usually not suitable for putting back 
into a database).  You can work around this by always storing such strings in 
UT1 (UTC).

For example, a full IS8601 "instant time string" looks something like: 
2019-02-06T15:45:47.305872603-07:00

This format can be handled on input to the datetime functions producing a 
correct UT1 instant time.  The input routines are general enough that they can 
accept varients such as a space in place of the "T", a space before the + or - 
timezone specifier, and an arbitrary number of decimal seconds (though the 
internal storage is Julian milliseconds, so the internal representation is 
limited to millisecond precision.  Z can be used to denote +0:00 and -0:00.

The "output" formats however are somewhat lacking as they do not specify 
decimal seconds (unless you use your own strftime format string) and do not 
include timezone information (thus making the time ambiguous).

The "localtime" and "utc" modifiers are processed by the underlying OS and are 
subject to the vagaries of such handling on the OS, which varies by OS.  For 
example, Windows conversions are only accurate for the current daylight rules.  
Linux works properly.  I do not know about other OSes.  "localtime" means the 
timezone of the computer.

Storing a "Unix Timestamp" or other numeric format of course is not subject to 
these vagaries since it is an offset in POSIX seconds (or days) from a fixed 
epoch and that epoch is almost always UT1 (UTC).  Note that you can store 
floating point unix timestamps, you just need to tell the internal datetime 
functions that they are working with unix epochs.

Storing the stamps as ISO8601 strings has the advantage that anyone can 
recognize them and use the standard SQLite3 shell tool to work with them.  
Numeric epoch offsets are not so "human readable" and you need to be more 
knowledgeable to see and work with them using the shell tools, but they are 
harder to screw up.

> I'm curious to know if anyone has experience and would highly recommend
> sticking to one or the other for a particular reason. I'd also be
> grateful if anyone could point me to any articles exploring this subject.

I prefer working with numeric epoch dates myself because they are (a) more 
compact and (b) inherently sortable.  ISO8601 strings are "mostly sortable" so 
long as they all have the same offset from UT1 -- the timestring format though 
does require about 31 bytes to store the same information as can be stored in 
an 8-byte float.

I have some patches that modify the datetime function library to always output 
full instant timestrings and that can use the Olsen database (which needs to be 
loaded into the database) to do timezone conversions inside SQLite3 and to 
maintain the offsets within the datetime objects.  (Unixtime uses the VDBE 
current statement time and simply convert the internal Julian Milliseconds into 
Unix Epoch Seconds as a double.  UnixInstant calls the 
GetSystemTimePreciseAsFileTime (windows) API and returns the current machine 
time as a unix epoch float (it does the Precise variant gets the currrent time, 
not the time as of the last tick).

>sqlite tz.db
SQLite version 3.27.0 2019-02-06 01:18:36
Enter ".help" for usage hints.
sqlite> select datetime('now', 'Canada/Mountain');
2019-02-06 16:09:08.039 -07:00
sqlite> select datetime('now', 'Europe/Moscow');
2019-02-07 02:09:23.943 +03:00
sqlite> select unixtime(), unixinstant();
1549496112.409|1549496112.40904
sqlite> select datetime(unixinstant(), 'unixepoch', 'America/Regina');
2019-02-06 17:35:46.849 -06:00
sqlite>

Or using the geopoly extension find the Olsen timezone for a given lat/long:

>gettz -110 50
3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit 
(AMD64)]
SQlite3 Source ID 2019-02-06 01:18:36 
3087a0c31e9fbfaedb7cf10a2fda59bc22318ff178614aefcc00ac83d9a7alt2

Find timezone at Location -110.00 50.00
Exact Row(tzid='America/Regina') 00:00:00.004773

st = time.time()
for row in db.cursor().execute("""select tzid
from tz_geopoly
   where geopoly_overlap(_shape, 
geopoly_regular(?1, ?2, 0.0002699785, 8))
 and geopoly_contains_point(_shape, ?1, ?2)
order by abs(geopoly_area(_shape)), 
instr(lower(tzid), 

Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Jens Alfke


> On Feb 6, 2019, at 2:21 PM, J Decker  wrote:
> 
> From a JS point of view new Date( ISOString )and .toISOString() are quick
> and available….

Available, yes, but expensive (compared to using a number.)

> ISO format parsing is NOT that hard it's just a minor varient of
> parsing floats.  (maybe the conversion from parts into numeric?)

Most date parsing (and formatting) functions are general-purpose and take 
format strings. That adds to the overhead. (SQLite does use a parser that’s 
specific to ISO-8661 and looks a lot faster.)

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread J Decker
From a JS point of view new Date( ISOString )and .toISOString() are quick
and available
ISO format parsing is NOT that hard it's just a minor varient of
parsing floats.  (maybe the conversion from parts into numeric?)
Haven't bothered to benchmark it.
Date Diffs easily avaialble.

On Wed, Feb 6, 2019 at 1:59 PM Jens Alfke  wrote:

>
>
> > On Feb 6, 2019, at 11:55 AM, Ben Asher  wrote:
> >
> > Hi there! We're having a debate at my company about date storage in
> SQLite.
> > SQLite has builtin support for ISO8601 in its date functions, so some
> folks
> > have started storing dates as ISO8601 SQLite-compatible date strings. Are
> > there pitfalls to storing dates this way compared to a unix timestamp?
>
> Date-string parsing can be surprisingly expensive. I’ve seen it as a
> hot-spot when profiling a program of mine that analyzed some XML-based data
> sets, and I’ve also seen it slow down CouchDB map functions.
>
> Date strings are many times larger than the equivalent numeric timestamps.
>
> On the plus side, they’re much more readable if someone has to look at the
> raw data in the database.
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Jens Alfke


> On Feb 6, 2019, at 11:55 AM, Ben Asher  wrote:
> 
> Hi there! We're having a debate at my company about date storage in SQLite.
> SQLite has builtin support for ISO8601 in its date functions, so some folks
> have started storing dates as ISO8601 SQLite-compatible date strings. Are
> there pitfalls to storing dates this way compared to a unix timestamp?

Date-string parsing can be surprisingly expensive. I’ve seen it as a hot-spot 
when profiling a program of mine that analyzed some XML-based data sets, and 
I’ve also seen it slow down CouchDB map functions.

Date strings are many times larger than the equivalent numeric timestamps.

On the plus side, they’re much more readable if someone has to look at the raw 
data in the database.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Andy Bennett

Hi,


Integer unix timestamps are only accurate to one second, where ISO8601
(at least as implemented by SQLite) can go to 1 millisecond.  Also you
have to know the epoch to interpret a unix timestamp - not everybody
uses 1970-01-01 00:00:00.  Will people be able to figure out what the
field value means when somebody discovers your data in 100 years?

The SQLite implementation is *not* subject to the 2038-01-19 integer
overflow problem.  But other systems that might interact with SQLite
are and so that is something to keep in mind as well.

On the other hand, integer timestamps take up less space in the file.

The third option is a fractional julian day number stored as an 8-byte
floating point value.


Do you want to store timestamps from the future (for example, a 
calendar-style application) or will all your timestamps always represent 
points in the past?


If you want to be able to accurately store future timestamps then the 
integer seconds and julian day number representations require some extra 
metadata.


Whilst past timestamps can always be deterministically converted from their 
local timezone to UTC and back again, this is not the case for timestamps 
in the future.


This is because the timezone and daylight saving rules change from 
time-to-time. In the integer seconds and julian day number representations, 
future timestamps must be stored in local time along with their timezone so 
that information is not lost.


In a perfect world you would store local time and location (rather than 
timezone) as places do also occasionally move timezone. However, this is 
difficult to work with because databases that map from suitably represented 
locations to timezones are more difficult to come by than databases that 
describe the relationships between timezones.



This is not a hypothetical problem or one where the error is small.

Bangladesh cancelled their daylight savings observances in 2010. In 2011 
Russia made their daylight savings time permenant; moving onto it in the 
Spring and never moving back.


If you had prematurely converted timestamps from these places to UTC then 
your error would be measured on the order of an hour.


In 1994 Eastern Kiribati crossed the International Date Line 
(bureaucratically speaking). If you had prematurely converted a timestamp 
from there then your error would be on the order of a day!




Doing consistent arithmentic on future dates is left as an exercise for the 
reader (sorry)!






Regards,
@ndy

--
andy...@ashurst.eu.org
http://www.ashurst.eu.org/
0x7EBA75FF
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Dennis Clarke

On 2/6/19 9:10 PM, Richard Hipp wrote:

On 2/6/19, Ben Asher  wrote:

Hi there! We're having a debate at my company about date storage in SQLite.
SQLite has builtin support for ISO8601 in its date functions, so some folks
have started storing dates as ISO8601 SQLite-compatible date strings. Are


In my own work, I have variously used ISO8601 text dates, unix
timestamp integers, and fractional Julian Day numbers to represent
dates and times, according to whichever worked best in that particular
application.  Since it is easy to convert between them all, this has
never been a big problem.



Why not merely use the data from :

struct timespec tn;
ec = clock_gettime( CLOCK_REALTIME,  );

That should give some sort of data down to the nanosec and if you have
decent ntp in place ( and black magic ) it may even be accurate. :-)


--
Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken
GreyBeard and suspenders optional
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Richard Hipp
On 2/6/19, Ben Asher  wrote:
> Hi there! We're having a debate at my company about date storage in SQLite.
> SQLite has builtin support for ISO8601 in its date functions, so some folks
> have started storing dates as ISO8601 SQLite-compatible date strings. Are
> there pitfalls to storing dates this way compared to a unix timestamp? I'm
> curious to know if anyone has experience and would highly recommend
> sticking to one or the other for a particular reason. I'd also be grateful
> if anyone could point me to any articles exploring this subject.

Integer unix timestamps are only accurate to one second, where ISO8601
(at least as implemented by SQLite) can go to 1 millisecond.  Also you
have to know the epoch to interpret a unix timestamp - not everybody
uses 1970-01-01 00:00:00.  Will people be able to figure out what the
field value means when somebody discovers your data in 100 years?

The SQLite implementation is *not* subject to the 2038-01-19 integer
overflow problem.  But other systems that might interact with SQLite
are and so that is something to keep in mind as well.

On the other hand, integer timestamps take up less space in the file.

The third option is a fractional julian day number stored as an 8-byte
floating point value.  Such values are accurate to about 1 millisecond
during the modern era, and it is much easier to compute the number of
days by which two dates differ (you just subtract).  There are no
overflow crises pending, though precision does decay as you move
further and further away from the epoch, though for dates in nearby
centuries this is not a factor. Space requirements are in between
integer unix timestamps and ISO8601 strings.

In my own work, I have variously used ISO8601 text dates, unix
timestamp integers, and fractional Julian Day numbers to represent
dates and times, according to whichever worked best in that particular
application.  Since it is easy to convert between them all, this has
never been a big problem.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Dennis Clarke

On 2/6/19 7:55 PM, Ben Asher wrote:

Hi there! We're having a debate at my company about date storage in SQLite.
SQLite has builtin support for ISO8601 in its date functions, so some folks
have started storing dates as ISO8601 SQLite-compatible date strings. Are
there pitfalls to storing dates this way compared to a unix timestamp? I'm
curious to know if anyone has experience and would highly recommend
sticking to one or the other for a particular reason. I'd also be grateful
if anyone could point me to any articles exploring this subject.

Thanks!

Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



Isn't ISO 8601 designed for communications with humans in an
international and standard way?  It is not for storage of data.
At least in my opinion one needs a data element that one may store
and later fetch and then perform computation and comparisons with.
That would be the unix timestamp way of things. The ISO 8601 format
is for display to human beings and other soft squishy creatures. I
don't see how you can check two dates readily unless you have a pile
of libs in your pocket that do that.

So .. this works real well :

l$ date -u ; tn; sleep 4; date -u; tn -f
Wed Feb  6 20:40:54 UTC 2019
1549485654
Wed Feb  6 20:40:58 UTC 2019
1549485658.659547276
l$

Easy to compare those unix timestamps ripped out of an struct timespec.

Dennis
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] ISO8601 vs Numeric Timestamp for Date Storage

2019-02-06 Thread Ben Asher
Hi there! We're having a debate at my company about date storage in SQLite.
SQLite has builtin support for ISO8601 in its date functions, so some folks
have started storing dates as ISO8601 SQLite-compatible date strings. Are
there pitfalls to storing dates this way compared to a unix timestamp? I'm
curious to know if anyone has experience and would highly recommend
sticking to one or the other for a particular reason. I'd also be grateful
if anyone could point me to any articles exploring this subject.

Thanks!

Ben
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users