Re: [sqlite] Data types for date and time functions

2017-09-25 Thread Keith Medcalf

On Monday, 25 September, 2017 06:20, R Smith  wrote:

>On 2017/09/25 10:12 AM, David Wellman wrote:

>All of these have pro's and cons. Integer storage is usually most
>efficient, but it takes some calculation to interpret, however SQLite
>is very efficient at it, but if you want to do it in your own code, 
>you will need some standard functions to do so.

>REAL/FLOAT is used by many systems (such as MS Excel) and it is a
>really great storage method for doing time-intensive calculations 
>and being easy to understand by humans (well, a bit more-so than 
>Unix stamps anyway). Adding a day is as simple as adding 1 to the 
>value (with the Unix timestamp you have to add 86400), adding 12 
>hours (half a day) is as easy as adding 0.5, etc. Subtracting 
>3 o'clock from 9 o'clock leaves you with 0.25, which is a quarter 
>of a day, so 6 hours exactly, and so on.

>Strings take up some more space (though not that much actually) but
>has the one significant advantage when trouble-shooting or debugging,
>that the date is completely human readable as it is stored in the 
>table, a mere glance at the data will already tell you if the date 
>is a factor in whatever problem you are solving, whereas the other 
>formats need a bit of interpretation first (though very easy with 
>a basic format statement in the query). It has the disadvantage 
>that, although SQLite provides great and easy-to-use date-time 
>adding/calculating functions, they do take some more CPU cycles 
>parsing the date string. (To be fair, I've measured this at 
>some point, and it was negligible, and even though it was long 
>ago, I doubt SQLite got worse at parsing dates.)

>I always use ISO8601 strings storing UTC (non-offset) dates and
>times, UNLESS data storage space is a big problem or date-time 
>calculation is a constant feature of the intended use of the 
>table, then an Integer Unix timestamp. If the date is simply 
>stored and referenced in a normal way, ISO date strings it is.

To add to this, you can use any standard supported format, timestrings 
(ISO8601), julian day numbers (float), or Unix timestamps.  The advantage of 
using either the Julian or Unix timestamp is that they are already UTC based 
(or ought to be) and just about everything can handle them.  Using the 
timestring format can cause issues, however, because unless you "know" that the 
string is UTC, someone "just looking" or "editing" the database by hand may try 
to store a localtime value.  In this case, obfuscation is better (I believe).

Of course, an issue that you will run into, no matter what format you choose, 
is that Microsoft Products cannot convert between UTC and Localtime except for 
the current set of timezone rules (and assumes that the current rules are 
propeleptic), and you will need an external library and tables to do this 
conversion correctly.  This is a limitation of the core runtime libraries and 
cannot be fixed if you let Windows Platform API's do the conversions.

For this reason it is generally better to store a format such as the julian day 
or the unix timestamp so you do not have to deal with the vagaries of 
processing/converting "localtime" on Microsoft platforms.




___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread R Smith

On 2017/09/25 3:46 PM, R Smith wrote:


PS: I refer to "Excel" only, but the problem probably persists in all 
of MS Office, though I didn't check.




Thanks to Igor's post and some quick testing, I can confirm that it 
seems to only affect Excel, not all of MS Office.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread R Smith

On 2017/09/25 2:23 PM, Stephan Buchert wrote:

I was just going to write that you can easily convert an MS serial date
value stored in Sqlite to a date string (using 40777 as example):

sqlite> select date('1899-12-31', 40777||' days');
2011-08-23

However, according to

https://support.office.com/en-us/article/DATEVALUE-function-df8b07d4-7761-4a93-bc33-b7471bbff252

the answer should be 2011-08-22.

1900 was not a leap year, 2000 was, can it be, that the MS Excel software
has this wrong?


That is not the only thing they have wrong. The standard is good though, 
you will find that other software using this scheme actually starts with 
day 0 as 30 December 1899, so that day 1 is 31 December 1899 and day 2 
only turns out to be 1 Jan 1900. In this way it continues to February 28 
1900 being day 60 (whereas in Excel day 60 is 29 Feb 1900) which means 
from Day 61 the two date-systems are in alignment, so the misalignment 
only exists for the first 2 months of the date range Still, that is 
a really weird way of doing things for Excel.


Which means... if you really want your SQlite dates to tie up to or 
output Excel-happy time values, you have to do SELECT date('1899-12-30', 
daycount||' days') which will pop out a date that is the same as other 
software using this scheme and also same as Excel's date so long as 
daycount > 60.


Another crazy thing Excel does - Day 0 is NOT given as 31-December-1899, 
but indeed as 00-Jan-1901. Yes, that's a month with a ZERO day in it, 
which means that day -1 would be 31-Dec-1899 while in every other system 
it is 29 Dec 1899... How did Excel fix this? Simple, any date with 
negative value gets a #VALUE! - i.e: "Sorry boss, can't compute!".


I can't be sure if Excel was the root cause of this bug, perhaps they 
tried to conform to an earlier mistake, or perhaps long ago software in 
general wasn't aware (or not in agreement) over whether 1900 was a leap 
year or not - but in my travels I only found Excel showing 1900 as a 
leap year along with these other funnies, so I'm leaning towards blaming 
them.


Here is a quick copy-paste from Excel with one column (A) showing 
integer numbers and the second (B) the same but formatted as dates using 
the formula: =TEXT(A1,"-MM-DD")
(A1 being the column to the left, increasing in row number for every 
subsequent row.). Dates used by everyone else given in C. As you can 
see, things normalize after row 60, but before then Excel is just weird.


A   B   C
-2  #VALUE! 1899/12/28
-1  #VALUE! 1899/12/29
0   1900-01-00  1899/12/30
1   1900-01-01  1899/12/31
2   1900-01-02  1900/01/01
3   1900-01-03  1900/01/02
4   1900-01-04  1900/01/03
5   1900-01-05  1900/01/04
6   1900-01-06  1900/01/05
7   1900-01-07  1900/01/06
8   1900-01-08  1900/01/07
9   1900-01-09  1900/01/08
10  1900-01-10  1900/01/09
11  1900-01-11  1900/01/10
12  1900-01-12  1900/01/11
13  1900-01-13  1900/01/12
14  1900-01-14  1900/01/13
15  1900-01-15  1900/01/14
16  1900-01-16  1900/01/15
17  1900-01-17  1900/01/16
18  1900-01-18  1900/01/17
19  1900-01-19  1900/01/18
20  1900-01-20  1900/01/19
21  1900-01-21  1900/01/20
22  1900-01-22  1900/01/21
23  1900-01-23  1900/01/22
24  1900-01-24  1900/01/23
25  1900-01-25  1900/01/24
26  1900-01-26  1900/01/25
27  1900-01-27  1900/01/26
28  1900-01-28  1900/01/27
29  1900-01-29  1900/01/28
30  1900-01-30  1900/01/29
31  1900-01-31  1900/01/30
32  1900-02-01  1900/01/31
33  1900-02-02  1900/02/01
34  1900-02-03  1900/02/02
35  1900-02-04  1900/02/03
36  1900-02-05  1900/02/04
37  1900-02-06  1900/02/05
38  1900-02-07  1900/02/06
39  1900-02-08  1900/02/07
40  1900-02-09  1900/02/08
41  1900-02-10  1900/02/09
42  1900-02-11  1900/02/10
43  1900-02-12  1900/02/11
44  1900-02-13  1900/02/12
45  1900-02-14  1900/02/13
46  1900-02-15  1900/02/14
47  1900-02-16  1900/02/15
48  1900-02-17  1900/02/16
49  1900-02-18  1900/02/17
50  1900-02-19  1900/02/18
51  1900-02-20  1900/02/19
52  1900-02-21  1900/02/20
53  1900-02-22  1900/02/21
54  1900-02-23  1900/02/22
55  1900-02-24  1900/02/23
56  1900-02-25  1900/02/24
57  1900-02-26  1900/02/25
58  1900-02-27  1900/02/26
59  1900-02-28  1900/02/27
60  1900-02-29  1900/02/28
61  1900-03-01  1900/03/01
62  1900-03-02  1900/03/02
63  1900-03-03  1900/03/03
64  1900-03-04  1900/03/04
65  1900-03-05  1900/03/05
66  1900-03-06  1900/03/06
67  1900-03-07  1900/03/07
68  1900-03-08  1900/03/08
69  1900-03-09

Re: [sqlite] Data types for date and time functions

2017-09-25 Thread Igor Tandetnik

On 9/25/2017 8:23 AM, Stephan Buchert wrote:

I was just going to write that you can easily convert an MS serial date
value stored in Sqlite to a date string (using 40777 as example):

sqlite> select date('1899-12-31', 40777||' days');
2011-08-23

However, according to

https://support.office.com/en-us/article/DATEVALUE-function-df8b07d4-7761-4a93-bc33-b7471bbff252

the answer should be 2011-08-22.

1900 was not a leap year, 2000 was, can it be, that the MS Excel software
has this wrong?


Yes, it's a famous Excel date bug, originally introduced in Lotus 1-2-3 and 
ported into Excel for compatibility:

http://support.microsoft.com/kb/214326

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread J. King

On 2017-09-25 08:19:52, "R Smith"  wrote:


On 2017/09/25 10:12 AM, David Wellman wrote:

C - Storing a string with a date or date and time, typically the 
standard form is ISO8601 which looks like '-MM-DDTHH:NN:SS.MSS 
+ZZ:ZZ' with the T optionally being a space and the + could be + or - 
to indicate offset (Zulu time) etc. Some parts are optional, there is a 
good argument to not store Zulu time (offsets) but revert to UTC and 
let the displaying app format the time to whatever time-frame the user 
expects (much like Unix timestamps), so most actual stored dates end up 
as simply '-MM-DD HH:NN:SS' or even just '-MM-DD' if not 
interested in the time.


It's worth noting that the CURRENT_TIMESTAMP keyword uses "-MM-DD 
HH:MM:SS" format, and implied UTC.


--
J. King


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Data types for date and time functions

2017-09-25 Thread Stephan Buchert
I fired up an MS Excel 2013 and yes, there 1900-02-29 exists and counts for
the serial date value!

My original comment was, that also storing in Sqlite the MS serial date
values would be possible (as well as Matlab date numbers, etc.), and the
Sqlite date/time functions allow quite easily to do the necessary
conversions. But such bugs outside Sqlite of course do complicate things...
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Data types for date and time functions

2017-09-25 Thread Stephan Buchert
I was just going to write that you can easily convert an MS serial date
value stored in Sqlite to a date string (using 40777 as example):

sqlite> select date('1899-12-31', 40777||' days');
2011-08-23

However, according to

https://support.office.com/en-us/article/DATEVALUE-function-df8b07d4-7761-4a93-bc33-b7471bbff252

the answer should be 2011-08-22.

1900 was not a leap year, 2000 was, can it be, that the MS Excel software
has this wrong?

Sqlite does the leap years correctly:

sqlite> select date('1900-02-28', 1||' days');
1900-03-01
sqlite> select date('2000-02-28', 1||' days');
2000-02-29
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread R Smith

On 2017/09/25 10:12 AM, David Wellman wrote:

Hi,
  


We're designing a new feature which will involve a lot of date/time
calculations which we intend to do within the SQLite engine//



The question has come up as to how we should store date/time values in our
tables? Basically how should we define our date/time columns?


A short summary of considerations for that decision:

There are three standard date-time storage options (there are others 
too, but these are the defacto go-to for most systems, all of which 
SQLite handles well):
A - Storing an INTEGER Unix timestamp (Seconds or Milliseconds since a 
base UTC date) or Julian day value,
B - Storing FLOAT/REAL where the Integer part describes the days since a 
base date, typically 1/1/1900 or 1/1/1970, depending on who you roll 
with and
C - Storing a string with a date or date and time, typically the 
standard form is ISO8601 which looks like '-MM-DDTHH:NN:SS.MSS 
+ZZ:ZZ' with the T optionally being a space and the + could be + or - to 
indicate offset (Zulu time) etc. Some parts are optional, there is a 
good argument to not store Zulu time (offsets) but revert to UTC and let 
the displaying app format the time to whatever time-frame the user 
expects (much like Unix timestamps), so most actual stored dates end up 
as simply '-MM-DD HH:NN:SS' or even just '-MM-DD' if not 
interested in the time.


All of these have pro's and cons. Integer storage is usually most 
efficient, but it takes some calculation to interpret, however SQLite is 
very efficient at it, but if you want to do it in your own code, you 
will need some standard functions to do so.


REAL/FLOAT is used by many systems (such as MS Excel) and it is a really 
great storage method for doing time-intensive calculations and being 
easy to understand by humans (well, a bit more-so than Unix stamps 
anyway). Adding a day is as simple as adding 1 to the value (with the 
Unix timestamp you have to add 86400), adding 12 hours (half a day) is 
as easy as adding 0.5, etc. Subtracting 3 o'clock from 9 o'clock leaves 
you with 0.25, which is a quarter of a day, so 6 hours exactly, and so on.


Strings take up some more space (though not that much actually) but has 
the one significant advantage when trouble-shooting or debugging, that 
the date is completely human readable as it is stored in the table, a 
mere glance at the data will already tell you if the date is a factor in 
whatever problem you are solving, whereas the other formats need a bit 
of interpretation first (though very easy with a basic format statement 
in the query). It has the disadvantage that, although SQLite provides 
great and easy-to-use date-time adding/calculating functions, they do 
take some more CPU cycles parsing the date string. (To be fair, I've 
measured this at some point, and it was negligible, and even though it 
was long ago, I doubt SQLite got worse at parsing dates.)


I always use ISO8601 strings storing UTC (non-offset) dates and times, 
UNLESS data storage space is a big problem or date-time calculation is a 
constant feature of the intended use of the table, then an Integer Unix 
timestamp. If the date is simply stored and referenced in a normal way, 
ISO date strings it is.


Note that these are my preferences when balancing data storage 
considerations with ease of use and working/debugging efficiency, they 
are by no means the "industry standard", if such a thing exists.


PS: If you do store ISO date strings, the suggested column TYPE in 
SQLite terms is "NUMERIC" (not "TEXT"). At least that is what SQLite 
itself relays DATE and DATETIME types to. I doubt it matters greatly, 
but I use NUMERIC for any kind of date and it seems to work well, and if 
I decide to change the date-time storage format later, it requires 
merely an UPDATE query as opposed to a schema change.
Ref: Section 3.1.1 on this page: 
http://www.sqlite.org/datatype3.html#affinity_name_examples


Cheers,
Ryan

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread Paul Sanderson
Internally SQLite stores and process numbers as Julian day numbers, the
number of days since noon in Greenwich on November 24, 4714 B.C.

I have not examined the code in any depth but would assume that if you
store the data in the same format it would save on any processing overhead
for calculations but would just require a conversion for display.





Paul
www.sandersonforensics.com
skype: r3scue193
twitter: @sandersonforens
Tel +44 (0)1326 572786
http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit
-Forensic Toolkit for SQLite
email from a work address for a fully functional demo licence

On 25 September 2017 at 09:12, David Wellman 
wrote:

> Hi,
>
>
>
> We're designing a new feature which will involve a lot of date/time
> calculations which we intend to do within the SQLite engine. As far as we
> can tell it has the functions that we need. Basically we'll be loading data
> into SQLite and performing analysis and calculations using SQL.
>
>
>
> The question has come up as to how we should store date/time values in our
> tables? Basically how should we define our date/time columns?
>
>
>
> For us an obvious choice is to store dates/times as REAL. We're working in
> a
> Windows environment and so in the application code we're dealing with MS
> serial date values.
>
>
>
> Looking at the 'date and tine functions' page
> (http://www.sqlite.org/lang_datefunc.html ) it says "All five date and
> time
> functions take a time string as an argument". So my initial reaction is to
> store dates and times as TEXT. I think this means that when passing such
> date/time values into the functions there is one less conversion to do.
>
>
>
> But then looking at some of the examples on that page I came across the
> following:
>
> Compute the date and time given a unix timestamp 1092941466.
>
> SELECT datetime(1092941466, 'unixepoch');
>
>
>
> In the sql syntax that I'm used to (which I thought was the same for
> SQLite)
> a series of digits like that shown above is treated as a numeric data value
> (and type), not text (i.e. a  "time string"). If that was meant to be
> string
> data (i.e. text) then I'd have expected:
>
>
>
> SELECT datetime('1092941466', 'unixepoch');
>
>
>
> So to clarify: Is our idea of storing date/time values as TEXT data a
> sensible one?
>
>
>
> We're essentially thinking of performance and there are almost certainly
> pros and cons to doing this.
>
> Pro: (I think) more efficient processing as the data is supplied to these
> functions as text which is what they're expecting/require - and therefore
> there is no additional conversion required.
>
> Con: Our full timestamps will be 19 bytes compared to a REAL which is only
> 8
> bytes. So they will require more storage/disk space which ultimately means
> more I/O to read the same number of rows and columns.
>
>
>
> I accept that from a performance perspective there may not be much in it,
> but I'd be interested in people's thoughts.
>
>
>
> In anticipation, many thanks.
>
>
>
> Cheers,
>
> Dave
>
>
>
>
>
>
>
> Ward Analytics Ltd - information in motion
>
> Tel: +44 (0) 118 9740191
>
> Fax: +44 (0) 118 9740192
>
> www:   http://www.ward-analytics.com
>
>
>
> Registered office address: The Oriel, Sydenham Road, Guildford, Surrey,
> United Kingdom, GU1 3SR
>
> Registered company number: 3917021 Registered in England and Wales.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Data types for date and time functions

2017-09-25 Thread Clemens Ladisch
David Wellman wrote:
> The question has come up as to how we should store date/time values in our
> tables? Basically how should we define our date/time columns?

SQLite does not have a separate date/time type.

If you want to use the built-in date/time function, you can store values
in one of three different formats:
http://www.sqlite.org/datatype3.html#datetime

> in the application code we're dealing with MS serial date values.

That is not one of SQLite's supported formats.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Data types for date and time functions

2017-09-25 Thread David Wellman
Hi,

 

We're designing a new feature which will involve a lot of date/time
calculations which we intend to do within the SQLite engine. As far as we
can tell it has the functions that we need. Basically we'll be loading data
into SQLite and performing analysis and calculations using SQL.

 

The question has come up as to how we should store date/time values in our
tables? Basically how should we define our date/time columns?

 

For us an obvious choice is to store dates/times as REAL. We're working in a
Windows environment and so in the application code we're dealing with MS
serial date values.

 

Looking at the 'date and tine functions' page
(http://www.sqlite.org/lang_datefunc.html ) it says "All five date and time
functions take a time string as an argument". So my initial reaction is to
store dates and times as TEXT. I think this means that when passing such
date/time values into the functions there is one less conversion to do.

 

But then looking at some of the examples on that page I came across the
following:

Compute the date and time given a unix timestamp 1092941466.

SELECT datetime(1092941466, 'unixepoch'); 

 

In the sql syntax that I'm used to (which I thought was the same for SQLite)
a series of digits like that shown above is treated as a numeric data value
(and type), not text (i.e. a  "time string"). If that was meant to be string
data (i.e. text) then I'd have expected:

 

SELECT datetime('1092941466', 'unixepoch'); 

 

So to clarify: Is our idea of storing date/time values as TEXT data a
sensible one?

 

We're essentially thinking of performance and there are almost certainly
pros and cons to doing this.

Pro: (I think) more efficient processing as the data is supplied to these
functions as text which is what they're expecting/require - and therefore
there is no additional conversion required.

Con: Our full timestamps will be 19 bytes compared to a REAL which is only 8
bytes. So they will require more storage/disk space which ultimately means
more I/O to read the same number of rows and columns.

 

I accept that from a performance perspective there may not be much in it,
but I'd be interested in people's thoughts.

 

In anticipation, many thanks.

 

Cheers,

Dave

 

 

 

Ward Analytics Ltd - information in motion

Tel: +44 (0) 118 9740191

Fax: +44 (0) 118 9740192

www:   http://www.ward-analytics.com

 

Registered office address: The Oriel, Sydenham Road, Guildford, Surrey,
United Kingdom, GU1 3SR

Registered company number: 3917021 Registered in England and Wales.

 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users