Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
Allen Fowler wrote: > > > >> You could be storing event duration, not stop time. Or perhaps store >> both. >> >> > > Here is what I have so far: > > sqlite> create table events (id INTEGER PRIMARY KEY AUTOINCREMENT, name, > kind, start, end); > > # Now add some events for "tomorrow" > sqlite> > insert into events values (null, 'tom', 'hour', datetime('now', '+1 > day','start of day', '+11 hours'), datetime('now', '+1 day','start of > day', '+12 hours')); > sqlite> insert into events values (null, > 'tom', 'hour', datetime('now', '+1 day','start of day', '+9 hours'), > datetime('now', '+1 day','start of day', '+10 hours')); > sqlite> > insert into events values (null, 'joe', 'hour', datetime('now', '+1 > day','start of day', '+9 hours'), datetime('now', '+1 day','start of > day', '+10 hours')); > > # Now add an all-day event for tomorrow. (It overlaps a couple of above > events.) > sqlite> insert into events values (null, > 'tom', 'day', datetime('now', '+1 day','start of day'), datetime('now', > '+1 day','start of day', '+1 day')); > > # Show all events and duration: > sqlite> select *, (strftime('%s', end) - strftime('%s', start)) as length > from events; > idname kind start > end length > > -- > 1 tom hour 2009-06-13 > 11:00:00 2009-06-13 12:00:00 3600 > 2 tom hour 2009-06-13 > 09:00:00 2009-06-13 10:00:00 3600 > 3 joe hour 2009-06-13 > 09:00:00 2009-06-13 10:00:00 3600 > 4 tom day 2009-06-13 > 00:00:00 2009-06-14 00:00:00 86400 > > > # > # And now, I want to get a result table with one row per user showing the > "shortest active at 9:30 AM event" for each user. > > sqlite> > select *, min((strftime('%s', end) - strftime('%s', start))) as length > from >...> events where >...> start < datetime('now', '+1 day','start of day', > '+9 hours','+30 minutes') >...> and end > datetime('now', '+1 day','start > of day', '+9 hours','+30 minutes') >...> group by name; > > idname kind start > end length > > -- > 3 joe hour 2009-06-13 > 09:00:00 2009-06-13 10:00:00 3600 > 4 tom day 2009-06-13 > 00:00:00 2009-06-14 00:00:00 3600 > sqlite> > > However this result returned is very wrong.The length col is correct but > the other cols for "tom" are wrong. (It should be "2 | tom | hour | > 2009-06-13 09:00:00 | 2009-06-13 10:00:00 | 3600" ) > > What am I missing here? Am I doing the query wrong? > > Allen, Yes, you are doing something wrong. It is probably the most common mistake made when using grouping in SQL. When grouping, keep in mind that all columns that appear in your SELECT column list, that are not aggregated (used along with one of the SQL aggregate functions), have to appear in the GROUP BY clause too. The only valid columns in your output (i.e. with a select *, min()) is the column named in the group by clause (i.e. the name) and the aggregate value (i.e. the min()). You have determined the length of the minimum event for each name that meets your other conditions and nothing else. You do not know its id, what kind it is, or when it starts or ends. The values shown for those other fields are randomly selected from the set of rows in the matching group. SQLite and some other database programs don't complain when you to break this rule because it is sometimes useful to get a randomly selected value for a column in the group. Usually it just leads to the confusion you are seeing. HTH Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
Allen Fowler wrote: > # Show all events and duration: > sqlite> select *, (strftime('%s', end) - strftime('%s', start)) as > length from events; > idname kind > start end length > > -- 1 > tom hour 2009-06-13 11:00:00 > 2009-06-13 12:00:00 3600 2 tom > hour 2009-06-13 09:00:00 2009-06-13 10:00:00 > 3600 3 joe hour > 2009-06-13 09:00:00 2009-06-13 10:00:00 3600 4 > tom day 2009-06-13 00:00:00 > 2009-06-14 00:00:00 86400 > > > # > # And now, I want to get a result table with one row per user showing > the "shortest active at 9:30 AM event" for each user. select * from events e1 where id in ( select id from events e2 join (select datetime('now', '+1 day','start of day', '+9 hours','+30 minutes') x) where e1.name = e2.name and start < x and end > x order by (julianday(end) - julianday(start)) limit 1 ); The variant below is more complicated, but may run faster: select * from events e1 where id in ( select ( select id from events e2 where e2.name = names.name and start < x and end > x order by (julianday(end) - julianday(start)) limit 1 ) from (select distinct name from events) names join (select datetime('now', '+1 day','start of day', '+9 hours','+30 minutes') x) ); Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Allen Fowler wrote: > Indeed, I am aware that SQL is not a "traditional" programming language > per-se and have will now be writing the calendar logic at the application > level. (Looking at Python...) You may want to get a good understanding of what SQL actually is (relational calculus). Mike Owens' "The Definitive Guide to SQLite" covers it in depth. Quite simply if you can't express what you want in another general purpose language like Python then you have zero hope of expressing it in SQL. As to how to write something like Python that is easy to translate to SQL, use for loops as I stated. For example: for row in events: if row["start-time"]>xxx: continue ... group[row["id"]].append(row) Would turn into a select with where and groupby phrases. > In this application, there are going to be a very large number of concurrent > "readers" that want to know at any given time who is doing what. Where > "doing what" is defined as the most specific event the user currently has > scheduled. There is a pseudo "free/busy" standard for that which many existing calendaring systems can generate. > The "event table" I have been asking about is something that I envision my > app will generate on a daily/hourly basis as the *output* of all the > calendering logic. Note that is effectively premature optimisation :-) It seemed unclear if you have worked out how to do the underlying calculations first. You may find that you can do them fast enough without caching the results. In any event if you do choose Python then you can use the APSW wrapper to write a virtual table. (Disclosure: I am the author of APSW). You can present the results of the calculations as the virtual table contents. The BestIndex callback tells you what the constraints are. This approach means that you will never be out of date since the calculated data is not physically stored in the database. It has the drawback that the virtual tables are only available in processes that have loaded your virtual table code. > In this way, I hope to let sqlite, and eventually an SQL server, handle all > the multi-user connection and scaling issues. Are you certain you will have scaling issues? You'll certainly have some if calculated generated event items are stuff into the database (how far into the past and future will they go, how many updates would need to be done if an event is changed?) You may want to investigate using an existing calendar server rather than reinventing most of the wheel. As all the messages keep pointing out, calendaring is a horrible messy thing with lots of tedious petty details and several projects have already done that work for you :-) Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkoy79oACgkQmOOfHg372QTmMgCdGky6/ipTjNN/T4kmDdvQj+2o ksEAn1aHs/gy4F7j2qoBXhhgFZrQ6sMq =hE/G -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
On 13/06/2009 9:05 AM, Allen Fowler wrote: > Indeed, I am aware that SQL is not a "traditional" > programming language per-se and have will now be writing > the calendar logic at the application level. (Looking at Python...) Don't look any further :-) Check out the dateutil module... http://labix.org/python-dateutil ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
> Simon Slavin wrote: > > > I'm not sure you appreciate what Roger (please be more careful about > your quoting, by the way) is telling you. SQL is not a programming > language. It's a way of accessing a database. The two are not at all > equivalent: everything you can do in SQL you can do in any good > programming language, but not vice versa. > > > You're being urged to write your program, not to try to use as many > SQL tricks as you can. > Hmm... Sorry if I mis-quoted thought I had it right. Indeed, I am aware that SQL is not a "traditional" programming language per-se and have will now be writing the calendar logic at the application level. (Looking at Python...) In this application, there are going to be a very large number of concurrent "readers" that want to know at any given time who is doing what. Where "doing what" is defined as the most specific event the user currently has scheduled. The "event table" I have been asking about is something that I envision my app will generate on a daily/hourly basis as the *output* of all the calendering logic. In this way, I hope to let sqlite, and eventually an SQL server, handle all the multi-user connection and scaling issues. This is also especially useful as there is other data in the DB that will need to be retrieved anyway for every currently active event. (This is why I am trying to generate that list of event IDs inside the DB to start with.) > Write code to manipulate dates and times, to > implement repeating events, and to present a decent user-interface to > your users. SQL is going to do lots of work for you but it's not > going to come up with all the answers you want. Sooner or later > you're going to have to do some filtering or sorting in your software. > Indeed, Thank you for your help, :) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
On 12 Jun 2009, at 11:14pm, Allen Fowler wrote: >> You'll probably find it easier to write the processing algorithm in >> the >> programming language of your choice. This will allow you to print >> out >> diagnostics as you go along, play with a debugger etc. To make it >> easy >> to map back to SQL you should structure it as a for loop over the >> rows >> of data with inclusion/exclusion conditions at the top of the >> loop. If >> you decide to express it in SQL again you can then run your correct >> language code against SQL and compare the results as your test >> framework. (Strictly speaking you are already doing this but the >> programming language is your brain :-) >> > > > Thank you for the advice Can you point me to some examples of > code in a traditional language and the equivalent in SQL? I'm not sure you appreciate what Roger (please be more careful about your quoting, by the way) is telling you. SQL is not a programming language. It's a way of accessing a database. The two are not at all equivalent: everything you can do in SQL you can do in any good programming language, but not vice versa. You're being urged to write your program, not to try to use as many SQL tricks as you can. Write code to manipulate dates and times, to implement repeating events, and to present a decent user-interface to your users. SQL is going to do lots of work for you but it's not going to come up with all the answers you want. Sooner or later you're going to have to do some filtering or sorting in your software. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
> > > > What am I missing here? Am I doing the query wrong? > > Yes. The "group by" doesn't know which rows to use for columns that > are not either aggregate functions (such as min) or grouped columns > (such as name). You know what min() does, but the query processor > doesn't. > Ok, this I understand. > You can do this in two steps by getting the min time length per name, > and joining that with the original table augmented with length. > How should I go about this? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
> > I'd recommend continuing down the path you are exploring which is having > test data and tweaking/tuning/correcting your queries until they are > acceptable. > > You'll probably find it easier to write the processing algorithm in the > programming language of your choice. This will allow you to print out > diagnostics as you go along, play with a debugger etc. To make it easy > to map back to SQL you should structure it as a for loop over the rows > of data with inclusion/exclusion conditions at the top of the loop. If > you decide to express it in SQL again you can then run your correct > language code against SQL and compare the results as your test > framework. (Strictly speaking you are already doing this but the > programming language is your brain :-) > Thank you for the advice Can you point me to some examples of code in a traditional language and the equivalent in SQL? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
On Jun 12, 2009, at 3:46 PM, Allen Fowler wrote: > sqlite> > select *, min((strftime('%s', end) - strftime('%s', start))) as > length > from > ...> events where > ...> start < datetime('now', '+1 day','start of day', > '+9 hours','+30 minutes') > ...> and end > datetime('now', '+1 day','start > of day', '+9 hours','+30 minutes') > ...> group by name; > > idname kind > start end length > > -- > 3 joe hour > 2009-06-13 09:00:00 2009-06-13 10:00:00 3600 > 4 tom day > 2009-06-13 00:00:00 2009-06-14 00:00:00 3600 > sqlite> > > However this result returned is very wrong.The length col is > correct but the other cols for "tom" are wrong. (It should be "2 | > tom | hour | 2009-06-13 09:00:00 | 2009-06-13 10:00:00 | 3600" ) > > What am I missing here? Am I doing the query wrong? Yes. The "group by" doesn't know which rows to use for columns that are not either aggregate functions (such as min) or grouped columns (such as name). You know what min() does, but the query processor doesn't. You can do this in two steps by getting the min time length per name, and joining that with the original table augmented with length. e ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Allen Fowler wrote: > What am I missing here? Am I doing the query wrong? It would take a considerable number of iterations to help get the queries exactly correct, especially as there are lots of details as you note such as sort order etc :-) I'd recommend continuing down the path you are exploring which is having test data and tweaking/tuning/correcting your queries until they are acceptable. You'll probably find it easier to write the processing algorithm in the programming language of your choice. This will allow you to print out diagnostics as you go along, play with a debugger etc. To make it easy to map back to SQL you should structure it as a for loop over the rows of data with inclusion/exclusion conditions at the top of the loop. If you decide to express it in SQL again you can then run your correct language code against SQL and compare the results as your test framework. (Strictly speaking you are already doing this but the programming language is your brain :-) Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkoywAAACgkQmOOfHg372QQtRgCgwXukh19LMNZmgW/PlNWtIykM rVYAnA+kcTHJVEfVzXcl/kev2PlKW7vR =TySB -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Shortest time interval? [Was: Re: repeating events?]
> > You could be storing event duration, not stop time. Or perhaps store > both. > Here is what I have so far: sqlite> create table events (id INTEGER PRIMARY KEY AUTOINCREMENT, name, kind, start, end); # Now add some events for "tomorrow" sqlite> insert into events values (null, 'tom', 'hour', datetime('now', '+1 day','start of day', '+11 hours'), datetime('now', '+1 day','start of day', '+12 hours')); sqlite> insert into events values (null, 'tom', 'hour', datetime('now', '+1 day','start of day', '+9 hours'), datetime('now', '+1 day','start of day', '+10 hours')); sqlite> insert into events values (null, 'joe', 'hour', datetime('now', '+1 day','start of day', '+9 hours'), datetime('now', '+1 day','start of day', '+10 hours')); # Now add an all-day event for tomorrow. (It overlaps a couple of above events.) sqlite> insert into events values (null, 'tom', 'day', datetime('now', '+1 day','start of day'), datetime('now', '+1 day','start of day', '+1 day')); # Show all events and duration: sqlite> select *, (strftime('%s', end) - strftime('%s', start)) as length from events; idname kind start end length -- 1 tom hour 2009-06-13 11:00:00 2009-06-13 12:00:00 3600 2 tom hour 2009-06-13 09:00:00 2009-06-13 10:00:00 3600 3 joe hour 2009-06-13 09:00:00 2009-06-13 10:00:00 3600 4 tom day 2009-06-13 00:00:00 2009-06-14 00:00:00 86400 # # And now, I want to get a result table with one row per user showing the "shortest active at 9:30 AM event" for each user. sqlite> select *, min((strftime('%s', end) - strftime('%s', start))) as length from ...> events where ...> start < datetime('now', '+1 day','start of day', '+9 hours','+30 minutes') ...> and end > datetime('now', '+1 day','start of day', '+9 hours','+30 minutes') ...> group by name; idname kind start end length -- 3 joe hour 2009-06-13 09:00:00 2009-06-13 10:00:00 3600 4 tom day 2009-06-13 00:00:00 2009-06-14 00:00:00 3600 sqlite> However this result returned is very wrong.The length col is correct but the other cols for "tom" are wrong. (It should be "2 | tom | hour | 2009-06-13 09:00:00 | 2009-06-13 10:00:00 | 3600" ) What am I missing here? Am I doing the query wrong? Thank you, :) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users