Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-14 Thread Dennis Cote
Allen Fowler wrote:
>
>
>   
>> You could be storing event duration, not stop time.  Or perhaps store  
>> both.
>>
>> 
>
> Here is what I have so far:
>
> sqlite> create table events (id INTEGER PRIMARY KEY AUTOINCREMENT, name, 
> kind, start, end);
>
> # Now add some events for "tomorrow"
> sqlite>
> insert into events values (null, 'tom', 'hour', datetime('now', '+1
> day','start of day', '+11 hours'), datetime('now', '+1 day','start of
> day', '+12 hours'));
> sqlite> insert into events values (null,
> 'tom', 'hour', datetime('now', '+1 day','start of day', '+9 hours'),
> datetime('now', '+1 day','start of day', '+10 hours'));
> sqlite>
> insert into events values (null, 'joe', 'hour', datetime('now', '+1
> day','start of day', '+9 hours'), datetime('now', '+1 day','start of
> day', '+10 hours'));
>
> # Now add an all-day event for tomorrow.  (It overlaps a couple of above 
> events.)
> sqlite> insert into events values (null,
> 'tom', 'day', datetime('now', '+1 day','start of day'), datetime('now',
> '+1 day','start of day', '+1 day'));
>
> # Show all events and duration:
> sqlite> select *, (strftime('%s', end) -  strftime('%s', start)) as length 
> from events;
> idname  kind  start   
>   end   length
>       
>     --
> 1 tom   hour  2009-06-13 
> 11:00:00   2009-06-13 12:00:00   3600  
> 2 tom   hour  2009-06-13 
> 09:00:00   2009-06-13 10:00:00   3600  
> 3 joe   hour  2009-06-13 
> 09:00:00   2009-06-13 10:00:00   3600  
> 4 tom   day   2009-06-13 
> 00:00:00   2009-06-14 00:00:00   86400
>
>
> #
> # And now, I want to get a result table with one row per user showing the 
> "shortest active at 9:30 AM event" for each user.
>
> sqlite>
> select *, min((strftime('%s', end) -  strftime('%s', start))) as length
> from
>...> events where 
>...> start < datetime('now', '+1 day','start of day',
> '+9 hours','+30 minutes') 
>...> and end >  datetime('now', '+1 day','start
> of day', '+9 hours','+30 minutes')
>...> group by name;
>
> idname  kind  start   
>   end   length
>       
>     --
> 3 joe   hour  2009-06-13 
> 09:00:00   2009-06-13 10:00:00   3600  
> 4 tom   day   2009-06-13 
> 00:00:00   2009-06-14 00:00:00   3600  
> sqlite> 
>
> However this result returned is very wrong.The length col is correct but 
> the other cols for "tom" are wrong.  (It should be "2 | tom | hour | 
> 2009-06-13 09:00:00 | 2009-06-13 10:00:00 | 3600" )
>
> What am I missing here?  Am I doing the query wrong?
>
>   
Allen,

Yes, you are doing something wrong. It is probably the most common 
mistake made when using grouping in SQL.

When grouping, keep in mind that all columns that appear in your SELECT 
column list, that are not aggregated (used along with one of the SQL 
aggregate functions), have to appear in the GROUP BY clause too.

The only valid columns in your output (i.e. with a select *, min()) is 
the column named in the group by clause (i.e. the name) and the 
aggregate value (i.e. the min()). You have determined the length of the 
minimum event for each name that meets your other conditions and nothing 
else. You do not know its id, what kind it is, or when it starts or 
ends. The values shown for those other fields are randomly selected from 
the set of rows in the matching group.

SQLite and some other database programs don't complain when you to break 
this rule because it is sometimes useful to get a randomly selected 
value for a column in the group. Usually it just leads to the confusion 
you are seeing.

HTH
Dennis Cote


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Igor Tandetnik
Allen Fowler wrote:
> # Show all events and duration:
> sqlite> select *, (strftime('%s', end) -  strftime('%s', start)) as
> length from events;
> idname  kind
> start end   length
>      
>     -- 1
> tom   hour  2009-06-13 11:00:00
> 2009-06-13 12:00:00   3600 2 tom
> hour  2009-06-13 09:00:00   2009-06-13 10:00:00
> 3600 3 joe   hour
> 2009-06-13 09:00:00   2009-06-13 10:00:00   3600 4
> tom   day   2009-06-13 00:00:00
> 2009-06-14 00:00:00   86400
>
>
> #
> # And now, I want to get a result table with one row per user showing
> the "shortest active at 9:30 AM event" for each user.

select * from events e1
where id in (
  select id from events e2 join
(select datetime('now', '+1 day','start of day', '+9 hours','+30 
minutes') x)
  where e1.name = e2.name and start < x and end > x
  order by (julianday(end) - julianday(start)) limit 1
);

The variant below is more complicated, but may run faster:

select * from events e1
where id in (
  select (
select id from events e2
where e2.name = names.name and start < x and end > x
order by (julianday(end) - julianday(start)) limit 1
  )
  from (select distinct name from events) names join
(select datetime('now', '+1 day','start of day', '+9 hours','+30 
minutes') x)
);

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Allen Fowler wrote:
> Indeed, I am aware that SQL is not a "traditional" programming language 
> per-se and have will now be writing the calendar logic at the application 
> level.  (Looking at Python...)

You may want to get a good understanding of what SQL actually is
(relational calculus).  Mike Owens' "The Definitive Guide to SQLite"
covers it in depth.

Quite simply if you can't express what you want in another general
purpose language like Python then you have zero hope of expressing it in
SQL.

As to how to write something like Python that is easy to translate to
SQL, use for loops as I stated.  For example:

  for row in events:
 if row["start-time"]>xxx: continue
 ...
 group[row["id"]].append(row)

Would turn into a select with where and groupby phrases.

> In this application, there are going to be a very large number of concurrent 
> "readers" that want to know at any given time who is doing what.   Where 
> "doing what" is defined as the most specific event the user currently has 
> scheduled.

There is a pseudo "free/busy" standard for that which many existing
calendaring systems can generate.

> The "event table" I have been asking about is something that I envision my 
> app will generate on a daily/hourly basis as the *output* of all the 
> calendering logic.  

Note that is effectively premature optimisation :-)  It seemed unclear
if you have worked out how to do the underlying calculations first.  You
may find that you can do them fast enough without caching the results.

In any event if you do choose Python then you can use the APSW wrapper
to write a virtual table.  (Disclosure: I am the author of APSW).  You
can present the results of the calculations as the virtual table
contents.  The BestIndex callback tells you what the constraints are.
This approach means that you will never be out of date since the
calculated data is not physically stored in the database.  It has the
drawback that the virtual tables are only available in processes that
have loaded your virtual table code.

> In this way, I hope to let sqlite, and eventually an SQL server, handle all 
> the multi-user connection and scaling issues.

Are you certain you will have scaling issues?  You'll certainly have
some if calculated generated event items are stuff into the database
(how far into the past and future will they go, how many updates would
need to be done if an event is changed?)

You may want to investigate using an existing calendar server rather
than reinventing most of the wheel.  As all the messages keep pointing
out, calendaring is a horrible messy thing with lots of tedious petty
details and several projects have already done that work for you :-)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkoy79oACgkQmOOfHg372QTmMgCdGky6/ipTjNN/T4kmDdvQj+2o
ksEAn1aHs/gy4F7j2qoBXhhgFZrQ6sMq
=hE/G
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread John Machin
On 13/06/2009 9:05 AM, Allen Fowler wrote:

> Indeed, I am aware that SQL is not a "traditional"
 > programming language per-se and have will now be writing
 > the calendar logic at the application level.  (Looking at Python...)

Don't look any further :-)
Check out the dateutil module...
http://labix.org/python-dateutil
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Allen Fowler

> Simon Slavin wrote:
> 
> 
> I'm not sure you appreciate what Roger (please be more careful about  
> your quoting, by the way) is telling you.  SQL is not a programming  
> language.  It's a way of accessing a database.  The two are not at all  
> equivalent: everything you can do in SQL you can do in any good  
> programming language, but not vice versa.
> 
>
> You're being urged to write your program, not to try to use as many  
> SQL tricks as you can.  
>

Hmm... Sorry if I mis-quoted thought I had it right.

Indeed, I am aware that SQL is not a "traditional" programming language per-se 
and have will now be writing the calendar logic at the application level.  
(Looking at Python...)

In this application, there are going to be a very large number of concurrent 
"readers" that want to know at any given time who is doing what.   Where "doing 
what" is defined as the most specific event the user currently has scheduled.

The "event table" I have been asking about is something that I envision my app 
will generate on a daily/hourly basis as the *output* of all the calendering 
logic.  In this way, I hope to let sqlite, and eventually an SQL server, handle 
all the multi-user connection and scaling issues.  This is also especially 
useful as there is other data in the DB that will need to be retrieved anyway 
for every currently active event.  (This is why I am trying to generate that 
list of event IDs inside the DB to start with.)


> Write code to manipulate dates and times, to  
> implement repeating events, and to present a decent user-interface to  
> your users.  SQL is going to do lots of work for you but it's not  
> going to come up with all the answers you want.  Sooner or later  
> you're going to have to do some filtering or sorting in your software.
> 

Indeed,   


Thank you for your help,
:)



  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Simon Slavin

On 12 Jun 2009, at 11:14pm, Allen Fowler wrote:

>> You'll probably find it easier to write the processing algorithm in  
>> the
>> programming language of your choice.  This will allow you to print  
>> out
>> diagnostics as you go along, play with a debugger etc.  To make it  
>> easy
>> to map back to SQL you should structure it as a for loop over the  
>> rows
>> of data with inclusion/exclusion conditions at the top of the  
>> loop.  If
>> you decide to express it in SQL again you can then run your correct
>> language code against SQL and compare the results as your test
>> framework.  (Strictly speaking you are already doing this but the
>> programming language is your brain :-)
>>
>
>
> Thank you for the advice   Can you point me to some examples of  
> code in a traditional language and the equivalent in SQL?

I'm not sure you appreciate what Roger (please be more careful about  
your quoting, by the way) is telling you.  SQL is not a programming  
language.  It's a way of accessing a database.  The two are not at all  
equivalent: everything you can do in SQL you can do in any good  
programming language, but not vice versa.

You're being urged to write your program, not to try to use as many  
SQL tricks as you can.  Write code to manipulate dates and times, to  
implement repeating events, and to present a decent user-interface to  
your users.  SQL is going to do lots of work for you but it's not  
going to come up with all the answers you want.  Sooner or later  
you're going to have to do some filtering or sorting in your software.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Allen Fowler




> >
> > What am I missing here?  Am I doing the query wrong?
> 
> Yes. The "group by" doesn't know which rows to use for columns that  
> are not either aggregate functions (such as min) or grouped columns  
> (such as name). You know what min() does, but the query processor  
> doesn't.
> 

Ok, this I understand.

> You can do this in two steps by getting the min time length per name,  
> and joining that with the original table augmented with length.
> 

How should I go about this?


  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Allen Fowler




> 
> I'd recommend continuing down the path you are exploring which is having
> test data and tweaking/tuning/correcting your queries until they are
> acceptable.
> 
> You'll probably find it easier to write the processing algorithm in the
> programming language of your choice.  This will allow you to print out
> diagnostics as you go along, play with a debugger etc.  To make it easy
> to map back to SQL you should structure it as a for loop over the rows
> of data with inclusion/exclusion conditions at the top of the loop.  If
> you decide to express it in SQL again you can then run your correct
> language code against SQL and compare the results as your test
> framework.  (Strictly speaking you are already doing this but the
> programming language is your brain :-)
> 


Thank you for the advice   Can you point me to some examples of code in a 
traditional language and the equivalent in SQL?



  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Doug Currie
On Jun 12, 2009, at 3:46 PM, Allen Fowler wrote:

> sqlite>
> select *, min((strftime('%s', end) -  strftime('%s', start))) as  
> length
> from
>   ...> events where
>   ...> start < datetime('now', '+1 day','start of day',
> '+9 hours','+30 minutes')
>   ...> and end >  datetime('now', '+1 day','start
> of day', '+9 hours','+30 minutes')
>   ...> group by name;
>
> idname  kind   
> start end   length
>        
>     --
> 3 joe   hour   
> 2009-06-13 09:00:00   2009-06-13 10:00:00   3600
> 4 tom   day
> 2009-06-13 00:00:00   2009-06-14 00:00:00   3600
> sqlite>
>
> However this result returned is very wrong.The length col is  
> correct but the other cols for "tom" are wrong.  (It should be "2 |  
> tom | hour | 2009-06-13 09:00:00 | 2009-06-13 10:00:00 | 3600" )
>
> What am I missing here?  Am I doing the query wrong?

Yes. The "group by" doesn't know which rows to use for columns that  
are not either aggregate functions (such as min) or grouped columns  
(such as name). You know what min() does, but the query processor  
doesn't.

You can do this in two steps by getting the min time length per name,  
and joining that with the original table augmented with length.

e
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Allen Fowler wrote:
> What am I missing here?  Am I doing the query wrong?

It would take a considerable number of iterations to help get the
queries exactly correct, especially as there are lots of details as you
note such as sort order etc :-)

I'd recommend continuing down the path you are exploring which is having
test data and tweaking/tuning/correcting your queries until they are
acceptable.

You'll probably find it easier to write the processing algorithm in the
programming language of your choice.  This will allow you to print out
diagnostics as you go along, play with a debugger etc.  To make it easy
to map back to SQL you should structure it as a for loop over the rows
of data with inclusion/exclusion conditions at the top of the loop.  If
you decide to express it in SQL again you can then run your correct
language code against SQL and compare the results as your test
framework.  (Strictly speaking you are already doing this but the
programming language is your brain :-)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkoywAAACgkQmOOfHg372QQtRgCgwXukh19LMNZmgW/PlNWtIykM
rVYAnA+kcTHJVEfVzXcl/kev2PlKW7vR
=TySB
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Shortest time interval? [Was: Re: repeating events?]

2009-06-12 Thread Allen Fowler




> 
> You could be storing event duration, not stop time.  Or perhaps store  
> both.
> 

Here is what I have so far:

sqlite> create table events (id INTEGER PRIMARY KEY AUTOINCREMENT, name, kind, 
start, end);

# Now add some events for "tomorrow"
sqlite>
insert into events values (null, 'tom', 'hour', datetime('now', '+1
day','start of day', '+11 hours'), datetime('now', '+1 day','start of
day', '+12 hours'));
sqlite> insert into events values (null,
'tom', 'hour', datetime('now', '+1 day','start of day', '+9 hours'),
datetime('now', '+1 day','start of day', '+10 hours'));
sqlite>
insert into events values (null, 'joe', 'hour', datetime('now', '+1
day','start of day', '+9 hours'), datetime('now', '+1 day','start of
day', '+10 hours'));

# Now add an all-day event for tomorrow.  (It overlaps a couple of above 
events.)
sqlite> insert into events values (null,
'tom', 'day', datetime('now', '+1 day','start of day'), datetime('now',
'+1 day','start of day', '+1 day'));

# Show all events and duration:
sqlite> select *, (strftime('%s', end) -  strftime('%s', start)) as length from 
events;
idname  kind  start 
end   length
      
    --
1 tom   hour  2009-06-13 
11:00:00   2009-06-13 12:00:00   3600  
2 tom   hour  2009-06-13 
09:00:00   2009-06-13 10:00:00   3600  
3 joe   hour  2009-06-13 
09:00:00   2009-06-13 10:00:00   3600  
4 tom   day   2009-06-13 
00:00:00   2009-06-14 00:00:00   86400


#
# And now, I want to get a result table with one row per user showing the 
"shortest active at 9:30 AM event" for each user.

sqlite>
select *, min((strftime('%s', end) -  strftime('%s', start))) as length
from
   ...> events where 
   ...> start < datetime('now', '+1 day','start of day',
'+9 hours','+30 minutes') 
   ...> and end >  datetime('now', '+1 day','start
of day', '+9 hours','+30 minutes')
   ...> group by name;

idname  kind  start 
end   length
      
    --
3 joe   hour  2009-06-13 
09:00:00   2009-06-13 10:00:00   3600  
4 tom   day   2009-06-13 
00:00:00   2009-06-14 00:00:00   3600  
sqlite> 

However this result returned is very wrong.The length col is correct but 
the other cols for "tom" are wrong.  (It should be "2 | tom | hour | 2009-06-13 
09:00:00 | 2009-06-13 10:00:00 | 3600" )

What am I missing here?  Am I doing the query wrong?

Thank you,
:)


  

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users