On Tuesday, 5 February, 2019 15:12, Gerlando Falauto wrote: > I could've just used directories and logfiles instead of abusing > a relational database but I just thought it would more convenient > to issue a query and use a cursor.
Well, the "abusing a relational database" is the correct terminology because you have not normalized the data before storing it. Failure to normalize causes all sorts of problems when you attempt to "retrieve" and "store" data -- if the data is not normalized you should not really expect anything more performant than would be obtained by simply sequentially scanning sequential log files. Also, relational databases do not have cursors (cursors are an illusion). Relational databases operate on "sets" of data all at one get-go. The appearance of "sequential execution row by row" is merely an artifice of the current limits of computer technology, and what some refer to a "cursors" are merely implementation details or "magic programming" allowing access to a "set" in a row by row fashion. You probably want separate tables for "source1" and for "source2" because these are not keys of the events, they are attributes of the events, and contain duplicate data. Similarly the timestamp is an attribute of an event, not a key of an event. The only true key of the event is a pseudo-key conveying the order in which the events happened and nothing more (for which a simple record number or "integer primary key") will suffice. Now you have to decide whether or not "source2" is dependant on "source1" or not. This is rather simple. If the same "source2" can occur in combination with several "source1" then it is independant. (This also depends on what you want to do with the data after you have it.) So now you have the following minimally normalized schema: create table Source1 ( id integer primary key, name text collate nocase unique ); create table Source2 ( id integer primary key, name text collate nocase unique ); create table Events ( id integer primary key, Source1id integer not null references Source1, Source2id integer not null references Source2, ts real not null, --- other data items --- ); create index EventSource1 on Events (Source1id, Source2id); create index EventSource2 on Events (Source2id, Source1id); create index EventTS on Events (ts); So now if you want all the combinations of source1 and source2 it is simply: select source1.name, source2.name from source1, source2 order by 1, 2; If you only want them where there are events then: select source1.name, source2.name from source1, source2 where exists (select * from events where events.source1id == source1.id and events.source2id == source2.id) order by 1, 2; If you want all the stuff between two ts you can do: select source1.name, source2.name, events.* from source1, source2, events where source1.id == events.source1id and source2.id == events.source2id and events.id between coalesce((select max(id) from events where ts <= :StartTime), (select min(id) from events)) and coalesce((select min(id) from events where ts >= :StartTime), (select max(id) from events)) order by events.id; If you only want the data for source2 = 'GATE' you can do the same and add: and source2.name == 'GATE' or probably more efficient: and events.source2id == (select id from source2 where name == 'GATE') and even other combinations: and events.source1id in (select id from source1 where name in ('ABA', 'AXX')) and events.source2id in (select id from source2 where name == 'GATE') and so on and so forth, and let the query optimizer decide the best way to actually retrieve the data you requested. --- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users