On 27 May 2016, at 3:17am, Michael Hari <michael.h...@slalom.com> wrote:

> I have a database that contains 2014,2015 and 2016’s worth of data broken 
> down by year and quarter. In total, this db was 2.36 GB. Because of a 
> replication requirement where the DB has to be under 2 GB, I’ve split the DB 
> into 3 smaller databases by year (2014.db,2015.db,2016.db).
> 
> I would need to access at most two databases in one query for a year over 
> year calculation over 6 quarters. Do I have the same performance as having 
> one table with all the data if I write my sql this way?
> 
> Attach 2014.db as 2014
> Attach 2015.db as 2015
> 
> (..Calculation..) FROM (SELECT * FROM 2014.datatable  UNION ALL SELECT * FROM 
> 2015.datatable)
> 
> You can assume the separate databases have the same indexing as the single 
> db, where it’s indexed by year and quarter.

Thank you for your very clear explanation of your setup and your question, 
which has saved lots of back-and-forth.  Your performance will be a little 
slower than it would be if all the data was in one database file, but not much. 
 Given what you're doing and why you're doing it I think you have found a good 
solution and would not do it any other way.  A couple of notes:

A) Using a database alias that starts with a digit makes me uncomfortable.  I 
can't point at any documentation or particular problem this will cause but if I 
was doing it I'd put at least a 'Y' for 'year' in front of the database name, 
as in "ATTACH 2014.db AS Y2014".

B) For equal treatment you may want to create an empty database and use that as 
the one you connect to.  Then attach the yearly databases to this connection 
including the one for the current year.  This may simplify what needs to be 
done at year-end.

C) Once you have attached all your databases you can create a view which will 
simplify your access to the data:

CREATE TEMPORARY VIEW datatableall AS SELECT * FROM Y2014.datatable UNION ALL 
SELECT * FROM Y2015.datatable UNION ALL SELECT * FROM Y2016.datatable

From then on you don't need to mention the separate tables in code which 
consults the tables, just treat datatableall as if it's one table.

D) VACUUM databases which will no longer change.  VACUUM the current year's 
database at year-end.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to