Re: [sqlite] DB Partitioning and Performance

Michael Hari Fri, 27 May 2016 09:56:39 -0700

Thanks for the reply Simon. 

We have database pooling since our databases are read only and we wanted to 
take advantage of multithreading. Each database connection had the temp view 
defined, but I saw a huge performance drop when it came to queries.


I’m going to try sub-selects for now with the suggestion from R Smith that I 
filter those sub-selects. Hopefully it works. 



Thanks Simon and R Smith!

On 5/27/16, 5:29 AM, "sqlite-users-boun...@mailinglists.sqlite.org on behalf of 
Simon Slavin" <sqlite-users-boun...@mailinglists.sqlite.org on behalf of 
slav...@bigfraud.org> wrote:

>
>On 27 May 2016, at 3:17am, Michael Hari <michael.h...@slalom.com> wrote:
>
>> I have a database that contains 2014,2015 and 2016’s worth of data broken 
>> down by year and quarter. In total, this db was 2.36 GB. Because of a 
>> replication requirement where the DB has to be under 2 GB, I’ve split the DB 
>> into 3 smaller databases by year (2014.db,2015.db,2016.db).
>> 
>> I would need to access at most two databases in one query for a year over 
>> year calculation over 6 quarters. Do I have the same performance as having 
>> one table with all the data if I write my sql this way?
>> 
>> Attach 2014.db as 2014
>> Attach 2015.db as 2015
>> 
>> (..Calculation..) FROM (SELECT * FROM 2014.datatable  UNION ALL SELECT * 
>> FROM 2015.datatable)
>> 
>> You can assume the separate databases have the same indexing as the single 
>> db, where it’s indexed by year and quarter.
>
>Thank you for your very clear explanation of your setup and your question, 
>which has saved lots of back-and-forth.  Your performance will be a little 
>slower than it would be if all the data was in one database file, but not 
>much.  Given what you're doing and why you're doing it I think you have found 
>a good solution and would not do it any other way.  A couple of notes:
>
>A) Using a database alias that starts with a digit makes me uncomfortable.  I 
>can't point at any documentation or particular problem this will cause but if 
>I was doing it I'd put at least a 'Y' for 'year' in front of the database 
>name, as in "ATTACH 2014.db AS Y2014".
>
>B) For equal treatment you may want to create an empty database and use that 
>as the one you connect to.  Then attach the yearly databases to this 
>connection including the one for the current year.  This may simplify what 
>needs to be done at year-end.
>
>C) Once you have attached all your databases you can create a view which will 
>simplify your access to the data:
>
>CREATE TEMPORARY VIEW datatableall AS SELECT * FROM Y2014.datatable UNION ALL 
>SELECT * FROM Y2015.datatable UNION ALL SELECT * FROM Y2016.datatable
>
>From then on you don't need to mention the separate tables in code which 
>consults the tables, just treat datatableall as if it's one table.
>
>D) VACUUM databases which will no longer change.  VACUUM the current year's 
>database at year-end.
>
>Simon.
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>https://urldefense.proofpoint.com/v2/url?u=http-3A__mailinglists.sqlite.org_cgi-2Dbin_mailman_listinfo_sqlite-2Dusers&d=CwIGaQ&c=fa_WZs7nNMvOIDyLmzi2sMVHyyC4hN9WQl29lWJQ5Y4&r=Za210bkAr8T8OGv7-RSux2dLtMfHgyYbTEN9AO6LXXg&m=Iv0ThknGqZ6jlx5nm2YCXf3G5PvjrMtxSMJCSiOMmcI&s=T4x-97XOSoM43DLhZGThEZOOht0kLdrxKTkHQv6_Sd0&e=
> 
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] DB Partitioning and Performance

Reply via email to