Re: [sqlite] DB rotate / incremental
> I currently use a single database and it works fine. > The trouble is replication. > Or the sharing of an ever growing database. I think your question is more architectural than sqlite-specific. If your user base / data set is going to be getting more serious and if you have some time on your hands, it doesn't seem like sqlite is the right tool for you. You may be better off with a nice fat traditional rdbms with built-in replication, load balancing, etc. So what follows is a not-as-good hack, but you could probably get it up in a few hours. If you're only doing inserts and selects then (as far as I understand) existing pages of the database should largely remain constant, regardless of your schema. I'm sure the sqlite experts can expound on that or explain in what circumstances it's wrong. If that's the case, then you could leverage an incremental file transfer program like rsync(1) to do replication. Keep two copies of your database around, one with all the latest stuff (call it 'current') and one replication candidate that you'll only update every once in a while (call it 'historical'). When you want, attach 'historical' and sync the latest stuff from 'current' into it. Obviously you want that to be fast (not scan the whole table), so hopefully your schema permits that. Then detach 'historical' and rsync 'historical' to wherever it needs to go. Only the new bytes will go over the wire. You implied you'd already tried something like that -- just wanted to point out rsync in particular if you hadn't looked into it. Another point is that, depending on the specifics of your data set and the metrics you're calculating on it, you may be able to accumulate the answers to your big 'select' queries as you go along, in a separate metrics table. E.g. use triggers, or run the 'select' only on each new chunk of data and update the metrics table accordingly. Then you can replace your big select on the main table with a simple (hopefully constant-time) select on the metrics table. This only works if your metrics have the nice property that they can be incrementally calculated using O(1) space. E.g. sum, count, mean, stddev, but not (afaik) median. In fact, if you are only inserting (never deleting), and you already know what all your metrics are, and they have that property, then you don't need to keep the main table at all -- just the metrics table :) Eric -- Eric A. Smith Keeping Young #1: Avoid fried meats which angry up the blood. -- Satchel Paige ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DB rotate / incremental
There is just one table, data is simple, fits well in CSV, but I like to do SQL query on this data and sqlite3 packs it pretty well. The data comes from external source and changes constantly, I just grab a snapshot of it every minute. I currently use a single database and it works fine. The trouble is replication. Or the sharing of an ever growing database. This data is accumulated by my server, and I have several computers on which would like to analyse the data. I don't need up-to-the-minute freshness of the data I analyse, I'm alright at analysing "all i got until yesterday at midnight". Being able to rotate the database or use it in an incremental manner would allow me to copy only what has changed since last time without the complication of writting a replication protocol (right now it appears as if the whole db changes all the time, so the only safe alternative I have is perhaps some kind of delta patch, but I still need to compare 2 states of the DB file all the time, and I would need to patch the target machine properly also). As for grepping, I could do that, but I would loose the power SQL offers to help at some analysis and calculations. So this comes back to the idea of accumulation in rotation every day, then when reading, just copy the result of many selects (like a recursive grep) and insert that into a temp table for further analysis. I think this is the simplest option I have for now, but I wonder how viable it can be when the DB grows bigger. Simon On Thu, Oct 28, 2010 at 12:03 PM, Black, Michael (IS)wrote: > My first question is: > > Why are you using a database? Sounds to me like grep might do what you want. > All you're doing is inserts and selects. > > Can you show what kind of selects you're doing or your database schema? > > And why do you need to rotate the database? Why can't you just keep it all > in one? > > > Michael D. Black > Senior Scientist > Advanced Analytics Directorate > Northrop Grumman Information Systems > > > > > From: sqlite-users-boun...@sqlite.org on behalf of Simon > Sent: Thu 10/28/2010 10:38 AM > To: General Discussion of SQLite Database > Subject: EXTERNAL:[sqlite] DB rotate / incremental > > > > Hi, i'm not sure what exact vocabulary I should use for this... > > I'm looking for a way or technique, that allows rotating the database > just like some tools will rotate my /var/log/messages on Linux. So, > to rotate a database, I'm guessing the current DB needs to be renamed > and kept nearby, new DB will contain additions and modification to the > previous DB. But this would need to take care of many many older DBs > like this. > > This would be complicated if the rotation/increments need to take care > of all scenarios... but my scenario is very similar to the Linux > system log. The calls that modify the DB are just inserts and rare > but huge selects, zero deletes and zero updates. With rotation I can > simply copy all "previous" files that will never change on different > computers, and only "today" will change constantly (but it's currently > less than 1MB per day). Writing to this "rotated DB" is fine, reading > from it is the catch. > > I've looked into ATTACH, but the doc says "the number of attached > databases cannot be increased above 30 on a machines with a 32-bit > integer". I would have to merge the "old" database at some point, but > this is less optimal than writting a sync protocol i think. > > I've looked into using virtual tables on the reader end. Inserter > would just open a new db file each day, with no regards to the past. > But the reader could use vtables to go around the ATTACH limit. I > think this approach is overly complicated though... > > Maybe I should program my own means of "attaching" the results of > multiple queries on multiple databases. I guess I could use a > :MEMORY: database to accumulate the result, but then perhaps I should > just use a temp table and "copy" the result of a series of > attach-select-detach operations on all the databases. > > I'm looking for suggestions. I don't have much experience with sqlite > so I believe it's likely I misunderstood a few things and will > probably run into some other limits. Let me know if you think some of > the above is good or just can't work at all, and let me know if you > have any other ideas. > > Thanks! > Simon > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] DB rotate / incremental
My first question is: Why are you using a database? Sounds to me like grep might do what you want. All you're doing is inserts and selects. Can you show what kind of selects you're doing or your database schema? And why do you need to rotate the database? Why can't you just keep it all in one? Michael D. Black Senior Scientist Advanced Analytics Directorate Northrop Grumman Information Systems From: sqlite-users-boun...@sqlite.org on behalf of Simon Sent: Thu 10/28/2010 10:38 AM To: General Discussion of SQLite Database Subject: EXTERNAL:[sqlite] DB rotate / incremental Hi, i'm not sure what exact vocabulary I should use for this... I'm looking for a way or technique, that allows rotating the database just like some tools will rotate my /var/log/messages on Linux. So, to rotate a database, I'm guessing the current DB needs to be renamed and kept nearby, new DB will contain additions and modification to the previous DB. But this would need to take care of many many older DBs like this. This would be complicated if the rotation/increments need to take care of all scenarios... but my scenario is very similar to the Linux system log. The calls that modify the DB are just inserts and rare but huge selects, zero deletes and zero updates. With rotation I can simply copy all "previous" files that will never change on different computers, and only "today" will change constantly (but it's currently less than 1MB per day). Writing to this "rotated DB" is fine, reading from it is the catch. I've looked into ATTACH, but the doc says "the number of attached databases cannot be increased above 30 on a machines with a 32-bit integer". I would have to merge the "old" database at some point, but this is less optimal than writting a sync protocol i think. I've looked into using virtual tables on the reader end. Inserter would just open a new db file each day, with no regards to the past. But the reader could use vtables to go around the ATTACH limit. I think this approach is overly complicated though... Maybe I should program my own means of "attaching" the results of multiple queries on multiple databases. I guess I could use a :MEMORY: database to accumulate the result, but then perhaps I should just use a temp table and "copy" the result of a series of attach-select-detach operations on all the databases. I'm looking for suggestions. I don't have much experience with sqlite so I believe it's likely I misunderstood a few things and will probably run into some other limits. Let me know if you think some of the above is good or just can't work at all, and let me know if you have any other ideas. Thanks! Simon ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users