Re: [sqlite] very large SQLite tables

2009-07-18 Thread Christian Smith
On Wed, Jun 24, 2009 at 02:21:09PM -0500, Matthew O'Keefe wrote: > > > We are using SQLite for indexing a huge number (i.e., 100 million to 1 > billion) of key pairs > that are represented by an 88-byte key. We are using a single table with a > very large number of rows (one for each data

Re: [sqlite] very large SQLite tables

2009-07-07 Thread Matthew O'Keefe
> We appreciate all the feedback on our questions regarding large SQLite tables. We are running a variety of performance tests and hope to post the results soon. This might provide the start of a thread regarding performance tuning SQLite for this particular workload. Thanks, Matt > >

Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
I have actually implemented such a structure, and it worked well. Kosenko Max wrote: > You're talking about db size much less than 1 billion records. > > In 1 billion records db with described scenario cache hit ratio so small > that everything you're talking about just very close to zero

Re: [sqlite] very large SQLite tables

2009-06-27 Thread Kosenko Max
You're talking about db size much less than 1 billion records. In 1 billion records db with described scenario cache hit ratio so small that everything you're talking about just very close to zero difference in effect. Because 1 uncached random IO operation is 10ms. Any reasonable calculations

Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
Quite wrong. Searching a B-Tree is relatively inexpensive but node splits are expensive. Inserting a non-terminal key in a part filled leaf node is cheap, inserting a terminal key is more expensive and a split is more expensive again The reason we spend the extra resources maintaining

Re: [sqlite] very large SQLite tables

2009-06-27 Thread Kosenko Max
Expenses in B-Tree not in the node splitting (that is really not that often and takes small amount of time). As I've said - it's in finding right place to insert. Root level which takes 1 page will do the same as your hash index. And will use much less space in cache. This root page in such DB

Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
This technique is used extensively in disk cacheing and in maintaining file directories with huge numbers of files.. I would expect it toincrease key insertion speed because it removes a level of index in the B-tree of each index. The expensive activity in a B-tree index insertion is a node

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
Well, I understand idea in general and how it works. But as you have described in second part of your letter - this won't help. Even if you will create 100 tables that will save you just 1 step from 5-7 IO steps, but won't make Cache hit ratio significantly higher. And I'm pretty sure that even

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Jay A. Kreibich
On Fri, Jun 26, 2009 at 10:06:48AM -0700, Kosenko Max scratched on the wall: > > > Doug Fajardo wrote: > > No, I admit I haven't tried this under SQLITE. > > > > Whether this approach will help for the specific application will depend > > on data usage patterns, which we haven't delved into for

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
Doug Fajardo wrote: > No, I admit I haven't tried this under SQLITE. > > Whether this approach will help for the specific application will depend > on data usage patterns, which we haven't delved into for this application. > Call me simple: since the main issue is degraded performance with

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Douglas E. Fajardo
a suggestion :-) *** Doug Fajardo -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Kosenko Max Sent: Friday, June 26, 2009 4:06 AM To: sqlite-users@sqlite.org Subject: Re: [sqlite] very large SQLite tables Have you ever tested

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
John Stanton-3 wrote: > Why would it not work? It is just adding an extra top level to the > index. A tried and true method. It will work. But won't give performance benefit. And from my undestanding it will even slow down things. You can place parts of index in different DB and on different

Re: [sqlite] very large SQLite tables

2009-06-26 Thread John Stanton
Why would it not work? It is just adding an extra top level to the index. A tried and true method. Kosenko Max wrote: > Have you ever tested such proposal? > I believe that doesn't works. > > > Doug Fajardo wrote: > >> One approach might be to split the big, monolithic table into some

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
I forgot to say about hash... My personal choice will be MurmurHash2 64 bit function http://murmurhash.googlepages.com/ http://en.wikipedia.org/wiki/MurmurHash2 - lots of implementations here It's fast (even in managed impls), have good characteristics and free. Don't use CRC64... P.S. You

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
Matthew O'Keefe wrote: > We wanted to post to the mailing list to see if there are any obvious, > first-order things we can try to improve performance for such a large > table. The problem with slow inserts generally speaking lies in the problem of cache miss. Imagine that each new insert in

Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max
Have you ever tested such proposal? I believe that doesn't works. Doug Fajardo wrote: > > One approach might be to split the big, monolithic table into some number > of hash buckets, where each 'bucket' is separate table. When doing a > search, the program calculates the hash and accesses

Re: [sqlite] very large SQLite tables

2009-06-25 Thread Ken
Along the same lines, the buckets could be created in their own unique Sqlite Db, thus improving concurrency as well!!! --- On Thu, 6/25/09, Douglas E. Fajardo <dfaja...@symark.com> wrote: > From: Douglas E. Fajardo <dfaja...@symark.com> > Subject: Re: [sqlite] very l

Re: [sqlite] very large SQLite tables

2009-06-25 Thread Simon Slavin
On 25 Jun 2009, at 5:43pm, P Kishor wrote: > heck! Do two comparisons -- SQLite v. BerkeleyDB v. Tokyo Cabinet. > > Nothing like thorough testing for the purpose of science. :-) Yes, there is: keeping to departmental budget and project deadlines. (I know, the argument is that the research will

Re: [sqlite] very large SQLite tables

2009-06-25 Thread P Kishor
On Thu, Jun 25, 2009 at 12:36 PM, David Fletcher wrote: > >> "PK" == P Kishor writes: > >>> As expected, as the table grows, the underlying B-tree >>> implementation for SQLite means that the number of disks accesses >>> to (a) find, and (b) add a chunk, grows larger and

Re: [sqlite] very large SQLite tables

2009-06-25 Thread David Fletcher
> "PK" == P Kishor writes: >> As expected, as the table grows, the underlying B-tree >> implementation for SQLite means that the number of disks accesses >> to (a) find, and (b) add a chunk, grows larger and larger. We¹ve >> tested up to 20 million chunks represented in the table: as >>

Re: [sqlite] very large SQLite tables

2009-06-25 Thread P Kishor
On Wed, Jun 24, 2009 at 3:21 PM, Matthew O'Keefe wrote: > > > We are using SQLite for indexing a huge number (i.e., 100 million to 1 > billion) of key pairs > that are represented by an 88-byte key. We are using a single table with a > very large number of rows (one for

Re: [sqlite] very large SQLite tables

2009-06-25 Thread Douglas E. Fajardo
O'Keefe Sent: Wednesday, June 24, 2009 12:21 PM To: sqlite-users@sqlite.org Subject: [sqlite] very large SQLite tables We are using SQLite for indexing a huge number (i.e., 100 million to 1 billion) of key pairs that are represented by an 88-byte key. We are using a single table with a very large

Re: [sqlite] very large SQLite tables

2009-06-25 Thread D. Richard Hipp
On Jun 24, 2009, at 3:21 PM, Matthew O'Keefe wrote: > disks accesses to (a) find, and (b) add a chunk, grows larger and > larger. We’ve tested up to 20 million chunks represented in the > table: as expected performance exponentially decreases as the number > of table entries grows. Did

[sqlite] very large SQLite tables

2009-06-25 Thread Matthew O'Keefe
We are using SQLite for indexing a huge number (i.e., 100 million to 1 billion) of key pairs that are represented by an 88-byte key. We are using a single table with a very large number of rows (one for each data chunk), and two columns. The table has two columns. One is of type ³text² and the