On Wed, Jun 24, 2009 at 02:21:09PM -0500, Matthew O'Keefe wrote:
>
>
> We are using SQLite for indexing a huge number (i.e., 100 million to 1
> billion) of key pairs
> that are represented by an 88-byte key. We are using a single table with a
> very large number of rows (one for each data chunk),
>
We appreciate all the feedback on our questions regarding large SQLite
tables. We
are running a variety of performance tests and hope to post the
results soon. This might
provide the start of a thread regarding performance tuning SQLite for
this particular workload.
Thanks,
Matt
>
> heck
I have actually implemented such a structure, and it worked well.
Kosenko Max wrote:
> You're talking about db size much less than 1 billion records.
>
> In 1 billion records db with described scenario cache hit ratio so small
> that everything you're talking about just very close to zero differen
You're talking about db size much less than 1 billion records.
In 1 billion records db with described scenario cache hit ratio so small
that everything you're talking about just very close to zero difference in
effect. Because 1 uncached random IO operation is 10ms. Any reasonable
calculations (i
Quite wrong. Searching a B-Tree is relatively inexpensive but node
splits are expensive.
Inserting a non-terminal key in a part filled leaf node is cheap,
inserting a terminal key is more expensive and a split is more expensive
again
The reason we spend the extra resources maintaining B-tree
Expenses in B-Tree not in the node splitting (that is really not that often
and takes small amount of time). As I've said - it's in finding right place
to insert.
Root level which takes 1 page will do the same as your hash index. And will
use much less space in cache. This root page in such DB wi
This technique is used extensively in disk cacheing and in maintaining
file directories with huge numbers of files..
I would expect it toincrease key insertion speed because it removes a
level of index in the B-tree of each index. The expensive activity in a
B-tree index insertion is a node sp
Well, I understand idea in general and how it works. But as you have
described in second part of your letter - this won't help. Even if you will
create 100 tables that will save you just 1 step from 5-7 IO steps, but
won't make Cache hit ratio significantly higher. And I'm pretty sure that
even ha
On Fri, Jun 26, 2009 at 10:06:48AM -0700, Kosenko Max scratched on the wall:
>
>
> Doug Fajardo wrote:
> > No, I admit I haven't tried this under SQLITE.
> >
> > Whether this approach will help for the specific application will depend
> > on data usage patterns, which we haven't delved into for
Doug Fajardo wrote:
> No, I admit I haven't tried this under SQLITE.
>
> Whether this approach will help for the specific application will depend
> on data usage patterns, which we haven't delved into for this application.
> Call me simple: since the main issue is degraded performance with large
7;s why it is only a suggestion :-)
*** Doug Fajardo
-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org]
On Behalf Of Kosenko Max
Sent: Friday, June 26, 2009 4:06 AM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] very large SQLite tables
Have
John Stanton-3 wrote:
> Why would it not work? It is just adding an extra top level to the
> index. A tried and true method.
It will work. But won't give performance benefit. And from my undestanding
it will even slow down things.
You can place parts of index in different DB and on different
Why would it not work? It is just adding an extra top level to the
index. A tried and true method.
Kosenko Max wrote:
> Have you ever tested such proposal?
> I believe that doesn't works.
>
>
> Doug Fajardo wrote:
>
>> One approach might be to split the big, monolithic table into some number
I forgot to say about hash...
My personal choice will be MurmurHash2 64 bit function
http://murmurhash.googlepages.com/
http://en.wikipedia.org/wiki/MurmurHash2 - lots of implementations here
It's fast (even in managed impls), have good characteristics and free.
Don't use CRC64...
P.S. You stil
Matthew O'Keefe wrote:
> We wanted to post to the mailing list to see if there are any obvious,
> first-order things we can try to improve performance for such a large
> table.
The problem with slow inserts generally speaking lies in the problem of
cache miss.
Imagine that each new insert in ind
Have you ever tested such proposal?
I believe that doesn't works.
Doug Fajardo wrote:
>
> One approach might be to split the big, monolithic table into some number
> of hash buckets, where each 'bucket' is separate table. When doing a
> search, the program calculates the hash and accesses reads
Along the same lines, the buckets could be created in their own unique Sqlite
Db, thus improving concurrency as well!!!
--- On Thu, 6/25/09, Douglas E. Fajardo wrote:
> From: Douglas E. Fajardo
> Subject: Re: [sqlite] very large SQLite tables
> To: "sqlite-users@sqlite.org"
On 25 Jun 2009, at 5:43pm, P Kishor wrote:
> heck! Do two comparisons -- SQLite v. BerkeleyDB v. Tokyo Cabinet.
>
> Nothing like thorough testing for the purpose of science. :-)
Yes, there is: keeping to departmental budget and project deadlines.
(I know, the argument is that the research will
On Thu, Jun 25, 2009 at 12:36 PM, David Fletcher wrote:
>
>> "PK" == P Kishor writes:
>
>>> As expected, as the table grows, the underlying B-tree
>>> implementation for SQLite means that the number of disks accesses
>>> to (a) find, and (b) add a chunk, grows larger and larger. We¹ve
>>> tes
> "PK" == P Kishor writes:
>> As expected, as the table grows, the underlying B-tree
>> implementation for SQLite means that the number of disks accesses
>> to (a) find, and (b) add a chunk, grows larger and larger. We¹ve
>> tested up to 20 million chunks represented in the table: as
>> exp
On Wed, Jun 24, 2009 at 3:21 PM, Matthew
O'Keefe wrote:
>
>
> We are using SQLite for indexing a huge number (i.e., 100 million to 1
> billion) of key pairs
> that are represented by an 88-byte key. We are using a single table with a
> very large number of rows (one for each data chunk), and two co
Of Matthew O'Keefe
Sent: Wednesday, June 24, 2009 12:21 PM
To: sqlite-users@sqlite.org
Subject: [sqlite] very large SQLite tables
We are using SQLite for indexing a huge number (i.e., 100 million to 1
billion) of key pairs
that are represented by an 88-byte key. We are using a single table wi
On Jun 24, 2009, at 3:21 PM, Matthew O'Keefe wrote:
> disks accesses to (a) find, and (b) add a chunk, grows larger and
> larger. We’ve tested up to 20 million chunks represented in the
> table: as expected performance exponentially decreases as the number
> of table entries grows.
Did yo
We are using SQLite for indexing a huge number (i.e., 100 million to 1
billion) of key pairs
that are represented by an 88-byte key. We are using a single table with a
very large number of rows (one for each data chunk), and two columns.
The table has two columns. One is of type ³text² and the
24 matches
Mail list logo