Re: [sqlite] very large SQLite tables

2009-07-18 Thread Christian Smith
On Wed, Jun 24, 2009 at 02:21:09PM -0500, Matthew O'Keefe wrote:
> 
> 
> We are using SQLite for indexing a huge number (i.e., 100 million to 1
> billion) of key pairs
> that are represented by an 88-byte key. We are using a single table with a
> very large number of rows (one for each data chunk), and two columns.
> 
> The table has two columns.  One is of type ³text² and the other is type
> ³integer².
> > 
> > The table is created with:
> > 
> > CREATE TABLE chunks
> > (
> >   name text primary key,
> >   pid integer not null
> );
> 
> As expected, as the
> table grows, the underlying B-tree implementation for SQLite means that the
> number of
> disks accesses to (a) find, and (b) add a chunk, grows larger and larger.
> We¹ve tested up
> to 20 million chunks represented in the table: as expected performance
> exponentially 
> decreases as the number of table entries grows.
> 
> We wanted to post to the mailing list to see if there are any obvious,
> first-order things
> we can try to improve performance for such a large table.

Bit late to the game...

Try increasing your page size. The larger page size will result in greater
fan out of the btree, resulting in a shallower tree and less IO requests.

Christian
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-07-07 Thread Matthew O'Keefe
>

We appreciate all the feedback on our questions regarding large SQLite  
tables. We
are running a variety of performance tests and hope to post the  
results soon. This might
provide the start of a thread regarding performance tuning SQLite for  
this particular workload.

Thanks,
Matt

>
> heck! Do two comparisons -- SQLite v. BerkeleyDB v. Tokyo Cabinet.
>
> Nothing like thorough testing for the purpose of science. :-)
>
>
>
> -- 
> Puneet Kishor
> "assertions are politics... backing up assertions with evidence is  
> science"
>

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
I have actually implemented such a structure, and it worked well.

Kosenko Max wrote:
> You're talking about db size much less than 1 billion records.
>
> In 1 billion records db with described scenario cache hit ratio so small
> that everything you're talking about just very close to zero difference in
> effect. Because 1 uncached random IO operation is 10ms. Any reasonable
> calculations (in the current scenario) or even saving new page near current
> page far less than 10ms.
>
> And that's what I've said - that proposal won't help and will make things a
> bit worse and more complex.
>
> In future, when we all will forget 100 IOPS wall and will be hitting 100K-1M
> IOPS, your assumptions might become in place with large DB. But I'm sure -
> tricks like splitting table into 100-1 tables with hash wheel in mind
> won't give any additional significant benefit. Hashtables can be faster in
> case you don't need range operations, but placing hash on top of B-Tree to
> eleminate single b-tree page shouldn't give any speedup.
>
> If you have proven that this trick still works - I will be glad to see code
> sample with benchmarks.
>
> Thanks.
> Max.
>
>
> John Stanton-3 wrote:
>   
>> Quite wrong.   Searching a B-Tree is relatively inexpensive but node 
>> splits are expensive.
>>
>> Inserting a non-terminal key in a part filled leaf node is cheap, 
>> inserting a terminal key is more expensive and a split is more expensive 
>> again
>>
>> The reason we spend the extra resources maintaining B-tree indices is 
>> because they maintain the keys in sorted sequence.  If maintaining keys 
>> in order is not required a hashing method can be faster.
>>
>> Our fastest B-Tree indices use the virtual memory capability of the OS 
>> as cache and perform very well.by avoiding buffer shadowing and 
>> maximizing utilization of physical memory..
>> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-27 Thread Kosenko Max

You're talking about db size much less than 1 billion records.

In 1 billion records db with described scenario cache hit ratio so small
that everything you're talking about just very close to zero difference in
effect. Because 1 uncached random IO operation is 10ms. Any reasonable
calculations (in the current scenario) or even saving new page near current
page far less than 10ms.

And that's what I've said - that proposal won't help and will make things a
bit worse and more complex.

In future, when we all will forget 100 IOPS wall and will be hitting 100K-1M
IOPS, your assumptions might become in place with large DB. But I'm sure -
tricks like splitting table into 100-1 tables with hash wheel in mind
won't give any additional significant benefit. Hashtables can be faster in
case you don't need range operations, but placing hash on top of B-Tree to
eleminate single b-tree page shouldn't give any speedup.

If you have proven that this trick still works - I will be glad to see code
sample with benchmarks.

Thanks.
Max.


John Stanton-3 wrote:
> 
> Quite wrong.   Searching a B-Tree is relatively inexpensive but node 
> splits are expensive.
> 
> Inserting a non-terminal key in a part filled leaf node is cheap, 
> inserting a terminal key is more expensive and a split is more expensive 
> again
> 
> The reason we spend the extra resources maintaining B-tree indices is 
> because they maintain the keys in sorted sequence.  If maintaining keys 
> in order is not required a hashing method can be faster.
> 
> Our fastest B-Tree indices use the virtual memory capability of the OS 
> as cache and perform very well.by avoiding buffer shadowing and 
> maximizing utilization of physical memory..
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24233133.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
Quite wrong.   Searching a B-Tree is relatively inexpensive but node 
splits are expensive.

Inserting a non-terminal key in a part filled leaf node is cheap, 
inserting a terminal key is more expensive and a split is more expensive 
again

The reason we spend the extra resources maintaining B-tree indices is 
because they maintain the keys in sorted sequence.  If maintaining keys 
in order is not required a hashing method can be faster.

Our fastest B-Tree indices use the virtual memory capability of the OS 
as cache and perform very well.by avoiding buffer shadowing and 
maximizing utilization of physical memory..

Kosenko Max wrote:
> Expenses in B-Tree not in the node splitting (that is really not that often
> and takes small amount of time). As I've said - it's in finding right place
> to insert.
>
> Root level which takes 1 page will do the same as your hash index. And will
> use much less space in cache. This root page in such DB will always be in
> cache. So you won't gain any advantage at all. And multi-threading also
> won't use the benefit of multiply tables. At least in SQLite.
>
> That method called partitioning. It gives advantages when partitions divided
> by some logic and there is a high chance to hit fewer partitions in average.
> It also can benefit a bit in case RDBMS supports real parallel execution and
> you have a lot of hard drives. That is not the case with SQLite (well you
> can compile without thread safety and try to do own locks).
>
> I have actually posted a real proposal to make DB much faster. That will
> work.
> Proposal with 100 tables as a hash buckets doesn't works and I've checked
> that a lot of time ago.
> You have a sample where it works and gives any visible benefit? I'd like to
> see that.
>
> My another addition to proposal is to use SSD with as small as possible
> average access time. Some of them can easily do 50-100x faster. And that
> will give 20-50x times faster inserts.
>
> Thank you.
> Max.
>
>
> John Stanton-3 wrote:
>   
>> This technique is used extensively in disk cacheing and in maintaining 
>> file directories with huge numbers of files..
>>
>> I would expect it toincrease key insertion speed because it removes a 
>> level of index in the B-tree of each index.  The expensive activity in a 
>> B-tree index insertion is a node split which requires that key 
>> information be updated in each internal node level and possibly a new 
>> level added.  Fewer levels mean faster performance.
>>
>> This method could also be used to add parallelism by having multiple 
>> threads or processes perform insertions concurrently.  Having each 
>> database in a separate databases would help this approach.
>> It would also help with concurrent read accesses.
>>
>> If this application only has one table and does not need SQL then there 
>> are better solutions than using Sqlite and paying the price for its many 
>> features but not using them.
>> 
>
>   

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-27 Thread Kosenko Max

Expenses in B-Tree not in the node splitting (that is really not that often
and takes small amount of time). As I've said - it's in finding right place
to insert.

Root level which takes 1 page will do the same as your hash index. And will
use much less space in cache. This root page in such DB will always be in
cache. So you won't gain any advantage at all. And multi-threading also
won't use the benefit of multiply tables. At least in SQLite.

That method called partitioning. It gives advantages when partitions divided
by some logic and there is a high chance to hit fewer partitions in average.
It also can benefit a bit in case RDBMS supports real parallel execution and
you have a lot of hard drives. That is not the case with SQLite (well you
can compile without thread safety and try to do own locks).

I have actually posted a real proposal to make DB much faster. That will
work.
Proposal with 100 tables as a hash buckets doesn't works and I've checked
that a lot of time ago.
You have a sample where it works and gives any visible benefit? I'd like to
see that.

My another addition to proposal is to use SSD with as small as possible
average access time. Some of them can easily do 50-100x faster. And that
will give 20-50x times faster inserts.

Thank you.
Max.


John Stanton-3 wrote:
> This technique is used extensively in disk cacheing and in maintaining 
> file directories with huge numbers of files..
> 
> I would expect it toincrease key insertion speed because it removes a 
> level of index in the B-tree of each index.  The expensive activity in a 
> B-tree index insertion is a node split which requires that key 
> information be updated in each internal node level and possibly a new 
> level added.  Fewer levels mean faster performance.
> 
> This method could also be used to add parallelism by having multiple 
> threads or processes perform insertions concurrently.  Having each 
> database in a separate databases would help this approach.
> It would also help with concurrent read accesses.
> 
> If this application only has one table and does not need SQL then there 
> are better solutions than using Sqlite and paying the price for its many 
> features but not using them.

-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24231750.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-27 Thread John Stanton
This technique is used extensively in disk cacheing and in maintaining 
file directories with huge numbers of files..

I would expect it toincrease key insertion speed because it removes a 
level of index in the B-tree of each index.  The expensive activity in a 
B-tree index insertion is a node split which requires that key 
information be updated in each internal node level and possibly a new 
level added.  Fewer levels mean faster performance.

This method could also be used to add parallelism by having multiple 
threads or processes perform insertions concurrently.  Having each 
database in a separate databases would help this approach.
It would also help with concurrent read accesses.

If this application only has one table and does not need SQL then there 
are better solutions than using Sqlite and paying the price for its many 
features but not using them.

Kosenko Max wrote:
> John Stanton-3 wrote:
>   
>> Why would it not work?  It is just adding an extra top level to the 
>> index.  A tried and true method.
>> 
>
> It will work. But won't give performance benefit. And from my undestanding
> it will even slow down things.
> You can place parts of index in different DB and on different HDD thinking
> it will boost the performance.
>
> But the problem is that we aren't talking about multiply selects at the same
> time... We are talking about updating index in sqlite which is
> single-threaded and even under load that wouldn't give you any advantages.
>
> Moreover you're emulating part of B-Tree with that approach and making it
> slower and more space consumptive. So should it work as a solution? No. 
>
> You have an idea why it should work? Tell me so.
>   

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max

Well, I understand idea in general and how it works. But as you have
described in second part of your letter - this won't help. Even if you will
create 100 tables that will save you just 1 step from 5-7 IO steps, but
won't make Cache hit ratio significantly higher. And I'm pretty sure that
even having Most Recently Used cache instead of Most Often Used in SQLite,
underlying OS will cache that step (root page) for you. But having +100
tables will put a lot of overhead and as a result you won't have any benefit
in theory and I'm sure in practice.

BTW, default page size now not less than cluster size and that's most of the
time > 4K.

Thanks. Max.


Jay A. Kreibich-2 wrote:
> 
>   Assuming we have a huge number of data points and that our operations
>   are on random rows, it would be possible to quickly develop the
>   situation you describe: the cache hit-ratio crashes, and each and
>   every B-tree traversal requires us to pull a page off disk.  This
>   makes a deep tree traversal very expensive, as each moving across
>   each level of the tree requires I/O.
> 
>   Now consider the hash system.  We setup 100 tables in a database and
>   use a hash of the key to figure out which table to access.  From
>   there, it is more or less the same thing.
> 
>   What we're doing is cutting the top of the B-tree off and reducing it
>   to 100 (or whatever) sub-trees.  The whole "super-tree" that we've cut
>   off is replaced by the hash algorithm, allowing us to jump right to
>   the correct sub-tree and start our tree traversal there.  This skips
>   traversal of the layers that make up the "super-tree" structure,
>   saving on the I/O.
> 
>   At least in theory.
> 
>   The problem is two fold.  This is dependent on the cache-replacement
>   algorithm used by SQLite (of which I know nothing about), but in
>   theory the nodes that make up the "super-tree" are exactly the nodes
>   you would expect to remain in the cache.  They're used by every
>   lookup on the dataset, after all.  Even if the replacement algorithm
>   is random and they're written over, they're going to be pulled back
>   in soon enough.
> 
>   Second, if I understand the BTree node structure used in SQLite, a
>   single node can hold a fairly hefty number of children.  This is not a
>   binary tree, that's for sure. This means you're not cutting off all
>   that many layers, even with 100 or more tables, which means you're
>   not actually going to see a ton of savings.
> 
>   Overall, I agree that the OP will likely see a noticeable improvement 
>   if they boost the cache size.  Bumping it up 10x or even 100x on a
>   modern workstation is not that big of a deal (100x ~= 300MB if you
>   start with the default 1K page and 2000 page cache).  We see the same
>   thing when you build a big index (which is accessing semi-random
>   data and doing a lot of tree node shuffling).  The best way to
>   increase index build performance is to boost the cache size so that
>   as much of the Btree as possible is always in RAM.  I suspect this is
>   much the same case.
> 

-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24224634.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Jay A. Kreibich
On Fri, Jun 26, 2009 at 10:06:48AM -0700, Kosenko Max scratched on the wall:
> 
> 
> Doug Fajardo wrote:
> > No, I admit I haven't tried this under SQLITE.
> > 
> > Whether this approach will help for the specific application will depend
> > on data usage patterns, which we haven't delved into for this application.
> > Call me simple: since the main issue is degraded performance with larger
> > groupings of data, it seemed to make sense that breaking the data into
> > smaller groupings would help. 
> > 
> > Of course it's very possible that the size of the database in question may
> > mean that the number of hash buckets needed to reap significant benefits
> > makes this approach counter-productive. That's why it is only a suggestion
> > :-)
> 
> I think you're assumptions wrong initially. I just can't imagine scenario
> where your proposal witl give any benefit except wrong implementation of
> B-Tree which is not the case with SQLite.
> 
> As I have posted in answer to thread-starter, degradation of performance
> because of the cache hit ratio becoming less with amount of data. Your
> proposal may work in case keys feeding not random and hitting only several
> tables and that gives higher cache hit ratio. But if that's the case - the
> same will occur with B-Tree. And if that's the case - why not to feed data
> sorted - in that case by logic and as it's proven SQLite will insert the
> data without any performance degradation.
> 
> Could you describe me situation in which your proposal would help and why?


  Assuming we have a huge number of data points and that our operations
  are on random rows, it would be possible to quickly develop the
  situation you describe: the cache hit-ratio crashes, and each and
  every B-tree traversal requires us to pull a page off disk.  This
  makes a deep tree traversal very expensive, as each moving across
  each level of the tree requires I/O.

  Now consider the hash system.  We setup 100 tables in a database and
  use a hash of the key to figure out which table to access.  From
  there, it is more or less the same thing.

  What we're doing is cutting the top of the B-tree off and reducing it
  to 100 (or whatever) sub-trees.  The whole "super-tree" that we've cut
  off is replaced by the hash algorithm, allowing us to jump right to
  the correct sub-tree and start our tree traversal there.  This skips
  traversal of the layers that make up the "super-tree" structure,
  saving on the I/O.

  At least in theory.

  The problem is two fold.  This is dependent on the cache-replacement
  algorithm used by SQLite (of which I know nothing about), but in
  theory the nodes that make up the "super-tree" are exactly the nodes
  you would expect to remain in the cache.  They're used by every
  lookup on the dataset, after all.  Even if the replacement algorithm
  is random and they're written over, they're going to be pulled back
  in soon enough.

  Second, if I understand the BTree node structure used in SQLite, a
  single node can hold a fairly hefty number of children.  This is not a
  binary tree, that's for sure. This means you're not cutting off all
  that many layers, even with 100 or more tables, which means you're
  not actually going to see a ton of savings.



  Overall, I agree that the OP will likely see a noticeable improvement 
  if they boost the cache size.  Bumping it up 10x or even 100x on a
  modern workstation is not that big of a deal (100x ~= 300MB if you
  start with the default 1K page and 2000 page cache).  We see the same
  thing when you build a big index (which is accessing semi-random
  data and doing a lot of tree node shuffling).  The best way to
  increase index build performance is to boost the cache size so that
  as much of the Btree as possible is always in RAM.  I suspect this is
  much the same case.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max


Doug Fajardo wrote:
> No, I admit I haven't tried this under SQLITE.
> 
> Whether this approach will help for the specific application will depend
> on data usage patterns, which we haven't delved into for this application.
> Call me simple: since the main issue is degraded performance with larger
> groupings of data, it seemed to make sense that breaking the data into
> smaller groupings would help. 
> 
> Of course it's very possible that the size of the database in question may
> mean that the number of hash buckets needed to reap significant benefits
> makes this approach counter-productive. That's why it is only a suggestion
> :-)

I think you're assumptions wrong initially. I just can't imagine scenario
where your proposal witl give any benefit except wrong implementation of
B-Tree which is not the case with SQLite.

As I have posted in answer to thread-starter, degradation of performance
because of the cache hit ratio becoming less with amount of data. Your
proposal may work in case keys feeding not random and hitting only several
tables and that gives higher cache hit ratio. But if that's the case - the
same will occur with B-Tree. And if that's the case - why not to feed data
sorted - in that case by logic and as it's proven SQLite will insert the
data without any performance degradation.

Could you describe me situation in which your proposal would help and why?

-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24223839.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Douglas E. Fajardo
No, I admit I haven't tried this under SQLITE.

Whether this approach will help for the specific application will depend on 
data usage patterns, which we haven't delved into for this application. Call me 
simple: since the main issue is degraded performance with larger groupings of 
data, it seemed to make sense that breaking the data into smaller groupings 
would help. 

Of course it's very possible that the size of the database in question may mean 
that the number of hash buckets needed to reap significant benefits makes this 
approach counter-productive. That's why it is only a suggestion :-)

*** Doug Fajardo

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Kosenko Max
Sent: Friday, June 26, 2009 4:06 AM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] very large SQLite tables


Have you ever tested such proposal?
I believe that doesn't works.


Doug Fajardo wrote:
> 
> One approach might be to split the big, monolithic table into some number
> of hash buckets, where each 'bucket' is separate table. When doing a
> search, the program calculates the hash and accesses reads only the bucket
> that is needed.
> 
> This approach also has the potential for allowing multiple databases,
> where tables would be spread across the different databases. The databases
> could be spread across multiple drives to improve performance.
> 
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24218386.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max


John Stanton-3 wrote:
> Why would it not work?  It is just adding an extra top level to the 
> index.  A tried and true method.

It will work. But won't give performance benefit. And from my undestanding
it will even slow down things.
You can place parts of index in different DB and on different HDD thinking
it will boost the performance.

But the problem is that we aren't talking about multiply selects at the same
time... We are talking about updating index in sqlite which is
single-threaded and even under load that wouldn't give you any advantages.

Moreover you're emulating part of B-Tree with that approach and making it
slower and more space consumptive. So should it work as a solution? No. 

You have an idea why it should work? Tell me so.
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p2433.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread John Stanton
Why would it not work?  It is just adding an extra top level to the 
index.  A tried and true method.

Kosenko Max wrote:
> Have you ever tested such proposal?
> I believe that doesn't works.
>
>
> Doug Fajardo wrote:
>   
>> One approach might be to split the big, monolithic table into some number
>> of hash buckets, where each 'bucket' is separate table. When doing a
>> search, the program calculates the hash and accesses reads only the bucket
>> that is needed.
>>
>> This approach also has the potential for allowing multiple databases,
>> where tables would be spread across the different databases. The databases
>> could be spread across multiple drives to improve performance.
>>
>> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max

I forgot to say about hash...

My personal choice will be MurmurHash2 64 bit function
http://murmurhash.googlepages.com/
http://en.wikipedia.org/wiki/MurmurHash2 - lots of implementations here

It's fast (even in managed impls), have good characteristics and free.
Don't use CRC64...

P.S. You still have a chance ~ 1/10`000`000`000 that two strings in 1
billion dictionary will have same hash. So you probably should make very
small table cached in memory that will have collision resolvings - string
key that was changed to other string key w/o collision. That's simple to do
and will remove a chance of collision while keeping additional checks very
fast (due to small size of the collision check table - I believe you will
never see anything in that table at all).
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24219678.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max


Matthew O'Keefe wrote:
> We wanted to post to the mailing list to see if there are any obvious,
> first-order things we can try to improve performance for such a large
> table.

The problem with slow inserts generally speaking lies in the problem of
cache miss.
Imagine that each new insert in index is random. After some time that means
that cache hit ratio becomes close to zero. So that means in case you're on
HDD you have to spend 10ms on each B-Tree page read. Even if you don't have
any flushing at all that can give you ~15 search/insert operations per
second (assuming ~6 levels depth of tree). The more data you have, the
closer you will be to that limit.

There are several inefficiencies in SQLite cache due to simplicity and
having several tables won't help. Having several files will lead to
spreading of cache between connections. Making more efficient cache system
or better defragmentation or another approach to layout data can help but
not radically. That will just move the wall or raise a bottom line from i.e.
15 ops to 50 ops.

So what you should do instead:
1. Make SQLite cache as large as possible.
2. Compact your data as much as possible -  one solution would be to convert
string key to int64 hash value - that will radically compact data.
3. Improve sqlite caching and send a patch to D. Richard Hipp :)

In case you will have cache size larger than DB - mostly every insert will
be very fast.
But with your demand of 1B records - you will have something like 10-20Gb of
sqlite db.
I don't know whether sqlite allows such large buffers defined in 64 bit. I
also don't know do you have that amount of RAM.

SOLUTION PROPOSAL

First proposal is to feed data sorted by key. That would be always fast.

Another approach would be to make delayed inserts. In case you have peaks of
inserts - you can actually create similar empty table and place rows there.
Queries should use UNION and after some time when you'll have i.e. 1
items in there - you can insert them at once in a sorted by key order. After
insert you should empty your table again. That will be MUCH faster. 


SAMPLE:

PRAGMA cache_size = 

CREATE TABLE chunks (nameHash integer primary key, pid integer not null);
CREATE TABLE chunksUpdates (nameHash integer primary key, pid integer not
null);

1. Insert with
INSERT INTO chunksUpdates VALUES(CustomHashFunction("some 88 bytes long key
for whatever reason I need it"), 34234234)

2. Select with
SELECT * FROM chunks WHERE nameHash = CustomHashFunction("some 88 bytes long
key for whatever reason I need it") UNION SELECT * FROM chunksUpdates WHERE
nameHash = CustomHashFunction("some 88 bytes long key for whatever reason I
need it") LIMIT 1

3. From time to time when size of chunksUpdates becomes something like 1
do following

BEGIN EXCLUSIVE
INSERT INTO chunks SELECT * FROM chunksUpdates ORDER BY nameHash
DELETE FROM chunksUpdates
END

Updates and deletes are different story with the same principles...

Making custom hash function is really important for you and it really should
be 64bit based otherwise you will get a duplicates on collections of 1B
items. 
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24218566.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-26 Thread Kosenko Max

Have you ever tested such proposal?
I believe that doesn't works.


Doug Fajardo wrote:
> 
> One approach might be to split the big, monolithic table into some number
> of hash buckets, where each 'bucket' is separate table. When doing a
> search, the program calculates the hash and accesses reads only the bucket
> that is needed.
> 
> This approach also has the potential for allowing multiple databases,
> where tables would be spread across the different databases. The databases
> could be spread across multiple drives to improve performance.
> 
-- 
View this message in context: 
http://www.nabble.com/very-large-SQLite-tables-tp24201098p24218386.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread Ken

Along the same lines, the buckets could be created in their own unique Sqlite 
Db, thus improving concurrency as well!!!

--- On Thu, 6/25/09, Douglas E. Fajardo  wrote:

> From: Douglas E. Fajardo 
> Subject: Re: [sqlite] very large SQLite tables
> To: "sqlite-users@sqlite.org" 
> Date: Thursday, June 25, 2009, 11:24 AM
> One approach might be to split the
> big, monolithic table into some number of hash buckets,
> where each 'bucket' is separate table. When doing a search,
> the program calculates the hash and accesses reads only the
> bucket that is needed.
> 
> This approach also has the potential for allowing multiple
> databases, where tables would be spread across the different
> databases. The databases could be spread across multiple
> drives to improve performance.
> 
> *** Doug Fajardo
> 
> 
> -Original Message-
> From: sqlite-users-boun...@sqlite.org
> [mailto:sqlite-users-boun...@sqlite.org]
> On Behalf Of Matthew O'Keefe
> Sent: Wednesday, June 24, 2009 12:21 PM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] very large SQLite tables
> 
> 
> 
> We are using SQLite for indexing a huge number (i.e., 100
> million to 1
> billion) of key pairs
> that are represented by an 88-byte key. We are using a
> single table with a
> very large number of rows (one for each data chunk), and
> two columns.
> 
> The table has two columns.  One is of type ³text²
> and the other is type
> ³integer².
> > 
> > The table is created with:
> > 
> > CREATE TABLE chunks
> > (
> >   name text primary key,
> >   pid integer not null
> );
> 
> As expected, as the
> table grows, the underlying B-tree implementation for
> SQLite means that the
> number of
> disks accesses to (a) find, and (b) add a chunk, grows
> larger and larger.
> We¹ve tested up
> to 20 million chunks represented in the table: as expected
> performance
> exponentially 
> decreases as the number of table entries grows.
> 
> We wanted to post to the mailing list to see if there are
> any obvious,
> first-order things
> we can try to improve performance for such a large table.
> 
> We really appreciate the efforts of the SQLite developer
> community!
> 
> Matt O¹Keefe
> 
> sqlite-users@sqlite.org
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread Simon Slavin

On 25 Jun 2009, at 5:43pm, P Kishor wrote:

> heck! Do two comparisons -- SQLite v. BerkeleyDB v. Tokyo Cabinet.
>
> Nothing like thorough testing for the purpose of science. :-)

Yes, there is: keeping to departmental budget and project deadlines.

(I know, the argument is that the research will pay for itself.)

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread P Kishor
On Thu, Jun 25, 2009 at 12:36 PM, David Fletcher wrote:
>
>> "PK" == P Kishor  writes:
>
>>> As expected, as the table grows, the underlying B-tree
>>> implementation for SQLite means that the number of disks accesses
>>> to (a) find, and (b) add a chunk, grows larger and larger.  We¹ve
>>> tested up to 20 million chunks represented in the table: as
>>> expected performance exponentially decreases as the number of table
>>> entries grows.
>
> PK> Why don't you use, or at least test, with BerkeleyDB? Since you have
> PK> only one table, you can hardly benefit from the SQLness of an rdb. If
> PK> nothing, you will have a point of comparison with another technology,
> PK> and know for sure if SQLite is the appropriate solution for you.
>
> Perhaps http://tokyocabinet.sf.net might be a better choice.
>


heck! Do two comparisons -- SQLite v. BerkeleyDB v. Tokyo Cabinet.

Nothing like thorough testing for the purpose of science. :-)



-- 
Puneet Kishor
"assertions are politics... backing up assertions with evidence is science"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread David Fletcher

> "PK" == P Kishor  writes:

>> As expected, as the table grows, the underlying B-tree
>> implementation for SQLite means that the number of disks accesses
>> to (a) find, and (b) add a chunk, grows larger and larger.  We¹ve
>> tested up to 20 million chunks represented in the table: as
>> expected performance exponentially decreases as the number of table
>> entries grows.

PK> Why don't you use, or at least test, with BerkeleyDB? Since you have
PK> only one table, you can hardly benefit from the SQLness of an rdb. If
PK> nothing, you will have a point of comparison with another technology,
PK> and know for sure if SQLite is the appropriate solution for you.

Perhaps http://tokyocabinet.sf.net might be a better choice.

David
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread P Kishor
On Wed, Jun 24, 2009 at 3:21 PM, Matthew
O'Keefe wrote:
>
>
> We are using SQLite for indexing a huge number (i.e., 100 million to 1
> billion) of key pairs
> that are represented by an 88-byte key. We are using a single table with a
> very large number of rows (one for each data chunk), and two columns.
>
> The table has two columns.  One is of type ³text² and the other is type
> ³integer².
>>
>> The table is created with:
>>
>> CREATE TABLE chunks
>> (
>>   name text primary key,
>>   pid integer not null
> );
>
> As expected, as the
> table grows, the underlying B-tree implementation for SQLite means that the
> number of
> disks accesses to (a) find, and (b) add a chunk, grows larger and larger.
> We¹ve tested up
> to 20 million chunks represented in the table: as expected performance
> exponentially
> decreases as the number of table entries grows.

Why don't you use, or at least test, with BerkeleyDB? Since you have
only one table, you can hardly benefit from the SQLness of an rdb. If
nothing, you will have a point of comparison with another technology,
and know for sure if SQLite is the appropriate solution for you.

You experience would also likely make for a good case study on the
SQLite wiki that might help others in the future.


>
> We wanted to post to the mailing list to see if there are any obvious,
> first-order things
> we can try to improve performance for such a large table.
>
> We really appreciate the efforts of the SQLite developer community!
>
> Matt O¹Keefe
>
> sqlite-users@sqlite.org
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Puneet Kishor http://www.punkish.org/
Carbon Model http://carbonmodel.org/
Charter Member, Open Source Geospatial Foundation http://www.osgeo.org/
Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor/
Nelson Institute, UW-Madison http://www.nelson.wisc.edu/
---
collaborate, communicate, compete
===
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread Douglas E. Fajardo
One approach might be to split the big, monolithic table into some number of 
hash buckets, where each 'bucket' is separate table. When doing a search, the 
program calculates the hash and accesses reads only the bucket that is needed.

This approach also has the potential for allowing multiple databases, where 
tables would be spread across the different databases. The databases could be 
spread across multiple drives to improve performance.

*** Doug Fajardo


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Matthew O'Keefe
Sent: Wednesday, June 24, 2009 12:21 PM
To: sqlite-users@sqlite.org
Subject: [sqlite] very large SQLite tables



We are using SQLite for indexing a huge number (i.e., 100 million to 1
billion) of key pairs
that are represented by an 88-byte key. We are using a single table with a
very large number of rows (one for each data chunk), and two columns.

The table has two columns.  One is of type ³text² and the other is type
³integer².
> 
> The table is created with:
> 
> CREATE TABLE chunks
> (
>   name text primary key,
>   pid integer not null
);

As expected, as the
table grows, the underlying B-tree implementation for SQLite means that the
number of
disks accesses to (a) find, and (b) add a chunk, grows larger and larger.
We¹ve tested up
to 20 million chunks represented in the table: as expected performance
exponentially 
decreases as the number of table entries grows.

We wanted to post to the mailing list to see if there are any obvious,
first-order things
we can try to improve performance for such a large table.

We really appreciate the efforts of the SQLite developer community!

Matt O¹Keefe

sqlite-users@sqlite.org

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] very large SQLite tables

2009-06-25 Thread D. Richard Hipp

On Jun 24, 2009, at 3:21 PM, Matthew O'Keefe wrote:
> disks accesses to (a) find, and (b) add a chunk, grows larger and  
> larger.  We’ve tested up to 20 million chunks represented in the  
> table: as expected performance exponentially decreases as the number  
> of table entries grows.


Did you mean to say "logarithmically" where you wrote "exponentially"?


D. Richard Hipp
d...@hwaci.com



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] very large SQLite tables

2009-06-25 Thread Matthew O'Keefe


We are using SQLite for indexing a huge number (i.e., 100 million to 1
billion) of key pairs
that are represented by an 88-byte key. We are using a single table with a
very large number of rows (one for each data chunk), and two columns.

The table has two columns.  One is of type ³text² and the other is type
³integer².
> 
> The table is created with:
> 
> CREATE TABLE chunks
> (
>   name text primary key,
>   pid integer not null
);

As expected, as the
table grows, the underlying B-tree implementation for SQLite means that the
number of
disks accesses to (a) find, and (b) add a chunk, grows larger and larger.
We¹ve tested up
to 20 million chunks represented in the table: as expected performance
exponentially 
decreases as the number of table entries grows.

We wanted to post to the mailing list to see if there are any obvious,
first-order things
we can try to improve performance for such a large table.

We really appreciate the efforts of the SQLite developer community!

Matt O¹Keefe

sqlite-users@sqlite.org

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users