Re: [sqlite] Cache invalidation after insert statements.

2007-06-22 Thread Eduardo Morras

At 11:20 22/06/2007, you wrote:



HI all.

Thanks for everyones help the problem is now solved. The memory drive worked
like a bomb. Basically the problem on that server was that the insanely high
IO prevented the OS from caching the file which slowed down the performance.
After installing a mem drive ( using mfs ) and reducing the cache size,
multiple connections are now flying. Im getting insane speeds. SQLite FTW!


It's better to write a custom ram drive with sqlite than sqlite with 
a ram drive. This way the ram drive can lock to the database and make 
a copy of ramdrive to disk.




--
"General error, hit any user to continue." 



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-22 Thread pompomJuice


HI all.

Thanks for everyones help the problem is now solved. The memory drive worked
like a bomb. Basically the problem on that server was that the insanely high
IO prevented the OS from caching the file which slowed down the performance.
After installing a mem drive ( using mfs ) and reducing the cache size,
multiple connections are now flying. Im getting insane speeds. SQLite FTW!

Regards.

Werner

pompomJuice wrote:
> 
> 
> Hello there.
> 
> I need some insight into how SQLite's caching works. I have a database
> that
> is quite large (5Gb) sitting on a production server that's IO is severely
> taxed. This causes my SQLite db to perform very poorly. Most of the time
> my
> application just sits there and uses about 10% of a CPU where it would use
> a
> 100% on test systems with idle IO. Effectively what the application does
> is
> constantly doing lookups as fast as it can.
> 
> To counteract this I increased the page size to 8192 (Unix server with
> advfs
> having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h to
> 512000. This worked. My application starts at low memory usage and as it
> gradually gains more memory. As it gains more memory it uses more CPU and
> reaches a point where it finally uses 100% CPU and 5Gb of ram.
> 
> Every now and then the lookup table is udpated. As soon as the application
> does this however the performance goes back to a crawl and slowly builds
> up
> again as described in the previous paragraph. The memory usage stays at
> 5Gb.
> All that I can think of is that the update invalidates the cache. The
> update
> is not very big, say 20 rows in a table that has about 45 million
> rows.
> 
> What exactly is happening here?
> 
> Regards.
> -- 
> View this message in context:
> http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944881.html#a11190285
> Sent from the SQLite mailing list archive at Nabble.com.
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11248815
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-21 Thread pompomJuice

That sounds like an awesome trick. I will definitely do as you suggest and
decrease cache_size as even at the moment it does not really seem to help
much.

With regards to the memory being volatile and such. That is not really a big
problem for me as a complete loss of the lookup table is not a bug loss as
it can be recovered from its source in a matter of hours. That machine has
never rebooted or lost power for the last 3 years that I have written
programs for it so I think that risk is acceptable. 

Thanks for the awesome tips.


Joe Wilson wrote:
> 
>> mmm, I was thinking that I decrease the cache_size to like 20 when using
>> the
>> ram drive since I dont need caching anymore then.
>> 
>> I have inserted more timeing code and I am now convinced I have an IO
>> problem. When I coax a OS to fully cache my (smaller 40 rows) db file
>> (
>> which takes like 2-3 runs ) sqlite can do lookups at about 5 per
>> second.
>> With the file uncached this value falls as low as 500.
>> 
>> Now I need to get the system administrators to make me that ram drive.
> 
> The obvious problem with a RAM drive is that the data is not persisted,
> so if you lose power... you get the idea.
> 
> I may be drowned as a witch for suggesting this, but since you have ample
> RAM and CPUs and you want the file to be in OS cache all day for quick
> ad-hoc lookups, just put the following in cron to be run every few
> minutes:
> 
>cat your.db > /dev/null
> 
> If the file is already in OS cache, this is a very quick operation.
> 
> There may be OS-specific ways to keep the image of the file in RAM 
> without the cron/cat hack.  Some modern smart OS pagers may not keep 
> the file cached in memory if it suspects it will not be used again, so 
> see what cat alternative works on your OS.
> 
> Whether you're using the RAM drive approach or the keep-the-db-in-OS-cache
> approach, do keep the cache_size low for all your sqlite processes, as 
> you mention. Having large caches for multiple processes is a waste of 
> system RAM, due to duplication.
> 
> 
>
> 
> Be a better Globetrotter. Get better travel answers from someone who
> knows. Yahoo! Answers - Check it out.
> http://answers.yahoo.com/dir/?link=list=396545469
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11231083
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread Martin Jenkins

Joe Wilson wrote:

A non-volatile RAM drive is the way to go if you got the bucks.


16 Processor machine
~40Gb ram
EMC storage

suggests he does. ;)

I worked on a project where the end client had Sun kit of this spec, and 
they claimed the systems cost 7 figures GBP back in 2005.


Martin

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread Joe Wilson
I assumed he meant a volatile system RAM "drive", as opposed to a non-volatile 
external RAM drive by his wording. But no point speculating what he meant.
A non-volatile RAM drive is the way to go if you got the bucks.

--- Ken <[EMAIL PROTECTED]> wrote:
>  I think the performance of the ram drive (i'm guessing scsi based) will not 
> be as good as
> physical system ram. But certainly better than the I/o speed of disk.

> pompomJuice <[EMAIL PROTECTED]> wrote: 
> Now I need to get the system administrators to make me that ram drive.



   

Sick sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread Joe Wilson
> mmm, I was thinking that I decrease the cache_size to like 20 when using the
> ram drive since I dont need caching anymore then.
> 
> I have inserted more timeing code and I am now convinced I have an IO
> problem. When I coax a OS to fully cache my (smaller 40 rows) db file (
> which takes like 2-3 runs ) sqlite can do lookups at about 5 per second.
> With the file uncached this value falls as low as 500.
> 
> Now I need to get the system administrators to make me that ram drive.

The obvious problem with a RAM drive is that the data is not persisted,
so if you lose power... you get the idea.

I may be drowned as a witch for suggesting this, but since you have ample
RAM and CPUs and you want the file to be in OS cache all day for quick
ad-hoc lookups, just put the following in cron to be run every few minutes:

   cat your.db > /dev/null

If the file is already in OS cache, this is a very quick operation.

There may be OS-specific ways to keep the image of the file in RAM 
without the cron/cat hack.  Some modern smart OS pagers may not keep 
the file cached in memory if it suspects it will not be used again, so 
see what cat alternative works on your OS.

Whether you're using the RAM drive approach or the keep-the-db-in-OS-cache
approach, do keep the cache_size low for all your sqlite processes, as 
you mention. Having large caches for multiple processes is a waste of 
system RAM, due to duplication.


   

Be a better Globetrotter. Get better travel answers from someone who knows. 
Yahoo! Answers - Check it out.
http://answers.yahoo.com/dir/?link=list=396545469

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread Ken
I understand where you are heading, by putting the entire db on a ram drive.
 
 I think the performance of the ram drive (i'm guessing scsi based) will not be 
as good as physical system ram. But certainly better than the I/o speed of disk.
 
 Let us know how it turns out.
 

pompomJuice <[EMAIL PROTECTED]> wrote: 
mmm, I was thinking that I decrease the cache_size to like 20 when using the
ram drive since I dont need caching anymore then.

I have inserted more timeing code and I am now convinced I have an IO
problem. When I coax a OS to fully cache my (smaller 40 rows) db file (
which takes like 2-3 runs ) sqlite can do lookups at about 5 per second.
With the file uncached this value falls as low as 500.

Now I need to get the system administrators to make me that ram drive.


ken-33 wrote:
> 
> The Ram drive is unlikely to work. It will still have the same cache
> invalidation. 
>   You need to get things logically working first. Ram drives are great to
> help improve performance where seeks are and rotational access
> requirements dictate.
>   
> 
> pompomJuice  wrote:
>   
> AArrgh.
> 
> That is the one thing that I wont be able to do. It would require a
> complete
> system redesign. I can adapt my program easy but now to get it to work in
> the greater scheme of things would be a nightmare.
> 
> My current efforts are being focussed into making a ram drive and putting
> the file in there. I hope it works.
> 
> 
> ken-33 wrote:
>> 
>> Can you consolidate your multiple binaries to a Single Binary?
>> Then Use threading and sqlite's shared caching to perform the Lookups and
>> updates.
>> That way the cache wouldn't get invalidated???
>> 
>> Someone else here correct me if this is a bad idea!!!
>> 
>> 
>> 
>> pompomJuice wrote: 
>> I suspected something like this, as it makes sense.
>> 
>> I have multiple binaries/different connections ( and I cannot make them
>> share a connection ) using this one lookup table and depending on which
>> connection checks first, it will update the table. 
>> 
>> My question is then, if any one connection makes any change to the
>> database
>> ( not neccesarily to the huge lookup table ) will all the other
>> connections
>> invalidate their entire cache? Or is it per table/btree that the cache is
>> dropped?
>> 
>> Thanks for that reponse. Already I can move ahead now with better
>> knowlege
>> of how the caching works.
>> 
>> Regards.
>> 
>> 
>> 
>> Dan Kennedy-4 wrote:
>>> 
>>> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
 Hello there.
 
 I need some insight into how SQLite's caching works. I have a database
 that
 is quite large (5Gb) sitting on a production server that's IO is
 severely
 taxed. This causes my SQLite db to perform very poorly. Most of the
 time
 my
 application just sits there and uses about 10% of a CPU where it would
 use a
 100% on test systems with idle IO. Effectively what the application
 does
 is
 constantly doing lookups as fast as it can.
 
 To counteract this I increased the page size to 8192 (Unix server with
 advfs
 having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
 to
 512000. This worked. My application starts at low memory usage and as
 it
 gradually gains more memory. As it gains more memory it uses more CPU
 and
 reaches a point where it finally uses 100% CPU and 5Gb of ram.
 
 Every now and then the lookup table is udpated. As soon as the
 application
 does this however the performance goes back to a crawl and slowly
 builds
 up
 again as described in the previous paragraph. The memory usage stays at
 5Gb.
 All that I can think of is that the update invalidates the cache. 
>>> 
>>> Probably right.
>>> 
 The update
 is not very big, say 20 rows in a table that has about 45 million
 rows.
 
 What exactly is happening here?
>>> 
>>> Are you using 3.3.17? And is it an external process (or at least a
>>> different connection doing) doing the update?
>>> 
>>> If so, the update is modifying the pager change-counter, invalidating
>>> the pager cache held by the lookup application. The lookup app has
>>> to start loading pages from the disk again, instead of just reading
>>> it's cache.
>>> 
>>> The only way around this performance hit is to do the UPDATE through
>>> the lookup app, using the same database connection.
>>> 
>>> Dan.
>>> 
 Regards.
>>> 
>>> 
>>> -
>>> To unsubscribe, send email to [EMAIL PROTECTED]
>>> -
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
>> Sent from the SQLite mailing list archive at Nabble.com.
>> 
>> 
>> 

Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread pompomJuice

mmm, I was thinking that I decrease the cache_size to like 20 when using the
ram drive since I dont need caching anymore then.

I have inserted more timeing code and I am now convinced I have an IO
problem. When I coax a OS to fully cache my (smaller 40 rows) db file (
which takes like 2-3 runs ) sqlite can do lookups at about 5 per second.
With the file uncached this value falls as low as 500.

Now I need to get the system administrators to make me that ram drive.


ken-33 wrote:
> 
> The Ram drive is unlikely to work. It will still have the same cache
> invalidation. 
>   You need to get things logically working first. Ram drives are great to
> help improve performance where seeks are and rotational access
> requirements dictate.
>   
> 
> pompomJuice <[EMAIL PROTECTED]> wrote:
>   
> AArrgh.
> 
> That is the one thing that I wont be able to do. It would require a
> complete
> system redesign. I can adapt my program easy but now to get it to work in
> the greater scheme of things would be a nightmare.
> 
> My current efforts are being focussed into making a ram drive and putting
> the file in there. I hope it works.
> 
> 
> ken-33 wrote:
>> 
>> Can you consolidate your multiple binaries to a Single Binary?
>> Then Use threading and sqlite's shared caching to perform the Lookups and
>> updates.
>> That way the cache wouldn't get invalidated???
>> 
>> Someone else here correct me if this is a bad idea!!!
>> 
>> 
>> 
>> pompomJuice wrote: 
>> I suspected something like this, as it makes sense.
>> 
>> I have multiple binaries/different connections ( and I cannot make them
>> share a connection ) using this one lookup table and depending on which
>> connection checks first, it will update the table. 
>> 
>> My question is then, if any one connection makes any change to the
>> database
>> ( not neccesarily to the huge lookup table ) will all the other
>> connections
>> invalidate their entire cache? Or is it per table/btree that the cache is
>> dropped?
>> 
>> Thanks for that reponse. Already I can move ahead now with better
>> knowlege
>> of how the caching works.
>> 
>> Regards.
>> 
>> 
>> 
>> Dan Kennedy-4 wrote:
>>> 
>>> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
 Hello there.
 
 I need some insight into how SQLite's caching works. I have a database
 that
 is quite large (5Gb) sitting on a production server that's IO is
 severely
 taxed. This causes my SQLite db to perform very poorly. Most of the
 time
 my
 application just sits there and uses about 10% of a CPU where it would
 use a
 100% on test systems with idle IO. Effectively what the application
 does
 is
 constantly doing lookups as fast as it can.
 
 To counteract this I increased the page size to 8192 (Unix server with
 advfs
 having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
 to
 512000. This worked. My application starts at low memory usage and as
 it
 gradually gains more memory. As it gains more memory it uses more CPU
 and
 reaches a point where it finally uses 100% CPU and 5Gb of ram.
 
 Every now and then the lookup table is udpated. As soon as the
 application
 does this however the performance goes back to a crawl and slowly
 builds
 up
 again as described in the previous paragraph. The memory usage stays at
 5Gb.
 All that I can think of is that the update invalidates the cache. 
>>> 
>>> Probably right.
>>> 
 The update
 is not very big, say 20 rows in a table that has about 45 million
 rows.
 
 What exactly is happening here?
>>> 
>>> Are you using 3.3.17? And is it an external process (or at least a
>>> different connection doing) doing the update?
>>> 
>>> If so, the update is modifying the pager change-counter, invalidating
>>> the pager cache held by the lookup application. The lookup app has
>>> to start loading pages from the disk again, instead of just reading
>>> it's cache.
>>> 
>>> The only way around this performance hit is to do the UPDATE through
>>> the lookup app, using the same database connection.
>>> 
>>> Dan.
>>> 
 Regards.
>>> 
>>> 
>>> -
>>> To unsubscribe, send email to [EMAIL PROTECTED]
>>> -
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
>> Sent from the SQLite mailing list archive at Nabble.com.
>> 
>> 
>> -
>> To unsubscribe, send email to [EMAIL PROTECTED]
>> -
>> 
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> 

Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread Ken
The Ram drive is unlikely to work. It will still have the same cache 
invalidation. 
  You need to get things logically working first. Ram drives are great to help 
improve performance where seeks are and rotational access requirements dictate.
  

pompomJuice <[EMAIL PROTECTED]> wrote:
  
AArrgh.

That is the one thing that I wont be able to do. It would require a complete
system redesign. I can adapt my program easy but now to get it to work in
the greater scheme of things would be a nightmare.

My current efforts are being focussed into making a ram drive and putting
the file in there. I hope it works.


ken-33 wrote:
> 
> Can you consolidate your multiple binaries to a Single Binary?
> Then Use threading and sqlite's shared caching to perform the Lookups and
> updates.
> That way the cache wouldn't get invalidated???
> 
> Someone else here correct me if this is a bad idea!!!
> 
> 
> 
> pompomJuice wrote: 
> I suspected something like this, as it makes sense.
> 
> I have multiple binaries/different connections ( and I cannot make them
> share a connection ) using this one lookup table and depending on which
> connection checks first, it will update the table. 
> 
> My question is then, if any one connection makes any change to the
> database
> ( not neccesarily to the huge lookup table ) will all the other
> connections
> invalidate their entire cache? Or is it per table/btree that the cache is
> dropped?
> 
> Thanks for that reponse. Already I can move ahead now with better knowlege
> of how the caching works.
> 
> Regards.
> 
> 
> 
> Dan Kennedy-4 wrote:
>> 
>> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
>>> Hello there.
>>> 
>>> I need some insight into how SQLite's caching works. I have a database
>>> that
>>> is quite large (5Gb) sitting on a production server that's IO is
>>> severely
>>> taxed. This causes my SQLite db to perform very poorly. Most of the time
>>> my
>>> application just sits there and uses about 10% of a CPU where it would
>>> use a
>>> 100% on test systems with idle IO. Effectively what the application does
>>> is
>>> constantly doing lookups as fast as it can.
>>> 
>>> To counteract this I increased the page size to 8192 (Unix server with
>>> advfs
>>> having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
>>> to
>>> 512000. This worked. My application starts at low memory usage and as it
>>> gradually gains more memory. As it gains more memory it uses more CPU
>>> and
>>> reaches a point where it finally uses 100% CPU and 5Gb of ram.
>>> 
>>> Every now and then the lookup table is udpated. As soon as the
>>> application
>>> does this however the performance goes back to a crawl and slowly builds
>>> up
>>> again as described in the previous paragraph. The memory usage stays at
>>> 5Gb.
>>> All that I can think of is that the update invalidates the cache. 
>> 
>> Probably right.
>> 
>>> The update
>>> is not very big, say 20 rows in a table that has about 45 million
>>> rows.
>>> 
>>> What exactly is happening here?
>> 
>> Are you using 3.3.17? And is it an external process (or at least a
>> different connection doing) doing the update?
>> 
>> If so, the update is modifying the pager change-counter, invalidating
>> the pager cache held by the lookup application. The lookup app has
>> to start loading pages from the disk again, instead of just reading
>> it's cache.
>> 
>> The only way around this performance hit is to do the UPDATE through
>> the lookup app, using the same database connection.
>> 
>> Dan.
>> 
>>> Regards.
>> 
>> 
>> -
>> To unsubscribe, send email to [EMAIL PROTECTED]
>> -
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
> Sent from the SQLite mailing list archive at Nabble.com.
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11208520
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-




Re: [sqlite] Cache invalidation after insert statements.

2007-06-20 Thread pompomJuice

AArrgh.

That is the one thing that I wont be able to do. It would require a complete
system redesign. I can adapt my program easy but now to get it to work in
the greater scheme of things would be a nightmare.

My current efforts are being focussed into making a ram drive and putting
the file in there. I hope it works.


ken-33 wrote:
> 
> Can you consolidate your multiple binaries to a Single Binary?
>  Then Use threading and sqlite's shared caching to perform the Lookups and
> updates.
>  That way the cache wouldn't get invalidated???
>  
>  Someone else here correct me if this is a bad idea!!!
>  
>  
> 
> pompomJuice <[EMAIL PROTECTED]> wrote: 
> I suspected something like this, as it makes sense.
> 
> I have multiple binaries/different connections ( and I cannot make them
> share a connection ) using this one lookup table and depending on which
> connection checks first, it will update the table. 
> 
> My question is then, if any one connection makes any change to the
> database
> ( not neccesarily to the huge lookup table ) will all the other
> connections
> invalidate their entire cache? Or is it per table/btree that the cache is
> dropped?
> 
> Thanks for that reponse. Already I can move ahead now with better knowlege
> of how the caching works.
> 
> Regards.
> 
> 
> 
> Dan Kennedy-4 wrote:
>> 
>> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
>>> Hello there.
>>> 
>>> I need some insight into how SQLite's caching works. I have a database
>>> that
>>> is quite large (5Gb) sitting on a production server that's IO is
>>> severely
>>> taxed. This causes my SQLite db to perform very poorly. Most of the time
>>> my
>>> application just sits there and uses about 10% of a CPU where it would
>>> use a
>>> 100% on test systems with idle IO. Effectively what the application does
>>> is
>>> constantly doing lookups as fast as it can.
>>> 
>>> To counteract this I increased the page size to 8192 (Unix server with
>>> advfs
>>> having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
>>> to
>>> 512000. This worked. My application starts at low memory usage and as it
>>> gradually gains more memory. As it gains more memory it uses more CPU
>>> and
>>> reaches a point where it finally uses 100% CPU and 5Gb of ram.
>>> 
>>> Every now and then the lookup table is udpated. As soon as the
>>> application
>>> does this however the performance goes back to a crawl and slowly builds
>>> up
>>> again as described in the previous paragraph. The memory usage stays at
>>> 5Gb.
>>> All that I can think of is that the update invalidates the cache. 
>> 
>> Probably right.
>> 
>>> The update
>>> is not very big, say 20 rows in a table that has about 45 million
>>> rows.
>>> 
>>> What exactly is happening here?
>> 
>> Are you using 3.3.17? And is it an external process (or at least a
>> different connection doing) doing the update?
>> 
>> If so, the update is modifying the pager change-counter, invalidating
>> the pager cache held by the lookup application. The lookup app has
>> to start loading pages from the disk again, instead of just reading
>> it's cache.
>> 
>> The only way around this performance hit is to do the UPDATE through
>> the lookup app, using the same database connection.
>> 
>> Dan.
>> 
>>> Regards.
>> 
>> 
>> -
>> To unsubscribe, send email to [EMAIL PROTECTED]
>> -
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
> Sent from the SQLite mailing list archive at Nabble.com.
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11208520
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Ken
1. Review your oracle 10g db and fix the "HUGE I/O" issues. 
 2. Why not do the lookups using oracle? Allocate the extra 5 gig to the oracle 
buffer cache.
 3. If you want good lookup performance, try to use the Array level interface 
so that you don't need to take multiple trips (context switch) to the DB.
 4. Use a Multi threaded approach for throughput with oracle since you have 16 
cpu's.
 

Andrew Finkenstadt <[EMAIL PROTECTED]> wrote: On 6/19/07, pompomJuice  wrote:
>
>
> Running a huge Oracle 10G database
> Running a 3rd party application that generates HUGE IO.
> Part of this 3rd party application is my application that does lookups.
>
> 1.) Data comes in in the form of files.
> 2.) 3rd party application decodes and prepares these tickets or records
> for
> insert into Oracle
> 3.) As these records flow by, we use key fields inside them to do lookups
> into this lightweight SQL database as doing it into oracle would be even
> slower.
>


When doing lookup tables in Oracle (9i or later stable versions), you are
better off doing an index-organized table to reduce by 1 seek the amount of
I/O Oracle does, and using a hash-key index-organized table to reduce the
seeks to (usually) ONLY 1.  Without knowing where in your box the Oracle
instance is, it's difficult to say for sure, but assuming its on the same
machine, I would try this out, based on my experience:

PIN the lookup table (index) into the buffer cache, giving Oracle the extra
5GB of memory space to do so.  Assuming you have a persistent connection to
Oracle from your fly-by-update process and your fly-by-lookup process, you
might find Oracle performs adequately.  Either way I'd measure what's taking
the time in doing lookups in Oracle that would be even slower;  I assume you
use bind parameters and cached SQL cursors (you get this automatically with
most PL/SQL constructs [excepting dynamic sql type 4]), and basically have 1
parse per cursor prepare at application startup time.



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Ken
Can you consolidate your multiple binaries to a Single Binary?
 Then Use threading and sqlite's shared caching to perform the Lookups and 
updates.
 That way the cache wouldn't get invalidated???
 
 Someone else here correct me if this is a bad idea!!!
 
 

pompomJuice <[EMAIL PROTECTED]> wrote: 
I suspected something like this, as it makes sense.

I have multiple binaries/different connections ( and I cannot make them
share a connection ) using this one lookup table and depending on which
connection checks first, it will update the table. 

My question is then, if any one connection makes any change to the database
( not neccesarily to the huge lookup table ) will all the other connections
invalidate their entire cache? Or is it per table/btree that the cache is
dropped?

Thanks for that reponse. Already I can move ahead now with better knowlege
of how the caching works.

Regards.



Dan Kennedy-4 wrote:
> 
> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
>> Hello there.
>> 
>> I need some insight into how SQLite's caching works. I have a database
>> that
>> is quite large (5Gb) sitting on a production server that's IO is severely
>> taxed. This causes my SQLite db to perform very poorly. Most of the time
>> my
>> application just sits there and uses about 10% of a CPU where it would
>> use a
>> 100% on test systems with idle IO. Effectively what the application does
>> is
>> constantly doing lookups as fast as it can.
>> 
>> To counteract this I increased the page size to 8192 (Unix server with
>> advfs
>> having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
>> to
>> 512000. This worked. My application starts at low memory usage and as it
>> gradually gains more memory. As it gains more memory it uses more CPU and
>> reaches a point where it finally uses 100% CPU and 5Gb of ram.
>> 
>> Every now and then the lookup table is udpated. As soon as the
>> application
>> does this however the performance goes back to a crawl and slowly builds
>> up
>> again as described in the previous paragraph. The memory usage stays at
>> 5Gb.
>> All that I can think of is that the update invalidates the cache. 
> 
> Probably right.
> 
>> The update
>> is not very big, say 20 rows in a table that has about 45 million
>> rows.
>> 
>> What exactly is happening here?
> 
> Are you using 3.3.17? And is it an external process (or at least a
> different connection doing) doing the update?
> 
> If so, the update is modifying the pager change-counter, invalidating
> the pager cache held by the lookup application. The lookup app has
> to start loading pages from the disk again, instead of just reading
> it's cache.
> 
> The only way around this performance hit is to do the UPDATE through
> the lookup app, using the same database connection.
> 
> Dan.
> 
>> Regards.
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-




Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Joe Wilson
Does every single process (however insignificant) that reads or writes 
to that sqlite database file run on the same 16 processor machine?

> 16 Processor machine
> ~40Gb ram
> EMC storage
> Running a huge Oracle 10G database
> Running a 3rd party application that generates HUGE IO. 
> Part of this 3rd party application is my application that does lookups.


  
___
You snooze, you lose. Get messages ASAP with AutoCheck
in the all-new Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_html.html

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Andrew Finkenstadt

On 6/19/07, pompomJuice <[EMAIL PROTECTED]> wrote:



Running a huge Oracle 10G database
Running a 3rd party application that generates HUGE IO.
Part of this 3rd party application is my application that does lookups.

1.) Data comes in in the form of files.
2.) 3rd party application decodes and prepares these tickets or records
for
insert into Oracle
3.) As these records flow by, we use key fields inside them to do lookups
into this lightweight SQL database as doing it into oracle would be even
slower.




When doing lookup tables in Oracle (9i or later stable versions), you are
better off doing an index-organized table to reduce by 1 seek the amount of
I/O Oracle does, and using a hash-key index-organized table to reduce the
seeks to (usually) ONLY 1.  Without knowing where in your box the Oracle
instance is, it's difficult to say for sure, but assuming its on the same
machine, I would try this out, based on my experience:

PIN the lookup table (index) into the buffer cache, giving Oracle the extra
5GB of memory space to do so.  Assuming you have a persistent connection to
Oracle from your fly-by-update process and your fly-by-lookup process, you
might find Oracle performs adequately.  Either way I'd measure what's taking
the time in doing lookups in Oracle that would be even slower;  I assume you
use bind parameters and cached SQL cursors (you get this automatically with
most PL/SQL constructs [excepting dynamic sql type 4]), and basically have 1
parse per cursor prepare at application startup time.


Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread pompomJuice

Thats exactly why I thought this sqlite would work.

16 Processor machine
~40Gb ram
EMC storage
Running a huge Oracle 10G database
Running a 3rd party application that generates HUGE IO. 
Part of this 3rd party application is my application that does lookups.

1.) Data comes in in the form of files.
2.) 3rd party application decodes and prepares these tickets or records for
insert into Oracle
3.) As these records flow by, we use key fields inside them to do lookups
into this lightweight SQL database as doing it into oracle would be even
slower.

When I deployed my solution I initially set the cache size to very small as
I thought that we can make gains by rather having the OS cache the SQLite db
file. This strategy failed miserably. The 3rd party app generates so much IO
that it seems that my SQLite file simply has no chance of staying cached.

So I then modified my program to mess with PRAGMA cache size to see if that
would help. Initially it did nothing until I edited the
SQLITE_DEFAULT_CACHE_SIZE in sqliteInt.h (which I see is now in limits) and
only then would my program start consuming massive amounts of ram. That work
brilliantly for a while until the lookup table was updated. All programs
went dead slow again and it would take them a good 2 hours to fully cache
that 5GB file again.

I am panicking now as I have been working on this project for a few months
now and its seems to be doomed. The only course of action I see is finding a
way to somehow to increase the IO performance as that is where I believe the
problem resides. Copying to and from the file system with the DB file is
very fast so I am kind of at a loss why my application is not getting the
throughput. Maybe because its not accessing the file sequentially... I don’t
know.

Thanks for the help.

Regards.



Christian Smith-4 wrote:
> 
> pompomJuice uttered:
> 
>>
>> I suspected something like this, as it makes sense.
>>
>> I have multiple binaries/different connections ( and I cannot make them
>> share a connection ) using this one lookup table and depending on which
>> connection checks first, it will update the table.
> 
> 
> What is your working set like? Are all processes on the same machine?
> 
> Sounds like you might benefit from increasing the amount of RAM on this 
> machine. You may be thrashing the OS cache, as your lookup process hogs 
> memory for it's own cache, pushing out old pages from the OS filesystem 
> cache.
> 
> If RAM upgrade is not feasible, then try reducing the cache of the lookup 
> process, so that the OS cache isn't forced out of memory so easily. Then, 
> when the lookup process has a cache miss, it's missed page is more likely 
> to be in the OS memory cache, and copied to the lookup process at memory 
> copy speed.
> 
> As you may have guessed, choosing the correct cache size for the lookup 
> process may involve several tuning iterations.
> 
> 
> Christian
> 
> --
>  /"\
>  \ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
>   X   - AGAINST MS ATTACHMENTS
>  / \
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11193389
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Christian Smith

pompomJuice uttered:



I suspected something like this, as it makes sense.

I have multiple binaries/different connections ( and I cannot make them
share a connection ) using this one lookup table and depending on which
connection checks first, it will update the table.



What is your working set like? Are all processes on the same machine?

Sounds like you might benefit from increasing the amount of RAM on this 
machine. You may be thrashing the OS cache, as your lookup process hogs 
memory for it's own cache, pushing out old pages from the OS filesystem 
cache.


If RAM upgrade is not feasible, then try reducing the cache of the lookup 
process, so that the OS cache isn't forced out of memory so easily. Then, 
when the lookup process has a cache miss, it's missed page is more likely 
to be in the OS memory cache, and copied to the lookup process at memory 
copy speed.


As you may have guessed, choosing the correct cache size for the lookup 
process may involve several tuning iterations.



Christian

--
/"\
\ /ASCII RIBBON CAMPAIGN - AGAINST HTML MAIL
 X   - AGAINST MS ATTACHMENTS
/ \

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread Dan Kennedy

> My question is then, if any one connection makes any change to the database
> ( not neccesarily to the huge lookup table ) will all the other connections
> invalidate their entire cache?

Yes. The entire cache, regardless of what table was modified etc.


Dan.



-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Cache invalidation after insert statements.

2007-06-19 Thread pompomJuice

I suspected something like this, as it makes sense.

I have multiple binaries/different connections ( and I cannot make them
share a connection ) using this one lookup table and depending on which
connection checks first, it will update the table. 

My question is then, if any one connection makes any change to the database
( not neccesarily to the huge lookup table ) will all the other connections
invalidate their entire cache? Or is it per table/btree that the cache is
dropped?

Thanks for that reponse. Already I can move ahead now with better knowlege
of how the caching works.

Regards.



Dan Kennedy-4 wrote:
> 
> On Tue, 2007-06-19 at 01:06 -0700, pompomJuice wrote:
>> Hello there.
>> 
>> I need some insight into how SQLite's caching works. I have a database
>> that
>> is quite large (5Gb) sitting on a production server that's IO is severely
>> taxed. This causes my SQLite db to perform very poorly. Most of the time
>> my
>> application just sits there and uses about 10% of a CPU where it would
>> use a
>> 100% on test systems with idle IO. Effectively what the application does
>> is
>> constantly doing lookups as fast as it can.
>> 
>> To counteract this I increased the page size to 8192 (Unix server with
>> advfs
>> having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h
>> to
>> 512000. This worked. My application starts at low memory usage and as it
>> gradually gains more memory. As it gains more memory it uses more CPU and
>> reaches a point where it finally uses 100% CPU and 5Gb of ram.
>> 
>> Every now and then the lookup table is udpated. As soon as the
>> application
>> does this however the performance goes back to a crawl and slowly builds
>> up
>> again as described in the previous paragraph. The memory usage stays at
>> 5Gb.
>> All that I can think of is that the update invalidates the cache. 
> 
> Probably right.
> 
>> The update
>> is not very big, say 20 rows in a table that has about 45 million
>> rows.
>> 
>> What exactly is happening here?
> 
> Are you using 3.3.17? And is it an external process (or at least a
> different connection doing) doing the update?
> 
> If so, the update is modifying the pager change-counter, invalidating
> the pager cache held by the lookup application. The lookup app has
> to start loading pages from the disk again, instead of just reading
> it's cache.
> 
> The only way around this performance hit is to do the UPDATE through
> the lookup app, using the same database connection.
> 
> Dan.
> 
>> Regards.
> 
> 
> -
> To unsubscribe, send email to [EMAIL PROTECTED]
> -
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944908.html#a11192121
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Cache invalidation after insert statements.

2007-06-19 Thread pompomJuice

Hello there.

I need some insight into how SQLite's caching works. I have a database that
is quite large (5Gb) sitting on a production server that's IO is severely
taxed. This causes my SQLite db to perform very poorly. Most of the time my
application just sits there and uses about 10% of a CPU where it would use a
100% on test systems with idle IO. Effectively what the application does is
constantly doing lookups as fast as it can.

To counteract this I increased the page size to 8192 (Unix server with advfs
having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h to
512000. This worked. My application starts at low memory usage and as it
gradually gains more memory. As it gains more memory it uses more CPU and
reaches a point where it finally uses 100% CPU and 5Gb of ram.

Every now and then the lookup table is udpated. As soon as the application
does this however the performance goes back to a crawl and slowly builds up
again as described in the previous paragraph. The memory usage stays at 5Gb.
All that I can think of is that the update invalidates the cache. The update
is not very big, say 20 rows in a table that has about 45 million rows.

What exactly is happening here?

Regards.
-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944881.html#a11190285
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Cache invalidation after insert statements.

2007-06-19 Thread pompomJuice

Hello there.

I need some insight into how SQLite's caching works. I have a database that
is quite large (5Gb) sitting on a production server that's IO is severely
taxed. This causes my SQLite db to perform very poorly. Most of the time my
application just sits there and uses about 10% of a CPU where it would use a
100% on test systems with idle IO. Effectively what the application does is
constantly doing lookups as fast as it can.

To counteract this I increased the page size to 8192 (Unix server with advfs
having 8K block sizes) and increased the MAX_PAGES value in sqliteInt.h to
512000. This worked. My application starts at low memory usage and as it
gradually gains more memory. As it gains more memory it uses more CPU and
reaches a point where it finally uses 100% CPU and 5Gb of ram.

Every now and then the lookup table is udpated. As soon as the application
does this however the performance goes back to a crawl and slowly builds up
again as described in the previous paragraph. The memory usage stays at 5Gb.
All that I can think of is that the update invalidates the cache. The update
is not very big, say 20 rows in a table that has about 45 million rows.

What exactly is happening here?

Regards.
-- 
View this message in context: 
http://www.nabble.com/Cache-invalidation-after-insert-statements.-tf3944877.html#a11190278
Sent from the SQLite mailing list archive at Nabble.com.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-