Re: [sqlite] Sqlite Insert Speed Optimization
On Wed, Jul 14, 2010 at 12:34:05PM -0500, Black, Michael (IS) scratched on the wall: > Was that a facetious remark??? > > Rather than "here's a function/pragma that allows you to put the > journal file where you want -- but BE CAREFUL BECAUSE..." > > Writing you own VFS is not for the casual user... > > I was trying just to find where the journal filename was created but > there's no "db-journal" string in either the .h or .c file. > > It would be trivial to add a function to set it via the C interface. This is a polite way of saying, "This a *really* Bad Idea. There is a big gun over there. Have fun shooting yourself in the foot. Can we take pictures?" If you know enough about the whole system to understand the full implications of moving the journal file, chances are you know enough about the whole system to make the VFS modifications without serious thought. If you don't, you should likely think twice about hacking up a fundamental transaction behavior and dismissing years and years of practical knowledge and experience. Adding a function would be trivial... which means there are really good reasons why it isn't there. There is a really big gun right here (3.6.23.1): $ grep -n \"-journal\" sqlite3.c 35406:memcpy(>zJournal[nPathname], "-journal", 8); You're not finding "db-journal" because the "db" comes from your application. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Jolly good...just the noose we need :-) I also was thinking of memory mode for the journal. That's probably the best solution for this kinda of case. Though I think the original poster was wanting a recovery mechanism. Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Richard Hipp Sent: Wed 7/14/2010 1:12 PM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization On Wed, Jul 14, 2010 at 1:51 PM, Richard Hippwrote: > > > On Wed, Jul 14, 2010 at 1:34 PM, Black, Michael (IS) < > michael.bla...@ngc.com> wrote: > >> Was that a facetious remark??? >> >> Rather than "here's a function/pragma that allows you to put the journal >> file where you want -- but BE CAREFUL BECAUSE..." >> >> Writing you own VFS is not for the casual user... >> >> I was trying just to find where the journal filename was created but >> there's no "db-journal" string in either the .h or .c file. >> >> It would be trivial to add a function to set it via the C interface. >> > > Putting the rollback journal in any directory other than the same directory > as the database file is an dangerous thing to do, because it risks being > unable to locate the rollback journal after a crash, resulting in database > corruption. Hence, we have no intention of supporting such a feature. If > you really need it badly enough, you can write your own VFS to make it > happen. Yes, writing your own VFS is hard to do. But this serves to > discourage people from doing it, which is what we want. > > Well, I'm wrong. Turns out we are going to help you hang yourself after all: I forgot about these features: PRAGMA journal_mode=MEMORY; PRAGMA journal_mode=OFF; Set one of those and you get no disk I/O from reading or writing the journal file. And you will corrupt your database on a crash. On your own head be it. > > -- > - > D. Richard Hipp > d...@sqlite.org > -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On Wed, Jul 14, 2010 at 1:51 PM, Richard Hippwrote: > > > On Wed, Jul 14, 2010 at 1:34 PM, Black, Michael (IS) < > michael.bla...@ngc.com> wrote: > >> Was that a facetious remark??? >> >> Rather than "here's a function/pragma that allows you to put the journal >> file where you want -- but BE CAREFUL BECAUSE..." >> >> Writing you own VFS is not for the casual user... >> >> I was trying just to find where the journal filename was created but >> there's no "db-journal" string in either the .h or .c file. >> >> It would be trivial to add a function to set it via the C interface. >> > > Putting the rollback journal in any directory other than the same directory > as the database file is an dangerous thing to do, because it risks being > unable to locate the rollback journal after a crash, resulting in database > corruption. Hence, we have no intention of supporting such a feature. If > you really need it badly enough, you can write your own VFS to make it > happen. Yes, writing your own VFS is hard to do. But this serves to > discourage people from doing it, which is what we want. > > Well, I'm wrong. Turns out we are going to help you hang yourself after all: I forgot about these features: PRAGMA journal_mode=MEMORY; PRAGMA journal_mode=OFF; Set one of those and you get no disk I/O from reading or writing the journal file. And you will corrupt your database on a crash. On your own head be it. > > -- > - > D. Richard Hipp > d...@sqlite.org > -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
It sounds like you've been drinking Microsoft's kool-aid -- don't let the user do what they need to do...it's too dangerous. Just try and print a directory list from your file browser in Windows for example (not that it's dangerous..but it's pretty obvious someone might want to do it). Software should ALWAYS allow for an expert mode. Where we can do dangerous things when we need to (for example keeping your journal on local file storage when your database is on network storage). Just like it's stated that it's dangerous to use network storage but you allow it nonetheless. That is MUCH more likely to happen that someone trying to store the journal elsewhere. So how do you justity allowing that to occur? Seems you're being a bit non-orthogonal. If you really needed to recover such a database you just have to move the journal file to the database directory. Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Richard Hipp Sent: Wed 7/14/2010 12:51 PM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization On Wed, Jul 14, 2010 at 1:34 PM, Black, Michael (IS)wrote: > Was that a facetious remark??? > > Rather than "here's a function/pragma that allows you to put the journal > file where you want -- but BE CAREFUL BECAUSE..." > > Writing you own VFS is not for the casual user... > > I was trying just to find where the journal filename was created but > there's no "db-journal" string in either the .h or .c file. > > It would be trivial to add a function to set it via the C interface. > Putting the rollback journal in any directory other than the same directory as the database file is an dangerous thing to do, because it risks being unable to locate the rollback journal after a crash, resulting in database corruption. Hence, we have no intention of supporting such a feature. If you really need it badly enough, you can write your own VFS to make it happen. Yes, writing your own VFS is hard to do. But this serves to discourage people from doing it, which is what we want. -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On 14 Jul 2010, at 5:02pm, Black, Michael (IS) wrote: > Does anybody know how to make the journal file go to a different location > than the database? Apprarently it's not treated as a "temporary" file. > Perhaps it should be?? It's essential not to treat the journal file as a temporary file (e.g put it in /tmp for a Unix environment). Temporary directories are wiped out at boot time, whereas a SQLite journal file is used after a crash and reboot to recover your database to a sane and usable state. And I agree with other people in this thread: it is pointless for you to mess about trying to optimize the operation of SQLite at this stage because your bottleneck to performance is the speed of your network link. You might somehow achieve a 5% improvement in speed by endless messing with SQLite whereas getting a faster path to your remote storage might get you a 50% improvement. Take a look at your network toplogy, the priority settings on your switches, etc.. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On Wed, Jul 14, 2010 at 1:34 PM, Black, Michael (IS)wrote: > Was that a facetious remark??? > > Rather than "here's a function/pragma that allows you to put the journal > file where you want -- but BE CAREFUL BECAUSE..." > > Writing you own VFS is not for the casual user... > > I was trying just to find where the journal filename was created but > there's no "db-journal" string in either the .h or .c file. > > It would be trivial to add a function to set it via the C interface. > Putting the rollback journal in any directory other than the same directory as the database file is an dangerous thing to do, because it risks being unable to locate the rollback journal after a crash, resulting in database corruption. Hence, we have no intention of supporting such a feature. If you really need it badly enough, you can write your own VFS to make it happen. Yes, writing your own VFS is hard to do. But this serves to discourage people from doing it, which is what we want. -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Was that a facetious remark??? Rather than "here's a function/pragma that allows you to put the journal file where you want -- but BE CAREFUL BECAUSE..." Writing you own VFS is not for the casual user... I was trying just to find where the journal filename was created but there's no "db-journal" string in either the .h or .c file. It would be trivial to add a function to set it via the C interface. Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Richard Hipp Sent: Wed 7/14/2010 12:12 PM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization On Wed, Jul 14, 2010 at 12:31 PM, Black, Michael (IS) < michael.bla...@ngc.com> wrote: > If you could set the journcal location BEFORE you open the database that > wouldn't be such a bad thing. Giving us the ability to do this would allow > for the flexibility when needed with appropriate warnings about how to > recover. > You can write your own VFS that places the journal file wherever you want. -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On Wed, Jul 14, 2010 at 12:31 PM, Black, Michael (IS) < michael.bla...@ngc.com> wrote: > If you could set the journcal location BEFORE you open the database that > wouldn't be such a bad thing. Giving us the ability to do this would allow > for the flexibility when needed with appropriate warnings about how to > recover. > You can write your own VFS that places the journal file wherever you want. -- - D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Yuck...do you know what bandwidth you have? Looks kind of like a T1 line to me. That should allow 100Meg to come across in about 13 minutes if it's not being used for anything else. Unless you're monitoring time usage in your application how do you know where your time is being spent? If the T1 line is busy that will slow you down more and if you're not measuring both Oracle and Sqlite time independently how do you know?? I would optimize your local write first to see what the optimum transaction size is and other options you can set. Then use those options on your network write. And have you considered trying 3.7.0 and use WAL? Seems like your application might benefit quite a bit from that. You could also split your app in 2 (this would work witih CLAR too). Have your Oracle app write to a named pipe and your CLAR/SQlite read the data from the name pipe. Then your Oracle app can trot off to retrieve another record whild your DB app writes to your database. This is a pretty simple way to do multitasking without writing code and would get rid of some your latency. I assume you've got a test data file with all your SQL statements in it that you can test with? That's the best way to optimize your local database writing options. Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Werner Smit Sent: Wed 7/14/2010 11:43 AM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization If you don't know why it's slow you'll be shooting in the dark. And doing compression on a local network link isn't likely to buy you much. Might even hurt. In other words, is it latency or bandwidth? Give 8K/sec I'm guessing it's latency unless you're running a 64KBit line. Are you THAT slow??? What's your ping time latency look like? Ping Latency : Pinging allpay [196.11.183.3] with 32 bytes of data: Reply from 196.11.183.3: bytes=32 time=115ms TTL=250 Reply from 196.11.183.3: bytes=32 time=175ms TTL=250 .. Reply from 196.11.183.3: bytes=32 time=228ms TTL=250 Reply from 196.11.183.3: bytes=32 time=264ms TTL=250 Ping statistics for 196.11.183.3: Packets: Sent = 8, Received = 8, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 74ms, Maximum = 264ms, Average = 178ms DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Werner Smit wrote: > After taking out count(*) and adding a few pragma's and saving 6000 > records rather than 500 at a time I've got it down to 34 minutes. > If I build in on local drive it takes 28 minutes.(with chunks of 500) Why not do an apples-to-apples test and commit the same number of records per batch in each test? The idea was to vary only one thing (the mount point) and keep all other variables constant. 250ms ping times, wow. SQLite write speeds will be better if the NFS server is on the same planet as the client. -- Eric A. Smith Software is like entropy. It is difficult to grasp, weighs nothing, and obeys the Second Law of Thermodynamics, i.e., it always increases. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
If you don't know why it's slow you'll be shooting in the dark. And doing compression on a local network link isn't likely to buy you much. Might even hurt. In other words, is it latency or bandwidth? Give 8K/sec I'm guessing it's latency unless you're running a 64KBit line. Are you THAT slow??? What's your ping time latency look like? Ping Latency : Pinging allpay [196.11.183.3] with 32 bytes of data: Reply from 196.11.183.3: bytes=32 time=115ms TTL=250 Reply from 196.11.183.3: bytes=32 time=175ms TTL=250 .. Reply from 196.11.183.3: bytes=32 time=228ms TTL=250 Reply from 196.11.183.3: bytes=32 time=264ms TTL=250 Ping statistics for 196.11.183.3: Packets: Sent = 8, Received = 8, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 74ms, Maximum = 264ms, Average = 178ms DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
If you could set the journcal location BEFORE you open the database that wouldn't be such a bad thing. Giving us the ability to do this would allow for the flexibility when needed with appropriate warnings about how to recover. In particular, if you only access the database through your own application there shouldn't be a problem. Only if you try to access the database with some other "unaware" application. Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Eric Smith Sent: Wed 7/14/2010 11:24 AM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization Black, Michael (IS) wrote: > Does anybody know how to make the journal file go to a different > location than the database? Apprarently it's not treated as a "temporary" > file. Perhaps it should be?? Seems like you'd have to communicate the journal location to other processes, meaning you'd have to write the name of the journal file into the main db, in the header or something. I think sqlite doesn't do that at the moment, which means you'd have to change the file format, which sqlite devs are loath to do. -- Eric A. Smith Worthless. -- Sir George Bidell Airy, KCB, MA, LLD, DCL, FRS, FRAS (Astronomer Royal of Great Britain), estimating for the Chancellor of the Exchequer the potential value of the "analytical engine" invented by Charles Babbage, September 15, 1842. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
If you don't know why it's slow you'll be shooting in the dark. And doing compression on a local network link isn't likely to buy you much. Might even hurt. In other words, is it latency or bandwidth? Give 8K/sec I'm guessing it's latency unless you're running a 64KBit line. Are you THAT slow??? What's your ping time latency look like? Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Werner Smit Sent: Wed 7/14/2010 11:23 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] EXTERNAL:Re: Sqlite Insert Speed Optimization On 14 July 2010 17:00, Werner Smit wrote: > It DOES sound terrible since 90%? of the time is spend in retrieving > data from a remote oracle server over a slow line. I think you're trying to optimise the wrong thing :) Assuming you can't upgrade that slow line, how about running a compressed ssh tunnel between the oracle server and the client, port-forwarding 1521/tcp over it and changing your tnsnames to point at the client machine? Paul. ___ Wow! That sound fascinating! But since my client is in the banking environment they have strict procedures about network control. I will research this option, but target date for implementation would possible be middle 2020. I'd however like to test this with a local server - is there a faq somewhere about compressed ssh tunneling on oracle ports? Is it open source? Have you done it yourself? What was speed improvement? Werner DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Black, Michael (IS) wrote: > Does anybody know how to make the journal file go to a different > location than the database? Apprarently it's not treated as a "temporary" > file. Perhaps it should be?? Seems like you'd have to communicate the journal location to other processes, meaning you'd have to write the name of the journal file into the main db, in the header or something. I think sqlite doesn't do that at the moment, which means you'd have to change the file format, which sqlite devs are loath to do. -- Eric A. Smith Worthless. -- Sir George Bidell Airy, KCB, MA, LLD, DCL, FRS, FRAS (Astronomer Royal of Great Britain), estimating for the Chancellor of the Exchequer the potential value of the "analytical engine" invented by Charles Babbage, September 15, 1842. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Now I'm confused...how can you be 50% faster if 90% of the time is in retrieving from Oracle? Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Werner Smit Sent: Wed 7/14/2010 11:11 AM To: General Discussion of SQLite Database Subject: EXTERNAL:Re: [sqlite] Sqlite Insert Speed Optimization >>According to my math your final database size should be on the order of 100Meg? >> >>That means at 200 minutes and 1,000,000 records: >>83 inserts per second >>8333 bytes per second >>Both of these values are terrible. >>#1 What kind of network connection do you have? 100BaseT? >>#2 What kind of server are you writing to? >>#3 How fast does this run if you write to your local machine? >>Michael D. Black >It DOES sound terrible since 90%? of the time is spend in retrieving >data from a remote oracle server over a slow line. The problem is that >the time spend saving to sqlite is still more than I used to spend on >saving to my previous file system. >Can you answer #3 though? It's nearly 50% faster. >Does anybody know how to make the journal file go to a different location than the database? Apprarently it's not treated as a "temporary" file. Perhaps it should be?? DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
>>According to my math your final database size should be on the order of 100Meg? >> >>That means at 200 minutes and 1,000,000 records: >>83 inserts per second >>8333 bytes per second >>Both of these values are terrible. >>#1 What kind of network connection do you have? 100BaseT? >>#2 What kind of server are you writing to? >>#3 How fast does this run if you write to your local machine? >>Michael D. Black >It DOES sound terrible since 90%? of the time is spend in retrieving >data from a remote oracle server over a slow line. The problem is that >the time spend saving to sqlite is still more than I used to spend on >saving to my previous file system. >Can you answer #3 though? It's nearly 50% faster. >Does anybody know how to make the journal file go to a different location than the database? Apprarently it's not treated as a "temporary" file. Perhaps it should be?? DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Can you answer #3 though? Does anybody know how to make the journal file go to a different location than the database? Apprarently it's not treated as a "temporary" file. Perhaps it should be?? Michael D. Black Senior Scientist Northrop Grumman Mission Systems From: sqlite-users-boun...@sqlite.org on behalf of Werner Smit Sent: Wed 7/14/2010 10:59 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] EXTERNAL:Re: Sqlite Insert Speed Optimization >According to my math your final database size should be on the order of 100Meg? > >That means at 200 minutes and 1,000,000 records: >83 inserts per second >8333 bytes per second >Both of these values are terrible. >#1 What kind of network connection do you have? 100BaseT? >#2 What kind of server are you writing to? >#3 How fast does this run if you write to your local machine? >Michael D. Black It DOES sound terrible since 90%? of the time is spend in retrieving data from a remote oracle server over a slow line. The problem is that the time spend saving to sqlite is still more than I used to spend on saving to my previous file system. DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
>>On 14 Jul 2010, at 11:55am, Werner Smit wrote: >> ps. When I started with sqlite it took 500 minutes to save the 1 million >> records. >> I've got it down to just less than 200 minutes with current settings. >> Clarion does it in between 100 and 200 minutes. >Do you have any indexes defined ? It can be considerably faster to DROP the indexes before importing all that >data, then remake them afterwards. Of course you may already be doing this. > >Simon. Nope, no index only a primary key.. CREATE TABLE Due (Pin Integer ,IDNumber Char(13) ,Name VarChar(35) ,PayPeriodInteger ,PaypointCode VarChar(6) ,RegionCode VarChar(6) ,ProxyPin Integer ,PRIMARY KEY (Pin) ); The table I'm using for test purpose look like above. DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On 14 Jul 2010, at 11:55am, Werner Smit wrote: > ps. When I started with sqlite it took 500 minutes to save the 1 million > records. > I've got it down to just less than 200 minutes with current settings. > Clarion does it in between 100 and 200 minutes. Do you have any indexes defined ? It can be considerably faster to DROP the indexes before importing all that data, then remake them afterwards. Of course you may already be doing this. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
-Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Paul Corke Sent: 14 July 2010 01:03 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Sqlite Insert Speed Optimization On 14 July 2010 11:56, Werner Smit wrote: > 3. I'm saving to a network drive. Is this a one-off data import into a new clean sqlite db? If so have you considered writing to a db file on a local drive and then copying the whole file to the network drive afterwards? Paul. I have considered that! I do allow the users to do a clean build or an update. Clean build is usually once a week - I could build that on local drive BUT the problem is if the copy fail and leave them with incomplete file I'm in trouble. To manage this and make sure all is well is not an easy task. I did check and the sqlite speed on local drive seem to be much faster. Would it be possible to tell sqlite to save journal file to local drive and merge with server file on transaction completion? Another option - with much more work would be to create a db on local drive, fill it with X records and start a seperate thread to merge this with network drive while my program is busy fetching the next X record batch. But if at all possible I'd like to not make too many changes to the current program. I like the kiss scenario. "keep it short and simple" DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On 14 July 2010 11:56, Werner Smit wrote: > 3. I'm saving to a network drive. Is this a one-off data import into a new clean sqlite db? If so have you considered writing to a db file on a local drive and then copying the whole file to the network drive afterwards? Paul. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
> Thanks for all the feedback! > > It helped a lot. > 1. I'm going to try and see what happen if I leave the "end transaction" > until 5 insert was done. This is what I meant also when I said 500 was too small. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Thanks for all the feedback! It helped a lot. 1. I'm going to try and see what happen if I leave the "end transaction" until 5 insert was done. 2. I'm going to increase cache_size from 8192 to 16384 Extra info, 1. This program saved data to a clarion file before and in sqlite it's about 20-30% slower. 2. I'm reading data from an remote oracle server 3. I'm saving to a network drive. 4. If my program seems to hang for longer than about 3-5 minutes the user tends to kill it. ps. When I started with sqlite it took 500 minutes to save the 1 million records. I've got it down to just less than 200 minutes with current settings. Clarion does it in between 100 and 200 minutes. DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
-Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Drescher Sent: Tuesday, July 13, 2010 12:37 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Sqlite Insert Speed Optimization > I also wrap my statements (about 500 inserts at a time) with a > begin/end transaction. > After these 500 i take a few seconds to read more data so sqlite > should have time to do any housekeeping it might need. > Wrap more into a transaction. 500 is too small of a percentage of a million. John John, I was wondering if that's really so. Wouldn't the marginal speed improvement be quite small? Is the percentage of the final rowcount really a criterion? I did find that the speed improvement was quite small, and to the user it appears as if the machine is unresponsive. DISCLAIMER: Everything in this email and its attachments relating to the official business of Face Technologies is proprietary to Face Technologies. It is confidential, legally privileged and protected by law. The person addressed in the email is the sole authorised recipient. Any unauthorized dissemination or copying of this e-mail (or any attachment to this e-mail) or the wrongful disclosure of the information herein contained is prohibited. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Griggs, Donald wrote: > I guess I was wondering if the fastest records-per-transaction value > would depend on the page cache and be more or less independent of the > total records to be imported. I think the page cache is one of a great many variables. > So, the records-per-transaction for import to a 20 million row table > should be twenty times the size for a 1 million row table? I'm no sqlite or sql guru myself, so with a grain of salt: If you have no reason to commit in the middle of a batch, then don't do it. I think inserting all the rows in a single go will give you the best insert performance in most use cases. The idea is that there is some fixed overhead (call it O) that SQLite has to go through every time it commits a transaction. The overhead is 'fixed' because it is independent of the number of rows you inserted. If you insert 1m rows and commit every 500, the total commit overhead is 2000*O. If you commit just once, the total commit overhead is just O. This argument is likely a small or big lie for a number of reasons, but is at least a push in the right direction. Eric -- Eric A. Smith The problem with engineers is that they tend to cheat in order to get results. The problem with mathematicians is that they tend to work on toy problems in order to get results. The problem with program verifiers is that they tend to cheat at toy problems in order to get results. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Griggs, Donald wrote: > Is the percentage of the final rowcount really a criterion? The answer to that, according to my brief exploration, is somewhere between "yes" and "very much yes", depending on various factors. Thanks, Eric. I guess I was wondering if the fastest records-per-transaction value would depend on the page cache and be more or less independent of the total records to be imported. (Indicies omitted.) So, the records-per-transaction for import to a 20 million row table should be twenty times the size for a 1 million row table? I confess I've got a lot to learn. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Werner Smit wrote: > My question(s) > If I want to use the "insert or replace" to populate my database of > around 1 million records. > And I want to do it as fast as possible. > What are all the tricks I can use? Obey the first rule of optimization: don't do it unless you're sure you need to. If you're sure you need to, here are a few hints from my own experience. They all come with trade-offs. Buy faster hardware with more memory. I am serious. Sometimes it is much cheaper to wait for 6 months for CPU speeds and memory sizes to increase than to spend time optimizing your code. Don't define triggers, indices, unique constraints, check constraints, or primary keys. Make sure foreign key checking is off (which it is by default). Give sqlite a large page cache. If you are 32 bits then sqlite can't use more than 4Gb of memory, so keep that in mind. If you give sqlite more memory than exists on your machine, you might go to swap hell, so don't do that. If you are CPU bound and if you can split your problem into orthogonal chunks and if you have multiple CPUs, consider farming out the work to worker processes and incorporating their results into the main database. Depending on your use-case you can consider telling your operating system to favor the disk cache over processes' memory when you are running low on RAM. In linux this is accomplished by setting 'swappiness' high, not sure about other OSs. > I had a count(*) to check how many inserts was actually done(4 progress > bar) - and this slowed my down very much. That's because count(*) doesn't run in constant time. I'm not sure, but it may be linear. Which would imply that your algo as a whole is quadratic instead of its original (likely constant) asymptotic behavior. > Took it out, and want to use "select total_changes() " to keep track of > inserts. Any problem with that? You sound like you are writing a multi-threaded program. Are you sure that total_changes() is only counting changes due to your insertions? Consider keeping a loop execution counter and using that for your status bar. Good luck, Eric -- Eric A. Smith Electricity is actually made up of extremely tiny particles called electrons, that you cannot see with the naked eye unless you have been drinking. -- Dave Barry ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
Griggs, Donald wrote: > Is the percentage of the final rowcount really a criterion? The answer to that, according to my brief exploration, is somewhere between "yes" and "very much yes", depending on various factors. -- Eric A. Smith The number of UNIX installations has grown to 10, with more expected. -- The Unix Programmer's Manual, 2nd Edition, June 1972 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On 13 Jul 2010, at 5:33pm, Werner Smit wrote: > I currently use the following pragma's (for speed) > temp_store = 2 > page_size=4096 > cache_size=8192 > synchronous=off > > Any others I could try? Don't get too involved in the PRAGMAs until you have a good reason to. The default values are pretty good. How much slower is it than you want it to be ? If you need, say, 5% improvement we might suggest some things, but if you need 50% improvement we might suggest more radical (and harder to program) solutions. If you're optimizing just for the sake of it, find something better to do. If you're doing a huge amount of database loading first it's faster to do it before you create any INDEXes, then to create your INDEXes once your TABLEs are populated. Once your database is initialised do you expect to do more reads or more writes ? Which one you do more of suggests how many INDEXes you should define. > I also wrap my statements (about 500 inserts at a time) with a begin/end > transaction. As JD wrote, at 500 writes in a transaction you're wasting a lot of time in overheads. Try 50,000. > After these 500 i take a few seconds to read more data so sqlite should > have time to do any housekeeping it might need. Unlike, for example MySQL, SQLite does nothing in the background. The only functions it runs are the ones you call directly: it has no server process and no daemon. However, your hardware may be caching writes or something, so your hardware may be taking advantage of the pauses. > I had a count(*) to check how many inserts was actually done(4 progress > bar) - and this slowed my down very much. > Took it out, and want to use "select total_changes() " to keep track of > inserts. Much better. As an alternative (and I'm not saying it's better than what you already have) take a look at http://www.sqlite.org/c3ref/last_insert_rowid.html Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
On Tue, Jul 13, 2010 at 12:48 PM, John Drescherwrote: >> I was wondering if that's really so. Wouldn't the marginal speed >> improvement be quite small? Is the percentage of the final rowcount really >> a criterion? > > Each transaction costs at least 1 disk seek. Doing thousands of seeks > the result would be very slow. > I guess synchronous=off eliminates this flushing / seeking though. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
> I was wondering if that's really so. Wouldn't the marginal speed improvement > be quite small? Is the percentage of the final rowcount really a criterion? Each transaction costs at least 1 disk seek. Doing thousands of seeks the result would be very slow. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
-Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of John Drescher Sent: Tuesday, July 13, 2010 12:37 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Sqlite Insert Speed Optimization > I also wrap my statements (about 500 inserts at a time) with a > begin/end transaction. > After these 500 i take a few seconds to read more data so sqlite > should have time to do any housekeeping it might need. > Wrap more into a transaction. 500 is too small of a percentage of a million. John John, I was wondering if that's really so. Wouldn't the marginal speed improvement be quite small? Is the percentage of the final rowcount really a criterion? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
> After these 500 i take a few seconds to read more data so sqlite should > have time to do any housekeeping it might need. SQLite is not a Database Server. It has no background threads. So it can't do any "housekeeping" until you call some sqlite3_* function. Pavel On Tue, Jul 13, 2010 at 12:33 PM, Werner Smitwrote: > Hi there. > > I've been playing around with sqlite. > Very impressed so far. > Using 3.5.6 in windows developing with Clarion. > > My question(s) > If I want to use the "insert or replace" to populate my database of > around 1 million records. > And I want to do it as fast as possible. > What are all the tricks I can use? > > > I've read http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html > And looked at some of the various pragmas. > > > But would PRAGMA locking_mode = EXCLUSIVE for instance give me an extra > speed increase? > Do we have a page in www.sqlite.org dedicated to speed optimization? > > > I currently use the following pragma's (for speed) > temp_store = 2 > page_size=4096 > cache_size=8192 > synchronous=off > > Any others I could try? > > I also wrap my statements (about 500 inserts at a time) with a begin/end > transaction. > After these 500 i take a few seconds to read more data so sqlite should > have time to do any housekeeping it might need. > > > I had a count(*) to check how many inserts was actually done(4 progress > bar) - and this slowed my down very much. > Took it out, and want to use "select total_changes() " to keep track of > inserts. > Any problem with that? > > > This is my 1st post here and I have a feeling I'm asking to many > questions at once? :-) > > Cheers, > Werner > > > > > DISCLAIMER: > Everything in this email and its attachments relating to the official > business of Face Technologies is proprietary to Face Technologies. It is > confidential, legally privileged and protected by law. The person addressed > in the email is the sole authorised recipient. Any unauthorized dissemination > or copying of this e-mail (or any attachment to this e-mail) or the wrongful > disclosure of the information herein contained is prohibited. > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Sqlite Insert Speed Optimization
> I also wrap my statements (about 500 inserts at a time) with a begin/end > transaction. > After these 500 i take a few seconds to read more data so sqlite should > have time to do any housekeeping it might need. > Wrap more into a transaction. 500 is too small of a percentage of a million. John ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users