Re: [sqlite] Process memory space exhausted in 3.7.0
A couple of seconds before the failure occurs, the test.db-wal file is 5,502,389KB and test.db-shm is 10,688KB. The private bytes (probably the best measure of how much memory a windows application is using) is perhaps a few megs above 130MB. Making the change to have it commit every 1 records by changing the source code from this: rc = sqlite3_step( statement ); if ( rc != SQLITE_DONE ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } sqlite3_reset( statement ); To this: rc = sqlite3_step( statement ); if ( rc != SQLITE_DONE ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } sqlite3_reset( statement ); if ( recIdx % 25 == 0 ) { sqlite3_exec( connection, "COMMIT TRANSACTION;BEGIN IMMEDIATE TRANSACTION;", NULL, NULL, NULL ); } Is currently running, but it looks like it will take quite a bit of time for it to finish, so I'll have to update you tomorrow. Victor -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, August 10, 2010 6:45 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam
I'll run the test again tonight to give you the size of the -wal and -shm file at the point of failure. On WinXP (32 bits) the memory used by the application is not a problem as it stays also in the low hundreds of MB even though the virtual bytes (which are explained like this by perfmon: "Virtual Bytes is the current size, in bytes, of the virtual address space the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. Virtual space is finite, and the process can limit its ability to load libraries.") hits the 2GB max. Commiting without closing the connection did not help. I'll write the code for the corresponding tests and I'll reply again later today. Later! Victor -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy Sent: Tuesday, August 10, 2010 6:45 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0
On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote: > As it turns out, I can reproduce the failure using a single huge > insert. > The code that I'm including below compiles under bcc32 from > Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more > likely that people have MS compilers available, the compilation line > having this source file and the sqlite3 3.7.0 amalgamation files in > the > same folder is: > > cl -EHsc -Fefail.exe main.cpp sqlite3.c > > You can then invoke fail.exe with a single command line argument of > 8000 like this: > > Fail.exe 8000 > > The source for the executable is listed below. If you're wondering > about > why the numbers being inserted are more complicated than need be, it's > because I just wanted the table and indices to look as much as > possible > like the actual data that our application stores in sqlite because I > had > not realized that the failure could be reproduced with simply > inserting. > Beware that there is no handling of incorrect command line arguments. > > If you monitor this executable run with perfmon and look at its > virtual > bytes, you'll see them hit 2GB and then the next time the insert > statement is stepped, it fails with an I/O disc error. When this happens, how large are the "test.db-wal" and "test.db-shm" files on disk? Here, using g++/Linux, the test.db-shm file (the one memory mapped into the processes address space) grows to about 24 MB here. The test.db-wal file grows to about 12 GB. This is as expected - the *-shm file requires 8 bytes of space for each page written into the *-wal log. So with your 4KB pages, the *-wal file should be roughly 512 times the size of the *-shm. When the transaction is committed, the database is checkpointed. In the checkpoint code there is a big heap memory allocation - Say 2.5 bytes for each page in the WAL. So in this case maybe 6-7MB. It's not ideal to be making allocations this big, but it shouldn't cause any trouble for a desktop PC. Memory usage here peaks at around 130MB. That's the 85MB of configured cache space (20,000 * 4KB pages), plus the mapping of the test.db-shm file plus the big allocation made during the checkpoint. Plus something else I suppose. > If you break up the insert into chunks > _and_close_the_connection_between_chunks_ then the error does not > occur. Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN") every 10,000 inserts the program still does not run to completion? Dan. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Process memory space exhausted in 3.7.0
t; int, std::pair< int, int > > userId2ContextIdCount; const char* sourceStr = "test_failure"; const char* types[] = { "type_01_1234567890_1234567890", "type_02_1234567890_1234567890", "type_03_1234567890_1234567890", "type_04_1234567890_1234567890", "type_05_1234567890_1234567890", "type_06_1234567890_1234567890", "type_07_1234567890_1234567890", "type_08_1234567890_1234567890", "type_09_1234567890_1234567890", "type_10_1234567890_1234567890" }; std::cout << "Inserting into db." << std::endl; sqlite3_exec( connection, "BEGIN IMMEDIATE TRANSACTION;", NULL, NULL, NULL ); for ( int recIdx = 0; recIdx <nRecords; ++recIdx ) { int userId = std::rand() % 25000; int contextId = 0; std::map< int, std::pair< int, int > >::iterator ui2ciIter = userId2ContextIdCount.find( userId ); if ( ui2ciIter != userId2ContextIdCount.end() ) { if ( ui2ciIter->second.second > ( 50 + ( std::rand() % 1000 ) ) ) { contextId = ui2ciIter->second.first + 1; ui2ciIter->second = std::make_pair< int, int >( contextId, 1 ); } else { ui2ciIter->second.second += 1; } } else { contextId = 1; userId2ContextIdCount[ userId ] = std::make_pair< int, int >( contextId, 1 ); } __int64 tstamp = std::time( NULL ); tstamp *= 1000; tstamp += ( std::rand() % 1000 ); tstamp += ( ( std::rand() % 90 ) - 45 ); int typeIdx = std::rand() % 10; int serverId = ( recIdx + std::rand() % 1000 ) % 1000; int nSig = std::rand() % 10; int nNoi = nSig; sqlite3_bind_int( statement, 1, userId); sqlite3_bind_int( statement, 2, contextId ); sqlite3_bind_int64( statement, 3, tstamp ); sqlite3_bind_text( statement, 4, sourceStr,0, NULL ); sqlite3_bind_text( statement, 5, types[ typeIdx ], 0, NULL ) ; sqlite3_bind_int( statement, 6, serverId ); sqlite3_bind_int( statement, 7, nSig ); sqlite3_bind_int( statement, 8, nNoi ); rc = sqlite3_step( statement ); if ( rc != SQLITE_DONE ) { std::string errorMessage( sqlite3_errmsg( connection ) ); std::runtime_error ex( errorMessage ); sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); statement = NULL; connection = NULL; std::cerr << errorMessage; throw ex; } sqlite3_reset( statement ); } sqlite3_exec( connection, "COMMIT TRANSACTION;", NULL, NULL, NULL ); sqlite3_finalize( statement ); sqlite3_close( connection ); return 0; } --end main.cpp-- -Original Message- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Victor Morales-Duarte Sent: Wednesday, August 04, 2010 2:19 PM To: sqlite-users@sqlite.org Subject: [sqlite] Process memory space exhausted in 3.7.0 Hello, The windows desktop application that I maintain uses sqlite for some of its storage. The data volume that we must handle has increased dramatically over the past 6 months and as it was to be expected update performance has degraded accordingly. Because of that, I was very quick to jump onto 3.7 when I read that WAL could be selected with it. The speed improvements when updating data are indeed very noticeable when running the application on my laptop's drive (3x faster) although not so much when running on a fast SSD connected to it via ESATA (only about 20% extra speed); I guess that the different ratio of improvement was to be expected given the access characteristics of each. Overall, I have to say that I believe that WAL was a great addition. Unfortunately, I've encountered what could potentially be considered a big problem. When I run a very large update the process space for the application seems to be exhausted. The way that it manifested itself at first was that there would be a disc I/O error, but because I test the application while running perfmon.exe on win xp sp3 to monitor the IO read bytes/sec, IO write bytes/sec, processor time and virtual bytes I noticed that the virtual bytes were at the 2GB max process space limit when the disc I/O error occurred. In order to rule out the possibility that I was doing something wrong, I decided to test a similar update using the sqlite3.ex
[sqlite] Process memory space exhausted in 3.7.0
Hello, The windows desktop application that I maintain uses sqlite for some of its storage. The data volume that we must handle has increased dramatically over the past 6 months and as it was to be expected update performance has degraded accordingly. Because of that, I was very quick to jump onto 3.7 when I read that WAL could be selected with it. The speed improvements when updating data are indeed very noticeable when running the application on my laptop's drive (3x faster) although not so much when running on a fast SSD connected to it via ESATA (only about 20% extra speed); I guess that the different ratio of improvement was to be expected given the access characteristics of each. Overall, I have to say that I believe that WAL was a great addition. Unfortunately, I've encountered what could potentially be considered a big problem. When I run a very large update the process space for the application seems to be exhausted. The way that it manifested itself at first was that there would be a disc I/O error, but because I test the application while running perfmon.exe on win xp sp3 to monitor the IO read bytes/sec, IO write bytes/sec, processor time and virtual bytes I noticed that the virtual bytes were at the 2GB max process space limit when the disc I/O error occurred. In order to rule out the possibility that I was doing something wrong, I decided to test a similar update using the sqlite3.exe CLI. During the update, what the application will do is it will iterate over all the records in a table in a specific order assigning a pair of integers to two columns (both initially null) of each record, based on domain specific rules; accordingly, the test with the CLI is the opposite operation; I take a db file that is about 1.5 GB in size, with over 3.7 million records in the table that needs to be updated and then I proceed to assign null to one of the columns for all records. After some time of working, the virtual bytes (as reported by perfmon) hit the max process space and the disk I/O error is reported. At that time, the wal file is over 5.5 GB in size and the shm file is over 10MB in size. My initial guess is that there is a problem memory mapping files. I wish that I could make the db available for testing but the data contained in it cannot be disclosed due to an NDA and the schema is proprietary information of my employer. First I need to finish a workaround for this (it seems that by closing and reopening the db connection, the situation improves somewhat) and then I will write a small piece of code that will create a dummy database large enough that the error can be reproduced in it so that I can post it in a reply to this email. Thank you!!! Victor ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users