Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
Zachary Turnerwrote: > I still don't understand the transactions. For example, I issue a > single BEGIN at the start of my application and then insert about > 500MB of data through many small inserts (about 4KB each). During > this whole time I never issue a commit. But the main db file grows > very large, and the journal file remains small. Is sqlite manually > forcing commits for me at some fixed threshold? SQLite uses an undo journal: changes are written to the database file, and the journal keeps the original pages. Committing a transaction simply means discarding the journal; rolling back means copying saved pages from journal back to database file. Your massive insert operation doesn't modify many pages in the database file - it mostly creates new ones by extending the file. For this case, the only thing one needs to store in the journal is the original size of the database, so that the file could be truncated on rollback. That's why the journal file remains the same size even as the main file grows. Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
On Wed, Jul 22, 2009 at 1:42 PM, Pavel Ivanovwrote: > > Again, try to call sqlite3_memory_used() several times during work of > your application. What does it say to you? It says it's using about 3MB. Which suggests maybe it's my application. I believe I've identified the problem in my code but it takes some work to fix so I can't report with 100% certainty how much of a difference it will make. > If you don't do commits (and "begins") of transactions by yourself > then SQLite does that automatically after each executed statement. So > when sqlite3_step returns you can be sure that everything is committed > and everything is on disk. SQLite doesn't do any write-through > caching. And as your transactions are small in volume then journal > size is also always small. I still don't understand the transactions. For example, I issue a single BEGIN at the start of my application and then insert about 500MB of data through many small inserts (about 4KB each). During this whole time I never issue a commit. But the main db file grows very large, and the journal file remains small. Is sqlite manually forcing commits for me at some fixed threshold? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
> Assuming I don't do any manual commits, what does sqlite do with the > data that has yet to be committed? If you don't do commits (and "begins") of transactions by yourself then SQLite does that automatically after each executed statement. So when sqlite3_step returns you can be sure that everything is committed and everything is on disk. SQLite doesn't do any write-through caching. And as your transactions are small in volume then journal size is also always small. > Unfortunately I don't have access > to memory leak detection tools, otherwise that would obviously be > ideal. Again, try to call sqlite3_memory_used() several times during work of your application. What does it say to you? Pavel On Wed, Jul 22, 2009 at 2:30 PM, Zachary Turnerwrote: > On Wed, Jul 22, 2009 at 10:47 AM, Pavel Ivanov wrote: >> SQLite synchronizes with disk during every commit (either issued by >> yourself or automatic) at least 2 times (I don't know exact number). >> So it's quite natural that it spends most of the time in winSync(). >> But I still didn't understand from your explanation how exactly your >> application works and whether it's SQLite uses memory or your >> application does. >> BTW, how do you measure memory usage and how do you see leakage? What >> does sqlite3_memory_used() returns for you? >> >> Pavel > > I was measuring memory usage by just looking at windows task manager. > If I watch it for about 20 seconds, it goes up indefinitely until I > stop reading more data from the file (and thus stop issuing insert > statements), at which point it steadly declines for a while. > > Assuming I don't do any manual commits, what does sqlite do with the > data that has yet to be committed? I thought it would store it in the > journal file, but the journal file always remains consistently very > small (around 8K max), and data gets written to the actual database > file even when I'm not doing commits. > > I have some ideas about the memory consumption problem that turns out > to be related to my own code (I agree it's amazingly complicated, but > it has to be for reasons outside of what we're doing with sqlite). I > will investigate that further and post back if I am able to pinpoint > the issue to sqlite more closely. Unfortunately I don't have access > to memory leak detection tools, otherwise that would obviously be > ideal. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
On Wed, Jul 22, 2009 at 10:47 AM, Pavel Ivanovwrote: > SQLite synchronizes with disk during every commit (either issued by > yourself or automatic) at least 2 times (I don't know exact number). > So it's quite natural that it spends most of the time in winSync(). > But I still didn't understand from your explanation how exactly your > application works and whether it's SQLite uses memory or your > application does. > BTW, how do you measure memory usage and how do you see leakage? What > does sqlite3_memory_used() returns for you? > > Pavel I was measuring memory usage by just looking at windows task manager. If I watch it for about 20 seconds, it goes up indefinitely until I stop reading more data from the file (and thus stop issuing insert statements), at which point it steadly declines for a while. Assuming I don't do any manual commits, what does sqlite do with the data that has yet to be committed? I thought it would store it in the journal file, but the journal file always remains consistently very small (around 8K max), and data gets written to the actual database file even when I'm not doing commits. I have some ideas about the memory consumption problem that turns out to be related to my own code (I agree it's amazingly complicated, but it has to be for reasons outside of what we're doing with sqlite). I will investigate that further and post back if I am able to pinpoint the issue to sqlite more closely. Unfortunately I don't have access to memory leak detection tools, otherwise that would obviously be ideal. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
just a point: In your pseudocode you precreate statements for the begin and commit and trying to reuse them later on. If your real code is like this I would suggest to take a look at this. I think it is not possible to "re-step" a begin or commit, you may try to recreate these statements completely. Maybe that is a reason you are seeing a lot of file-syncs because the begin/commit mechanism doesn't actually work as intended by you ? Marcus Zachary Turner wrote: > On Tue, Jul 21, 2009 at 6:45 PM, Pavel Ivanovwrote: >>> If I remove or comment out the >>> three lines with //* in the code above, I get no memory leaks. >> So basically you're saying that if you don't insert any data into your >> database and thus effectively don't do with your database anything and >> thus SQLite don't have to cache anything from database then you >> observe no memory leaks, right? Is it by any chance a growing database >> cache is what you see? What if you execute at the very beginning of >> your application this: >> pragma cache_size = 1; >> >> Pavel > > Thanks for your suggestion. I didn't have too much luck with it > unfortunately, but I did learn a few more things about the problem. > Just to make sure I understood you though, after creating the database > with sqlite3_open_v2 I ran the command > > sqlite3_exec(db_, "pragma cache_size=1;"); > > The return value was successful so I assume it worked. > > This doesn't appear to change the allocation behavior. I feel like it > might be related to me issuing INSERT queries too fast for sqlite to > deal with. This is asynchronous (not using the experimental Sqlite > asynchronous vfs but a completely different async model I wrote > myself) and the thread that issues inserts to sqlite is different than > the thread that reads data from the file. > > I changed the code a little so that it permanently stops reading more > data from the file after some fixed amount (I chose about 500MB) but > continues writing until all data has been committed to the database. > when I do this, I notice that after I stop my reads, it takes a *long > time* before all the writes are finished in sqlite. 99% of this time > is spent in the winSync() function in the sqlite source code > constantly calling FlushFileBuffers(). > > Upon reflection it makes sense that this would happen in an > asynchronous environment, but I thought that after a certain threshold > running an insert statement on a non-in-memory table would generate a > sync and simply block my sqlite background thread until there was no > more buffered data. > > In my async model I post messages between threads, so for example the > main thread posts a message to the sqlite thread saying "write this > data", which runs the pseudocode I posted in the OP, and after that's > done it posts a message to the main thread saying "I'm done writing > that data". the main thread only issues new read requests as a result > of getting that message, so if sqlite were syncing every so often then > I could guarantee that I wouldn't be in this situation because the > main thread wouldn't be able to issue any more reads until the sync > was done. > > > Am I misunderstanding sqlite's filesystem sync'ing algorithm? > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
SQLite synchronizes with disk during every commit (either issued by yourself or automatic) at least 2 times (I don't know exact number). So it's quite natural that it spends most of the time in winSync(). But I still didn't understand from your explanation how exactly your application works and whether it's SQLite uses memory or your application does. BTW, how do you measure memory usage and how do you see leakage? What does sqlite3_memory_used() returns for you? Pavel On Wed, Jul 22, 2009 at 11:26 AM, Zachary Turnerwrote: > On Tue, Jul 21, 2009 at 6:45 PM, Pavel Ivanov wrote: >>> If I remove or comment out the >>> three lines with //* in the code above, I get no memory leaks. >> >> So basically you're saying that if you don't insert any data into your >> database and thus effectively don't do with your database anything and >> thus SQLite don't have to cache anything from database then you >> observe no memory leaks, right? Is it by any chance a growing database >> cache is what you see? What if you execute at the very beginning of >> your application this: >> pragma cache_size = 1; >> >> Pavel > > Thanks for your suggestion. I didn't have too much luck with it > unfortunately, but I did learn a few more things about the problem. > Just to make sure I understood you though, after creating the database > with sqlite3_open_v2 I ran the command > > sqlite3_exec(db_, "pragma cache_size=1;"); > > The return value was successful so I assume it worked. > > This doesn't appear to change the allocation behavior. I feel like it > might be related to me issuing INSERT queries too fast for sqlite to > deal with. This is asynchronous (not using the experimental Sqlite > asynchronous vfs but a completely different async model I wrote > myself) and the thread that issues inserts to sqlite is different than > the thread that reads data from the file. > > I changed the code a little so that it permanently stops reading more > data from the file after some fixed amount (I chose about 500MB) but > continues writing until all data has been committed to the database. > when I do this, I notice that after I stop my reads, it takes a *long > time* before all the writes are finished in sqlite. 99% of this time > is spent in the winSync() function in the sqlite source code > constantly calling FlushFileBuffers(). > > Upon reflection it makes sense that this would happen in an > asynchronous environment, but I thought that after a certain threshold > running an insert statement on a non-in-memory table would generate a > sync and simply block my sqlite background thread until there was no > more buffered data. > > In my async model I post messages between threads, so for example the > main thread posts a message to the sqlite thread saying "write this > data", which runs the pseudocode I posted in the OP, and after that's > done it posts a message to the main thread saying "I'm done writing > that data". the main thread only issues new read requests as a result > of getting that message, so if sqlite were syncing every so often then > I could guarantee that I wouldn't be in this situation because the > main thread wouldn't be able to issue any more reads until the sync > was done. > > > Am I misunderstanding sqlite's filesystem sync'ing algorithm? > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
On Tue, Jul 21, 2009 at 6:45 PM, Pavel Ivanovwrote: >> If I remove or comment out the >> three lines with //* in the code above, I get no memory leaks. > > So basically you're saying that if you don't insert any data into your > database and thus effectively don't do with your database anything and > thus SQLite don't have to cache anything from database then you > observe no memory leaks, right? Is it by any chance a growing database > cache is what you see? What if you execute at the very beginning of > your application this: > pragma cache_size = 1; > > Pavel Thanks for your suggestion. I didn't have too much luck with it unfortunately, but I did learn a few more things about the problem. Just to make sure I understood you though, after creating the database with sqlite3_open_v2 I ran the command sqlite3_exec(db_, "pragma cache_size=1;"); The return value was successful so I assume it worked. This doesn't appear to change the allocation behavior. I feel like it might be related to me issuing INSERT queries too fast for sqlite to deal with. This is asynchronous (not using the experimental Sqlite asynchronous vfs but a completely different async model I wrote myself) and the thread that issues inserts to sqlite is different than the thread that reads data from the file. I changed the code a little so that it permanently stops reading more data from the file after some fixed amount (I chose about 500MB) but continues writing until all data has been committed to the database. when I do this, I notice that after I stop my reads, it takes a *long time* before all the writes are finished in sqlite. 99% of this time is spent in the winSync() function in the sqlite source code constantly calling FlushFileBuffers(). Upon reflection it makes sense that this would happen in an asynchronous environment, but I thought that after a certain threshold running an insert statement on a non-in-memory table would generate a sync and simply block my sqlite background thread until there was no more buffered data. In my async model I post messages between threads, so for example the main thread posts a message to the sqlite thread saying "write this data", which runs the pseudocode I posted in the OP, and after that's done it posts a message to the main thread saying "I'm done writing that data". the main thread only issues new read requests as a result of getting that message, so if sqlite were syncing every so often then I could guarantee that I wouldn't be in this situation because the main thread wouldn't be able to issue any more reads until the sync was done. Am I misunderstanding sqlite's filesystem sync'ing algorithm? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Memory leak on inserting from sqlite c/c++ interfaces
> If I remove or comment out the > three lines with //* in the code above, I get no memory leaks. So basically you're saying that if you don't insert any data into your database and thus effectively don't do with your database anything and thus SQLite don't have to cache anything from database then you observe no memory leaks, right? Is it by any chance a growing database cache is what you see? What if you execute at the very beginning of your application this: pragma cache_size = 1; Pavel On Tue, Jul 21, 2009 at 7:01 PM, Zachary Turnerwrote: > Hello, I'm a bit new to sqlite, I wonder if someone can advise me here. > > I'm using the Sqlite C/C++ interfaces, and I'm trying to do some very > basic things. Currently I'm just creating a database with 1 table, > and this table has 1 column of type blob. I then read some data out > of a file and insert it into the database, where each row will contain > some number N of bytes from the file, where N may not necessarily be > the same in each row. > > I'm doing this using the following pseudocode: > > ---Initialization--- > 1) db = sqlite3_open_v2("C:\\foo.db") > 2) sqlite3_exec(db, "CREATE TABLE DummyTable (DummyColumn BLOB)") > 3) insert_query = sqlite3_prepare_v2(db, "INSERT INTO DummyTable > (DummyColumn) VALUES (?1)") > 4) commit_query = sqlite3_prepare_v2(db, "commit") > 5) begin_query = sqlite3_prepare_v2(db, "begin") > > > ---When I want to write a chunk of the file into the database--- > if (!active_transaction_) > { > //begin a new transaction > > sqlite3_step(begin_query) > active_transaction_ = true; > } > > //bind the data to the query and execute the query > sqlite3_bind_blob(insert_query, data, length) > sqlite3_step(insert_query) // * > sqlite3_clear_bindings(insert_query) // * > sqlite3_reset(insert_query) // * > > //128 is a made up number, just for the sake of illustration > if (++count >= 128) > { > //commit the transaction > > sqlite3_step(commit_query) > active_transaction_ = false; > } > > > When I run this code for a while my memory usage grows extremely > quickly, and I don't understand why. If I remove or comment out the > three lines with //* in the code above, I get no memory leaks. If it > makes a difference when I call sqlite3_bind_blob I'm using > SQLITE_TRANSIENT for the final parameter, but my understanding is that > this is supposed to automatically free the memory when it's no longer > needed. Furthermore, the bind itself isn't what's causing the > problem, because if I leave the bind in and only comment out the > insert, I don't get the leak anymore. > > > Am I using the interfaces incorrectly or is perhaps something else > going on that I need to be aware of? > > Thanks > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Memory leak on inserting from sqlite c/c++ interfaces
Hello, I'm a bit new to sqlite, I wonder if someone can advise me here. I'm using the Sqlite C/C++ interfaces, and I'm trying to do some very basic things. Currently I'm just creating a database with 1 table, and this table has 1 column of type blob. I then read some data out of a file and insert it into the database, where each row will contain some number N of bytes from the file, where N may not necessarily be the same in each row. I'm doing this using the following pseudocode: ---Initialization--- 1) db = sqlite3_open_v2("C:\\foo.db") 2) sqlite3_exec(db, "CREATE TABLE DummyTable (DummyColumn BLOB)") 3) insert_query = sqlite3_prepare_v2(db, "INSERT INTO DummyTable (DummyColumn) VALUES (?1)") 4) commit_query = sqlite3_prepare_v2(db, "commit") 5) begin_query = sqlite3_prepare_v2(db, "begin") ---When I want to write a chunk of the file into the database--- if (!active_transaction_) { //begin a new transaction sqlite3_step(begin_query) active_transaction_ = true; } //bind the data to the query and execute the query sqlite3_bind_blob(insert_query, data, length) sqlite3_step(insert_query) // * sqlite3_clear_bindings(insert_query)// * sqlite3_reset(insert_query) // * //128 is a made up number, just for the sake of illustration if (++count >= 128) { //commit the transaction sqlite3_step(commit_query) active_transaction_ = false; } When I run this code for a while my memory usage grows extremely quickly, and I don't understand why. If I remove or comment out the three lines with //* in the code above, I get no memory leaks. If it makes a difference when I call sqlite3_bind_blob I'm using SQLITE_TRANSIENT for the final parameter, but my understanding is that this is supposed to automatically free the memory when it's no longer needed. Furthermore, the bind itself isn't what's causing the problem, because if I leave the bind in and only comment out the insert, I don't get the leak anymore. Am I using the interfaces incorrectly or is perhaps something else going on that I need to be aware of? Thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users