Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-12 Thread Victor Morales-Duarte
A couple of seconds before the failure occurs, the test.db-wal file is
5,502,389KB and test.db-shm is 10,688KB.
The private bytes (probably the best measure of how much memory a
windows application is using) is perhaps a few megs above 130MB.
Making the change to have it commit every 1 records by changing the
source code from this:

  rc = sqlite3_step( statement );
  if ( rc != SQLITE_DONE )
  {
 std::string errorMessage( sqlite3_errmsg( connection ) );
 std::runtime_error ex( errorMessage );
 sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL );
 sqlite3_finalize( statement );
 sqlite3_close( connection );
 statement = NULL;
 connection = NULL;
 std::cerr << errorMessage;
 throw ex;
  }
  sqlite3_reset( statement );

To this:

  rc = sqlite3_step( statement );
  if ( rc != SQLITE_DONE )
  {
 std::string errorMessage( sqlite3_errmsg( connection ) );
 std::runtime_error ex( errorMessage );
 sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL );
 sqlite3_finalize( statement );
 sqlite3_close( connection );
 statement = NULL;
 connection = NULL;
 std::cerr << errorMessage;
 throw ex;
  }
  sqlite3_reset( statement );
  if ( recIdx % 25 == 0 )
  {
 sqlite3_exec( connection, "COMMIT TRANSACTION;BEGIN IMMEDIATE
TRANSACTION;", NULL, NULL, NULL );
  }

Is currently running, but it looks like it will take quite a bit of time
for it to finish, so I'll have to update you tomorrow.

Victor

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy
Sent: Tuesday, August 10, 2010 6:45 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian
Filter detected spam


On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian Filter detected spam

2010-08-12 Thread Victor Morales-Duarte
I'll run the test again tonight to give you the size of the -wal and
-shm file at the point of failure.

On WinXP (32 bits) the memory used by the application is not a problem
as it stays also in the low hundreds of MB even though the virtual bytes
(which are explained like this by perfmon: "Virtual Bytes is the current
size, in bytes, of the virtual address space the process is using. Use
of virtual address space does not necessarily imply corresponding use of
either disk or main memory pages. Virtual space is finite, and the
process can limit its ability to load libraries.") hits the 2GB max.

Commiting without closing the connection did not help. I'll write the
code for the corresponding tests and I'll reply again later today.

Later!

Victor


-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Dan Kennedy
Sent: Tuesday, August 10, 2010 6:45 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Process memory space exhausted in 3.7.0 - Bayesian
Filter detected spam


On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-10 Thread Dan Kennedy

On Aug 10, 2010, at 11:55 AM, Victor Morales-Duarte wrote:

> As it turns out, I can reproduce the failure using a single huge  
> insert.
> The code that I'm including below compiles under bcc32 from
> Embarcadero's C++ Builder 2007 and cl from vs2005. Since it's more
> likely that people have MS compilers available, the compilation line
> having this source file and the sqlite3 3.7.0 amalgamation files in  
> the
> same folder is:
>
> cl -EHsc -Fefail.exe main.cpp sqlite3.c
>
> You can then invoke fail.exe with a single command line argument of
> 8000 like this:
>
> Fail.exe 8000
>
> The source for the executable is listed below. If you're wondering  
> about
> why the numbers being inserted are more complicated than need be, it's
> because I just wanted the table and indices to look as much as  
> possible
> like the actual data that our application stores in sqlite because I  
> had
> not realized that the failure could be reproduced with simply  
> inserting.
> Beware that there is no handling of incorrect command line arguments.
>
> If you monitor this executable run with perfmon and look at its  
> virtual
> bytes, you'll see them hit 2GB and then the next time the insert
> statement is stepped, it fails with an I/O disc error.

When this happens, how large are the "test.db-wal" and "test.db-shm"
files on disk?

Here, using g++/Linux, the test.db-shm file (the one memory mapped
into the processes address space) grows to about 24 MB here. The  
test.db-wal file
grows to about 12 GB. This is as expected - the *-shm file
requires 8 bytes of space for each page written into the *-wal
log. So with your 4KB pages, the *-wal file should be roughly
512 times the size of the *-shm.

When the transaction is committed, the database is checkpointed.
In the checkpoint code there is a big heap memory allocation -
Say 2.5 bytes for each page in the WAL. So in this case maybe
6-7MB. It's not ideal to be making allocations this big, but
it shouldn't cause any trouble for a desktop PC.

Memory usage here peaks at around 130MB. That's the 85MB of
configured cache space (20,000 * 4KB pages), plus the mapping
of the test.db-shm file plus the big allocation made during
the checkpoint. Plus something else I suppose.

> If you break up the insert into chunks
> _and_close_the_connection_between_chunks_ then the error does not  
> occur.

Does this imply that if you add an sqlite3_exec("COMMIT;BEGIN")
every 10,000 inserts the program still does not run to completion?

Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Process memory space exhausted in 3.7.0

2010-08-09 Thread Victor Morales-Duarte
t; int, std::pair< int, int > > userId2ContextIdCount;
   const char* sourceStr = "test_failure";
   const char* types[] = { "type_01_1234567890_1234567890", 
   "type_02_1234567890_1234567890", 
   "type_03_1234567890_1234567890",
   "type_04_1234567890_1234567890",
   "type_05_1234567890_1234567890",
   "type_06_1234567890_1234567890",
   "type_07_1234567890_1234567890",
   "type_08_1234567890_1234567890",
   "type_09_1234567890_1234567890",
   "type_10_1234567890_1234567890" };
   
   std::cout << "Inserting into db." << std::endl;
   sqlite3_exec( connection, "BEGIN IMMEDIATE TRANSACTION;", NULL, NULL,
NULL );
   for ( int recIdx = 0; recIdx <nRecords; ++recIdx )
   {
  int userId = std::rand() % 25000;
  int contextId = 0;
  std::map< int, std::pair< int, int > >::iterator
 ui2ciIter = userId2ContextIdCount.find( userId );
  if ( ui2ciIter != userId2ContextIdCount.end() )
  {
 if ( ui2ciIter->second.second > ( 50 + ( std::rand() % 1000 ) )
)
 {
contextId = ui2ciIter->second.first + 1;
ui2ciIter->second = std::make_pair< int, int >( contextId, 1
);
 }
 else
 {
ui2ciIter->second.second += 1;
 }
  }
  else
  {
 contextId = 1;
 userId2ContextIdCount[ userId ] = std::make_pair< int, int >(
contextId, 1 );
  }
  
  __int64 tstamp = std::time( NULL );
  tstamp *= 1000;
  tstamp += ( std::rand() % 1000 );
  tstamp += ( ( std::rand() % 90 ) - 45 );
  
  int typeIdx = std::rand() % 10;
  int serverId = ( recIdx + std::rand() % 1000 ) % 1000;
  int nSig = std::rand() % 10;
  int nNoi = nSig;
  
  sqlite3_bind_int( statement, 1, userId); 
  sqlite3_bind_int( statement, 2, contextId ); 

  sqlite3_bind_int64( statement, 3, tstamp ); 
  
  sqlite3_bind_text(  statement, 4, sourceStr,0, NULL ); 
  sqlite3_bind_text(  statement, 5, types[ typeIdx ], 0, NULL ) ; 

  sqlite3_bind_int( statement, 6, serverId  ); 
  sqlite3_bind_int( statement, 7, nSig  ); 
  sqlite3_bind_int( statement, 8, nNoi  ); 

  rc = sqlite3_step( statement );
  if ( rc != SQLITE_DONE )
  {
 std::string errorMessage( sqlite3_errmsg( connection ) );
 std::runtime_error ex( errorMessage );
 sqlite3_exec( connection, "ROLLBACK;", NULL, NULL, NULL );
 sqlite3_finalize( statement );
 sqlite3_close( connection );
 statement = NULL;
 connection = NULL;
 std::cerr << errorMessage;
 throw ex;
  }
  sqlite3_reset( statement );
   }
   sqlite3_exec( connection, "COMMIT TRANSACTION;", NULL, NULL, NULL );
   
   sqlite3_finalize( statement );
   sqlite3_close( connection );
   return 0;
}
--end main.cpp--

-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Victor
Morales-Duarte
Sent: Wednesday, August 04, 2010 2:19 PM
To: sqlite-users@sqlite.org
Subject: [sqlite] Process memory space exhausted in 3.7.0

Hello,

 

The windows desktop application that I maintain uses sqlite for some of
its storage. The data volume that we must handle has increased
dramatically over the past 6 months and as it was to be expected update
performance has degraded accordingly. Because of that, I was very quick
to jump onto 3.7 when I read that WAL could be selected with it. The
speed improvements when updating data are indeed very noticeable when
running the application on my laptop's drive (3x faster) although not so
much when running on a fast SSD connected to it via ESATA (only about
20% extra speed);  I guess that the different ratio of improvement was
to be expected given the access characteristics of each. Overall, I have
to say that I believe that WAL was a great addition.

 

Unfortunately, I've encountered what could potentially be considered a
big problem. When I run a very large update the process space for the
application seems to be exhausted. The way that it manifested itself at
first was that there would be a disc I/O error, but because I test the
application while running perfmon.exe on win xp sp3 to monitor the IO
read bytes/sec, IO write bytes/sec, processor time and virtual bytes I
noticed that the virtual bytes were at the 2GB max process space limit
when the disc I/O error occurred.

 

In order to rule out the possibility that I was doing something wrong, I
decided to test a similar update using the sqlite3.ex

[sqlite] Process memory space exhausted in 3.7.0

2010-08-05 Thread Victor Morales-Duarte
Hello,

 

The windows desktop application that I maintain uses sqlite for some of
its storage. The data volume that we must handle has increased
dramatically over the past 6 months and as it was to be expected update
performance has degraded accordingly. Because of that, I was very quick
to jump onto 3.7 when I read that WAL could be selected with it. The
speed improvements when updating data are indeed very noticeable when
running the application on my laptop's drive (3x faster) although not so
much when running on a fast SSD connected to it via ESATA (only about
20% extra speed);  I guess that the different ratio of improvement was
to be expected given the access characteristics of each. Overall, I have
to say that I believe that WAL was a great addition.

 

Unfortunately, I've encountered what could potentially be considered a
big problem. When I run a very large update the process space for the
application seems to be exhausted. The way that it manifested itself at
first was that there would be a disc I/O error, but because I test the
application while running perfmon.exe on win xp sp3 to monitor the IO
read bytes/sec, IO write bytes/sec, processor time and virtual bytes I
noticed that the virtual bytes were at the 2GB max process space limit
when the disc I/O error occurred.

 

In order to rule out the possibility that I was doing something wrong, I
decided to test a similar update using the sqlite3.exe CLI. During the
update, what the application will do is it will iterate over all the
records in a table in a specific order assigning a pair of integers to
two columns (both initially null) of each record, based on domain
specific rules; accordingly, the test with the CLI is the opposite
operation; I take a db file that is about 1.5 GB in size, with over 3.7
million records in the table that needs to be updated and then I proceed
to assign null to one of the columns for all records. After some time of
working, the virtual bytes (as reported by perfmon) hit the max process
space and the disk I/O error is  reported. At that time, the wal file is
over 5.5 GB in size and the shm file is over 10MB in size.

 

My initial guess is that there is a problem memory mapping files.

 

I wish that I could make the db available for testing but the data
contained in it cannot be disclosed due to an NDA and the schema is
proprietary information of my employer. First I need to finish a
workaround for this (it seems that by closing and reopening the db
connection, the situation improves somewhat) and then I will write a
small piece of code that will create a dummy database large enough that
the error can be reproduced in it so that I can post it in a reply to
this email. 

 

Thank you!!!

 

Victor

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users