To answer your question: 
 - Disk space is not a problem - there is over 500gb free space on that disk
(with database size 80gb). 
 - Memory - 16gb with at least 8 free at all times during testing
 - Platform -  I moved the platform to Windows Server 2012

I did more testing. I have a large csv file (100gb/500m lines)  Originally I
have imported full file into database  (using sqlite3 .import)  - and then
ran the select query which resulted in out-of-memory error;

Over the last 24 hours it tried the following
 - split the csv file into smaller pieces (10m lines each)
 - import  smaller pieces one-by-one  (either in one sqlite3 session or 1
session per piece - same result)
-  after each import run the query to check whether we have out-of-memory
condition

Out of memory condition:
1. does not affect import - I can still load the rows and count shows that
they are loaded
2. once it occurs all subsequent select...group by queries show out of
memory - adding new rows does not change that
3. is _random_ I got it after three piece import  yesterday, after two piece
import  today (on the same machine, creating new db each time).
 
So this is _not_ deterministic but over larger timescale (like few hours -
not minutes or a couple of hours)
 - It does not depend on the database size - 20m lines database is only 3gb.

 - In in the full database I have tables with 20 and 30gb sizes  - different
schemas - many more columns but many less rows
  - It is not caused by malformatted data - the data is from a bcp out (SQL
Server) (I will test in in the next 24h)

Additional observations:
        When looking at task manager and sqlite3 process
        - during import the memory growth up to 800 mb (the cache size)
        - during select count(*) it goes up to 1.5gb
        - when starting select ... group by the memory it goes up to 1.6 gb
for successful imports, up to 2gb for out of memory condition

My current suspicion: this is not just the size, but (mostly?) the query. 





-----Original Message-----
From: Michael Black [mailto:mdblac...@yahoo.com] 
Sent: Friday, January 25, 2013 5:41 AM
To: ammon_lympha...@hotmail.com; 'General Discussion of SQLite Database'
Subject: RE: [sqlite] bug report: out of memory error for most operations on
a large database

How much free disk space do you have?  Your temp tables might be exceeding
capacity.

-----Original Message-----
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of ammon_lymphater
Sent: Thursday, January 24, 2013 1:41 PM
To: sqlite-users@sqlite.org
Subject: [sqlite] bug report: out of memory error for most operations on a
large database

Summary: except for  select count(*) , all operations on a large table (500m
rows/80gb) result in out-of-memory error

 

Details:

a.       Using SQLite3 ver 3.7.15.2, Windows 7  64 bit AMD(but the error
also in 3.6.xx and on other Windows platforms)

b.      Created a table (schema attached), imported data from a csv file
using sqlite3 (no problems)

c.  "select Name5, count(*) from LargeData group by name5 order by name5
results" in Error: out of memory (the cardinality of Name5 is 12)

d.  "sqlite> select count(*) from StorageHourlyBySIDCL2;" gives 587608348,
as it should

e.      The above independent of cache_size (from 0.3 to 1.5gb); happens for
page_size 32kb and 64kb (others not tested)

 

(personally not urgent for me - just testing the limits -  but may be useful
for others)

 

====   table schema & sqlite3_analyzer output

CREATE TABLE largedata (

      name1 smalldatetime

     , Name2 uniqueidentifier

     , Name3 varchar (16)

     , Name4 varchar (8)

     , Name5 varchar (80)

     , Name6 real

     , Name7 real

     , Name8 real

     , Name9 real

     , Name10 real

     , Name11 real

     , Name12 real

     , Name13 real

     , Name14 smallint

     , Name15 tinyint

     , Name16 tinyint

     , Name17 smalldatetime

     , Name18 real

     , Name19 tinyint

);

 

-- SQLITE3_ANALYZER output

/** Disk-Space Utilization Report For h:\temp\convert\import2.db

 

Page size in bytes.................... 32768

Pages in the whole file (measured).... 2578119

Pages in the whole file (calculated).. 2578118

Pages that store data................. 2578118    100.000%

Pages on the freelist (per header).... 0            0.0%

Pages on the freelist (calculated).... 1            0.0%

Pages of auto-vacuum overhead......... 0            0.0%

Number of tables in the database...... 11

Number of indices..................... 0

Number of named indices............... 0

Automatically generated indices....... 0

Size of the file in bytes............. 84479803392

Bytes of user payload stored.......... 79293861071  93.9%

 

*** Page counts for all tables with their indices ********************

 

LargeData................. 2578108    100.000%

smalldata............................ 1            0.0%

(zero-length tables omitted)

 

*** All tables *******************************************************

 

Percentage of total database.......... 100.000%

Number of entries..................... 587608358

Bytes of storage consumed............. 84479770624

Bytes of payload...................... 79293871126  93.9%

Average payload per entry............. 134.94

Average unused bytes per entry........ 0.32

Average fanout........................ 2716.00

Fragmentation.........................   0.11%

Maximum payload per entry............. 1933

Entries that use overflow............. 0            0.0%

Index pages used...................... 949

Primary pages used.................... 2577169

Overflow pages used................... 0

Total pages used...................... 2578118

Unused bytes on index pages........... 3904523     12.6%

Unused bytes on primary pages......... 184122910    0.22%

Unused bytes on overflow pages........ 0

Unused bytes on all pages............. 188027433    0.22%

 

--$ removing unused tables

*** Table STORAGEHOURLYBYSIDCL2 **************************************

 

Percentage of total database.......... 100.000%

Number of entries..................... 587608348

Bytes of storage consumed............. 84479442944

Bytes of payload...................... 79293861071  93.9%

Average payload per entry............. 134.94

Average unused bytes per entry........ 0.32

Average fanout........................ 2716.00

Fragmentation.........................   0.11%

Maximum payload per entry............. 183

Entries that use overflow............. 0            0.0%

Index pages used...................... 949

Primary pages used.................... 2577159

Overflow pages used................... 0

Total pages used...................... 2578108

Unused bytes on index pages........... 3904523     12.6%

Unused bytes on primary pages......... 183805515    0.22%

Unused bytes on overflow pages........ 0

Unused bytes on all pages............. 187710038    0.22%

 

 

*** Table SQLITE_MASTER **********************************************

 

Percentage of total database..........   0.0%

Number of entries..................... 10

Bytes of storage consumed............. 32768

Bytes of payload...................... 10055       30.7%

Average payload per entry............. 1005.50

Average unused bytes per entry........ 2255.50

Maximum payload per entry............. 1933

Entries that use overflow............. 0            0.0%

Primary pages used.................... 1

Overflow pages used................... 0

Total pages used...................... 1

Unused bytes on primary pages......... 22555       68.8%

Unused bytes on overflow pages........ 0

Unused bytes on all pages............. 22555       68.8%

 

*** Definitions ******************************************************

--$ removed definitions

**********************************************************************

The entire text of this report can be sourced into any SQL database

engine for further analysis.  All of the text above is an SQL comment.

The data used to generate this report follows:

*/

BEGIN;

CREATE TABLE space_used(

   name clob,        -- Name of a table or index in the database file

   tblname clob,     -- Name of associated table

   is_index boolean, -- TRUE if it is an index, false for a table

   nentry int,       -- Number of entries in the BTree

   leaf_entries int, -- Number of leaf entries

   payload int,      -- Total amount of data stored in this table or index

   ovfl_payload int, -- Total amount of data stored on overflow pages

   ovfl_cnt int,     -- Number of entries that use overflow

   mx_payload int,   -- Maximum payload size

   int_pages int,    -- Number of interior pages used

   leaf_pages int,   -- Number of leaf pages used

   ovfl_pages int,   -- Number of overflow pages used

   int_unused int,   -- Number of unused bytes on interior pages

   leaf_unused int,  -- Number of unused bytes on primary pages

   ovfl_unused int,  -- Number of unused bytes on overflow pages

   gap_cnt int,      -- Number of gaps in the page layout

   compressed_size int  -- Total bytes stored on disk

);

INSERT INTO space_used
VALUES('sqlite_master','sqlite_master',0,10,10,10055,0,0,1933,0,1,0,0,22555,
0,0,32768);

--$ removed unused tables

INSERT INTO space_used
VALUES('LargeData','LargeData',0,590185506,587608348,79293861071,0,0,183,949
,2577159,0,3904523,183805515,0,2844,84479442944);

COMMIT;

 

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to