280GB)

David Raymond Tue, 17 Oct 2017 07:42:47 -0700

If you're doing 1 gigantic insert to populate the table then it's more 
efficient to delay creating an index until the end. If you've already got a 
bunch of data in there and are inserting more, then you get diminishing returns 
from that. In this case though you're not creating a separate good index, 
you're creating a duplicate one that doesn't add value even when it's there and 
complete, so it's doing nothing but slowing things down. When you declare a 
primary key SQLite implements that by creating a unique index on those fields 
to keep track of it. (named sqlite_autoindex_map_1 below) When you explicitly 
declare another index on the same data SQLite shrugs and does what you ask, 
making a second index of the same things.


Depending on your confidence level you could get rid of the "primary key" line 
and only add the explicit unique index at the end of the initial population. 
Though I'd advise for any primary key to be in there from the start and only 
delay the creation of non-unique/helper indexes. Just my own personal 
preference on that though.


CREATE TABLE map(
  zoom_level INTEGER,
  tile_column INTEGER,
  tile_row INTEGER,
  tile_id VARCHAR(256) NOT NULL,
  PRIMARY KEY(zoom_level, tile_column, tile_row)
);
CREATE UNIQUE INDEX map_index ON map(
  zoom_level ASC,
  tile_column ASC,
  tile_row ASC
);

sqlite> select * from sqlite_master where type = 'index' and tbl_name = 'map';
type|name|tbl_name|rootpage|sql
index|sqlite_autoindex_map_1|map|5|
index|map_index|map|7|CREATE UNIQUE INDEX map_index ON map (zoom_level ASC, 
tile_column ASC, tile_row ASC)

sqlite> pragma index_list(map);
seq|name|unique|origin|partial
0|map_index|1|c|0
1|sqlite_autoindex_map_1|1|pk|0

sqlite> pragma index_info(sqlite_autoindex_map_1);
seqno|cid|name
0|0|zoom_level
1|1|tile_column
2|2|tile_row

sqlite> pragma index_info(map_index);
seqno|cid|name
0|0|zoom_level
1|1|tile_column
2|2|tile_row


-----Original Message-----
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Fiona
Sent: Monday, October 16, 2017 10:36 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Sqlite3.6 Command-line delete/update not working with 
large db file(>280GB)

>>The setting for synchronous is basically what level of safety net do you
want if it dies in the middle of something. Setting it to off shouldn't
cause any corruption if things go well, it should only come into play if you
saw errors or didn't close things down correctly etc. 

You're right, my Python code was ended manually sometime for it's taking too
much time to finish the INSERT/UPDATE operation, or the image data I get is
wrong.

>>The unique index you declared is redundant by the way, declaring those
three fields as the primary key makes a unique index already to keep track
of that. Did you intend to make that on the retry table? 

The redundant map_index is an attempt to improve insert speed, as I learned
drop index before insert operation is a better way to go. But now with my
data growing so huge, drop/rebuild index also takes quite a long time, and I
never choice to drop then create this index anymore, just leave the index
there. Does it still effect my operation and I should just drop it? 

And if I want to speed up insert operation further more, what measures
should I consider?  I'v already set synchronous and journal_mode off, use
transaction and prepared statement, but when insert data there's still 3
times' speed difference between my code and  the *attach-insert* method. 



--
Sent from: http://sqlite.1065341.n5.nabble.com/
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Sqlite3.6 Command-line delete/update not working with large db file(>280GB)

Reply via email to