Re: [sqlite] sqlite-users Digest, Vol 80, Issue 19

2014-08-19 Thread JohnG

On 08/19/2014 11:00 AM, sqlite-users-requ...@sqlite.org wrote:

   10. Re: Long time to drop tables. (Jonathan Moules)
   12. Re: Long time to drop tables. (Simon Slavin)


--


Long time for me too!

This was run on an 11.7GB database where the dropped table is about 50% 
of total size.  CPU is AMD AM3+ X4 4.2Ghz


sqlite> select count(*) from fvall;
count(*)
--
6524250
CPU Time: user 0.248015 sys 13.084817
sqlite>  pragma auto_vacuum;
auto_vacuum
---
0
CPU Time: user 0.00 sys 0.00
sqlite> drop table fvall;

While running...

top - 20:51:16 up 11 days, 10:58,  2 users,  load average: 1.97, 1.10, 0.62
Tasks: 178 total,   2 running, 175 sleeping,   0 stopped,   1 zombie
Cpu0  :  0.7%us,  5.3%sy,  0.0%ni, 90.1%id,  3.8%wa,  0.0%hi, 0.1%si,  
0.0%st
Cpu1  :  0.8%us,  2.1%sy,  0.0%ni, 93.8%id,  3.2%wa,  0.0%hi, 0.0%si,  
0.0%st
Cpu2  :  0.8%us,  3.2%sy,  0.0%ni, 90.9%id,  4.9%wa,  0.0%hi, 0.2%si,  
0.0%st
Cpu3  :  2.2%us,  8.7%sy,  0.0%ni, 50.8%id, 36.7%wa,  0.0%hi, 1.6%si,  
0.0%st

Mem:   4031760k total,  3911436k used,   120324k free,   160704k buffers
Swap:  3664892k total,57932k used,  3606960k free,  2657068k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
10400 gelmjw20   0 41064 6580 1876 D   22  0.2   0:34.91 sqlite3
10430 root  20   0 000 S1  0.0   0:00.26 kworker/0:0

Sure looks I/O bound.  Why so much I/O?

CPU Time: user 13.744859 sys 79.016938

What is the above in milliseconds of real time?

sqlite>


voyager finviz # hdparm -t /dev/sda

/dev/sda:
 Timing buffered disk reads: 582 MB in  3.01 seconds = 193.48 MB/sec
voyager finviz # hdparm -t /dev/sda

/dev/sda:
 Timing buffered disk reads: 584 MB in  3.00 seconds = 194.49 MB/sec
voyager finviz # hdparm -t /dev/sda

/dev/sda:
 Timing buffered disk reads: 582 MB in  3.01 seconds = 193.53 MB/sec
voyager finviz #

Is my disk slow?

John Gelm




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread Nelson, Erik - 2
Joe Fisher wrote on Tuesday, August 19, 2014 5:11 PM
> I use the temp table because every CSV files has a header with the
> column names.

Can you just import the files, header row and all, into your destination table 
and just delete the 2000 header rows at the end?

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread Petite Abeille

On Aug 19, 2014, at 11:11 PM, joe.fis...@tanguaylab.com 
 wrote:

> Is there something better I can do to improve this process?


PRAGMA journal_mode  = off;

http://www.sqlite.org/pragma.html#pragma_journal_mode


> Perhaps one transaction? Perhaps turn something off? It took about 1.5 hours 
> to run. I use the temp table because every CSV files has a header with the 
> column names. I have to drop the table each time because of the header issue.

Preprocess your files by dropping the first line beforehand, e.g.: 

sed -i ā€˜1dā€™

That will save half of your time already.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread Simon Slavin

On 19 Aug 2014, at 10:13pm, John Drescher  wrote:

>> I'm running the following script on more than 1000 2MB CSV files and I'd
>> like to speed it up if possible. I noticed that a 'WAL' is running. Is there
>> something better I can do to improve this process?
> 
> Use an SSD.

Agree.  There's nothing wrong with your script: it's well designed to do what 
you need with the maximum of elegance.  You might get a faster result by 
writing your own C program to read the CSV file and do the minimum of SQLite 
commands, but probably not much.

1.5 hours for 2000 CSV files is 2.7 seconds per file.  About .2 seconds of that 
are file handling.  The rest indicates that SQLite is running extremely quickly 
and efficiently for you.

If you want anything faster, use faster hardware.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread Stadin, Benjamin
Directly after opening your db:

PRAGMA synchronous = OFF;
PRAGMA journal_mode = WAL;

It should fly then, at the cost of risking db corruption in case of a
crash. 

Ben(jamin Stadin)



Am 19.08.14 23:11 schrieb "joe.fis...@tanguaylab.com" unter
:

>I'm running the following script on more than 1000 2MB CSV files and I'd
>like to speed it up if possible. I noticed that a 'WAL' is running. Is
>there something better I can do to improve this process? Perhaps one
>transaction? Perhaps turn something off? It took about 1.5 hours to run.
>I use the temp table because every CSV files has a header with the
>column names. I have to drop the table each time because of the header
>issue. I'm using the latest version of SQLite on a fast notebook.
>
>.import 'TP962-A1-P1_TP962-A1-P2_01.CSV' temp_table
>delete from temp_table where an = 1; -- using .read del_rec.sql
>insert into external_lpr_assay_raw
>select *, 'TP962-A1-P1_TP962-A1-P2_01.CSV' as filename, from temp_table;
>drop table if exists temp_table; -- using .read drop_table.sql
>
>Joe Fisher
>Oregon State University
>
>
>
>___
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread John Drescher
> I'm running the following script on more than 1000 2MB CSV files and I'd
> like to speed it up if possible. I noticed that a 'WAL' is running. Is there
> something better I can do to improve this process?

Use an SSD.

John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Importing ~1000 CSV files faster

2014-08-19 Thread joe.fis...@tanguaylab.com
I'm running the following script on more than 1000 2MB CSV files and I'd 
like to speed it up if possible. I noticed that a 'WAL' is running. Is 
there something better I can do to improve this process? Perhaps one 
transaction? Perhaps turn something off? It took about 1.5 hours to run. 
I use the temp table because every CSV files has a header with the 
column names. I have to drop the table each time because of the header 
issue. I'm using the latest version of SQLite on a fast notebook.


.import 'TP962-A1-P1_TP962-A1-P2_01.CSV' temp_table
delete from temp_table where an = 1; -- using .read del_rec.sql
insert into external_lpr_assay_raw
select *, 'TP962-A1-P1_TP962-A1-P2_01.CSV' as filename, from temp_table;
drop table if exists temp_table; -- using .read drop_table.sql

Joe Fisher
Oregon State University



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQlite on flash filesystem

2014-08-19 Thread Richard Hipp
On Tue, Aug 19, 2014 at 10:38 AM, Richard Hipp  wrote:

>
>
>
> On Tue, Aug 19, 2014 at 10:34 AM, Levente Kovacs 
> wrote:
>
>> I started to worry about this issue, because I am in a middle of an
>> application development, and yesterday, it started to work, and I only
>> SELECT
>> a few times, and it makes a noticeable disk access. I'm still on magnetic
>> HDD,
>> but the application will be running on SSD or Flash drive.
>>
>> Let me start again by a stupid question. If I do a SELECT... does this
>> performs any disk write operation?
>>
>
> No.  SQLite doesn't.  But your filesystem might decide to update the mtime
> on an inode.
>

Correction:  If you do a sufficiently complex SELECT, SQLite might need to
manifest a view or do a large sort that won't fit in RAM.  In those cases
it will write to disk.  You can disable that using PRAGMA
temp_store=MEMORY.  http://www.sqlite.org/pragma.html#pragma_temp_store

Additional information:  http://www.sqlite.org/tempfiles.html



>
>
>>
>> I know that sqlite can store the database in RAM. Is there any way to
>> (periodically) write the database to a regular sqlite file?
>>
>
> http://www.sqlite.org/backup.html
>
>
>
>>
>> Does sqlite calls 'sync()' after an UPDATE or INSERT?
>>
>
> http://www.sqlite.org/atomiccommit.html
>
> --
> D. Richard Hipp
> d...@sqlite.org
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQlite on flash filesystem

2014-08-19 Thread Richard Hipp
On Tue, Aug 19, 2014 at 10:34 AM, Levente Kovacs 
wrote:

> I started to worry about this issue, because I am in a middle of an
> application development, and yesterday, it started to work, and I only
> SELECT
> a few times, and it makes a noticeable disk access. I'm still on magnetic
> HDD,
> but the application will be running on SSD or Flash drive.
>
> Let me start again by a stupid question. If I do a SELECT... does this
> performs any disk write operation?
>

No.  SQLite doesn't.  But your filesystem might decide to update the mtime
on an inode.


>
> I know that sqlite can store the database in RAM. Is there any way to
> (periodically) write the database to a regular sqlite file?
>

http://www.sqlite.org/backup.html



>
> Does sqlite calls 'sync()' after an UPDATE or INSERT?
>

http://www.sqlite.org/atomiccommit.html

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQlite on flash filesystem

2014-08-19 Thread Levente Kovacs
I started to worry about this issue, because I am in a middle of an
application development, and yesterday, it started to work, and I only SELECT
a few times, and it makes a noticeable disk access. I'm still on magnetic HDD,
but the application will be running on SSD or Flash drive.

Let me start again by a stupid question. If I do a SELECT... does this
performs any disk write operation?

I know that sqlite can store the database in RAM. Is there any way to
(periodically) write the database to a regular sqlite file?

Does sqlite calls 'sync()' after an UPDATE or INSERT?

Thanks,
Levente

On Sat, 16 Aug 2014 22:22:56 +0100
Simon Slavin  wrote:

> A good question.  By which I mean it's one people should worry about,
> not one which we're not sure about.
> 
> No.  Physical writes to Flash drives are optimized for best use of
> hardware (minimum writes, promoting maximum lifetime for the device
> and fastest execution of programs).  Worse still, there is no
> guarantee that any writes are done to the device at all until you
> have issued 'unmount' to it (shown to the user as 'Eject' or 'Make
> safe for removal' on popular popular operating systems). Until then,
> as far as you're concerned, all the changes may be made only to a
> cache held by the device driver.
> 
> So if you pull your Flash drive without telling the OS to unmount it
> you can get all sorts of problems, including no files at all if they
> were created during that session.  It's rare to see this actually
> happen (few systems create a cache that reflects the whole drive then
> never bother to flush it) but the specs of the hardware involved do
> allow it.

-- 
Levente Kovacs
CTO, CSO
http://levente.logonex.eu



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Long time to drop tables.

2014-08-19 Thread Simon Slavin

On 19 Aug 2014, at 11:41am, Jonathan Moules 
 wrote:

> There are some foreign keys.

Put all your DROP TABLE commands into one big transaction.

Make sure when you DROP your TABLES that you are dropping them in an order 
which won't trigger any foreign key checks.  Normally this just means you have 
to drop the offspring tables first.

Alternatively do

PRAGMA foreign_keys = OFF
BEGIN TRANSACTION
DROP table a
DROP table b
...
END TRANSACTION
PRAGMA foreign_keys = ON

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Improve query performance using a join

2014-08-19 Thread Simon Slavin

On 19 Aug 2014, at 8:25am, Paul Dillon  wrote:

> I was
> using count(first_field) instead of selecting all the fields, can't imagine
> that could be the problem.

There's an optimization in SQLite which means you can do COUNT(*) and it will 
fetch no data at all.  It's faster than doing any COUNT(specific_field).

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Long time to drop tables.

2014-08-19 Thread Jonathan Moules
HI,
Thanks for the replies.

The entire thing on happening on a single desktop machine. The database is
on a standard HDD. Using SQLiteStudio; sqlite version 3.7.16.1.

There are some foreign keys.

Autovacuum is off I think (don't know how to check, but the table size
never automatically shrinks of its own accord).

You can download the database here (compressed down to ~30mb).
https://drive.google.com/file/d/0B22cYd3gwE-6MTdwUzczUlptdk0/edit?usp=sharing

I was actually trying to delete all of these tables (this version of the
database doesn't have the extra tables I created; the extra ones had no
foreign keys at all).

Cheers,
Jonathan



On 16 August 2014 20:04, Richard Hipp  wrote:

> On Sat, Aug 16, 2014 at 2:41 PM, Jonathan Moules <
> jonathanmou...@warwickshire.gov.uk> wrote:
>
> > Hi List,
> > More of a curiosity.
> > I'm doing some general data munging and set off a query that consists
> > entirely of 37 DROP TABLEs in it. The database it's running against is
> > a bit less than 1GB made of about 5 million rows, and the tables being
> > dropped constitute about 99% of the content.
> >
> > My questions is - why does it take so long? The total time
> required
> > to create this dataset (most of which was processing on the Python
> > side) was about 11 minutes.
> >
> > The total time required to perform these drops is ... well I
> > cancelled it at 20mins - it had deleted 20 of the 37. For that entire
> > period SQLite has been reading at a rate of 170MB/s - by my maths it
> > had read about 200GB!
> >
> > The tables don't have indexes, the settings are all whatever the
> > defaults are.
> >
> > Any suggestions what's going on? Is this normal behavior?
> >
>
>
> I made a copy of a 2.3GB database that contained two tables that comprised
> 90% of the database space.  I did a DROP TABLE on each.  Each DROP TABLE
> took about 100 milliseconds, real-time.
>
> DROP TABLE bmdTileTable;
> Run Time: real 0.109 user 0.052133 sys 0.043098
> DROP TABLE bmdTileTable_with_out;
> Run Time: real 0.102 user 0.052688 sys 0.029648
>
> Maybe you have autovacuum turned on in your database?  Autovacuum makes
> doing things like DROP TABLE much slower because it has to rearrange
> content in order to move it all to the front of the file then truncate the
> file.
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>

-- 
This transmission is intended for the named addressee(s) only and may 
contain confidential, sensitive or personal information and should be 
handled accordingly. Unless you are the named addressee (or authorised to 
receive it for the addressee) you may not copy or use it, or disclose it to 
anyone else. If you have received this transmission in error please notify 
the sender immediately. All email traffic sent to or from us, including 
without limitation all GCSX traffic, may be subject to recording and/or 
monitoring in accordance with relevant legislation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with Update Statement

2014-08-19 Thread Clemens Ladisch
Wycliff Mabwai wrote:
> syntax near "Where".
>
> SQLITEcmd2.CommandText = "UPDATE RecordGrid SET
> LineNumber='" & reade20.GetInt32(11) & "',self_empty_info_gender_PRect= '"
> & IIf(reade20.IsDBNull(22), String.Empty, reade20.GetString(22)) & "," &
> IIf(reade20.IsDBNull(23), String.Empty, reade20.GetString(23)) & "," &
> IIf(reade20.IsDBNull(24), String.Empty, reade20.GetString(24)) & "," &
> IIf(reade20.IsDBNull(25), String.Empty, reade20.GetString(25)) & "' WHERE
> RecordGridID='"chombo"'"

This is unreadable.  Show the value of CommandText.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Improve query performance using a join

2014-08-19 Thread Paul Dillon
Jim Callahan jim.callahan.orlando at gmail.com wrote:

> 1. My condolences with those dimensions you are heading for "big> 
> data"/hadoop land.

Heheh thanks, I am so determined not to go there!  SQLite has been
such a nice simple database to use, I will do anything to avoid all
the complexity of those map-reduce solutions.

> 2. Worry about the number of rows; that's what feeds into the big-oh: O(n).
> Assuming your 150 columns translate into a 1.5k to 2k record length that
> means your 300 gigabyte file must have 150 (300 gig/2k) to 200 (300
> gig/1.5k) million records. That's a lot of n for O(n).

I've got about 200 million rows of 1,100 bytes each.  At least it's
linear.  I would love to partition the rows into 4 databases and query
them in parallel.  But that's just a fantasy, I can live with it for
now.

> So, I would recommend double checking the specs of your SSD and not
> necessarily making the reasonable, but not yet true, assumption that
> solid-state has to be faster than mechanical.

I use an SSD for other reasons.  However, I've tested sustained
sequential read and I am getting 550MB/s.  I'm doing full table scans
so it's sequential.  This is on a laptop, and I am fairly sure there
are no mechanical 2.5" HDDs that can transfer 200GB at 550MB/s.

> One strategy that might work is to have an entirely separate (not joined) 8
> column table; develop queries (targets) on that database and then write out
> the primary key of the rows you are interested in to a separate table
> (CREATE TABLE AS SELECT primarykey FROM 8columntable WHERE yourquery;) and
> then JOIN the row reduced table to the main table. If your rowreduced table
> has millions of fewer records (primary keys) that's millions of rows (in
> the main 150 column table) where the precompiled SQL query doesn't have to
> be executed.

Cheers for that, I'll do some testing and see how I go!

Thanks,

Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Improve query performance using a join

2014-08-19 Thread Paul Dillon
Thanks to all that replied to my post and sorry for the delayed response.
 I had trouble joining the list and had not realised that my post went
through until I tried to join the list again.

Simon Slavin slavins at bigfraud.org wrote:

> What you might find increases your speed is to make sure that those 8 columns
> are the first 8 columns listed in your table definition (after the primary
> key, if any).  Each time you have SQLite read the columns for a row it has to
> read the whole row up to the last column it needs.  So if all the unwanted

Thanks, intriguing idea.  Unfortunately made no difference for my
situation.  I made sure the table was too large to be cached, and I was
using count(first_field) instead of selecting all the fields, can't imagine
that could be the problem.

> Lastly, do not fall into the trap of premature optimization.  You should not
> be looking for your program to run "as fast as possible".  You should be > 
> looking for it to run "acceptably fast".  You can spend 20 hours of 
> programming to improve
> runtime by 1 minute -- a minute that your users wouldn't care about because 
> they always go
> make a cup of coffee during a run anyway.

Yeah I am wary of that.  The queries take about 20 minutes, and really
disrupts my workflow.  If I spend a few hours on this and it works,
it'll save me hundreds of hours in the long run.  I like to collect
anecdotal performance advice, and implement it when the cost of doing
so is not much more than not doing it.  I have some evidence that the
approach has worked for me in the past.

Thanks,

Paul
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users