Re: [sqlite] Large Database Windows vs Linux

2013-06-29 Thread James K. Lowden
On Fri, 28 Jun 2013 08:54:07 -0400
"Christopher W. Steenwyk"  wrote:

> I have a rather large database (11 GB) that has two tables (one with
> approximately 500,000 rows and another with approximately 50,000,000
> rows). In this database I am performing a query that joins these two
> tables to produce approximately 4.4 billion rows and then uses a
> GROUP BY and COUNT to return some values to me.
> 
> I struggled with the slowness of this query for a while in linux and
> through covering indexes, ANALYZE, and a SSD I was able to get the
> query to run in about 15 minutes which I am OK with. 

Could I ask how much RAM is in the machine?  I would like to believe
that ~16 GB of RAM would make the I/O speed irrelevant because the OS
would cache the database in RAM.  I also suspect that Windows is less
likely than Windows to devote RAM to I/O cache.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-29 Thread Bradley Giesbrecht
On Jun 28, 2013, at 5:54 AM, Christopher W. Steenwyk wrote:

> I have a rather large database (11 GB) that has two tables (one with
> approximately 500,000 rows and another with approximately 50,000,000 rows).
> In this database I am performing a query that joins these two tables to
> produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
> to return some values to me.

If your data is somewhat static use aggregate tables which can updated on data 
change.


Regards,
Bradley Giesbrecht (pixilla)

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-29 Thread Olaf Schmidt

Am 28.06.2013 14:54, schrieb Christopher W. Steenwyk:


I have a rather large database (11 GB) that has two tables (one with
approximately 500,000 rows and another with approximately 50,000,000 rows).
  In this database I am performing a query that joins these two tables to
produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
to return some values to me.

I struggled with the slowness of this query for a while in linux and
through covering indexes, ANALYZE, and a SSD I was able to get the query to
run in about 15 minutes which I am OK with.


... speaking from a pure technical (or better, hardware-)
perspective - the problem should not take that long.

It is IO-Bound apparently (especially on Windows) - but even on
Linux your 15 minutes (which is roughly 1000 seconds for ease of
calculation) seem way too much.
The sustained Disk-Read-Throughput of a modern SSD is between 300
and 550MB per second. Let's take 300MB/sec - with that rate, your
11GB should be "spooled-through" (passing your own "Software-Filter"
along the way) in roughly 1GB any 3 seconds - or say: about 30-40
seconds for the whole 11GB.

Most (or at least "many") of the problems, which cause such huge
amounts of data, are pretty "narrow" - and if that ist the case,
then one doesn't always need to tackle them with a "generic, all-
purpose SQL-engine".

If you end up, having only 2 or 3 "typical Query- or Filter-criteria"
to perform against those 11GB - then why not "hand-code" your filters,
and store your data in dedicated binary files?

E.g. your smaller set of 500,000 Rows looks like (at least for the
Join-Operation) it could be stored separately from the other data -
and before you perform the Grouping-Join, could be loaded into memory
completely.
Done so, you could perform your grouping-filter then, shoveling only
your larger set of data (in binary-format, directly from Disk with
about 300MB/sec) into memory - and then compare it there against
your already cached, smaller "Join-and-Group-criteria-set".


Olaf


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-29 Thread Jeremy Nicoll - ml sqlite users
Roger Binns  wrote:

> Here is the list of monitored extensions:
>
>
http://msdn.microsoft.com/en-us/library/windows/desktop/aa378870(v=vs.85).aspx

That's a frighteningly long list - 574 separate extensions...  And they're
not all obscure ones either.  For example it includes

   .DATA
   .SRC- quite likely to be used for 'a source file' of some sort
   .TAG- surely someone will be tagging their own data?

and I could easily see my choosing many of the others for arbitrary data in
certain projects.

-- 
Jeremy C B Nicoll - my opinions are my own.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Simon Slavin

On 29 Jun 2013, at 1:22am, Howard Chu  wrote:

> David de Regt wrote:
>> It's the kind of useful help like this that makes me love the FOSS movement.
> 
> All based in facts, of course. http://blog.zorinaq.com/?e=74

Nevertheless, the remark was not helpful and is therefore best forgotten.  If 
an aspect of Windows is the answer to someone’s problem, then it should be 
discussed.  Unhelpful beating of horses belongs elsewhere.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Howard Chu

David de Regt wrote:

It's the kind of useful help like this that makes me love the FOSS movement.


All based in facts, of course. http://blog.zorinaq.com/?e=74


-David

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Walter Hurry
Sent: Friday, June 28, 2013 5:09 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Large Database Windows vs Linux

On Fri, 28 Jun 2013 15:22:57 -0600, Keith Medcalf wrote:


That would explain why the best thing to be done with System Destroyer
(System Restore) is the same as the best way to handle the Hardware
Destroyer (Power Management) in Windows.  Disable it completely.


The best thing to do with Windows is format the drive and install Unix or 
FreeBSD or Linux.



--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread David de Regt
It's the kind of useful help like this that makes me love the FOSS movement.

-David

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Walter Hurry
Sent: Friday, June 28, 2013 5:09 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Large Database Windows vs Linux

On Fri, 28 Jun 2013 15:22:57 -0600, Keith Medcalf wrote:

> That would explain why the best thing to be done with System Destroyer 
> (System Restore) is the same as the best way to handle the Hardware 
> Destroyer (Power Management) in Windows.  Disable it completely.
> 
The best thing to do with Windows is format the drive and install Unix or 
FreeBSD or Linux.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Walter Hurry
On Fri, 28 Jun 2013 15:22:57 -0600, Keith Medcalf wrote:

> That would explain why the best thing to be done with System Destroyer
> (System Restore) is the same as the best way to handle the Hardware
> Destroyer (Power Management) in Windows.  Disable it completely.
> 
The best thing to do with Windows is format the drive and install Unix or 
FreeBSD or Linux.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread David de Regt
FWIW, with our test and prod implementations, we find between a 3 and 10x 
(300-1000%) increase in almost all of our query times on Windows NTFS over OSX 
and iOS systems, depending on the query type.  We've done a bunch of testing 
and can verify it every time.

I started a thread on this ~7 months ago, and everyone else was able to verify 
it too with a simple implementation.  No explanations were provided outside of 
"Windows sucks", "Lol n00b", and "try linux". :)

And no, system restore isn't backing up our database files...

-David

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Keith Medcalf
Sent: Friday, June 28, 2013 2:23 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Large Database Windows vs Linux


That would explain why the best thing to be done with System Destroyer (System 
Restore) is the same as the best way to handle the Hardware Destroyer (Power 
Management) in Windows.  Disable it completely.


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- 
> boun...@sqlite.org] On Behalf Of Roger Binns
> Sent: Friday, 28 June, 2013 15:07
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Large Database Windows vs Linux
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 28/06/13 13:17, RSmith wrote:
> > Best guess is some other system is trying to also look into that
> file,
> > making the Windows file manager stutter quite possibly the Win7 
> > Preview pane, a 3rd party file indexer service, an anti-virus
> system or
> > some other
> 
> Those are often called tag alongs since they tag along file i/o 
> activity.
> 
> Also beware that if your database (or any other file) as a particular 
> extension then Windows' System Restore will keep making backup copies 
> whenever it changes.  Here is the list of monitored extensions:
> 
> 
> http://msdn.microsoft.com/en-
> us/library/windows/desktop/aa378870(v=vs.85).aspx
> 
> Roger
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.12 (GNU/Linux)
> 
> iEYEARECAAYFAlHN+ugACgkQmOOfHg372QTTqgCeN3hNpHGON/CaoEx95y9605Qz
> GMIAnAtmuJzgo3wwLrZdGOIbA2yWwuRP
> =5SeP
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Keith Medcalf

That would explain why the best thing to be done with System Destroyer (System 
Restore) is the same as the best way to handle the Hardware Destroyer (Power 
Management) in Windows.  Disable it completely.


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Roger Binns
> Sent: Friday, 28 June, 2013 15:07
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Large Database Windows vs Linux
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 28/06/13 13:17, RSmith wrote:
> > Best guess is some other system is trying to also look into that
> file,
> > making the Windows file manager stutter quite possibly the Win7
> > Preview pane, a 3rd party file indexer service, an anti-virus
> system or
> > some other
> 
> Those are often called tag alongs since they tag along file i/o
> activity.
> 
> Also beware that if your database (or any other file) as a
> particular
> extension then Windows' System Restore will keep making backup
> copies
> whenever it changes.  Here is the list of monitored extensions:
> 
> 
> http://msdn.microsoft.com/en-
> us/library/windows/desktop/aa378870(v=vs.85).aspx
> 
> Roger
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.12 (GNU/Linux)
> 
> iEYEARECAAYFAlHN+ugACgkQmOOfHg372QTTqgCeN3hNpHGON/CaoEx95y9605Qz
> GMIAnAtmuJzgo3wwLrZdGOIbA2yWwuRP
> =5SeP
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 28/06/13 13:17, RSmith wrote:
> Best guess is some other system is trying to also look into that file, 
> making the Windows file manager stutter quite possibly the Win7
> Preview pane, a 3rd party file indexer service, an anti-virus system or
> some other

Those are often called tag alongs since they tag along file i/o activity.

Also beware that if your database (or any other file) as a particular
extension then Windows' System Restore will keep making backup copies
whenever it changes.  Here is the list of monitored extensions:


http://msdn.microsoft.com/en-us/library/windows/desktop/aa378870(v=vs.85).aspx

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlHN+ugACgkQmOOfHg372QTTqgCeN3hNpHGON/CaoEx95y9605Qz
GMIAnAtmuJzgo3wwLrZdGOIbA2yWwuRP
=5SeP
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread RSmith
I think your assumption about the file system is correct - It is hard for the code to produce widely differing times under different 
systems as the basic algorithms do not change between systems, only dependencies on file-system or VFS specific api's etc.


The NTFS file system in Windows (hoping this is what you are on) is actually impressively fast in most modern implementations 
(though not as fast as some more process-specific linux flavours), but the point being it should not be taking more than 10 to 15% 
over the same query run for the same SQLite release in any other system, unless something is wrong. 1 hour+ would set off red lights 
in your case, nvm 16hrs+.


Best guess is some other system is trying to also look into that file, making the Windows file manager stutter quite possibly the 
Win7 Preview pane, a 3rd party file indexer service, an anti-virus system or some other real-time PC protection/enhancing kind of 
software or even memory juggling at low-diskspace situations. Usually any of the above you should have installed explicitly so you 
should know which of them are on your system.


If none of the above options are found to be the culprit, I would suggest 
looking at some testing of the core system.

a simple test operation would be:

create file #1 for input/output
create file #2 for output
L = 20 * 1024 * 1024 * 1024  // 20 Gig chosen arbitrarily
wInterval = L mod 4; / 4 additional read-write tests along the way
for n = 0 to L-1 do -
   write in file#1 byte(randomvalue)
   if ((n mod winterval)=0) read new data From #1 into #2
end for;
close #1 & #2
delete

The total time for the above on any moderately modern Windows PC using internal drives should bench under 10 minutes, certainly not 
more than 30 mins.
Take care to not use memory buffers for any of this, use file api calls directly and only - and ensure you have 40 GB available on 
your HDD.

You could follow the File-System buffering efforts by looking at the 
Task-manager Performance graphs while it runs.

if it takes too long, break it and remove the data transfer statement - if it doesn't help much (<40% improvement), you have a 3rd 
party problem. If it improves performance by around ~50%, it works normally, if it improves performance to over 80%, you have a file 
system or kernel which gets bogged down between read/writes and needs checking. (this last situation usually spells impending disk 
failure or such if the FS is healthy, time to backup).


If you don't feel adventurous and coding the above - You could also just download and use such a utility I've made for windows 
already from:

www.rifin.co.za/software/utils/FBench.exe
(Disclaimer - just one of my personal non-professional tools - Accuracy < Utility, but still pretty close on higher filesizes, use 
at own risk, etc. etc.).



On 2013/06/28 14:54, Christopher W. Steenwyk wrote:

Hi,

I have been struggling with a problem and was hoping I could get some
insight.

I have a rather large database (11 GB) that has two tables (one with
approximately 500,000 rows and another with approximately 50,000,000 rows).
  In this database I am performing a query that joins these two tables to
produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
to return some values to me.

I struggled with the slowness of this query for a while in linux and
through covering indexes, ANALYZE, and a SSD I was able to get the query to
run in about 15 minutes which I am OK with. Unfortunately the same query in
Windows has been running for 16 hours and still hasn't returned.

In my investigation I found that the pre-compiled exe was faster than the
C++ wrapper I was using, so my tool actually spawns a process using the
pre-compiled binary. In linux I am using v 3.7.13 and in Windows I am using
v 3.7.17.

Does anybody have a suggestion on how to either improve my overall
performance or increase my performance on Windows? My general thought is
that this is caused by the file manager layer in Windows, but I don't know.

Any help would be appreciated!

Thanks,
Chris
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Paul Corke
On 28 June 2013 18:08, Stephan Beal wrote:

> i've seen sqlite3 tests of mine in a 32-bit VM running on
> 64-bit hardware run twice as fast as that same code on the
> 64-bit hardware (outside the VM)

One of our customers uses our product on a VM, and it appears
that the hypervisor lies about having done an fsync.

Performace is the same (as in transactions per second) regardless
of what PRAGMA synchronous is set to.  On real hardware, there's
a big difference.

Paul.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Stephan Beal
On Fri, Jun 28, 2013 at 7:18 PM, Simon Slavin  wrote:

> Probably that your entire VM is in cache memory on your computer, but the
> program running on your hardware gets to write to a physical disk drive.
>

That's certainly the most likely hypothesis i've heard so far. i hadn't
considered the effects of the virtual disk and caching.

 Or you’re running anti-virus software on the hardware, but not inside the
> VM.
>

i haven't needed anti-virus software since last millennium ;).

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Simon Slavin

On 28 Jun 2013, at 6:08pm, Stephan Beal  wrote:

> On Fri, Jun 28, 2013 at 6:55 PM, Christopher W. Steenwyk <
> csteen...@gmail.com> wrote:
> 
>> I also did try version 3.7.13 and that did run faster. So for whatever
>> reason my shell 3.7.17 (32 or 64 bit) is significantly slower on windows
>> than my 3.7.13 32-bit.
> 
> Vaguely related to those observations: i've seen sqlite3 tests of mine in a
> 32-bit VM running on 64-bit hardware run twice as fast as that same code on
> the 64-bit hardware (outside the VM). My assumption is that it has nothing
> directly to do with sqlite3 but with architecture-level details (about
> which i know next to nothing).

Probably that your entire VM is in cache memory on your computer, but the 
program running on your hardware gets to write to a physical disk drive.  Or 
you’re running anti-virus software on the hardware, but not inside the VM.  But 
yes, whatever it is, you’re probably right.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Stephan Beal
On Fri, Jun 28, 2013 at 6:55 PM, Christopher W. Steenwyk <
csteen...@gmail.com> wrote:

> I also did try version 3.7.13 and that did run faster. So for whatever
> reason my shell 3.7.17 (32 or 64 bit) is significantly slower on windows
> than my 3.7.13 32-bit.
>

Vaguely related to those observations: i've seen sqlite3 tests of mine in a
32-bit VM running on 64-bit hardware run twice as fast as that same code on
the 64-bit hardware (outside the VM). My assumption is that it has nothing
directly to do with sqlite3 but with architecture-level details (about
which i know next to nothing).

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread Christopher W. Steenwyk
Thanks for the input.

I did recompile in 64-bit mode with no difference.

I also did try version 3.7.13 and that did run faster. So for whatever
reason my shell 3.7.17 (32 or 64 bit) is significantly slower on windows
than my 3.7.13 32-bit.


On Fri, Jun 28, 2013 at 10:30 AM, og  wrote:

> I had a similar problem and it was the antivirus (win 7 prof)... My table
> has about 63 million
> rows and a description very similar to yours... but now I use the same data
> for FTS4, etc...
> on Debian wheeze (simple workstation... and currently all ok...) example:
>
> sqlite> select count(*) from parte;
> 62468241
> CPU Time: user 33.214076 sys 2.160135
>
> sqlite> select count(*) from parte where nome match 'castelo';
> 3834
> CPU Time: user 0.004000 sys 0.00
>
> good luck... :)
>
> t+
>
>
>
> On Fri, Jun 28, 2013 at 9:54 AM, Christopher W. Steenwyk <
> csteen...@gmail.com> wrote:
>
> > Hi,
> >
> > I have been struggling with a problem and was hoping I could get some
> > insight.
> >
> > I have a rather large database (11 GB) that has two tables (one with
> > approximately 500,000 rows and another with approximately 50,000,000
> rows).
> >  In this database I am performing a query that joins these two tables to
> > produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
> > to return some values to me.
> >
> > I struggled with the slowness of this query for a while in linux and
> > through covering indexes, ANALYZE, and a SSD I was able to get the query
> to
> > run in about 15 minutes which I am OK with. Unfortunately the same query
> in
> > Windows has been running for 16 hours and still hasn't returned.
> >
> > In my investigation I found that the pre-compiled exe was faster than the
> > C++ wrapper I was using, so my tool actually spawns a process using the
> > pre-compiled binary. In linux I am using v 3.7.13 and in Windows I am
> using
> > v 3.7.17.
> >
> > Does anybody have a suggestion on how to either improve my overall
> > performance or increase my performance on Windows? My general thought is
> > that this is caused by the file manager layer in Windows, but I don't
> know.
> >
> > Any help would be appreciated!
> >
> > Thanks,
> > Chris
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Large Database Windows vs Linux

2013-06-28 Thread og
I had a similar problem and it was the antivirus (win 7 prof)... My table
has about 63 million
rows and a description very similar to yours... but now I use the same data
for FTS4, etc...
on Debian wheeze (simple workstation... and currently all ok...) example:

sqlite> select count(*) from parte;
62468241
CPU Time: user 33.214076 sys 2.160135

sqlite> select count(*) from parte where nome match 'castelo';
3834
CPU Time: user 0.004000 sys 0.00

good luck... :)

t+



On Fri, Jun 28, 2013 at 9:54 AM, Christopher W. Steenwyk <
csteen...@gmail.com> wrote:

> Hi,
>
> I have been struggling with a problem and was hoping I could get some
> insight.
>
> I have a rather large database (11 GB) that has two tables (one with
> approximately 500,000 rows and another with approximately 50,000,000 rows).
>  In this database I am performing a query that joins these two tables to
> produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
> to return some values to me.
>
> I struggled with the slowness of this query for a while in linux and
> through covering indexes, ANALYZE, and a SSD I was able to get the query to
> run in about 15 minutes which I am OK with. Unfortunately the same query in
> Windows has been running for 16 hours and still hasn't returned.
>
> In my investigation I found that the pre-compiled exe was faster than the
> C++ wrapper I was using, so my tool actually spawns a process using the
> pre-compiled binary. In linux I am using v 3.7.13 and in Windows I am using
> v 3.7.17.
>
> Does anybody have a suggestion on how to either improve my overall
> performance or increase my performance on Windows? My general thought is
> that this is caused by the file manager layer in Windows, but I don't know.
>
> Any help would be appreciated!
>
> Thanks,
> Chris
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Large Database Windows vs Linux

2013-06-28 Thread Christopher W. Steenwyk
Hi,

I have been struggling with a problem and was hoping I could get some
insight.

I have a rather large database (11 GB) that has two tables (one with
approximately 500,000 rows and another with approximately 50,000,000 rows).
 In this database I am performing a query that joins these two tables to
produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
to return some values to me.

I struggled with the slowness of this query for a while in linux and
through covering indexes, ANALYZE, and a SSD I was able to get the query to
run in about 15 minutes which I am OK with. Unfortunately the same query in
Windows has been running for 16 hours and still hasn't returned.

In my investigation I found that the pre-compiled exe was faster than the
C++ wrapper I was using, so my tool actually spawns a process using the
pre-compiled binary. In linux I am using v 3.7.13 and in Windows I am using
v 3.7.17.

Does anybody have a suggestion on how to either improve my overall
performance or increase my performance on Windows? My general thought is
that this is caused by the file manager layer in Windows, but I don't know.

Any help would be appreciated!

Thanks,
Chris
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users