Re: [sqlite] Counting rows
On Thu, 11 Dec 2014 15:19:26 + Simon Slavinwrote: > In my table which had about 300 million (sic.) rows I did this > > SELECT count(*) FROM myTable; > > to count the number of rows. After half an hour it was still > processing and I had to kill it. > > I know that the internal structure of a table means that this number > isn't simple to produce. But is there really no faster way ? This > table is going to have about six times that amount soon. I really > can't count the rows in less than a few hours ? Try SELECT count(myTable.indexed_column) FROM myTable; indexed_column is a column of your table that is the leftmost NOTNULL column in a index, for example the rowid. count(*) will look at every row in the table pages. The count(myTable.indexed_column) with the restrictions I said, should use the index, which (again) should use a lot less pages, minimizing the I/O. > Simon. --- --- Eduardo Morras ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
>From previous reading (years ago on this list) I normally do select count(1) from tableName ; to count the rows in a table. as an alternate, select count(primary_key_or_SomeIndexName) from tableName when trying to get an actual count. beware: select count(someField) from table; will not count rows where someField is null select count(1) from table; will. Adam On Thu, Dec 11, 2014 at 11:39 AM, Dominique Deviennewrote: > On Thu, Dec 11, 2014 at 4:19 PM, Simon Slavin > wrote: > > > In my table which had about 300 million (sic.) rows I did this > > SELECT count(*) FROM myTable; > > to count the number of rows. After half an hour it was still processing > > and I had to kill it. > > > > I have a little utility that connects to Oracle, and does a big UNION ALL > query to get the counts of all my tables (82 currently): > > TOTAL: 1,900,343 rows in 20 tables (out of 82) > 0.129u 0.051s 0:00.66 25.7% 0+0k 32+32io 0pf+0w (COLD) > 0.128u 0.045s 0:00.34 47.0% 0+0k 0+32io 0pf+0w (HOT) > > Granted, it's not 300M rows, just 1.9M, but that's 660ms (340ms when in > cache), and that's counting the startup and connect time (~ 170ms). > > The plan is INDEX (FAST FULL SCAN) of the PK (a raw(16) for a GUID). FWIW, > for context/comparison. --DD > > PS: I was actually surprised it was that cheap. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- -- VerifEye Technologies Inc. 151 Whitehall Dr. Unit 2 Markham, ON L3R 9T1 ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
On 11 Dec 2014, at 4:39pm, Dominique Deviennewrote: > I have a little utility that connects to Oracle, and does a big UNION ALL > query to get the counts of all my tables (82 currently): Yeah, it's easy in Oracle. The problem is that SQLite3 uses a tree to store lists, and it does not store the total number of entries separately. So to count the number of rows in a table SQLite has to walk the entire tree: go up and down all the branches to find which rows exist, whether any have been deleted, etc.. SQLite4 uses a different file format and I understand it does not have this problem. Which doesn't help me at all right now. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
On Thu, Dec 11, 2014 at 4:19 PM, Simon Slavinwrote: > In my table which had about 300 million (sic.) rows I did this > SELECT count(*) FROM myTable; > to count the number of rows. After half an hour it was still processing > and I had to kill it. > I have a little utility that connects to Oracle, and does a big UNION ALL query to get the counts of all my tables (82 currently): TOTAL: 1,900,343 rows in 20 tables (out of 82) 0.129u 0.051s 0:00.66 25.7% 0+0k 32+32io 0pf+0w (COLD) 0.128u 0.045s 0:00.34 47.0% 0+0k 0+32io 0pf+0w (HOT) Granted, it's not 300M rows, just 1.9M, but that's 660ms (340ms when in cache), and that's counting the startup and connect time (~ 170ms). The plan is INDEX (FAST FULL SCAN) of the PK (a raw(16) for a GUID). FWIW, for context/comparison. --DD PS: I was actually surprised it was that cheap. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
On 11 Dec 2014, at 3:58pm, Paul Sandersonwrote: > would count _rowid_ from mytable be quicker Hmm. Given that these tables have the normal use of rowid, and that rows in this table are only inserted, never deleted, I wonder whether SELECT max(rowid) FROM myTable would have given the right result, almost instantly. Can't check it now, but thanks for the idea, Paul. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
Simon Slavin wrote on Thursday, December 11, 2014 10:19 AM > I know that the internal structure of a table means that this number > isn't simple to produce. But is there really no faster way ? This > table is going to have about six times that amount soon. I really > can't count the rows in less than a few hours ? > Might it be possible to write a trigger that keeps track of insertions/deletions and updates a counter in another table? As another data point, I tested an 8 GB database on a SAN that has about 14 million rows, count(*) takes about 7 seconds. Erik -- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
On 2014/12/11 17:19, Simon Slavin wrote: In my table which had about 300 million (sic.) rows I did this SELECT count(*) FROM myTable; to count the number of rows. After half an hour it was still processing and I had to kill it. I know that the internal structure of a table means that this number isn't simple to produce. But is there really no faster way ? This table is going to have about six times that amount soon. I really can't count the rows in less than a few hours ? Not so strange I think... The highest number of rows I have tried to do maintenance of that sort on was only about 50 million though DB size was around 150GB, and the row-count on that took some time to establish, but in the order of minutes, not hours and certainly not days. I have here and now only a 10-million row DB to run some quick tests on a machine without SSD or anything good - seems to take around 1 min 20s on the first attempt to do a count() and around 33s on the next attempts (I'm assuming caching doing its bit here). Scaling that up - it becomes apparent that a ~300-mil row DB with 30x the rows than what I am testing should take around 30x the time, which is 1m20 x 30 which is around 40 minutes assuming similar hardware. You probably stopped it just shy of the goal. Either way, 30 minutes and 45 minutes are /exactly/ the same amounts of time when measured in impatience base units. I have no idea how to make it faster or in any way how to improve the speed on a query that simple. Knowing you (a bit) I already know you have thought about all of this a lot and you won't be asking if solutions were easy to come by, so my usual advice probably won't help much, other than to ask - do you really need to know the row-count? Is knowing it is around 300mil not enough? Any other query you might envision on this data-set will probably take in the order of hours for simple ones and days for anything joined. Best is to have a lone-standing machine churn through it over time and getting some results from time to time as a long-term project. (I know you are already well-aware of this). Cheers and best of luck! Ryan (PS: I know the above isn't really helpful or qualified as an "answer", sorry about that - you are dealing with a special beast indeed). ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows
would count _rowid_ from mytable be quicker Paul www.sandersonforensics.com skype: r3scue193 twitter: @sandersonforens Tel +44 (0)1326 572786 http://sandersonforensics.com/forum/content.php?195-SQLite-Forensic-Toolkit -Forensic Toolkit for SQLite http://sandersonforensics.com/forum/content.php?168-Reconnoitre - VSC processing made easy On 11 December 2014 at 15:19, Simon Slavinwrote: > In my table which had about 300 million (sic.) rows I did this > > SELECT count(*) FROM myTable; > > to count the number of rows. After half an hour it was still processing and > I had to kill it. > > I know that the internal structure of a table means that this number isn't > simple to produce. But is there really no faster way ? This table is going > to have about six times that amount soon. I really can't count the rows in > less than a few hours ? > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Counting rows
In my table which had about 300 million (sic.) rows I did this SELECT count(*) FROM myTable; to count the number of rows. After half an hour it was still processing and I had to kill it. I know that the internal structure of a table means that this number isn't simple to produce. But is there really no faster way ? This table is going to have about six times that amount soon. I really can't count the rows in less than a few hours ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows in one table, and comparing COUNT with column in other table?
"Gilles Ganault" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > One more thing: I'd like to only retrieve rows where code.number and > COUNT(companies.code) don't match SELECT code.number, COUNT(companies.code) FROM code JOIN companies ON (code.id=companies.code) GROUP BY (code.id) HAVING code.number != COUNT(companies.code); Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows in one table, and comparing COUNT with column in other table?
On Tue, 11 Nov 2008 07:30:21 -0500, "Igor Tandetnik" <[EMAIL PROTECTED]> wrote: >> SELECT code.number,COUNT(companies.code) FROM code,companies WHERE >> code.id=companies.code >> > >Add "GROUP BY code.id" One more thing: I'd like to only retrieve rows where code.number and COUNT(companies.code) don't match (this happens if there were a network problem or a bug on the server from which I'm downloading those data). SQLite doesn't seem to allow this: = SELECT code.number,COUNT(companies.code) FROM code,companies WHERE code.id=companies.code AND code.number <> COUNT(companies.code) GROUP BY code.id SQL error: misuse of aggregate: COUNT(companies.code) = Should I use a sub-SELECT to perform the second COUNT? Thank you. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows in one table, and comparing COUNT with column in other table?
On Tue, 11 Nov 2008 07:30:21 -0500, "Igor Tandetnik" <[EMAIL PROTECTED]> wrote: >Add "GROUP BY code.id" Way to do. Thank you. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows in one table, and comparing COUNT with column in other table?
"Gilles Ganault" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Here's an example of data in the CODE and COMPANIES tables: > > > CODE table > id=1 > number=26 > > COMPANIES table > id=x > code=1 > > > This doesn't: > > SELECT code.number,COUNT(companies.code) FROM code,companies WHERE > code.id=companies.code > Add "GROUP BY code.id" Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Counting rows in one table, and comparing COUNT with column in other table?
Hello This is a newbie question. I could successfully retrieve the results by sending two queries and extracting data with Python, but I was wondering if I could perform the same action in one SQLite query. Here's an example of data in the CODE and COMPANIES tables: CODE table id=1 number=26 COMPANIES table id=x code=1 I'd like to check if there are actually as many rows in COMPANIES that match whatever number of rows is saved CODE.number. This works: sql = 'SELECT id FROM code' rows=list(cursor.execute(sql)) for id in rows: sql = 'SELECT code.number,COUNT(companies.code) FROM code, companies WHERE code.id="%s" AND companies.code="%s"' % (id[0],id[0]) result = list(cursor.execute(sql)) print "Code=%s, number=%s" % (id[0],result[0][0]) This doesn't: SELECT code.number,COUNT(companies.code) FROM code,companies WHERE code.id=companies.code Can SQLite do what I'd like to do above, or should I just forget use the two-step above? Thank you for any tip. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Counting rows in multiple tables and joining on an index
create table t1 (rpsIndex INTEGER); create table t2 (rpsIndex INTEGER); create table t3 (rpsIndex INTEGER); insert into t1 values (1); insert into t1 values (1); insert into t1 values (2); insert into t2 values (1); insert into t2 values (2); insert into t2 values (3); insert into t3 values (3); insert into t3 values (3); insert into t3 values (3); select rpsIndex, count(1) frequency from ( select rpsIndex from t1 union all select rpsIndex from t2 union all select rpsIndex from t3 ) group by rpsIndex; - Trey jrpfinch wrote: > I have three tables, each of which has the column rpsIndex. This column is > not a unique index/primary key. > > I would like to count the number of times each rpsIndex appears in all three > tables. E.g. > > Table1 > rpsIndex=1 > rpsIndex=1 > rpsIndex=2 > Table2 > rpsIndex=1 > rpsIndex=2 > rpsIndex=3 > Table3 > rpsIndex=3 > rpsIndex=3 > rpsIndex=3 > > Query would return: > rpsIndex Frequency > 13 > 22 > 34 > > It is possible to do this in pure SQL in SQLite? > > Many thanks > > jon > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Counting rows in multiple tables and joining on an index
I have three tables, each of which has the column rpsIndex. This column is not a unique index/primary key. I would like to count the number of times each rpsIndex appears in all three tables. E.g. Table1 rpsIndex=1 rpsIndex=1 rpsIndex=2 Table2 rpsIndex=1 rpsIndex=2 rpsIndex=3 Table3 rpsIndex=3 rpsIndex=3 rpsIndex=3 Query would return: rpsIndex Frequency 13 22 34 It is possible to do this in pure SQL in SQLite? Many thanks jon -- View this message in context: http://www.nabble.com/Counting-rows-in-multiple-tables-and-joining-on-an-index-tp15547081p15547081.html Sent from the SQLite mailing list archive at Nabble.com. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users