Hi Simon,

Am 10.12.2014 12:39, schrieb Simon Slavin:
Dear folks,

A little SQL question for you.  The database file concerned is purely for data 
manipulation at the moment.  I can do anything I like to it, even at the schema 
level, without inconveniencing anyone.

I have a TABLE with about 300 million (sic.) entries in it, as follows:

CREATE TABLE s2 (a TEXT, b TEXT, theCount INTEGER)

There are numerous cases where two or more rows (up to a few thousand in some 
cases) have the same values for a and b.  I would like to merge those rows into 
one row with a 'theCount' which is the total of all the merged rows.  
Presumably I do something like

CREATE TABLE s2merged (a TEXT, b TEXT, theCount INTEGER)

INSERT INTO s2merged SELECT DISTINCT ... FROM s2
insert into s2merged (a, b, theCount) select a, b, sum(theCount) from s2 group by a, b;

and there'll be a TOTAL() in there somewhere.  Or is it GROUP BY ?  I can't 
seem to get the right phrasing.

Also, given that this is the last operation I'll be doing on table s2, will it 
speed things up to create an index on s2 (a,b), or will the SELECT just spend 
the same time making its own temporary index ?
Creating the index and select with index will probably be slower than select without index

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to