Re: [sqlite] replace many rows with one
On 10 Dec 2014, at 3:40pm, RSmithwrote: > INSERT INTO s2merged SELECT a, b, sum(theCount) FROM s2 GROUP BY a,b; Thanks to Martin, Hick and R for this solution. It was just what I was looking for. > Not sure if your theCount field already contains totals or if it just has > 1's... how did duplication happen? The existing rows contain totals. Or maybe I should call them subtotals. The data is being massaged from one format to another. I did a bunch of stuff when it was text files, then imported it into SQLite and did a bunch more on it as rows and columns. Eventually it'll end up in SQLite. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] replace many rows with one
On 2014/12/10 13:39, Simon Slavin wrote: Dear folks, A little SQL question for you. The database file concerned is purely for data manipulation at the moment. I can do anything I like to it, even at the schema level, without inconveniencing anyone. I have a TABLE with about 300 million (sic.) entries in it, as follows: CREATE TABLE s2 (a TEXT, b TEXT, theCount INTEGER) There are numerous cases where two or more rows (up to a few thousand in some cases) have the same values for a and b. I would like to merge those rows into one row with a 'theCount' which is the total of all the merged rows. Presumably I do something like CREATE TABLE s2merged (a TEXT, b TEXT, theCount INTEGER) INSERT INTO s2merged SELECT DISTINCT ... FROM s2 I think the one you are looking for is: INSERT INTO s2merged SELECT a, b, sum(theCount) FROM s2 GROUP BY a,b; Not sure if your theCount field already contains totals or if it just has 1's... how did duplication happen? Should this be the case you might also be able to use simply: INSERT INTO s2merged SELECT a, b, count() FROM s2 GROUP BY a,b; Either way, the last query will obviously show the duplication counts (if needed as an exercise). For 300 mil rows this will be rather quick if it's going to be a once-off thing and not something running often. I'd say it will take under an hour depending on hardware and how much duplication happened in s2. Making an index will take a lot longer, you are better off just running the merge as above - unless of course the eventual use of s2merged includes being a look-up attached DB or such, in which case making an index from the start will be worthwhile. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] replace many rows with one
Both, I guess Insert into ... select a,b,sum(theCount) group by a,b; -Ursprüngliche Nachricht- Von: Simon Slavin [mailto:slav...@bigfraud.org] Gesendet: Mittwoch, 10. Dezember 2014 12:39 An: General Discussion of SQLite Database Betreff: [sqlite] replace many rows with one Dear folks, A little SQL question for you. The database file concerned is purely for data manipulation at the moment. I can do anything I like to it, even at the schema level, without inconveniencing anyone. I have a TABLE with about 300 million (sic.) entries in it, as follows: CREATE TABLE s2 (a TEXT, b TEXT, theCount INTEGER) There are numerous cases where two or more rows (up to a few thousand in some cases) have the same values for a and b. I would like to merge those rows into one row with a 'theCount' which is the total of all the merged rows. Presumably I do something like CREATE TABLE s2merged (a TEXT, b TEXT, theCount INTEGER) INSERT INTO s2merged SELECT DISTINCT ... FROM s2 and there'll be a TOTAL() in there somewhere. Or is it GROUP BY ? I can't seem to get the right phrasing. Also, given that this is the last operation I'll be doing on table s2, will it speed things up to create an index on s2 (a,b), or will the SELECT just spend the same time making its own temporary index ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] replace many rows with one
Hi Simon, Am 10.12.2014 12:39, schrieb Simon Slavin: Dear folks, A little SQL question for you. The database file concerned is purely for data manipulation at the moment. I can do anything I like to it, even at the schema level, without inconveniencing anyone. I have a TABLE with about 300 million (sic.) entries in it, as follows: CREATE TABLE s2 (a TEXT, b TEXT, theCount INTEGER) There are numerous cases where two or more rows (up to a few thousand in some cases) have the same values for a and b. I would like to merge those rows into one row with a 'theCount' which is the total of all the merged rows. Presumably I do something like CREATE TABLE s2merged (a TEXT, b TEXT, theCount INTEGER) INSERT INTO s2merged SELECT DISTINCT ... FROM s2 insert into s2merged (a, b, theCount) select a, b, sum(theCount) from s2 group by a, b; and there'll be a TOTAL() in there somewhere. Or is it GROUP BY ? I can't seem to get the right phrasing. Also, given that this is the last operation I'll be doing on table s2, will it speed things up to create an index on s2 (a,b), or will the SELECT just spend the same time making its own temporary index ? Creating the index and select with index will probably be slower than select without index Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users