On 2017/01/02 5:28 PM, Ron Aaron wrote:
Hello, all -

I'm trying to select from a table which contains pairs of "id", "name",
where the "name" may appear in multiple "ids".  For example, the table
might have

"10", "abc"
"20", "abc"
"10", "def"
"10", "ghi"
"20", "ghi"
etc.    That is, not every id will have every name in it.

What I would like to accomplish, is to select distinct names, along with
an id, such that the id is as uniformly represented as possible.

My current solution is to create a temporary table with a unique key for
the name, and select into it like:

insert or ignore into tbl
select id,name
from originaltbl
where name in (...)
order by random() ;

This sort of works, in that I get each name just once, but the
distribution of ids is not as uniform as I would like, and I'm
scratching my head as to how to accomplish this.

The purpose here is to distribute jobs over processing units,
distributing the load as evenly as possible.

Might I suggest:
CREATE TABLE tbl (UKey TEXT COLLATE NOCASE PRIMARY KEY);

INSERT OR IGNORE INTO tbl
SELECT DISTINCT id||name FROM originaltbl WHERE...

Perhaps use only as a temporary table, or keep it up to date with a trigger.

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to