On Sat, 16 Apr 2016 01:20:55 +0200
Ketil Froyn <ketil at froyn.name> wrote:
> I have two tables and a join table, in principle like this:
>
> CREATE TABLE records (id INTEGER PRIMARY KEY, data TEXT);
> CREATE TABLE features (id INTEGER PRIMARY KEY, data TEXT UNIQUE);
> CREATE TABLE records_features (id_r INTEGER, id_f INTEGER, ord
> INTEGER);
(As a style suggestion, consider not reduplicating plurals. It's a
one-man band, a 3-star general, and a 5-finger discount. What you have
is a set of record_features.)
> Later, if I come across a new record that has exactly features
> "feature1", "feature2" and "feature3" (IDs 20,21 and 22), I don't want
> to add a new record for this. So given a new set of feature IDs, how
> can I best check if there's another record that has these exact
> features before I insert it?
>
> To be clear, a record with features "feature1", "feature2", "feature4"
> would be ok. So would "feature1", "feature2". Subsets are ok, but not
> exact duplicates.
This is an application of relational division: you want to know if two
sets are equal. Well, almost. You really want to know if the "new" set
is a subset of an existing one.
You're also a little stuck for lack of syntax. There are two sets in
question: the extant one and the one to be inserted. You can insert a
set in SQLite,
insert into T values (a), (b), (c);
but because that set of rows (a, b, and c) doesn't have a name, you
can't refer to them again in the same query. So you have to dance a
little jig, something like:
begin transation
insert into record features ...
delete from record features
id = new_id
where exists (
select 1 from record features as rf
where rf.id_r = record_features.id_r
and rf.id_f = record_features.id_f
group by id
having count(*) = (
select count(*)
from record_features where id_r = new_id
)
);
commit transaction
That puts the rows into the database -- where they can be examined --
and deletes them if they turn out to be a duplicate. You could also
apply the same logic in an insert & update trigger, and probably should
if the rule applies to the *data*, and not just the application's
expectations.
--jkl