On Sat, Jul 25, 2009 at 09:26:16AM -0700, Jim Showalter scratched on the wall:
> You can have the tags in a separate table that has a foreign-key to 
> the table with the rows in in you want to tag,

  That's essentially what the OP is doing, except they've built a
  many-to-many relationship using a bridge table, rather than a
  one-to-many using a simple foreign key.  This allows any tag to be
  associated with any row without duplicating any of the tags, allowing
  for better data normalization.

> If you need to search for 
> multiple tags at a time, each requires a join: 

  Right, and that's the problem.  If you've got an application that
  allows you to search for tags with an arbitrary number of tags, you
  need a different query for each situation (one tag, two tags, etc.).
  Also, adding two more JOINs (since you need to join through the
  bridge table) for each additional search target means your performance
  degrades very quickly when you've got four or five search tags.

  This is why I suggested looking at Relational division.  Even if SQL
  doesn't support it directly, there are other ways of doing it to make
  the query more generic.  With division, you can do something like (in the
  terms of the OP's post):

  (foo JOIN foo_tag JOIN tag) / (search_tags) = foo with search_tags

  This basically builds a table of all foo:tag combinations and then
  divides out our search tags, resulting in a list of foo.f_id values
  that have an association with all of the tags found in search_tags.
  You can then JOIN the results of this expression back to the tags.
  The OP asked to have the search tags removed from the results, which
  is also a bit of a trick to do in a generic way.

  Anyways... The relational division makes any query search -- from
  1 to 1000 tags -- have the same structure.  As long as the division
  performance is good, the query performance shouldn't horribly
  degrade as you add more search tags.

> select t
> from tagged as t 
> join t.tags as tag1
> join t.tags as tag2
> where tag1 = some value
>   and tag2 = some value.

  This points in the direction of a one-to-many structure, only the
  syntax is not making sense to me.  JOINs work on tables, not columns,
  and WHERE expressions work on columns, not tables.
  
  You're also only doing the first half of the problem, since you need
  to join this back to the tags to get the result the OP was looking for
  (including pulling back out the search targets).  It's a trickier
  problem than it first looks, especially if you want to do it in one
  SQL statement with a minimal number of sub-SELECTs.

   -j

> ----- Original Message ----- 
> From: "Jay A. Kreibich" <j...@kreibi.ch>
> To: <punk...@eidesis.org>; "General Discussion of SQLite Database" 
> <sqlite-users@sqlite.org>
> Sent: Friday, July 24, 2009 9:21 PM
> Subject: Re: [sqlite] a system for arbitrarily tagging rows in a table
> 
> 
> > On Fri, Jul 24, 2009 at 09:20:29PM -0500, P Kishor scratched on the 
> > wall:
> >> I am trying to develop a "tagging" system, whereby each row in a 
> >> table
> >> can be tagged with arbitrary number of tags.
> >
> >  This smells of a Relational division problem.  If you're dealing 
> > with
> >  tags you might want to have a look at that (Celko has a few good
> >  articles on it).  Since SQL lacks a native Relational division
> >  operator, chances are a solution in that direction is going to be
> >  more complex -- at least for this problem.  But any time I've done
> >  tags or attributes, sooner or later I find myself needing to do a
> >  division.  They come in handy any time you say "my data is vertical
> >  but I need it horizontal."  You might want to read up on them just 
> > to
> >  have that knowledge available.
> >
> >
> >> TABLE foo (f_id INTEGER PRIMARY KEY, f_name TEXT);
> >> TABLE tag (t_id INTEGER PRIMARY KEY, t_name TEXT);
> >> TABLE foo_tag (f_id INTEGER, t_id INTEGER);
> >
> >
> >> I have the following solution. Could I do better or differently?
> >
> >  I'm not sure about "better", but here's different:
> >
> > sqlite> SELECT foo.f_name, tag.t_name
> >   ...> FROM tag AS target
> >   ...>   NATURAL JOIN foo_tag AS target_ft
> >   ...>   NATURAL JOIN foo
> >   ...>   NATURAL JOIN foo_tag
> >   ...>   NATURAL JOIN tag
> >   ...> WHERE target.t_name = 'bar'
> >   ...>   AND tag.t_id != target.t_id
> >   ...> ORDER BY foo.f_name, tag.t_name;
> >
> >  This basically folds your IN sub-select back into the main query.
> >  We join "foo" to the tag table in two directions... one to find
> >  the search target tag id and the other to produce the output.
> >
> >  "tag AS target" with the first WHERE clause should return one row.
> >  We join that through "foo_tag AS target_ft" to get a list of foo 
> > ids
> >  that have the search target tag.  We then build the normal output
> >  list by joining that back through the foo_tag bridge table to the
> >  tags, and throw out any rows with an output tag id that matches the
> >  search target tag id.
> >
> >  Simple!
> >
> >  Best of all, the target only appears in the query once, and your 
> > name
> >  convention means we can use NATURAL JOINs to keep things clean.  It
> >  also totally falls apart if you need to search on more than one 
> > tag.
> >  That's where Relational division comes in.
> >
> >   -j (who had to draw a picture to get it right)

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Our opponent is an alien starship packed with atomic bombs.  We have
 a protractor."   "I'll go home and see if I can scrounge up a ruler
 and a piece of string."  --from Anathem by Neal Stephenson
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to