[SQL] join and sort on 'best match'

2006-12-13 Thread Dirk Griffioen

 Hi Everybody,

I have been breaking my head on the following problem: how to join 2 
tables and sort the results on the best match.


explanation:

- there are 3 tables, items, tags and items_tags. The items_tags table 
links items to tags.
- I have one item which has certain tags, and I want to look up all the 
other items that have those tags as well
- results should be sorted and presented by 'best match': first all the 
items that have 3 tags in common, then 2 and last 1


example:

Item 1 : news, nature, greenpeace
Item 2 : news, nature
Item 3 : news, nature, greenpeace, whale

Item 1 and Item 3 are the best match.

So far, the SQL I came up wiht looks like:

SELECT id, COUNT(items_tags.item_id) AS quantity
FROM items JOIN items_tags ON items_tags.item_id = items.id
WHERE id in (select item_id from items_tags where tag_id in (select 
tag_id from items_tags where item_id=?))

GROUP BY items_tags.item_id,id
ORDER BY quantity DESC

note: the '?' in the query represents the dynamic part: I have 1 item 
and I want to look up matching items.


To me, this query means the following:

- get all items that have tags, the 'JOIN', and count the tags, but only 
those that match on the same tags, the 'WHERE'

- then show them

I thought I had found the solution (my test cases worked), but I now 
find cases that should be found by the query but are not.


Can anyone please help me?

Dirk




Re: [SQL] join and sort on 'best match'

2006-12-13 Thread Ragnar
On miư, 2006-12-13 at 10:26 +0100, Dirk Griffioen wrote:

> I have been breaking my head on the following problem: how to join 2
> tables and sort the results on the best match.
  ^
> - there are 3 tables, items, tags and items_tags. The items_tags table
> links items to tags.
> - I have one item which has certain tags, and I want to look up all
> the other items that have those tags as well

looks to me like you want to join:
items->item_tags->tags->item_tags->items

so the basic select is:
SELECT *
  FROM  items AS i1
   JOIN items_tags AS it1 ON (it1.item_id = i1.id)
   JOIN tags AS t ON (t.tag_id = it1.tag_id)
   JOIN items_tags AS it2 ON (it2.tag_id = t.tag_id)
   JOIN items AS i2 ON (i2.id = it2.item_id)
  WHERE i1.id=?

> - results should be sorted and presented by 'best match': first all
> the items that have 3 tags in common, then 2 and last 1

this would be: 

SELECT i1.id,i2.id,COUNT(*) as quantity
  FROM  items AS i1
   JOIN items_tags AS it1 ON (it1.item_id = i1.id)
   JOIN tags AS t ON (t.tag_id = it1.tag_id)
   JOIN items_tags AS it2 ON (it2.tag_id = t.tag_id)
   JOIN items AS i2 ON (i2.id = it2.item_id)
  WHERE i1.id=?
  GROUP by i1.id,i2.id
  ORDER BY quantity DESC

> I thought I had found the solution (my test cases worked), but I now
> find cases that should be found by the query but are not.

if this does not work, please provide us with a counter example.

gnari



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[SQL] Rule for multiple entries

2006-12-13 Thread William Scott Jordan

Hi all!

I have a rule in place that is supposed to adjust a value in one 
table based on how many rows are added or deleted to another table, 
but I'm not getting the results that I hoped for.  If a single sql 
statement adds or deletes multiple entries, the rule still only 
adjusts the value by one, when I want it to adjust the value by the 
number of rows that were added or deleted.


To recreate this problem, first set up the schema:

CREATE TABLE counter (
counter_key int,
counter_value int
) ;

INSERT INTO counter (counter_key, counter_value) VALUES (1, 0) ;

CREATE TABLE entries (
entry_key int
) ;

CREATE RULE entries_insert AS ON INSERT TO entries DO UPDATE counter 
SET counter_value = counter_value + 1 WHERE counter_key = 1 ;
CREATE RULE entries_delete AS ON DELETE TO entries DO UPDATE counter 
SET counter_value = counter_value - 1 WHERE counter_key = 1 ;


Then enter some values:

INSERT INTO entries (entry_key) VALUES (1) ;
INSERT INTO entries (entry_key) VALUES (2) ;
INSERT INTO entries (entry_key) VALUES (3) ;
INSERT INTO entries (entry_key) VALUES (4) ;
INSERT INTO entries (entry_key) VALUES (5) ;

At this point the counter table should show a counter_value of 5, 
because there are 5 entries in the entries table.  Now, if you delete 
3 of those entries, the counter table should show a value of 2, but 
instead it only counts it as a single transaction.


DELETE FROM entries WHERE entry_key > 2 ;

Why doesn't the entries_delete rule hit for each row that's being 
adjusted.  Or, more importantly, how can I adjust the rule so that it 
will count each row that's be added/deleted?


Any suggestions would be appreciated.

-Scott


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [SQL] Rule for multiple entries

2006-12-13 Thread Jim Buttafuoco
Use a trigger instead, the rule is only run once per insert/update/delete
while the trigger is run for each row.


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of William Scott Jordan
Sent: Wednesday, December 13, 2006 9:05 PM
To: [email protected]
Subject: [SQL] Rule for multiple entries

Hi all!

I have a rule in place that is supposed to adjust a value in one 
table based on how many rows are added or deleted to another table, 
but I'm not getting the results that I hoped for.  If a single sql 
statement adds or deletes multiple entries, the rule still only 
adjusts the value by one, when I want it to adjust the value by the 
number of rows that were added or deleted.

To recreate this problem, first set up the schema:

CREATE TABLE counter (
counter_key int,
counter_value int
) ;

INSERT INTO counter (counter_key, counter_value) VALUES (1, 0) ;

CREATE TABLE entries (
entry_key int
) ;

CREATE RULE entries_insert AS ON INSERT TO entries DO UPDATE counter 
SET counter_value = counter_value + 1 WHERE counter_key = 1 ;
CREATE RULE entries_delete AS ON DELETE TO entries DO UPDATE counter 
SET counter_value = counter_value - 1 WHERE counter_key = 1 ;

Then enter some values:

INSERT INTO entries (entry_key) VALUES (1) ;
INSERT INTO entries (entry_key) VALUES (2) ;
INSERT INTO entries (entry_key) VALUES (3) ;
INSERT INTO entries (entry_key) VALUES (4) ;
INSERT INTO entries (entry_key) VALUES (5) ;

At this point the counter table should show a counter_value of 5, 
because there are 5 entries in the entries table.  Now, if you delete 
3 of those entries, the counter table should show a value of 2, but 
instead it only counts it as a single transaction.

DELETE FROM entries WHERE entry_key > 2 ;

Why doesn't the entries_delete rule hit for each row that's being 
adjusted.  Or, more importantly, how can I adjust the rule so that it 
will count each row that's be added/deleted?

Any suggestions would be appreciated.

-Scott


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings