Marcus Engene wrote:
> Hi List,
>
> I have the might_like table that contains products a user might like if
> he likes the present one (item).
>
> CREATE TABLE might_like
> (
> item INTEGER NOT NULL
> ,created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
> ,child INTEGER NOT NULL
> )
> WITHOUT OIDS;
>
> CREATE INDEX might_like_x1 ON might_like(item);
>
> Since there are (will be) houndreds of thousands of items, and 20+ might
> like items, i thought it would be nice to reduce the set to 1/20th by
> using a vector.
>
> CREATE TABLE might_like_vector
> (
> item INTEGER NOT NULL
> ,created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
> ,child_arr INTEGER[]
> )
> WITHOUT OIDS;
You haven't reduced the set at all, you've just turned part of it
sideways. You might gain something on your search, but I'm guessing
you've not tested it.
Hmm - the attached script generates 100,000 items and 10 liked ones for
each (well for the first 99,990 it says you like the next 10 items).
They're all given different timestamps at day intervals which means
you'll end up with 6 or seven matches for you sample query.
> But then this don't work:
>
> select
> ...
> from
> item pic
> where
> pic.objectid in (
> select mlv.child_arr
> from might_like_vector mlv
> where mlv.item = 125 AND
> mlv.created_at > now() - interval '1 week'
> )
> limit 16
Without messing around with arrays you get this query (which seems
readable enough to me)
SELECT
objectid, objname
FROM
items i
JOIN might_like m ON (i.objectid = m.child)
WHERE
m.created_at > (now() - '1 week'::interval)
AND m.item = 125
ORDER BY
objectid
LIMIT
16
;
I'm getting times less than a millisecond for this - are you sure it's
worth fiddling with arrays?
--
Richard Huxton
Archonet Ltd
BEGIN;
CREATE SCHEMA mightlike;
SET search_path = mightlike;
CREATE TABLE items (
objectid integer NOT NULL,
objname text NOT NULL
);
CREATE TABLE might_like (
item integer NOT NULL,
created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
child integer NOT NULL
);
INSERT INTO items SELECT i, 'item number ' || i
FROM generate_series(1, 100000) i;
INSERT INTO might_like SELECT i, (now() - j * '1 day'::interval), i+j
FROM generate_series(1, 99990) i, generate_series(1, 10) j;
ALTER TABLE items ADD PRIMARY KEY (objectid);
ALTER TABLE might_like ADD PRIMARY KEY (item, child);
ALTER TABLE might_like ADD CONSTRAINT valid_child FOREIGN KEY (child) REFERENCES items;
CREATE INDEX might_like_idx1 ON might_like (item, created_at);
-- EXPLAIN ANALYSE
SELECT
objectid, objname
FROM
items i
JOIN might_like m ON (i.objectid = m.child)
WHERE
m.created_at > (now() - '1 week'::interval)
AND m.item = 125
ORDER BY
objectid
LIMIT
16
;
ROLLBACK;
--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general