[SQL] RTREE on points

2001-04-15 Thread Julian Scarfe

Am I missing the point (no pun intended ;-) of RTREE indices?

I was expecting a "point_ops" opclass or similar...

[7.1 on RedHat 6.2]

SELECT am.amname AS acc_name,
opc.opcname AS ops_name,
COUNT(*)
 FROM pg_am am, pg_amop amop,
  pg_opclass opc
 WHERE amop.amopid = am.oid AND
   amop.amopclaid = opc.oid AND
   am.amname = 'rtree'
 GROUP BY am.amname, opc.opcname
 ORDER BY acc_name, ops_name;

 acc_name |  ops_name  | count
--++---
 rtree| bigbox_ops | 8
 rtree| box_ops| 8
 rtree| poly_ops   | 8
(3 rows)

Surely the most natural application of an RTREE is to index points, as well
as boxes and polygons. E.g.


CREATE TABLE "nodes" (
"node" point,
"node_name" character varying(30)
);
CREATE
INSERT INTO nodes VALUES ('(1,1)', 'a');
INSERT 207372 1
INSERT INTO nodes VALUES ('(1,2)', 'b');
INSERT 207373 1
INSERT INTO nodes VALUES ('(3,2)', 'c');
INSERT 207374 1
INSERT INTO nodes VALUES ('(5,4)', 'd');
INSERT 207375 1
INSERT INTO nodes VALUES ('(7,8)', 'e');
INSERT 207376 1
INSERT INTO nodes VALUES ('(11,10)', 'f');
INSERT 207377 1
INSERT INTO nodes VALUES ('(101,11)', 'g');
INSERT 207378 1

explain select * from nodes where node @ '((1,1),(3,3))'::box;
NOTICE:  QUERY PLAN:
Seq Scan on nodes  (cost=0.00..22.50 rows=500 width=28)

So create an RTREE index to help...but predictably:

CREATE INDEX test_rtree ON nodes USING RTREE (node);
ERROR:  DefineIndex: type point has no default operator class

I can do something like:

CREATE INDEX test_rtree ON nodes USING RTREE (box(node,node));
CREATE

but then:

explain select * from nodes where node @ '((1,1),(3,3))'::box;
NOTICE:  QUERY PLAN:
Seq Scan on nodes  (cost=0.00..1.09 rows=4 width=28)

and even:

explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box;
NOTICE:  QUERY PLAN:
Seq Scan on nodes  (cost=0.00..1.10 rows=1 width=28)

Thanks for any help

Julian Scarfe



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] RTREE on points

2001-04-16 Thread Julian Scarfe

Julian Scarfe wrote:
> >
> > explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box;
> > NOTICE:  QUERY PLAN:
> > Seq Scan on nodes  (cost=0.00..1.10 rows=1 width=28)

From: "Jeff Hoffmann" <[EMAIL PROTECTED]>

> this should work, assuming you have enough points to make a difference
> (in the optimizer's mind, at least).  the optimizer still doesn't do a
> great job of knowing when it's best to use an index, although, in your
> sample, there's no way it would ever be cheaper to use an index.
> there's simply not enough data there.  you can test to see if an index
> can be used by a query by shutting off the sequential scans (set
> enable_seqscan=off) and retrying the query.  essentially, this forces it
> to use an index scan if at all possible.

And indeed it does, thank you, Jeff:

# set enable_seqscan=off;
SET VARIABLE
# explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box;
NOTICE:  QUERY PLAN:
Index Scan using test_rtree on nodes  (cost=0.00..2.02 rows=1 width=28)

It hadn't occured to me that the index would simply not be used and I'm
grateful for the pointer to the appropriate variable.

Nevertheless, wouldn't...

CREATE INDEX test_rtree ON nodes USING RTREE (node);
(which fails)

...be a lot simpler than...

CREATE INDEX test_rtree ON nodes USING RTREE (box(node,node));
(which succeeds, as above)

?

The latter feels contorted and possibly inefficient.  After all, I don't
do...:

 CREATE TABLE "nodes" (
"node" point,
"node_name" character varying(30)
);

INSERT INTO nodes VALUES ('(1,1)', 'a');
INSERT INTO nodes VALUES ('(1,2)', 'b');
INSERT INTO nodes VALUES ('(3,2)', 'c');
INSERT INTO nodes VALUES ('(5,4)', 'd');
INSERT INTO nodes VALUES ('(7,8)', 'e');
INSERT INTO nodes VALUES ('(11,10)', 'f');
INSERT INTO nodes VALUES ('(101,11)', 'g');

CREATE INDEX test_btree ON nodes USING BTREE (textcat(node_name,node_name));

...if I want to index by name? (even though in principle it would work)

Thanks for any guidance.

Julian Scarfe



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



[SQL] Indexes with LIKE

2002-07-13 Thread Julian Scarfe

I've noticed a difference in behaviour between a server running version 6.x
and one running 7.2.1 in use of indexes with LIKE.

With an index on foo,

WHERE foo LIKE 'bar%'

uses the index (as I would expect it to) on the 6.x box, but uses a
sequential scan (really slow) on the 7.2.1 box.
It's possible that I've set (or failed to set)  a run-time parameter that
controls the behaviour, but I don't know what that might be. ANALYZEing the
table doesn't modify the behaviour.

I can't find any mention of this in the documentation -- any pointers would
be appreciated.

If the above doesn't ring any bells, I'll put together an example.

Many thanks

Julian Scarfe



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] Indexes with LIKE

2002-07-15 Thread Julian Scarfe

From: "Stephan Szabo" <[EMAIL PROTECTED]>

> You need to have made the database in C locale in order to get index scans
> from LIKE.  I think that's mentioned in the Localization section of the
> admin guide, but I could be remembering that wrong.

Thanks very much Stephan.  Indeed it's in Admin Guide 5.1.2. I was using
en_GB, not C.

There is of course no excuse for failing to read every bit of smallprint in
the admin guide before installation :-) but for those lazy unfortunates like
me that don't think to look there to find out what's wrong when indexes
aren't used with LIKE, I've added a DocNote under Pattern Matching.

Julian Scarfe



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[SQL] Ordering with GROUPs

2002-08-18 Thread Julian Scarfe

I'm struggling to find an appropriate efficient query for an aggregate-type
problem and I'd appreciate suggestions.

I have messages associated with a set of locations (zero or more messages
per location).  A cut down test is as follows:

CREATE TABLE location (ident char(4), node point);
INSERT INTO location values ('', '(1,1)');
INSERT INTO location values ('', '(1,2)');
INSERT INTO location values ('', '(2,1)');
INSERT INTO location values ('', '(2,2)');

CREATE TABLE message (ident char(4), message_text text);
INSERT INTO message values ('', 'foo');
INSERT INTO message values ('', 'bar');
INSERT INTO message values ('', 'baz');
INSERT INTO message values ('', 'abel');
INSERT INTO message values ('', 'baker');
INSERT INTO message values ('', 'charlie');

so each message is associated with a location as follows -- here's the
natural join

SELECT location.ident, node, message_text from location, message
WHERE location.ident = message.ident;
 ident | node  | message_text
---+---+--
   | (1,1) | foo
   | (1,1) | bar
   | (1,1) | baz
   | (1,2) | abel
   | (1,2) | baker
   | (2,1) | charlie
(6 rows)

What I want is to know how many messages are available for each location.
It's easy if I just want the ident:

SELECT location.ident, count(*) from location, message
WHERE location.ident = message.ident
GROUP BY location.ident;
 ident | count
---+---
   | 3
   | 2
   | 1
(3 rows)

But I'd like to return the "node" in the same query.  If I try:

SELECT location.ident, node, count(*) from location, message
WHERE location.ident = message.ident
GROUP BY location.ident;
ERROR:  Attribute location.node must be GROUPed or used in an aggregate
function

it obviously fails.  If node were an integer I could just use an aggregate
like max() or similar, but it's not, and there's no suitable aggregate for
point.  I can create a trivial one as a work around, but I hope I don't have
to.

But if I do it properly, it requires an ordering operator :

SELECT location.ident, node, count(*) from location, message
WHERE location.ident = message.ident
GROUP BY location.ident, location.node;
ERROR:  Unable to identify an ordering operator '<' for type 'point'
Use an explicit ordering operator or modify the query

So how do I specify the operator for GROUP BY?  If I compare with ORDER BY
(>> operator is "is right of" for type point):

SELECT location.ident, node from location, message
WHERE location.ident = message.ident
ORDER BY location.node USING >>, location.ident;
 ident | node
---+---
   | (2,1)
   | (1,1)
   | (1,1)
   | (1,1)
   | (1,2)
   | (1,2)
(6 rows)

...and use the same construction, I get a parser error...

SELECT location.ident, node from location, message
WHERE location.ident = message.ident
GROUP BY location.node USING >>, location.ident;
ERROR:  parser: parse error at or near ","

I can't find anything in the doc.

One alternative is to use a nested query:

SELECT location.ident, node, (
   select count(*) from message
   WHERE location.ident = message.ident
) as count
FROM location;
 ident | node  | count
---+---+---
   | (1,1) | 3
   | (1,2) | 2
   | (2,1) | 1
   | (2,2) | 0
(4 rows)

For the test that works fine, but for my real life situation, the nested
query seems to be very inefficient, taking vastly longer than the first
query illustrated above.  Since the information required is clearly
contained in the result of that query, it seems unfortunate to say the least
that I can't work the GROUP BY mechanism on it to give me what I want.

Any tips please?

Thanks

Julian Scarfe




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [SQL] Ordering with GROUPs

2002-08-19 Thread Julian Scarfe

From: "Tom Lane" <[EMAIL PROTECTED]>

> ISTM the problem here is the lack of any ordering operator for POINT,
> which defeats GROUP BY, *plus* the lack of any aggregate you might use
> for an aggregate-based solution.  This is not really a language failing
> but a problem with an impoverished datatype.

Yes, I agree completely.  If I were doing this again from scratch I'd be
using PostGIS, but I've got a lot of data that depends on POINT.

> So, if you don't like
> Bruno's subselect-based workaround, the dummy aggregate seems the way
> to go.

I've actually implemented the dummy aggregate now, and it works fine.  So
does Bruno's subselect (thank you Bruno), and the efficiency seems to be
similar in each case.

> SQL99 contains a whole bunch of verbiage whose intent seems to be that
> if you GROUP BY a unique or primary-key column, you can reference the
> other columns of that table without aggregation (essentially, the
> system treats them as implicitly GROUP BY'd).  Sooner or later we'll
> probably get around to implementing that, and that would solve your
> problem as long as you declare location.ident properly.

That makes a lot of sense, though I imagine there are higher priorities.

Thanks for your help.

Julian Scarfe



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [SQL] sum(time) problem

2003-01-17 Thread Julian Scarfe
On 17/1/03 13:03, "Oliver Vecernik" <[EMAIL PROTECTED]> wrote:

> sport=# \d polar
>Table "polar"
> Column |   Type   | Modifiers
> +--+---
> ts | timestamp with time zone | not null
> time   | time without time zone   |
> sport  | integer  | default 1
> kcal   | integer  |
> Primary key: polar_pkey
> 
> sport=# select * from polar limit 3;
>  ts   |   time   | sport | kcal
> +--+---+--
> 2001-05-17 19:47:31+02 | 00:28:25 | 1 |  388
> 2001-05-18 17:08:11+02 | 00:42:36 | 1 |  595
> 2001-05-19 13:41:43+02 | 00:51:39 | 1 |  737
> (3 rows)
> 
> 
> I'd like to have the total amount of time:
> 
> sport=# select sum(time) from polar where extract(year from ts)=2001;
> ERROR:  Unable to select an aggregate function sum(time without time zone)
> 
> It seems I've chosen the wrong type. Or is there another solution?

Correct diagnosis.  You need the "interval" type, not the "time" type for
your second field.  Interval is a time difference between two timestamps,
for example the time between the start and the finish of a race.

If you check out the available aggregates with \da you'll find that you can
sum an interval, but not a time.

Julian Scarfe


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] TIME vs. TIMESTAMP data type

2003-02-06 Thread Julian Scarfe
On 6/2/03 11:04, "Ludwig Lim" <[EMAIL PROTECTED]> wrote:

>   Are there cases when a TIME data type is a better
> choice over the TIMESTAMP data type?

Surely this depends on the nature of the data that you want to represent?

If you're researching into sleep patterns and want to represent the times
each day that subjects say they tend to wake and/or fall asleep, you may
want the TIME type, as the important aspect is the time, not the date.  The
inclusion of a date would be nonsensical.

If you want to record *when* an event occurred, you usually want date and
time, so TIMESTAMP is more appropriate.

Since event timing is a much more frequent requirement than a time-of-day,
it's not surprising that the facilities may be better developed for dealing
with that type.

Julian Scarfe


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



[SQL] Ordering in an aggregate -- points to paths

2003-06-15 Thread Julian Scarfe
OK, I know relying on ordering in an aggregate is sinful, but I don't know
if it's mortal or venial.

Long explanation, please bear with me, here's the background:

---
CREATE TABLE "foo" (
"a" point,
"b" int
);
INSERT INTO ...

SELECT * FROM foo;
a| b
-+---
 (1,1)   | 1
 (1,2)   | 3
 (1.5,3) | 5
 (4,4)   | 2
 (-1,-2) | 4
(5 rows)

---

So I want to create paths from the points in field a ordered by b.  The
first step is the rather laborious construction of an append_point function
(am I missing something, BTW? -- seems like an obvious function to have but
I couldn't find a built-in).

---
CREATE FUNCTION "path" (point) RETURNS path AS 'select
path_add_pt(''[(0,0)]''::path,$1)' LANGUAGE 'sql';


CREATE FUNCTION "append_point" (path,point) RETURNS path AS 'select case
WHEN $1 is null THEN path($2)
WHEN $2 is null THEN $1
ELSE path_add($1,path_add_pt(''[(0,0)]''::path,$2))
END' LANGUAGE 'sql';

---

and then the aggregate itself follows in the obvious way

---
CREATE AGGREGATE create_path ( BASETYPE = point, SFUNC = append_point, STYPE
= path);

SELECT create_path(a) FROM foo;
 create_path
-
 [(1,1),(1,2),(1.5,3),(4,4),(-1,-2)]
(1 row)

---

and moreover, with subselect for ordering following examples from this
mailing list

---
SELECT create_path(c.a) FROM (SELECT a FROM foo ORDER BY b) c;
 create_path
-
 [(1,1),(4,4),(1,2),(-1,-2),(1.5,3)]
(1 row)

---

So far so good. Now for the real data.  The points are an ordered (by
seq_no) set of latitude, longitude pairs defining a "fir", the boundary of a
region on the surface of the earth.  (fir_ident, fir_indicator, seq_no) is
unique.

---
CREATE TABLE "fir_coords" (
"node" point,
...
"fir_ident" character(4),
"fir_indicator" character(4),
"seq_no" character(4),
);

SELECT c.fir_ident, c.fir_indicator, create_path (c.node) AS fir_edge
   INTO fir_e
   FROM
   (SELECT fir_ident, fir_indicator, node
   FROM fir_coords
   ORDER BY fir_ident, fir_indicator,seq_no) c
GROUP BY fir_ident, fir_indicator;
---

The fir_e table should contain the paths for the fir. And for simple shapes
(a few dozen points) it works fine.

But the problem is that e.g. Austria's fir is defined by 1577 points, and
the path that I construct appears to be in the wrong order.

foo=# SELECT fir_ident, fir_indicator, seq_no, node  FROM fir_coords WHERE
fir_ident = 'LOVV'  LIMIT 10;
 fir_ident | fir_indicator | seq_no | node
---+---++---
 LOVV  | B | 0005   | (0.241534175928771,0.851255253839368)
 LOVV  | B | 0010   | (0.241844456684681,0.851240709428934)
 LOVV  | B | 0015   | (0.242135344893347,0.851167987376768)
 LOVV  | B | 0020   | (0.242368055460279,0.851022543272435)
 LOVV  | B | 0025   | (0.242571677206345,0.850935276809835)
 LOVV  | B | 0030   | (0.242862565415011,0.850717110653336)
 LOVV  | B | 0035   | (0.24312436480281,0.850528033317703)
 LOVV  | B | 0040   | (0.243327986548876,0.850368044802937)
 LOVV  | B | 0045   | (0.243560697115809,0.850324411571637)
 LOVV  | B | 0050   | (0.243633419167975,0.850208056288171)

whereas my path fir_edge looks like:


((0.268140750748062,0.854920445268556),(0.195244165656432,0.819810238482603)
,(0.238688319620658,0.812014434490362),(0.286597607587902,0.82373722929959),
(0.184975811890532,0.817861287484543),(0.192917059987107,0.816959534037679),
(0.244753338771338,0.849015414632642),(0.298204047113664,0.838528894710242),
(0.277478262246232,0.852418806674031),...

The ordering has gone awry. And since I'm going to draw the fir by 'joining
the dots' that's a Bad Thing.

Is this a problem with my functions, or is there something going on in the
internals that makes it dangerous to rely on the
ordered-subselect-with-aggregate construction above?

Thanks

Julian Scarfe



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] Ordering in an aggregate -- points to paths

2003-06-16 Thread Julian Scarfe
From: "Tom Lane" <[EMAIL PROTECTED]>

> Yeah, this is a fairly obvious thing to want to do with a user-written
> aggregate.  It does not work in released versions, because the planner
> does not notice that the inner SELECT's output ordering matches what
> the GROUP BY needs, and so it inserts an additional Sort plan step
> above the sub-select (you can see this if you look at EXPLAIN output).
> Unfortunately, on most platforms qsort() isn't stable and will not
> preserve the ordering of its input for equal keys.  So you lose the
> minor ordering by seq_no in the re-sort.

Most grateful for the rapid response Tom. Knowing that, I can work around by
iterating through the firs at the application level.

Regards

Julian Scarfe

PS: you shouldn't be working on a Sunday, it's bad for you ;-)




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[SQL] Negative lookbehind assertions in regexs

2005-08-29 Thread Julian Scarfe

I'd like a regex that matches 'CD' but not 'ABCD' in any part of the regex.

In Perl I'd use a negative lookbehind assertion (?Julian 




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [SQL] Negative lookbehind assertions in regexs

2005-09-03 Thread Julian Scarfe
I'd like a regex that matches 'CD' but not 'ABCD' in any part of the 
regex.


From: "Bruno Wolff III" <[EMAIL PROTECTED]>


Something like:
(^.?CD)|([^B]CD)|([^A]BCD)


Thanks to Bruno, and to Dawid who replied offline.  The above does the job 
nicely.


Any plans for a Perl Compatible Regular Expression operator? 
http://www.pcre.org/

Or are two regex operators enough?

Julian




---(end of broadcast)---
TIP 6: explain analyze is your friend


[SQL] NULL becomes default

2006-08-17 Thread Julian Scarfe

A surrogate key has been introduced on a table with modifiers:

  Column|Type | 
Modifiers

-+-+
new_id| integer | not null default 
nextval(('some_id_seq'::text)::regclass)


However, some legacy software inserts into the table by first getting column 
names and then composing an INSERT for all the columns, filling in any 
undefined values with NULLs.


Thus when it attempts to insert a row I get a:

ERROR:  null value in column "new_id" violates not-null constraint

Thus I would like NULLs in such INSERTs to be treated as DEFAULT rather than 
violating the constraint.  Is there an easy way to do that at the database 
level?


Thanks

Julian



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] NULL becomes default

2006-08-20 Thread Julian Scarfe

Thus I would like NULLs in such INSERTs to be treated as DEFAULT rather
than violating the constraint.  Is there an easy way to do that at the
database level?


From: "Markus Schaber" <[EMAIL PROTECTED]>


Did you try a "before insert" trigger that checks new_id for null
values, and replaces it with nextval()?


So thinking about this a little more, it requires me to be able to change a 
column value to DEFAULT (unless I implement the default from scratch in the 
trigger, but that feels messy).  I don't have much experience of triggers: 
can I do that?


Thanks

Julian 




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq