[SQL] Utility of recursive queries?

2004-04-09 Thread James Robinson
Would recursive queries be the trick to doing things like unwinding a 
linked-list to either the head or tail, with:

create table list (
id int primary key,
parent int references list(id)
);
insert into list values (1, null);  -- head of chain in list
insert into list values (2, 1); -- 1st child
insert into list values (3, 2); -- second child
Given a reference to id=3, would a recursive query be the trick to 
unrolling the list to discover id=1 as the head using a SQL one-liner? 
Is discovery possible in straight SQL w/o resorting to stored 
procedures (or modifying the table schema to directly point)? And, 
finally, would any potential recursive query implementation be 
noticably more efficient that a straightforward implementation in 
plpgsql, such as:

create or replace function find_head(int) returns int as '
DECLARE
cur_par INT;
prev_par INT;
BEGIN
prev_par := $1;
cur_par := parent from list where id = $1;
WHILE cur_par is not null LOOP
prev_par := cur_par;
cur_par := parent from list where id = prev_par;
END LOOP;
return prev_par;
END;
' language 'plpgsql';

James Robinson
Socialserve.com
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[SQL] Prepared Statements and large where-id-in constant blocks?

2004-04-19 Thread James Robinson
Howdy:

	Java middlewares like JBossCMP issue many queries in the general form 
of:

		SELECT t1.att1, t1.att2 ... t1.attN FROM t1 WHERE (t1.id = X) or 
(t1.id = Y) 

where there may be anywhere between 1 and thousands of "(id = N)" 
blocks ORed together. These may be transformed to the "WHERE t1.id IN 
(X, Y, ...)" form for possibly a little performance gain (possibly -- 
I've not yet checked to see if this plans better than the other, but I 
could imagine this form being parsed into the hashjoin form as opposed 
to a huge index filter form).

	Real performance gains, however, could be attained through being able 
to ultimately use either the v2 PREPARE / EXECUTE statements or the 
better v3 protocol's bind / execute commands, but only if the SQL-level 
form of the query could better represent the fact there are not really 
N params, but, rather, one single param of type Set (or, heck, Array?). 
This would let all such queries map onto one single backend prepared 
statement, regardless of the size of the id set being passed in.

	I guess that separate preparation for each different cardinality would 
be okay performance-wise, but if there were some way to get all such 
queries factored-down into one single planned statement, it could:

1) Make far better use of JBoss's connection-wide LRU cache of 
PreparedStatements, since only one entry (with much higher cache hit 
rate) could exist for the entire family of queries.

2) Make better use of backend memory, since it only needed to prepare 
one such (generic) form, as opposed to one for each cardinality.

Problems in implementation:

	1) JBoss query engine would need to be educated about the potential to 
use this form as opposed to the "OR (t1.id=X)" form. Likewise, JBoss 
could / should well be educated about being able to use the "WHERE 
t1.id IN (X, Y, ...)" form for databases which support "WHERE .. IN ( 
.. )", probably an easier sell since this is most likely supported by 
more DBs than just PG.

	2) Does the JDBC spec allow any hooks for passing in a set of ids as 
one single param? We'd need the SQL-template to be prepared to look 
something like:

	SELECT t1.attr1 FROM t1 where t1.id in ( ? )

From memory, perhaps setArray() might could be hacked for the job. I 
know JBossCMP uses
the setObject() call, so perhaps JDBC could be tricked out to handle a 
java.util.Collection, an arguably cleaner way to do it -- no backward 
compat issues since could be all-new functionality. JDBC driver could 
just iterate through the collection contents, calling setObject 
accordingly. Harder part would be educating JBoss to do this. Hardest 
part would be convincing someone to commit it into JBoss.

	3) Can a cardinality-free plan even be made? I bet I'm assuming a 
little too much in asserting all such plans are equal, but I suspect 
that Tom is going to tell me that the query for just one id value would 
and should be planned differently from the 800-value form, since the 
800-value form might well prefer a full sequential scan, since the 
table might only have 900 live rows in it.

Anyone have any insight or opinions?

[ crossposted to pgsql-sql for anyone's insight into the pure SQL / 
planning matters. Apologies in advance ].


James Robinson
Socialserve.com
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [SQL] [JDBC] Prepared Statements and large where-id-in constant blocks?

2004-04-20 Thread James Robinson
On Apr 19, 2004, at 10:57 PM, Oliver Jowett wrote:

Unfortunately a bit of experimentation indicates that the planner 
doesn't do anything clever with ANY + constant array values (at least 
in 7.4.1 which is what I have to hand):
Not only that, but it seems to get planned only as an index scan. 
Preparing the statement
using "SELECT ... WHERE id = ANY (?)" plus a call to a really-hacked up 
version of
setObject() would solve the issue of getting better use out of fewer 
cached prepared
statements, but the only-sequential scan planning would be a downer.

And while "OR (id=N)" plans exactly like "id IN (N)", there seems to be 
nothing really
worth doing. I examined plans comparing a 6-way join used in our 
production code with
4 ids in the tail "OR (id=N)" nodes, then with the full 723 ids, and 
the plans were markedly
different, preferring sequential scans for many of the intermediate 
table joins in the 723-id
case.The runtimes for the sequential scans were faster than forcing 
index scans, so
the planner's voodoo definitely benefits from full knowledge of how 
many rows should
be expected (which appears crystal clear in hindsight). So, I doubt 
that any single server-side
preparation using a single parameter representing an entire collection 
of ids could perform
as well as it does currently with full information.

Oliver, I tested your proposal of providing 
more-id-params-than-necessary, passing in a
dummy value (-1) which will never be found as a pk in that table, and 
the
planner handled it efficiently. The repeated instances of "or 
u.id=-1::int8" were, when not
pre-planned using PREPARE, were nicely crunched down to a single index 
condition
clause of " OR (id= -1::BIGINT)". But, when transforming this to 
PREPARE and EXECUTE
pairs, the planner cannot crunch the plan down, since it has no idea 
that, say, 500 of the
700 params will all be the same, so it cannot factor them out at 
planning time (hence, I guess
the warning about constant values in the notes section of the manual 
page for PREPARE).

All roads seem to lead to don't attempt to change a thing -- there is 
no easy or medium
difficulty way to better solve this.

In writing this, I went so far as to think about shunting the list of 
ids into a temporary table
to join off of. Doing this at the SQL-level would be far too costly in 
round-trip times, but
could the "WHERE id in (A, B, C, ...)" form somehow be transformed into 
a hashjoin
operation on the backend when the size of the static set is 'high'? 
Could this not perform
(theoretically) better than what appears to be an O(N) index condition 
evaluation? I am
asking only for personal edification -- I have no sample live query 
where the index condition
solution performs too slowly. In Java-land, if we suppose that the size 
of the set could potentially
be 'large', we quickly defer to containing the values in a HashSet if 
we're going to test for
membership as opposed to performing selection searches on a list. 
Probably a dead-horse
beaten elsewhere.

Many Thanks.


James Robinson
Socialserve.com
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] Design question: Scalability and tens of thousands of tables?

2005-11-04 Thread James Robinson


On Nov 4, 2005, at 9:47 AM, [EMAIL PROTECTED] wrote:

The problem is I am very concerned about scalability with having a  
different
table created for each custom object.  I want to design to site to  
handle

tens of thousands of users.  If each user has 3-5 custom objects the
database would have to handle tens of thousands of tables.

So it would appear that this is a broken solution and will not  
scale.  Has

anyone designed a similar system or have ideas to share?


Well, before you discount it, did you try out your design? You could  
do initial segregation of user's tables into separate schemas (say,  
schema 'a' -> 'z' according to username or some better hashing  
routine like brute-force round-robin assignment at user creation  
time). Assignment of objects -> schema would be one additional column  
in your centralized user directory table(s).



James Robinson
Socialserve.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[SQL] 8.4 versus 8.2 against nonexistent column "name" ...

2010-03-17 Thread James Robinson
Can anyone shine a light on why 8.4 behaves surprisingly when being  
queried for a bogus column "name", but only using qualified table/ 
column references.


Here's a sample script:

-
create table foo
(
id int,
x int
);

insert into foo(id, x)
values
(1, 23),
(2, 43),
(4, 45);

-- Fails on both -- no column named 'name'
select name from foo;

-- Fails on 8.2, still no 'name' column, but 8.4 succeeds returning  
whole rows.

select f.name from foo f;


On 8.2.11, both selects fail:

CREATE TABLE
INSERT 0 3
ERROR:  column "name" does not exist
LINE 1: select name from foo;
  ^
ERROR:  column f.name does not exist
LINE 1: select f.name from foo f;
  ^


On 8.4.2, the first select fails, but the second succeeds, returning  
whole rows wrapped up like tuples:


CREATE TABLE
INSERT 0 3
ERROR:  column "name" does not exist
LINE 1: select name from foo;
  ^
 name

(1,23)
(2,43)
(4,45)
(3 rows)

A quick skim through the 8.3. and 8.4. release notes found nothing  
interesting related to 'name'.


Thanks!

James Robinson
Socialserve.com


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] 9.0+ way of determining if a LISTEN channel has a backend listening?

2011-06-01 Thread James Robinson
Is there a way in post-9.0 postgres for one session to determine if a  
LISTEN channel currently has a backend registered to receive  
notifications? Prior to 9.0, one could select from system catalog  
table pg_listener.


Function pg_listening_channels() returns list of symbol names *this*  
session is listening on; I'd like to find out if a given listen  
channel has *any* (i.e other session) listeners.


Thanks!

James Robinson
Socialserve.com


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an outer join

2006-10-06 Thread James Robinson

Given something like:

create table foo (id int primary key not null);
	create table bar (id int primary key not null, a_id int references  
foo(id));
	select a.id, b.id from foo a left outer join bar b on (b.a_id =  
a.id) for update;


PG 8.1.4 balks, saying:

SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an  
outer join


Is this behavior spec-mandated, or could only the rows in B which are  
not-null be locked?



James Robinson
Socialserve.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an outer join

2006-10-06 Thread James Robinson

Oh that's sweet and all I needed anyway thanks.

On Oct 6, 2006, at 12:25 PM, Tom Lane wrote:

Note that you can select "FOR UPDATE OF a" in this situation, it's  
just

the B side that is problematic.



James Robinson
Socialserve.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[SQL] Subselects in CHECK clause ...

2006-11-29 Thread James Robinson
I see that subselects are not directly supported in check clauses,  
but one can work around that by writing a stored function which  
returns boolean and performs the subselect. Are there any known  
gotchas with doing this?


The CREATE TABLE docs regarding CHECK clauses states:

	"Currently, CHECK expressions cannot contain subqueries nor refer to  
variables other than columns of the current row."


Is this due to someone not yet writing the necessary code to remove  
the subquery clause, or is there a more fundamental reason?


----
James Robinson
Socialserve.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] Subselects in CHECK clause ...

2006-11-29 Thread James Robinson
Gotcha. This is indeed to ensure a specialized ref integrity bit,  
against a column which "ought not ever change". Once some more old  
code goes away, then can fully normalize this area, making this check  
subselect bit go away, replaced by a regular FK.


Hmm -- probably could even now make it a FK against a column which is  
not the pri key of the foreign table, and it'd work better and would  
ensure consistence against changes on the foreign table side, eh?


Thanks!

On Nov 29, 2006, at 12:53 PM, Stephan Szabo wrote:

To completely get the constraint, you have to also apply  
constraints on
the tables referenced in the function that prevent modifications on  
those

tables from causing the constraint to be violated. For example, if you
were to do an exists test on another table for a row that matches  
up with

this row in some fashion (for a specialized referential integrity
constraint) modifications on that other table could also cause the
constraint to be violated, but that isn't caught by the CHECK
function(...) case and you'll probably need triggers or other  
constraints

on that table.



James Robinson
Socialserve.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[SQL] Index to enforce non-overlapping ranges?

2008-05-08 Thread James Robinson

Academic question here:

	Given a table with a pair of any sort of line-segment-esqe range  
delimiter columns, is it possible to build a unique index to enforce  
non-overlapping ranges? Such as:


create table test
(
id int not null primary key,
low_value int not null,
high_value int not null
);

	Can one build an index to enforce a rule such that no (low_value,  
high_value) range is identical or overlaps with another (low_value,  
high_value) range described by the table? And, more interestingly,  
what about for ranges of dates / timestamps as opposed to simple  
integers?


	I can see how a trigger on insert or update could enforce such a  
constraint [ probe the table for an existing overlapping row, and  
raise exception one exists ], but can such an activity be performed  
with fewer lines using some sort of r-tree index?



James Robinson
Socialserve.com


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql