Re: [HACKERS] 2PC transaction id
branch id: Branch Identifier. Every RM involved in the global transaction is given a *different* branch id. Hm, I am confused then -- the XA spec definitely talks about enlisting multiple RMs in a single transaction branch. Can you explain? I oversimplified a bit. The TM *can* enlist multiple threads of control (= connection in JTA) to the same transaction branch. That's called tightly-coupled threads, and they should then be treated as one local transaction in the RM. The calls will look like this: conn1.start(xid1, TMNOFLAGS); ... conn2.start(xid1, TMJOIN); ... conn1.end(xid1, TMSUCCESS); ... conn2.end(xid1, TMSUCCESS); connX.prepare(xid1); connX.commit(xid1, false); conn1 and conn2 must share locks and see each others changes. They mustn't deadlock each other. The JDBC driver can implement this in a very straight-forward way by using the same physical connection for both conn1 and conn2. Note that there's only one prepare, and it can be issued using any connection. In your example above couldn't conn1 and conn2 be running in two different JVMs? And thus your statement that 'the JDBC driver can implement this in a very straight-forward way by using the same physical connection' would not be true. I can't see a way for two JVMs (possibly on different client machines even) to share the same physical connection. --Barry ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] We are not following the spec for HAVING without GROUP BY
On Oracle 9.2 you get 0, 0, 0, and 2 rows. --Barry SQL create table tab (col integer); Table created. SQL select 1 from tab having 1=0; no rows selected SQL select 1 from tab having 1=1; no rows selected SQL insert into tab values (1); 1 row created. SQL insert into tab values (2); 1 row created. SQL select 1 from tab having 1=0; no rows selected SQL select 1 from tab having 1=1; 1 -- 1 1 SQL exit Disconnected from Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production JServer Release 9.2.0.1.0 - Production -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Thursday, March 10, 2005 9:45 AM To: pgsql-hackers@postgresql.org; pgsql-bugs@postgresql.org Subject: Re: [HACKERS] We are not following the spec for HAVING without GROUP BY I wrote: This is quite clear that the output of a HAVING clause is a grouped table no matter whether the query uses GROUP BY or aggregates or not. What that means is that neither the HAVING clause nor the targetlist can use any ungrouped columns except within aggregate calls; that is, select col from tab having 21 is in fact illegal per SQL spec, because col isn't a grouping column (there are no grouping columns in this query). Actually, it's even more than that: a query with HAVING and no GROUP BY should always return 1 row (if the HAVING succeeds) or 0 rows (if not). If there are no aggregates, the entire from/where clause can be thrown away, because it can have no impact on the result! Would those of you with access to other DBMSes try this: create table tab (col integer); select 1 from tab having 1=0; select 1 from tab having 1=1; insert into tab values(1); insert into tab values(2); select 1 from tab having 1=0; select 1 from tab having 1=1; I claim that a SQL-conformant database will return 0, 1, 0, and 1 rows from the 4 selects --- that is, the contents of tab make no difference at all. (MySQL returns 0, 0, 0, and 2 rows, so they are definitely copying our mistake...) regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
Tom, Your patch works for my test cases. Thanks to both you and Oliver for getting this fixed. --Barry -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Sunday, November 28, 2004 2:23 PM To: Oliver Jowett Cc: Barry Lind; [EMAIL PROTECTED] Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver Oliver Jowett [EMAIL PROTECTED] writes: Perhaps PerformCursorOpen should copy the query tree before planning, or plan in a different memory context? Patch attached. It moves query planning inside the new portal's memory context. With this applied I can run Barry's testcase without errors, and valgrind seems OK with it too. I think the better solution is the first way (copy the querytree first). The problem with the way you did it is that all the temporary structures built by the planner will be left behind in the cursor's memory context, and can't be reclaimed until the cursor is destroyed. In the case of a complex query that could represent a pretty serious memory leak. It seems better to eat the cost of copying the querytree an extra time, especially since this way forms a patch that's easy to reverse whenever we fix the planner to be less cavalier about scribbling on its input. I've applied the attached patch instead (and analogously in 7.4 branch). Would you confirm it fixes the problem you see? regards, tom lane *** src/backend/commands/portalcmds.c.orig Thu Sep 16 12:58:28 2004 --- src/backend/commands/portalcmds.c Sun Nov 28 17:02:22 2004 *** *** 62,73 RequireTransactionChain((void *) stmt, DECLARE CURSOR); /* * The query has been through parse analysis, but not rewriting or * planning as yet. Note that the grammar ensured we have a SELECT * query, so we are not expecting rule rewriting to do anything * strange. */ ! rewritten = QueryRewrite((Query *) stmt-query); if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query)) elog(ERROR, unexpected rewrite result); query = (Query *) linitial(rewritten); --- 62,82 RequireTransactionChain((void *) stmt, DECLARE CURSOR); /* +* Because the planner is not cool about not scribbling on its input, +* we make a preliminary copy of the source querytree. This prevents +* problems in the case that the DECLARE CURSOR is in a portal and is +* executed repeatedly. XXX the planner really shouldn't modify its +* input ... FIXME someday. +*/ + query = copyObject(stmt-query); + + /* * The query has been through parse analysis, but not rewriting or * planning as yet. Note that the grammar ensured we have a SELECT * query, so we are not expecting rule rewriting to do anything * strange. */ ! rewritten = QueryRewrite(query); if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query)) elog(ERROR, unexpected rewrite result); query = (Query *) linitial(rewritten); ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
Oliver, The patch works for me. Thanks. Things look good now against an 8.0 server. (I still have a lot more testing to do though). However I still have problems against a 7.4 server with the 8.0 jdbc driver. (ERROR: no value found for parameter 1). You mentioned that you had found this bug and fixed it in 8.0 of the server. Any chance of getting a backport? Or is my only option to run with protocolVersion=2 on the jdbc connection. Thanks, --Barry -Original Message- From: Oliver Jowett [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 24, 2004 1:38 AM To: Barry Lind Cc: Tom Lane; [EMAIL PROTECTED] Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver Oliver Jowett wrote: Perhaps PerformCursorOpen should copy the query tree before planning, or plan in a different memory context? Patch attached. It moves query planning inside the new portal's memory context. With this applied I can run Barry's testcase without errors, and valgrind seems OK with it too. -O ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
Tom, Here is what you requested. (Thanks to Oliver for the good logging in the jdbc driver). I also have the test case (in java) down to the bare minimum that generated the following output (that test case is attached). (Note that if the FETCH in the test case is not executed then the backend crashes; with the FETCH you get an error: ERROR: unrecognized node type: 0) Thanks, --Barry PostgreSQL 8.0devel JDBC3 with SSL (build 308) Trying to establish a protocol version 3 connection to localhost:5432 FE= StartupPacket(user=blind, database=fileswfs43, client_encoding=UNICODE, DateStyle=ISO) =BE AuthenticationOk =BE ParameterStatus(client_encoding = UNICODE) =BE ParameterStatus(DateStyle = ISO, MDY) =BE ParameterStatus(integer_datetimes = off) =BE ParameterStatus(is_superuser = on) =BE ParameterStatus(server_encoding = UNICODE) =BE ParameterStatus(server_version = 8.0.0beta4) =BE ParameterStatus(session_authorization = blind) =BE BackendKeyData(pid=3348,ckey=914259969) =BE ReadyForQuery(I) compatible = 8.0 loglevel = 2 prepare threshold = 1 getConnection returning driver[className=org.postgresql.Driver,[EMAIL PROTECTED] simple execute, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl [EMAIL PROTECTED], maxRows=0, fetchSize=0, flags=0 FE= Parse(stmt=S_1,query=BEGIN,oids={}) FE= Bind(stmt=S_1,portal=null) FE= Execute(portal=null,limit=0) FE= Parse(stmt=S_2,query=DECLARE CUR CURSOR FOR SELECT 1,oids={}) FE= Bind(stmt=S_2,portal=null) FE= Describe(portal=null) FE= Execute(portal=null,limit=0) FE= Sync =BE ParseComplete [S_1] =BE BindComplete [null] =BE CommandStatus(BEGIN) =BE ParseComplete [S_2] =BE BindComplete [null] =BE NoData =BE CommandStatus(DECLARE CURSOR) =BE ReadyForQuery(T) simple execute, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl [EMAIL PROTECTED], maxRows=0, fetchSize=0, flags=0 FE= Parse(stmt=S_3,query=FETCH FORWARD 10 FROM CUR,oids={}) FE= Bind(stmt=S_3,portal=null) FE= Describe(portal=null) FE= Execute(portal=null,limit=0) FE= Sync =BE ParseComplete [S_3] =BE BindComplete [null] =BE RowDescription(1) =BE DataRow =BE CommandStatus(FETCH) =BE ReadyForQuery(T) simple execute, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl [EMAIL PROTECTED], maxRows=0, fetchSize=0, flags=0 FE= Parse(stmt=S_4,query=CLOSE CUR,oids={}) FE= Bind(stmt=S_4,portal=null) FE= Describe(portal=null) FE= Execute(portal=null,limit=0) FE= Sync =BE ParseComplete [S_4] =BE BindComplete [null] =BE NoData =BE CommandStatus(CLOSE CURSOR) =BE ReadyForQuery(T) simple execute, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl [EMAIL PROTECTED], maxRows=0, fetchSize=0, flags=0 FE= Bind(stmt=S_2,portal=null) FE= Describe(portal=null) FE= Execute(portal=null,limit=0) FE= Sync =BE BindComplete [null] =BE NoData =BE ErrorMessage(ERROR: unrecognized node type: 0 Location: File: clauses.c, Routine: expression_tree_mutator, Line: 3220 Server SQLState: XX000) java.sql.SQLException: ERROR: unrecognized node type: 0 Location: File: clauses.c, Routine: expression_tree_mutator, Line: 3220 Server SQLState: XX000 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu torImpl.java:1356) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp l.java:1151) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java: 166) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme nt.java:363) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb c2Statement.java:308) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme nt.java:299) at test80.main(test80.java:31) SQLException: SQLState(XX000) =BE ReadyForQuery(E) java.sql.SQLException: ERROR: unrecognized node type: 0 Location: File: clauses.c, Routine: expression_tree_mutator, Line: 3220 Server SQLState: XX000 -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 23, 2004 7:10 AM To: Barry Lind Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver Barry Lind [EMAIL PROTECTED] writes: OK, getting closer. The error happens if in jdbc I reuse PreparedStatement objects to reexecute the same set of queries multiple times. The TRAP you showed looked like it might have something to do with trying to execute code outside any transaction. But I dunno how it got there. If you still need a test case, let me know, and I will continue to package up what I have been working on. What I'd actually rather have is a list of the exact sequence of messages sent to the server. regards, tom lane test80.java Description: test80.java ---(end of broadcast)--- TIP 3: if posting
FW: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
I have been unable to come up with a simple test case for this problem (yet). But here is some additional information. Today I setup a 7.4.6 and an 8.0.0beta5 on linux (RH9) and could also reproduce the problem. However there were some new twists. I now sometimes get the following error on 8.0: ERROR: cache lookup failed for function 18005 I did as Tom suggested and rebuilt with --enable-cassert and strangely that made the problem more difficult to reproduce. Once I finally was able to get the server to start having errors, I got the following interesting message in the log file: TRAP: FailedAssertion(!(serializable ? !((MyProc-xmin) != ((TransactionId) 0)) : ((MyProc-xmin) != ((TransactionId) 0))), File: sinval.c, Line: 767) I am going to try to continue to see if I can come up with a test case, but I wanted to pass this information on in case it might mean anything to anyone. Thanks, --Barry -Original Message- From: Barry Lind Sent: Friday, November 19, 2004 5:40 PM To: Kris Jurka Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver Kris, Environment #1: WinXP 8.0beta4 server, 8.0jdbc client I get random failures with the following errors: $ grep ERROR postgresql-2004-11-19_091524.log 2004-11-19 12:19:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:20:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:21:06 ERROR: unrecognized node type: 0 2004-11-19 12:22:06 ERROR: unrecognized node type: 0 2004-11-19 12:23:06 ERROR: unrecognized node type: 0 2004-11-19 12:24:06 ERROR: unrecognized node type: 0 2004-11-19 12:25:06 ERROR: unrecognized node type: 0 2004-11-19 12:26:06 ERROR: unrecognized node type: 0 2004-11-19 12:27:06 ERROR: unrecognized node type: 0 2004-11-19 12:28:06 ERROR: unrecognized node type: 653 2004-11-19 12:29:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:30 ERROR: unrecognized node type: 26 2004-11-19 12:31:06 ERROR: unrecognized node type: 0 2004-11-19 12:32:06 ERROR: unrecognized node type: 0 2004-11-19 12:33:06 ERROR: unrecognized node type: 0 2004-11-19 12:34:06 ERROR: unrecognized node type: 0 2004-11-19 12:35:06 ERROR: unrecognized node type: 0 2004-11-19 12:36:06 ERROR: unrecognized node type: 0 2004-11-19 12:37:06 ERROR: unrecognized node type: 0 Environment #2: Sun Solaris 7.4.3 server, 8.0jdbc client I get random failures with the following errors: ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 I am seeing errors in the two different environments where I am testing the 8.0 driver (which fully uses the V3 protocol), I don't have errors with the 7.4 driver (which only used basic V3 protocol features). I will work to try to come up with a smaller reproducable test case and repro in a linux environment where I can do more. Thanks, --Barry -Original Message- From: Kris Jurka [mailto:[EMAIL PROTECTED] Sent: Friday, November 19, 2004 3:57 PM To: Barry Lind Cc: [EMAIL PROTECTED] Subject: Re: [JDBC] Strange server error with current 8.0beta driver On Fri, 19 Nov 2004, Barry Lind wrote: During my testing with the 8.0 driver, I am occasionally getting failures. The strange thing is that a test will only fail 1 out of 10 times. The error I am getting from the server is: ERROR: unrecognized node type: 25344832 This type of error points to a bug in the server. It means a query plan is created that it can't process. A higher level node finds a lower level node that it doesn't expect. The variability of the test failure could be due to different plans being generated (note that V2 vs V3 can generate different plans because of string substitution vs a prepared query). The fact that the node number varies is a little suspicious, also because it's a very large value. Does it vary between a couple values or is it different every time. If it varies wildly then that could point to a memory overwrite instead of a bad plan being created, but in any case this problem is on the server side. Kris Jurka ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
OK, getting closer. The error happens if in jdbc I reuse PreparedStatement objects to reexecute the same set of queries multiple times. Specifically if I do the following set of queries: Declare cursor Close cursor Declare cursor Close cursor Declare cursor Close cursor Declare cursor Close cursor Declare cursor Close cursor After the 5th close the server will gpf on windows (if I add fetches between the declare and close then I don't get a gfp, but instead get ERROR: unrecognized node type: 0 I believe the current jdbc driver begins to use a named portal for reuse after five executions of the same jdbc PreparedStatement (until then it uses an unnamed portal), which would seem to jive with the fact that it errors after the fifth execution. (Oliver please correct me if I am wrong here). If you still need a test case, let me know, and I will continue to package up what I have been working on. Thanks, --Barry -Original Message- From: Barry Lind Sent: Monday, November 22, 2004 7:48 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: FW: [HACKERS] [JDBC] Strange server error with current 8.0beta driver I have been unable to come up with a simple test case for this problem (yet). But here is some additional information. Today I setup a 7.4.6 and an 8.0.0beta5 on linux (RH9) and could also reproduce the problem. However there were some new twists. I now sometimes get the following error on 8.0: ERROR: cache lookup failed for function 18005 I did as Tom suggested and rebuilt with --enable-cassert and strangely that made the problem more difficult to reproduce. Once I finally was able to get the server to start having errors, I got the following interesting message in the log file: TRAP: FailedAssertion(!(serializable ? !((MyProc-xmin) != ((TransactionId) 0)) : ((MyProc-xmin) != ((TransactionId) 0))), File: sinval.c, Line: 767) I am going to try to continue to see if I can come up with a test case, but I wanted to pass this information on in case it might mean anything to anyone. Thanks, --Barry -Original Message- From: Barry Lind Sent: Friday, November 19, 2004 5:40 PM To: Kris Jurka Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver Kris, Environment #1: WinXP 8.0beta4 server, 8.0jdbc client I get random failures with the following errors: $ grep ERROR postgresql-2004-11-19_091524.log 2004-11-19 12:19:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:20:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:21:06 ERROR: unrecognized node type: 0 2004-11-19 12:22:06 ERROR: unrecognized node type: 0 2004-11-19 12:23:06 ERROR: unrecognized node type: 0 2004-11-19 12:24:06 ERROR: unrecognized node type: 0 2004-11-19 12:25:06 ERROR: unrecognized node type: 0 2004-11-19 12:26:06 ERROR: unrecognized node type: 0 2004-11-19 12:27:06 ERROR: unrecognized node type: 0 2004-11-19 12:28:06 ERROR: unrecognized node type: 653 2004-11-19 12:29:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:30 ERROR: unrecognized node type: 26 2004-11-19 12:31:06 ERROR: unrecognized node type: 0 2004-11-19 12:32:06 ERROR: unrecognized node type: 0 2004-11-19 12:33:06 ERROR: unrecognized node type: 0 2004-11-19 12:34:06 ERROR: unrecognized node type: 0 2004-11-19 12:35:06 ERROR: unrecognized node type: 0 2004-11-19 12:36:06 ERROR: unrecognized node type: 0 2004-11-19 12:37:06 ERROR: unrecognized node type: 0 Environment #2: Sun Solaris 7.4.3 server, 8.0jdbc client I get random failures with the following errors: ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 I am seeing errors in the two different environments where I am testing the 8.0 driver (which fully uses the V3 protocol), I don't have errors with the 7.4 driver (which only used basic V3 protocol features). I will work to try to come up with a smaller reproducable test case and repro in a linux environment where I can do more. Thanks, --Barry -Original Message- From: Kris Jurka [mailto:[EMAIL PROTECTED] Sent: Friday, November 19, 2004 3:57 PM To: Barry Lind Cc: [EMAIL PROTECTED] Subject: Re: [JDBC] Strange server error with current 8.0beta driver On Fri, 19 Nov 2004, Barry Lind wrote: During my testing with the 8.0 driver, I am occasionally getting failures. The strange thing is that a test will only fail 1 out of 10 times. The error I am getting from the server is: ERROR: unrecognized node type: 25344832 This type of error points to a bug in the server. It means a query plan is created that it can't process. A higher level node finds a lower level node that it doesn't expect. The variability of the test failure could be due to different
Re: [HACKERS] [JDBC] Strange server error with current 8.0beta driver
Kris, Environment #1: WinXP 8.0beta4 server, 8.0jdbc client I get random failures with the following errors: $ grep ERROR postgresql-2004-11-19_091524.log 2004-11-19 12:19:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:20:06 ERROR: unrecognized node type: 25344832 2004-11-19 12:21:06 ERROR: unrecognized node type: 0 2004-11-19 12:22:06 ERROR: unrecognized node type: 0 2004-11-19 12:23:06 ERROR: unrecognized node type: 0 2004-11-19 12:24:06 ERROR: unrecognized node type: 0 2004-11-19 12:25:06 ERROR: unrecognized node type: 0 2004-11-19 12:26:06 ERROR: unrecognized node type: 0 2004-11-19 12:27:06 ERROR: unrecognized node type: 0 2004-11-19 12:28:06 ERROR: unrecognized node type: 653 2004-11-19 12:29:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:06 ERROR: unrecognized node type: 0 2004-11-19 12:30:30 ERROR: unrecognized node type: 26 2004-11-19 12:31:06 ERROR: unrecognized node type: 0 2004-11-19 12:32:06 ERROR: unrecognized node type: 0 2004-11-19 12:33:06 ERROR: unrecognized node type: 0 2004-11-19 12:34:06 ERROR: unrecognized node type: 0 2004-11-19 12:35:06 ERROR: unrecognized node type: 0 2004-11-19 12:36:06 ERROR: unrecognized node type: 0 2004-11-19 12:37:06 ERROR: unrecognized node type: 0 Environment #2: Sun Solaris 7.4.3 server, 8.0jdbc client I get random failures with the following errors: ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 ERROR: no value found for parameter 1 I am seeing errors in the two different environments where I am testing the 8.0 driver (which fully uses the V3 protocol), I don't have errors with the 7.4 driver (which only used basic V3 protocol features). I will work to try to come up with a smaller reproducable test case and repro in a linux environment where I can do more. Thanks, --Barry -Original Message- From: Kris Jurka [mailto:[EMAIL PROTECTED] Sent: Friday, November 19, 2004 3:57 PM To: Barry Lind Cc: [EMAIL PROTECTED] Subject: Re: [JDBC] Strange server error with current 8.0beta driver On Fri, 19 Nov 2004, Barry Lind wrote: During my testing with the 8.0 driver, I am occasionally getting failures. The strange thing is that a test will only fail 1 out of 10 times. The error I am getting from the server is: ERROR: unrecognized node type: 25344832 This type of error points to a bug in the server. It means a query plan is created that it can't process. A higher level node finds a lower level node that it doesn't expect. The variability of the test failure could be due to different plans being generated (note that V2 vs V3 can generate different plans because of string substitution vs a prepared query). The fact that the node number varies is a little suspicious, also because it's a very large value. Does it vary between a couple values or is it different every time. If it varies wildly then that could point to a memory overwrite instead of a bad plan being created, but in any case this problem is on the server side. Kris Jurka ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Nested Transactions, Abort All
Alvaro, My proposal would be: 1. Begin main transaction: BEGIN { TRANSACTION | WORK } 2. Commit main (all) transaction: COMMIT { TRANSACTION | WORK } 3. Rollback main (all) transaction: ROLLBACK { TRANSACTION } 4. Begin inner transaction: BEGIN NESTED { TRANSACTION | WORK } 5. Commit inner transaction: COMMIT NESTED { TRANSACTION | WORK } 6. Rollback inner transaction: ROLLBACK NESTED { TRANSACTION } I agree with your 1,2 and 3, for the reasons you specify. I don't like your proposal for 5, because using the keyword COMMIT implies something that really isn't true IMHO. This is due to the fact as you point out subtransactions aren't really transactions. So when you 'commit' a subtransaction you are not making the changes permanent like a regular transaction. Instead you are saying these changes are OK and the real transaction gets to decide if these changes should be committed (or not). It is only the real transaction that ever does a COMMIT (i.e. makes the changes permanent for others to see). IMHO it is for these reasons that the standard SAVEPOINT syntax doesn't have a concept of committing a savepoint, only of rolling back to a savepoint. thanks, --Barry Alvaro Herrera wrote: On Thu, Jul 08, 2004 at 10:40:36AM -0700, Josh Berkus wrote: This means that we CANNOT maintain compatibility with other databases without supporting SAVEPOINT syntax, which we are not yet ready to do. As a result, I would propose the following syntax: Begin main transaction: BEGIN { TRANSACTION | WORK } Begin inner transaction: BEGIN { TRANSACTION | WORK } Commit inner transaction: COMMIT { TRANSACTION | WORK } Commit all transactions: COMMIT ALL Rollback inner transaction: ROLLBACK { TRANSACTION } Rollback all transanctions: ROLLBACK ALL We can _not_ do this. The reason is that COMMIT and ROLLBACK are defined per spec to end the transaction. So they have to end the transaction. Keep in mind that a nested transaction _is not_ a transaction. You cannot commit it; it doesn't behave atomically w.r.t. other concurrent transactions. It is not a transaction in the SQL meaning of a transaction. So, when I say it has to end the transaction it cannot just end the current nested transaction. It has to end the _real_ transaction. My proposal would be: 1. Begin main transaction: BEGIN { TRANSACTION | WORK } 2. Commit main (all) transaction: COMMIT { TRANSACTION | WORK } 3. Rollback main (all) transaction: ROLLBACK { TRANSACTION } 4. Begin inner transaction: BEGIN NESTED { TRANSACTION | WORK } 5. Commit inner transaction: COMMIT NESTED { TRANSACTION | WORK } 6. Rollback inner transaction: ROLLBACK NESTED { TRANSACTION } 1, 2 and 3 are not negotiable. 4, 5 and 6 are. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Nested xacts: looking for testers and review
Am I the only one who has a hard time understanding why COMMIT in the case of an error is allowed? Since nothing is actually committed, but instead everything was actually rolled back. Isn't it misleading to allow a commit under these circumstances? Then to further extend the commit syntax with COMMIT WITHOUT ABORT makes even less since, IMHO. If we are going to extend the syntax shouldn't we be extending ROLLBACK or END, something other than COMMIT so that we don't imply that anything was actually committed. Perhaps I am being too literal here in reading the keyword COMMIT as meaning that something was actually committed, instead of COMMIT simply being end-of-transaction that may or may not have committed the changes in that transaction. I have always looked at COMMIT and ROLLBACK as a symmetric pair of commands - ROLLBACK - the changes in the transaction are not committed, COMMIT - the changes in the transaction are committed. That symmetry doesn't exist in reality since COMMIT only means that the changes might have been committed. --Barry Alvaro Herrera wrote: On Fri, May 28, 2004 at 04:05:40PM -0400, Bruce Momjian wrote: Bruce, One interesting idea would be for COMMIT to affect the outer transaction, and END not affect the outer transaction. Of course that kills the logic that COMMIT and END are the same, but it is an interesting idea, and doesn't affect backward compatibility because END/COMMIT behave the same in non-nested transactions. I implemented this behavior by using parameters to COMMIT/END. I didn't want to add new keywords to the grammar so I just picked up COMMIT WITHOUT ABORT. (Originally I had thought COMMIT IGNORE ERRORS but those would be two new keywords and I don't want to mess around with the grammar. If there are different opinions, tweaking the grammar is easy). So the behavior I originally implemented is still there: alvherre=# begin; BEGIN alvherre=# begin; BEGIN alvherre=# select foo; ERROR: no existe la columna foo alvherre=# commit; COMMIT alvherre=# select 1; ERROR: transacción abortada, las consultas serán ignoradas hasta el fin de bloque de transacción alvherre=# commit; COMMIT However if one wants to use in script the behavior you propose, use the following: alvherre=# begin; BEGIN alvherre=# begin; BEGIN alvherre=# select foo; ERROR: no existe la columna foo alvherre=# commit without abort; COMMIT alvherre=# select 1; ?column? -- 1 (1 fila) alvherre=# commit; COMMIT The patch is attached. It applies only after the previous patch, obviously. diff -Ncr --exclude-from=diff-ignore 10bgwriter/src/backend/access/transam/xact.c 13commitOpt/src/backend/access/transam/xact.c *** 10bgwriter/src/backend/access/transam/xact.c 2004-06-08 17:34:49.0 -0400 --- 13commitOpt/src/backend/access/transam/xact.c 2004-06-09 12:00:49.0 -0400 *** *** 2125,2131 * EndTransactionBlock */ void ! EndTransactionBlock(void) { TransactionState s = CurrentTransactionState; --- 2125,2131 * EndTransactionBlock */ void ! EndTransactionBlock(bool ignore) { TransactionState s = CurrentTransactionState; *** *** 2163,2172 /* * here we are in an aborted subtransaction. Signal * CommitTransactionCommand() to clean up and return to the ! * parent transaction. */ case TBLOCK_SUBABORT: ! s-blockState = TBLOCK_SUBENDABORT_ERROR; break; case TBLOCK_STARTED: --- 2163,2177 /* * here we are in an aborted subtransaction. Signal * CommitTransactionCommand() to clean up and return to the ! * parent transaction. If we are asked to ignore the errors ! * in the subtransaction, the parent can continue; else, ! * it has to be put in aborted state too. */ case TBLOCK_SUBABORT: ! if (ignore) ! s-blockState = TBLOCK_SUBENDABORT_OK; ! else ! s-blockState = TBLOCK_SUBENDABORT_ERROR; break; case TBLOCK_STARTED: diff -Ncr --exclude-from=diff-ignore 10bgwriter/src/backend/parser/gram.y 13commitOpt/src/backend/parser/gram.y *** 10bgwriter/src/backend/parser/gram.y 2004-06-03 20:46:48.0 -0400 --- 13commitOpt/src/backend/parser/gram.y 2004-06-09 11:51:04.0 -0400 *** *** 225,232 target_list update_target_list insert_column_list insert_target_list def_list opt_indirection group_clause TriggerFuncArgs select_limit ! opt_select_limit opclass_item_list transaction_mode_list ! transaction_mode_list_or_empty TableFuncElementList prep_type_clause prep_type_list execute_param_clause --- 225,232 target_list update_target_list insert_column_list insert_target_list def_list opt_indirection group_clause TriggerFuncArgs select_limit ! opt_select_limit opclass_item_list transaction_commit_opts ! transaction_mode_list
Re: [HACKERS] contrib vs. gborg/pgfoundry for replication solutions
Kris, Thank you. I objected to having the jdbc code moved out of the base product cvs tree for some of the reasons being discussed in this thread: how are people going to find the jdbc driver, how will they get documentation for it, etc. I think the core problem is that some people view postgres as just the database server proper. But most people (IMHO) view postgres (or any database) as a set of things (core server, admin utils, drivers, etc). I think the solution lies in improving www.postgresql.org. At the end of the day it doesn't matter where source code lives, what matters is can people find what they are expecting. Given we know what people are looking for, that should be front and center on the web site and the ftp sites. --Barry Kris Jurka wrote: On Wed, 21 Apr 2004, Marc G. Fournier wrote: On Wed, 21 Apr 2004, Rod Taylor wrote: We have the current issue of people not knowing that projects like pgadmin exist or where to find the jdbc drivers. Now, out of all of the PostgreSQL users, what % are using JDBC? What % are using ODBC? What percentage of those using JDBC are also using ODBC? What % of those using PgAdmin are also using ODBC? For that matter, how many ppl using JDBC only want to download the .jar file itself, and not the source code? % of Binary-Only PgAdmin users? ODBC driver? See the poll run on postgresql.org: http://www.postgresql.org/survey.php?View=1SurveyID=24 It took several minutes to load for me so I'll include the results here: Currently the results of our What PostgreSQL API do you use the most? survey are: Answer Responses Percentage libpq 175213.116% libpq++ 526 3.938% libpqxx 176 1.318% psqlODBC249518.678% JDBC760756.947% Npgsql 294 2.201% ECPG154 1.153% pgtcl 354 2.650% Total number of responses: 13358 You can certainly fault the choices (leaving perl, python, and php off the list), but 7500 java users is something that can't be ignored. The point is that any legitimate database will provide JDBC and ODBC drivers. When new users can't immediately find them from the postgresql home page they get frustrated and badmouth postgresql. Telling them to go to gborg isn't really helpful. gborg is currently full of dead projects, projects that have never had any code committed, and projects that are of questionable utility. gborg is supposed to be the dumping ground for these ala sourceforge, so it's not the dead projects I object to, so much as the fact that serious and critical projects are grouped together with them. We need better packaging/promotion of secondary projects and the main project can't be ignorant of this fact and cop out with we just provide the server. Kris Jurka ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bad timestamp external representation
Denis, This is more appropriate for the jdbc mail list. --Barry Denis Khabas wrote: Hi everyone: I am using Postgresql 7.3.4 and found a problem inserting Timestamp objects through JDBC Prepared Statements when the time zone is set to Canada/Newfoundland (3 hours and 30 minutes from MGT). I am trying to insert new Timestamp(0L) into one of the fields. The database replies with an error message: Bad timestamp external representation '1969-12-31 20:30:00.00-030-30' Most likely, the database doesn't understand the last part of the timestamp, which is '-30' (30 minutes). It works properly only with time zones that don't have that additional half hour difference. I could not find any useful information regarding that issue. Any help would be appreciated. Thanx ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Tablespaces
Oliver Elphick wrote: On Wed, 2004-03-03 at 04:59, Tom Lane wrote: What might make sense is some sort of marker file in a tablespace directory that links back to the owning $PGDATA directory. CREATE TABLESPACE should create this, or reject if it already exists. It will not be enough for the marker to list the path of the parent $PGDATA, since that path might get changed by system administration action. The marker should contain some sort of unique string which would match the same string somewhere in $PGDATA. Then, if either tablespace or $PGDATA were moved, it would be possible to tie the two back together. It wouldn't be an issue on most normal systems, but might be of crucial importance for an ISP running numerous separate clusters. Taking this one step further would be to do something like Oracle does. Every datafile in Oracle (because the Oracle storage manager stores multiple objects inside datafiles, one could say there is some similarity between Oracle datafiles and the proposed pg tablespaces), has meta info that tells it which database instance it belongs to and the last checkpoint that occured (It might actually be more granular than checkpoint, such that on a clean shutdown you can tell that all datafiles are consistent with each other and form a consistent database instance). So Oracle on every checkpoint updates all datafiles with an identifier. Now you might ask why is this useful. Well in normal day to day operation it isn't, but it can be usefull in disaster recovery. If you loose a disk and need to restore the entire database from backups it can be difficult to make sure you have done it all correctly (do I have all the necessary files/directories? did I get the right ones from the right tapes?) Especially if you have directories spread across various different disks that might be backed up to different tapes. So by having additional information stored in each datafile Oracle can provide additional checks that the set of files that are being used when the database starts up are consistent and all belong together. Oracle also ensures that all the datafiles that are suposed to exist actually do as well. So what might this mean for postgres and tablespaces? It could mean that on startup the database checks to verify that all the tablespaces that are registered actually exist. And that the data in each tablespace is consistent with the current WAL status. (i.e. someone didn't restore a tablespace from backup while the database was down that is old and needs recovery. A lot of what I am talking about here become PITR issues. But since PITR and tablespaces are both potential features for 7.5, how they interact probably should be thought about in the designs for each. thanks, --Barry ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Tablespaces
Gavin, After creating a tablespace what (if any) changes can be done to it. Can you DROP a tablespace, or once created will it always exist? Can you RENAME a tablespace? Can you change the location of a tablespace (i.e you did a disk reorg and move the contents to a different location and now want to point to the new location)? What are the permissions necessary to create a tablespace (can any use connected to the database create a tablespace, or only superuser, or ...)? Overall this will be a great addition to postgres. I am looking forward to this feature. thanks, --Barry Gavin Sherry wrote: Hi all, I've been looking at implementing table spaces for 7.5. Some notes and implementation details follow. -- Type of table space: There are many different table space implementations in relational database management systems. In my implementation, a table space in PostgreSQL will be the location of a directory on the file system in which files backing database objects can be stored. Global tables and non 'data' objects (WAL, CLOG, config files) will all remain in $PGDATA. $PGDATA/base will be the default table space. A given table space will be identified by a unique table space name. I haven't decided if 'unique' should mean database-wide unique or cross-database unique. It seems to me that we might run into problems with CREATE DATABASE ... TEMPLATE = database with table spaces if the uniqueness of table spaces is limited to the database level. A table space parameter will be added to DDL commands which create physical database objects (CREATE DATABASE/INDEX/TABLE/SEQUENCE) and to CREATE SCHEMA. The associated routines, as well as the corresponding DROP commands will need to be updated. Adding the ability to ALTER object TABLESPACE name seems a little painful. Would people use it? Comments? When an object is created the system will resolve the table space the object is stored in as follows: if the table space paramater is passed to the DDL command, then the object is stored in that table space (given validation of the table space, etc). If it is not passed, the object inherits its parent's table space where the parent/child hierarchy is as follows: database schema table [index|sequence]. So, if you issued: create table foo.bar (...); We would first not that there is no TABLESPACE name, then cascade to the table space for the schema 'foo' (and possibly cascade to the table space for the database). A database which wasn't created with an explicit table space will be created under the default table space. This ensures backward compatibility. Creating a table space: A table space is a directory structure. The directory structure is as follows: [EMAIL PROTECTED] /path/to/tblspc]$ ls OID1/ OID2/ OID1 and OID2 are the OIDs of databases which have created a table space against this file system location. In this respect, a table space resembles $PGDATA/base. I thought it useful to keep this kind of namespace mechanism in place so that administrators do not need to create hierarchies of names on different partitions if they want multiple databases to use the same partition. The actual creation of the table space will be done with: CREATE TABLE SPACE name LOCATION /path/to/tblspc; Before creating the table space we must: 1) Check if the directory exists. If it does, create a sub directory as the OID of the current database. 2) Alternatively, if the directory doesn't exist, attempt to create it, then the sub directory. I wonder if a file, such as PG_TBLSPC, should be added to the table space directory so that, in the case of an existing non-empty directory, we can attempt to test if the directory is being used for something else and error out. Seems like: CREATE TABLESPACE tbl1 LOCATION '/var/' which will result in something like '/var/123443' is a bad idea. Then again, the user should know better. Comments? If everything goes well, we add an entry to pg_tablespace with the table space location and name (and and OID). Tying it all together: The catalogs pg_database, pg_namespace, and pg_class will have a 'tblspc' field. This will be the OID of the table space the object resides in, or 0 (default table space). Since we can then resolve relid/relname, schema and database to a tablespace, there aren't too many cases when extra logic needs to be added to the IO framework. In fact, most of it is taken care of because of the abstraction of relpath(). The creation of table spaces will need to be recorded in xlog in the same way that files are in heap_create() with the corresponding delete logic incase of ABORT. Postmaster startup: Ideally, the postmaster at startup should go into each tblspc/databaseoid directory and check for a postmaster.pid file to see if some other instance is touching the files we're interested in. This will require a control file listing tblspc/databaseoid paths and it will need to plug into WAL in case we die during CREATE TABLESPACE. Comments? Creating a database I
Re: [HACKERS] PL/Java issues
Jan, In Oracle a call from sql into java (be it trigger, stored procedure or function), is required to be a call to a static method. Thus in Oracle all the work is left for the programmer to manage object instances and operate on the correct ones. While I don't like this limitation in Oracle, I can't see a better way of implementing things. Therefore if you want to operate on object instances you (in Oracle) need to code up object caches that can hold the instances across function calls so that two or more functions can operate on the same instance as necessary. What this implies to the implementation is that in order to be possible the multiple function calls need to run inside the same jvm (so you can access the static caches across the different calls). If every call created a new jvm instance in Oracle you couldn't do very much. The Oracle jvm essentially gives you one jvm per connection (although technically it is somewhere between one jvm for the whole server and one per connection - i.e. it has the memory and process footprint of a single jvm for the entire server, but appears to the user as a jvm per connection). Having one jvm per connection is important to limit multiple connections ability to stomp on each others data. Something similar could probably a done for postgres by having one jvm running, by having each postgres connection having a unique thread in that jvm and having each connection thread run with its own class loader instance so that separate classes (and thus static members) are loaded for each connection. thanks, --Barry Jan Wieck wrote: I have included the JDBC mailing list since I guess most Java developers are around here, but not necessarily on Hackers. Dave Cramer and I where discussing a few issues about the PL/Java implementation last night and would like to get more input and suggestions on the matter. The basic question is the definition of the lifetime of an object and it's identificaition when doing nested calls in this context. In the OO world, ideally a real world object is translated into one instance of a class. And complex structures are trees of instances, possibly of different classes. As an example, a sales order consists of the order header and a variable number of order lines. Therefore, per order we have one OH instance and several OL's. So far so good. Naturally, one Java object instance would correspond to one row in a database. If we now implement a stored procedure in PL/Java, that means that a pg_proc entry corresponds to a specific method of a specific class (its signature). But there is no obvious relationship between functions and tables or other objects. Because of that it is not implicitly clear if an incoming call to a method is meant for an existing instance or if a new one should be created. As an example, if a PL/Java trigger on the order header executes an SPI query on the order lines, a trigger on the order line (also in PL/Java) might now want to call a method on it's parent object (the order header that is waiting for the SPI result set). This should NOT result in another OH instance being created for the same logical OH. Probably it is not possible to map these things automatically while keeping the system flexible enough to be usefull. But is it feasable to require the programmer to provide glue code for every procedure that does all these things? How does Oracle attack this problem? Jan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Information Schema and constraint names not unique
Tom Lane wrote: Using tableoid instead of tablename avoids renaming problems, but makes the names horribly opaque IMNSHO. Agreed. I think using the OIDs would be a horrible choice. As a point of reference Oracle uses a naming convention of 'C' where is a sequence generated unique value. So in Oracle system generated names are very opaque. I never saw this as a problem, since if you wanted a non-opaque name you could always assign one yourself. --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.4 LOG: invalid message length
Gmane, I just checked in a fix to the jdbc driver for this. The problem was that the connection termination message was being passed the wrong length, which really didn't have any other adverse side effect than this message in the log, since the connection was no longer being used. thanks, --Barry Gmane Smith wrote: Using Mac OS X 10.2.6 Pgsql 7.4 (postgresql-7.4beta1) from postgresql.org devpgjdbc2.jar and WebObjects 5.2 I get LOG: invalid message length when EOModeler creates my database. When I built PostgreSQL I specified --with-pam --without-readline Since I don't have ant the --with-java failed so I left it off. Should I go back to a more stable JDBC driver? Or should I press on? Thanks. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] encoding question
Chris, SQL_ASCII means that the data could be anything. It could be Latin1, UTF-8, Latin9, whatever the code inserting data sends to the server. In general the server accepts anything as SQL_ASCII. In general this doesn't cause any problems as long as all the clients have a common understanding on what the real encoding of the data is. However if you set CLIENT_ENCODING then the server does assume that the data is really 7bit ascii. In the jdbc driver we only support US-ASCII data if the character set is SQL_ASCII since we use the CLIENT_ENCODING setting of UTF8 to have the server perform the necessary conversion for us since java needs unicode strings. And if you store anything other than US-ASCII data in a SQL_ASCII database the server will return invalid UTF-8 data to the client. thanks, --Barry Christopher Kings-Lynne wrote: Hi, In phpPgAdmin, we automatically set the HTML page encoding to and encoding that allows us to properly display the encoding of the current postgresql database. I have a small problem with SQL_ASCII. Theoretically (and what we currently do), we should set page encoding to US-ASCII. However, Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII databases, so what encoding should I use. Is ISO-8859-1 a better choice? Is SQL_ASCII basically equivalent to the LATIN1 encoding? My other question is we play around with bytea fields to escape nulls and chars 32 and stuff so that when someone browses the table, they get '\000unknown\000...', etc. However, are the other field types for which we have to do this? Can you put nulls and stuff in text/varchar/char fields? What about other fields? Thanks, Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Wrong version of jdbc in 7.3.3 rpms
Does anyone know why apparently the 7.3beta1 version of the jdbc drivers are what is included in the 7.3.3 rpms? --Barry Original Message Subject: Re: [JDBC] Official JDBC driver release ? Date: Thu, 05 Jun 2003 08:14:40 +0200 From: Thomas Kellerer [EMAIL PROTECTED] To: [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] Barry Lind schrieb: I'm a bit puzzled about the versions of the JDBC driver floating around. I initially downloaded the release for 7.3 from jdbc.postgresql.org Now I have seen that the JDBC driver which is included e.g. in the RPM's for 7.3.3 is a bit older and has a different name (pg73b1jdbc3.jar vs. pg73jdbc3.jar) The pg73b1jdbc3.jar file is very old (it is the 7.3 beta 1 version). What RPMs are you using? You should contact whoever produced those RPMs to get them built with the current version. The 'official' version is the source code that is tagged with the 7.3.3 freeze label (which is the version that is currently posted on the jdbc.postgresql.org web site) --Barry Barry, thanks for the answer. The pg73b1jdbc3.jar file is contained in all the 7.3.3 rpm available on the ftp mirrors... (ok, not necessarilly all, but I checked about 3 or 4 different mirrors) I don't know who builds the rpms on the mirror sites available from www.postgresql.org Cheers Thomas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Wrong version of jdbc in 7.3.3 rpms
Lamar, I can understand you not wanting to install the components necessary to build the various jdbc versions. So I would just request that you pull the latest prebuilt version from jdbc.postgresql.org when doing a new RPM. I will try to answer some of your other questions below. Lamar Owen wrote: On Thursday 05 June 2003 11:39, Barry Lind wrote: Does anyone know why apparently the 7.3beta1 version of the jdbc drivers are what is included in the 7.3.3 rpms? The pg73b1jdbc3.jar file is very old (it is the 7.3 beta 1 version). What RPMs are you using? You should contact whoever produced those RPMs to get them built with the current version. The 'official' version is the source code that is tagged with the 7.3.3 freeze label (which is the version that is currently posted on the jdbc.postgresql.org web site) I am whoever. :-) I have not synced up with the version on jdbc.postgresql.org (primarily because I didn't know about there being a newer version). I understand. In the future always just grab the latest prebuilt version from jdbc.postgresql.org. I do not have a JDK installed here, so I don't build the JDBC driver from source. So, I'll blame my own bit rot. I understand. Since the postgresql-jdbc RPM has no dependencies and is a distribution-independent RPM, I'll roll a new one. This won't require a rebuild on all the distributions supported, since we're talking distribution independent jars. However, I wouldn't mind pulling the JDBC subpackage out of the main set entirely, and having a separate RPM distribution for that. You could maintain that yourself easily enough. I can even give you a starting spec file, and the JDBC driver could have a separate release schedule, which might be appropriate anyway. The topic of having jdbc on a separate release cycle has come up in the past multiple times. And in general I have been against it (and also the move of the jdbc code to a separate project). In general jdbc needs to release as often as the server. Because jdbc heavily depends on all the pg_* tables and they tend to change each release, there needs to be a corresponding release of jdbc for each server release. Now jdbc could release on a more frequent schedule than the server, but there currently just aren't enough developers working on it for that to be a reality, so the current server schedule works out just right. There are a number of reasons that IMHO moving the source out of the core project does not make sense for jdbc: 1) Documentation infrastructure - the server has a nice setup for producing doc. I don't have time or want to reinvent all that for the jdbc doc if it were in a separate project. 2) Core developer access. It is a great benefit when Tom, Bruce or some other core hacker makes some sort of global change to the backend tables, and they can change all the source affected including the jdbc source. 3) Release infrastructure - RPMs and packageing work is already done (and it usually works :-) 4) Beta program - When postgres does a beta, it is great to be a part of that automatically. Instead of needing to try and get the word out and do it on a different cycle. Going the one obvious next step forward, is there a compelling reason to include the JDBC client as part of the main tarball, rather than a separate project (ODBC is separate, as is the C++ and Perl clients) (and the same thing can be said for the Python client)? Does the JDBC client need backend source code, or is it happy being built with just the necessary fe protocol headers? (I'm thinking out loud here -- I can see a reason for the JDBC driver to have a separate release schedule from the main tarball, but I'm not saying 'JDBC is a problem child! Let's yank it because I don't want to deal with it!'). Barry, what would be your preference? What would best serve JDBC users? (other than me installing all the software necessary to build the JDBC from source -- this requires non-vanilla software in the form of the JDK, as well as the build environment that the makefiles want, and with me not being a Java developer at this time, I wouldn't necessarily be up on what is considered the 'canonical' development or runtime environments. With the other portions of PostgreSQL, nothing beyond the stock distribution is required for build.) I think it would best serve the users for an active JDBC developer to make that distribution. Please advise how you would like to handle this. I think I answered all of the questions put forth here. If I haven't please let me know. thanks, --Barry PS. Thanks for all the work you do on the RPMs. You provide a valuable service to the postgres community. PPS. Perhaps someday you will get the beter 'upgrade' you have been asking for. I think you have been asking for it for as long as I have been a part of the postgres community. ---(end of broadcast)--- TIP 2: you can get off all lists
Re: [HACKERS] Charset encoding and accents
Davide Romanini wrote: Barry Lind ha scritto: The charSet= option will no longer work with the 7.3 driver talking to a 7.3 server, since character set translation is now performed by the server (for performance reasons) in that senario. The correct solution here is to convert the database to the proper character set for the data it is storing. SQL_ASCII is not a proper character set for storing 8bit data. Probably I'm not enough clear about the problem. I *cannot* change charset type. SQL_ASCII really *is* the proper character set for my porpuses, because I actually work using psql and ODBC driver without any problem. You were clear, however we disagree. SQL_ASCII is *not* the proper character set for your purposes. The characters you are having problems with do not exist in the SQL_ASCII character set. The fact that psql and ODBC work under this misconfiguration doesn't mean that the configuration is correct. Java deals with all characters internally in unicode thus forcing a character set conversion. So the code is converting from SQL_ASCII to UTF8. When it finds characters that are not part of SQL_ASCII character set it doesn't know what to do with them (are they LATIN1, LATIN5, LATIN? characters). You state that you *cannot* change the character set. Can you explain why this is the case? I repeat: psql and ODBC retrives all data (with the accents) in the correct manner. Also, if I change the org.postgresql.core.Encoding.java making the decodeUTF8 method to return simply a new String(data), JDBC retrives the data from my SQL_ASCII database correctly! So my question is: why JDBC calls the decodeUTF8 method also when the string is surely *not* an UTF-8 string? If you were only storing SQL_ASCII characters it would be a UTF8 string since SQL_ASCII is a subset of UTF8. But since you are storing invalid SQL_ASCII characters this is no longer true. The logic is as follows: The driver sets the CLIENT_ENCODING parameter to UNICODE which instructs the server to convert from the character set of the database to UTF8. The server then sends all data to the client encoded in UTF8. The jdbc driver reads the UTF8 data and converts it to java's internal unicode representation. The problem in all of this is that the server has decided as an optimization that if the database character set is SQL_ASCII then no conversion is necessary to UTF8 since SQL_ASCII is a proper subset of UTF8. However when characters that are not SQL_ASCII are stored in the database (i.e. 8bit characters) then this optimization simply sends them on to the client as if they were valid UTF8 characters (which they are not). So the client then tries to read what are supposed to be UTF8 characters and fails because it is receiving non UTF8 data even though it asked the server to only send it UTF8 data. If jdbc could recognize that the string is *not* an UTF-8 string, then it will simply return a new String that is the right thing to do. It's obvious that if JDBC receives from postgresql server a byte array representing a non-UTF8 string, and it a calls e method that wants as a parameter a byte array representing an UTF8 string, then it is a *bug*, because for non-UTF8 strings it must return a new String. As stated above the driver tells the server to send all data as UTF8, but because of the optimization and the non-SQL_ASCII characters you are storing that optimization results in non-UTF8 data being sent to the client. I hope to be enough clear this time. As I said ealier you were clear the first time. I hope I have been more clear in my response to explain the issues in greater detail. Sincerely, I'm getting a bit frustrated from the problem, because I've projects to do and it prevents me to do that projects :-( I understand that you are frustrated, but frankly I am frustrated too, because I keep telling you what the solution to your problem is and you keep ignoring it :-) thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Changing behavior of BEGIN...sleep...do something...COMMIT
Andreas, From the JDBC side it really doesn't make that much difference. The JDBC code needs to support both ways of doing it (explicit begin/commits for 7.2 and earlier servers, and set autocommit for 7.3 servers), so however it ends up for 7.4 it shouldn't be too much work to adopt. As Tom has mentioned elsewhere the key change is having the FE/BE protocol report the current transaction state. thanks, --Barry Zeugswetter Andreas SB SD wrote: Also, per other discussions, we are removing backend autocommit support in 7.4. It was the wrong way to do it. Somehow I did not see that conclusion made. I thought, at least for JDBC, it is already successfully used ? I think the backend autocommit is useful. Maybe only the installation/database/user wide GUC setting should be depricated/ disabled, so it is only used by a session SET ? Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] A bad behavior under autocommit off mode
Tom, From the jdbc driver perspective I prefer the GUC variable approach, but either can be used. Each has limitations. In 7.2 and earlier jdbc code the driver handled the transaction symantics by adding begin/commit/rollback in appropriate places. And that code is still in the 7.3 driver to support older servers. In 7.3 the driver uses the GUC variable to control the transaction state. In general this is easier since it is a set once and forget about it operation. As I mentioned earlier each method has limitations. Let me list them. The problem with managing the state on the client is that in order for this to work the client needs to intercept all transaction ending events in order to start the next transaction when running in non-autocommit mode. Thus each 'commit' becomes 'commit; begin;'. Since the jdbc API has a commit() and rollback() method there is an obvious place to insert this logic. However if the user directly issues a commit or rollback sql call (instead of using the jdbc api) then the driver isn't in a position to start the new transaction, unless the driver starts parsing all SQL looking for commits or rollbacks which I am reluctant to do. However the proposed FE/BE protocol change to tell the client the transaction state would allow the driver to detect this. The problem with using the GUC approach is that if the user in SQL changed the GUC value the driver would have no way to know the state change. And thus the driver would think it was opperating in one mode (the mode *it* set), but actually be running in a different mode (the mode the *user* set through SQL). Of these two limitations the first is more significant since users do issue 'commit' statements directly sometimes, whereas users would likely never change the GUC parameter in their SQL. I like the simplicity of the GUC parameter and that is the reason I converted the jdbc driver in 7.3 to use this new method. thanks, --Barry Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I think our SET functionality is easy to understand and use. I don't see pushing it into the client as greatly improving things, and could make things worse. If we can't get it right in the backend, how many clients are going to do it wrong? This argument overlooks the fact that most of the client libraries already have notions of autocommit on/off semantics that they need to adhere to. libpq is too simple to have heard of the concept, but I believe that JDBC, ODBC, and DBI/DBD all need to deal with it anyway. I doubt that managing a server-side facility makes their lives any easier ... especially not if its semantics don't quite match what they need to do, which seems very possible. But it'd be interesting to hear what the JDBC and ODBC maintainers think about it. Perhaps autocommit as a GUC variable is just what they want. Please recall that GUC-autocommit in its current form was my idea, and I rushed it in there because I wanted us to be able to run the NIST compliance tests easily. In hindsight I am thinking it was a bad move. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: One addition I would personally like to see (it comes up in my apps code) is the ability to detect wheather the server is big endian or little endian. When using binary cursors this is necessary in order to read int data. Actually, my hope is to eliminate that business entirely by standardizing the on-the-wire representation for binary data; note the reference to send/receive routines in the original message. For integer data this is simple enough: network byte order will be it. I'm not sure yet what to do about float data. Great. 2) Better support for domains. Currently the jdbc driver is broken with regards to domains (although no one has reported this yet). The driver will treat a datatype that is a domain as an unknown/unsupported datatype. It would be great if the T response included the 'base' datatype for a domain attribute so that the driver would know what parsing routines to call to convert to/from the text representation the backend expects. I'm unconvinced that we need do this in the protocol, as opposed to letting the client figure it out with metadata inquiries. If we should, I'd be inclined to just replace the typeid field with the base typeid, and not mention the domain to the frontend at all. Comments? I don't have a strong opinion on this one. I can live with current functionality. It isn't too much work to look up the base type. So I would request the ability of the client to set a max rows parameter for query results. If a query were to return more than the max number of rows, the client would be given a handle (essentially a cursor name) that it could use to fetch additional sets of rows. How about simply erroring out if the query returns more than X rows? This shouldn't be an error condition. I want to fetch all of the rows, I just don't want to have to buffer them all in memory. Consider the following example. Select statement #1 is 'select id from foo', statement #2 is 'update bar set x = y where foo_id = ?'. The program logic issues statement #1 and then starts iterating through the results and the issues statement #2 for some of those results. If statement #1 returns a large number of rows the program can run out of memory if all the rows from #1 need to be buffered in memory. What would be nice is if the protocol allowed getting some rows from #1 but not all so that the connection could be used to issue some #2 statements. 4) Protocol level support of PREPARE. In jdbc and most other interfaces, there is support for parameterized SQL. If you want to take advantage of the performance benefits of reusing parsed plans you have to use the PREPARE SQL statement. This argument seems self-contradictory to me. There is no such benefit unless you're going to re-use the statement many times. Nor do I see how pushing PREPARE down to the protocol level will create any improvement in its performance. There is a benefit if you do reuse the statement multiple times. The performance problem is the two round trips minimum to the server that are required. A protocol solution to this would be to allow the client to send multiple requests at one time to the server. But as I type that I realize that can already be done, by having multiple semi-colon separated SQL commands sent at once. So I probably have everything I need for this already. I can just cue up the 'deallocate' calls and piggyback them on to the next real call to the server. So what I would like to see is the ability for the client to set a MAX VALUE size parameter. The server would send up to this amount of data for any column. If the value was longer than MAX VALUE, the server would respond with a handle that the client could use to get the rest of the value (in chunks of MAX VALUE) if it wanted to. I don't think I want to embed this in the protocol, either; especially not when we don't have even the beginnings of backend support for it. I think such a feature should be implemented and proven as callable functions first, and then we could think about pushing it down into the protocol. That is fine. 6) Better over the wire support for bytea. The current encoding of binary data \000 results in a significant expansion in the size of data transmitted. It would be nice if bytea data didn't result in 2 or 3 times data expansion. AFAICS the only context where this could make sense is binary transmission of parameters for a previously-prepared statement. We do have all the pieces for that on the roadmap. Actually it is the select of binary data that I was refering to. Are you suggesting that the over the wire format for bytea in a query result will be binary (instead of the ascii encoded text format as it currently exists)? regards, tom lane I am looking forward to all of the protocol changes. thanks, --Barry ---(end of broadcast
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: AFAICS the only context where this could make sense is binary transmission of parameters for a previously-prepared statement. We do have all the pieces for that on the roadmap. Actually it is the select of binary data that I was refering to. Are you suggesting that the over the wire format for bytea in a query result will be binary (instead of the ascii encoded text format as it currently exists)? See binary cursors ... Generally that is not an option. It either requires users to code to postgresql specific sql syntax, or requires the driver to do it magically for them. The later runs into all the issues that I raised on cursor support. In general the jdbc driver is expected to execute arbitrary sql statements any application might want to send it. The driver is handicaped because it doesn't know really anything about that sql statement (other than it is a select vs an update or delete). Specifically it doesn't know what tables or columns that SQL will access or how many rows a select will return. All of this knowledge is in the backend, and short of implementing a full sql parser in java this knowledge will never exist in the front end. Many of the things I put on my wish list for the protocol stem from this. Where there are two ways to do something (use cursors or not, use prepared statements or not, use binary cursors or not) the driver either needs to a) choose one way and always use it, b) infer from the sql statement which way will be better, or c) require the user to tell us. The problem with a) is that it may not always be the correct choice. The problem with b) is that generally this isn't possible and the problem with c) is it requires that the user write code that isn't portable across different databases. I would like to simply do a) in all cases. But that means that one of the two options should always (or almost always) be the best choice. So in the case of use cursors or not, it would be nice if using cursors added little or no overhead such that it could always be used. In the case of use prepared statements vs not, it would be nice if prepared statements added little or no overhead so that they could always be used. And finally in the case of use binary or regular cursors it would be nice if binary cursors could always be used. The Oracle SQLNet protocol supports most of this. Though it has been a few years since I worked with it, the oracle protocol has many of the features I am looking for (and perhaps the reason I am looking for them, is that I have seen them used there before). Essentially the Oracle protocol lets you do the following operations: open, parse, describe, bind, execute, fetch, close. A request from the client to the server specifies what operations it wants to perform on a sql statement. So a client could request to do all seven operations (which is essentially what the current postgres protocol does today). Or it could issue an open,parse call which essentially is that same thing as the PREPARE sql statement, followed by a describe,bind,execute,fetch which is similar to an EXECUTE and FETCH sql statement and finally a close which is similar to a CLOSE and DEALLOCATE sql. The describe request is generally only done once even though you may do multiple fetchs (unlike todays protocol which includes the describe information on every fetch, even if you are fetching one row at a time). The oracle approach gives the client complete flexibility to do a lot, without requiring that the client start parsing sql statements and doing things like appending on DECLARE CURSOR, or FETCH in order to reformate the applications sql statement into the postgresql sql way of doing this. --Barry ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: Tom Lane wrote: See binary cursors ... Generally that is not an option. It either requires users to code to postgresql specific sql syntax, or requires the driver to do it magically for them. Fair enough. I don't see anything much wrong with a GUC option that says send SELECT output in binary format. This is not really a protocol issue since the ASCII and BINARY choices both exist at the protocol level --- there is nothing in the protocol saying binary data can only be returned by FETCH and not by SELECT. The main problem with it in present releases is that binary data is architecture-dependent and so encouraging its general use seems like a really bad idea. But if we manage to get send/receive conversion routines in there, most of that issue would go away. That would be great. The describe request is generally only done once even though you may do multiple fetchs (unlike todays protocol which includes the describe information on every fetch, even if you are fetching one row at a time). I'm less than excited about changing that, because it breaks clients that don't want to remember past RowDescriptions (libpq being the front-line victim), and it guarantees loss-of-synchronization failures anytime the client misassociates rowdescription with query. In exchange for that, we get what exactly? Fetching one row at a time is *guaranteed* to be inefficient. The correct response if that bothers you is to fetch multiple rows at a time, not to make a less robust protocol. I don't feel strongly either way on this one, but IIRC the SQL standard for cursors only specifies fetching one record at a time (at least that is how MSSQL and DB2 implement it). Thus portable code is likely to only fetch one record at a time. The current row description isn't too big, but with the changes being suggested it might become so. thanks, --Barry ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Dave Page wrote: I don't know about JDBC, but ODBC could use it, and it would save a heck of a lot of pain in apps like pgAdmin that need to figure out if a column in an arbitrary resultset might be updateable. At the moment there is some nasty code in pgAdmin II that attempts to parse the SQL statement to figure out if the the resultset is updateable by trying to figure out the number of relations in the query, whether any of them is a view or sequence, whether there are any function calls or expressions in the attribute list and so on. It then has to try to figure out if there is a complete pkey in the resultset that can be used for the update, or whether it should attempt an update based on all existing values. That code is just plain nasty in VB. In pgAdmin III we've already mentioned stealing bits of the PostgreSQL parser. I will just add a me to here. This would be very useful for JDBC as well. We go through the same hoops to support the jdbc spec that Dave does. The jdbc spec has two features that require this level of information: 1) For every result set you can ask for a ResultSetMetaData object. This object provides you with the following methods: getColumnCount() isAutoIncrement(int column) isCaseSensitive(int column) isSearchable(int column) isNullable(int column) getColumnDisplaySize(int column) getColumnLabel(int column) getColumnName(int column) getSchemaName(int column) getPrecision(int column) getScale(int column) getTableName(int column) getColumnTypeName(int column) isReadOnly(int column) isWritable(int column) isDefinitelyWritable(int column) Now one can state the spec is broken and it doesn't make sense to ask this type of information about a query (and frankly I would agree with you), but that doesn't mean that I don't need to support it anyway. So anything that the server can do to make this easier is greatly appreciated. And I believe ODBC has almost the exact same issue since in general the JDBC spec was copied from the ODBC spec. 2) Updateable result sets. The jdbc spec allows the user to declare any select statement to be updateable. This means that as you scroll through the result (the ResultSet object) you can issue modify the data and expect the jdbc driver to reflect that change back to the base tables. The following if from the JDBC API doc: * A set of updater methods were added to this interface * in the JDBC 2.0 API (JavaTM 2 SDK, * Standard Edition, version 1.2). The comments regarding parameters * to the getter methods also apply to parameters to the * updater methods. * * The updater methods may be used in two ways: * * to update a column value in the current row. In a scrollable * ResultSet object, the cursor can be moved backwards * and forwards, to an absolute position, or to a position * relative to the current row. * The following code fragment updates the NAME column * in the fifth row of the ResultSet object * rs and then uses the method updateRow * to update the data source table from which rs was derived. * * * rs.absolute(5); // moves the cursor to the fifth row of rs * rs.updateString(NAME, AINSWORTH); // updates the * // NAME column of row 5 to be AINSWORTH * rs.updateRow(); // updates the row in the data source * * * to insert column values into the insert row. An updatable * ResultSet object has a special row associated with * it that serves as a staging area for building a row to be inserted. * The following code fragment moves the cursor to the insert row, builds * a three-column row, and inserts it into rs and into * the data source table using the method insertRow. * * * rs.moveToInsertRow(); // moves cursor to the insert row * rs.updateString(1, AINSWORTH); // updates the * // first column of the insert row to be AINSWORTH * rs.updateInt(2,35); // updates the second column to be 35 * rs.updateBoolean(3, true); // updates the third row to true * rs.insertRow(); * rs.moveToCurrentRow(); * Now application developers love this functionality. It allows them to implement fairly complex apps with very little sql knowledge. They only need to know how to do a simple select statement and that is it. The jdbc driver handles the rest for them automatically (updates, inserts, deletes). As a jdbc maintainer I personally hate this functionality as it is a real pain to implement, and can't work in any but the most simple select statements. But is is part of the spec and needs to be supported in the best manner possible. thanks, --Barry ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [JDBC] [HACKERS] Wrong charset mappings
I don't see any jdbc specific requirements here, other than the fact that jdbc assumes that the following conversions are done correctly: dbcharset - utf8 - java/utf16 where the dbcharset to/from utf8 conversion is done by the backend and the utf8 to/from java/utf16 is done in the jdbc driver. Prior to 7.3 the jdbc driver did the entire conversion itself. However versions of the jdk prior to 1.4 do a terrible job when it comes to the performance of the conversion. So for a significant speed up in 7.3 we moved most of the work to the backend. thanks, --Barry Thomas O'Dowd wrote: Hi Ishii-san, Thanks for the reply. Why was the particular change made between 7.2 and 7.3? It seems to have moved away from the standard. I found the following file... src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl Which generates the mappings. I found it references 3 files from unicode organisation, namely: http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/JIS0201.TXT http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/JIS0208.TXT http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/JIS0212.TXT The JIS0208.TXT has the line... 0x8160 0x2141 0x301C # WAVE DASH 1st col is sjis, 2nd is EUC - 0x8080, 3rd is utf16. Incidently those mapping files are marked obsolete but I guess the old mappings still hold. I guess if I run the perl script it will generate a mapping file different to what postgresql is currently using. It might be interesting to pull out the diffs and see what's right/wrong. I guess its not run anymore? I can't see how the change will affect the JDBC driver. It should only improve the situation. Right now its not possible to go from sjis - database (utf8) - java (jdbc/utf16) - sjis for the WAVE DASH character because the mapping is wrong in postgresql. I'll cc the JDBC list and maybe we'll find out if its a real problem to change the mapping. Changing the mapping I think is the correct thing to do from what I can see all around me in different tools like iconv, java 1.4.1, utf-8 terminal and any unicode reference on the web. What do you think? Tom. On Wed, 2003-02-12 at 22:30, Tatsuo Ishii wrote: I think the problem you see is due to the the mapping table changes between 7.2 and 7.3. It seems there are more changes other than u301c. Moreover according to the recent discussion in Japanese local mailing list, 7.3's JDBC driver now relies on the encoding conversion performed by the backend. ie. The driver issues set client_encoding = 'UNICODE'. This problem is very complex and I need time to find good solution. I don't think simply backout the changes to the mapping table solves the problem. Hi all, One Japanese character has been causing my head to swim lately. I've finally tracked down the problem to both Java 1.3 and Postgresql. The problem character is namely: utf-16: 0x301C utf-8: 0xE3809C SJIS: 0x8160 EUC_JP: 0xA1C1 Otherwise known as the WAVE DASH character. The confusion stems from a very similar character 0xFF5E (utf-16) or 0xEFBD9E (utf-8) the FULLWIDTH TILDE. Java has just lately (1.4.1) finally fixed their mappings so that 0x301C maps correctly to both the correct SJIS and EUC-JP character. Previously (at least in 1.3.1) they mapped SJIS to 0xFF5E and EUC to 0x301C, causing all sorts of trouble. Postgresql at least picked one of the two characters namely 0xFF5E, so conversions in and out of the database to/from sjis/euc seemed to be working. Problem is when you try to view utf-8 from the database or if you read the data into java (utf-16) and try converting to euc or sjis from there. Anyway, I think postgresql needs to be fixed for this character. In my opinion what needs to be done is to change the mappings... euc-jp - utf-8- euc-jp ==== 0xA1C1 - 0xE3809C0xA1C1 sjis - utf-8- sjis ==== 0x8160 - 0xE3809C0x8160 As to what to do with the current mapping of 0xEFBD9E (utf-8)? It probably should be removed. Maybe you could keep the mapping back to the sjis/euc characters to help backward compatibility though. I'm not sure what is the correct approach there. If anyone can tell me how to edit the mappings under: src/backend/utils/mb/Unicode/ and rebuild postgres to use them, then I can test this out locally. Just edit src/backend/utils/mb/Unicode/*.map and rebiuld PostgreSQL. Probably you might want to modify utf8_to_euc_jp.map and euc_jp_to_utf8.map. -- Tatsuo Ishii ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Bug: Re: [JDBC] Warning on transaction commit
Jeremy, This appears to be a bug in the database. I have been able to reproduce. It appears that the new 'autocommit' functionality in 7.3 has a problem. The jdbc driver is essentially issuing the following sql in your example: set autocommit = off; -- result of the setAutoCommit(false) call delete ... insert ... commit; select 1; commit; set autocommit = on; -- result of setAC(true) call Note that the last one is one call to the server issuing three sql statements together. (The reason for the select 1 and the commit is intended to ensure a transaction is in progress before committing it and then turning on autocommit) If you do the exact same calls in psql it works. But the interesting thing is that psql will take that last one and break it into three calls to the server. So if you issue them as one call there is different behavior than if you issue them in three calls. So the bug is if a statement that starts a transaction is in the same set of statements as a commit or rollback the commit or rollback will not work correctly. In the example of the problem in the jdbc driver the warning can be ignored, however consider the following example which would be more of a problem: set autocommit = off; insert ...; commit; rollback; in this case even though the client application issued a commit the commit would be ignored and the insert would never be committed. thanks, --Barry Jeremy Buchmann wrote: Hi all, I'm getting a warning message in my logs that looks like this: WARNING: COMMIT: no transaction in progress when I run this code: dbh.setAutoCommit(false); Statement doEvents = dbh.createStatement(); doEvents.executeUpdate(DELETE FROM ...); doEvents.executeUpdate(INSERT INTO ...); dbh.commit(); dbh.setAutoCommit(true); The statements do what they're supposed to do, but I don't understand why I'm getting that warning message. Anyone know why? I'm running PostgreSQL 7.3.1 and using the PostgreSQL 7.3 JDBC 2 driver that I just downloaded a couple days ago from jdbc.postgresql.org. Thanks, --Jeremy ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] JDBC access is broken in 7.3 beta
Mats, Patch applied. (I also fixed the 'length' problem you reported as well). thanks, --Barry Mats Lofkvist wrote: (I posted this on the bugs and jdbc newsgroups last week but have seen no response. Imho, this really needs to be fixed since the bug makes it impossible to use the driver in a multithreaded environment so I'm reposting to hackers and patches.) _ Mats Lofkvist [EMAIL PROTECTED] The optimization added in src/interfaces/jdbc/org/postgresql/core/Encoding.java version 1.7 breaks JDBC since it is not thread safe. The new method decodeUTF8() uses a static (i.e. class member) but is synchronized on the instance so it won't work with multiple instances used in parallel by multiple threads. (Quick and dirty patch below.) (The method also isn't using the 'length' parameter correctly, but since offset always seems to be zero, this bug doesn't show up.) _ Mats Lofkvist [EMAIL PROTECTED] *** org/postgresql/core/Encoding.java~ Sun Oct 20 04:55:50 2002 --- org/postgresql/core/Encoding.java Fri Nov 8 16:13:20 2002 *** *** 233,239 */ private static final int pow2_6 = 64; // 26 private static final int pow2_12 = 4096;// 212 ! private static char[] cdata = new char[50]; private synchronized String decodeUTF8(byte data[], int offset, int length) { char[] l_cdata = cdata; --- 233,239 */ private static final int pow2_6 = 64; // 26 private static final int pow2_12 = 4096;// 212 ! private char[] cdata = new char[50]; private synchronized String decodeUTF8(byte data[], int offset, int length) { char[] l_cdata = cdata; ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[Fwd: Re: [HACKERS] PG functions in Java: maybe use gcj?]
Forwarding to hackers a discussion that has been happening off list. --Barry Original Message Subject: Re: [HACKERS] PG functions in Java: maybe use gcj? Date: 01 Nov 2002 19:13:39 + From: Oliver Elphick [EMAIL PROTECTED] To: Barry Lind [EMAIL PROTECTED] References: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] 1036153748.24598.70.camel@linda [EMAIL PROTECTED] On Fri, 2002-11-01 at 18:19, Barry Lind wrote: Oliver, Thanks for the explaination. This makes sense. However I think the proposal Tom was suggesting would require linking in more than just the runtime. Since create function would need to call the compiler itself to compile the function into java bytecode or directly to a .so. If we had to supply gcj along with PostgreSQL in order for PostgreSQL to work, I guess that would mean gcj was incorporated in PostgreSQL - that would mean PostgreSQL would become subject to GPL protection. If CREATE FUNCTION called a java compiler (which might or might not be gcj) that PostgreSQL did not supply, that compiler would not be incorporated. If there were no java compiler, java functions would not work, but otherwise PostgreSQL would be unaffected. It would be the same as any of the tools that get called in our shell scripts, many of which may be GNU utilities, including, of course, the shell itself. As for the virtual machine, I think that is what libgcj supplies, and it is specifically excluded from having GPL effects by its licence. (You didn't post to the list, so I haven't; but I have no objection to your forwarding this there.) -- Oliver Elphick[EMAIL PROTECTED] Isle of Wight, UK http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C But they that wait upon the LORD shall renew their strength; they shall mount up with wings as eagles; they shall run, and not be weary; and they shall walk, and not faint.Isaiah 40:31 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PG functions in Java: maybe use gcj?
Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: In either case I am concerned about licensing issues. gcj is not under a BSD style license. Depending on what you need you are either dealing with regular GPL, LGPL, or LGPL with a special java exception. I beleive (without giving it too much thought) that doing either 1 or 2 above would end up linking GPL code into postgres. This can be worked around by requiring the the necessary gcj libraries be installed separately and detected at configure time (like is done elsewhere). But is does (I think) present a problem for commercial products that would like to redistribute postgres with pljava. Good point, but unless you want to build a BSD-license Java implementation, there will never be a pljava that doesn't have different licensing restrictions than PG itself does. gcj is at least more free than either Sun's or IBM's JVM ... It depends on what you mean by more free. An architecture that interacts with an external jvm would let you use any jvm (free ones as well as others). From a licensing standpoint it is generally easy to redistribute a jvm or expect the user to have one installed (most java based products out there today do this). However in the proposal here we are talking about requiring a specific jvm (gcj) and actually linking parts of it into postgres. To the extent that GPL code is linked in the GPL extends to the entire code base. As I said previously there are ways to work around this, but it becomes tricky. Especially when a commercial product wants to bundle postgres and pljava. That resulting bundle is probably entirely under the GPL and then any changes to it are also GPL. So it could be the case that this company would be prevented from submitting improvements they made back to the core product because their improvements are GPLed as a result of pljava. Now having said all that, I have been monitoring the progres of gcj for some time because I think there are very interesting possibilities. And I am all for anyone who wants to look into it further and investigate the possiblities. I just want to raise the licensing issue because it can cause problems and it is better to think about them up front than after the fact. Another challenge here it that the java code is going to want to use the jdbc api when communicating with the database. Yes. I think we'd need a new implementation of jdbc that sits atop SPI (invoked via jni I guess) rather than a FE/BE connection. How well layered is our jdbc code --- would this mean a large rewrite, or just rolling in a new bottom layer? It isn't as well layered as it could be, but it isn't too bad. Overall it shouldn't be too much work, but not a little project either. One area that isn't well layered is the assumption that the raw data from the server is in text format, since that is what the FE/BE protocol provides. So all the conversion functions that convert to/from java datatypes do so in this format. This assumption runs deep into the code. As a first pass it would be easiest to get raw data from SPI convert to text and then convert to java datatypes instead of going directly from the internal SPI format directly to java datatypes. This could be improved upon later. regards, tom lane thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PG functions in Java: maybe use gcj?
I am not sure I follow. Are you suggesting: 1) create function takes java source and then calls gcj to compile it to native and build a .so from it that would get called at runtime? or 2) create function takes java source and just compiles to java .class files and the runtime invokes the gcj java interpreter. or I guess you could do both at the same time. In either case I am concerned about licensing issues. gcj is not under a BSD style license. Depending on what you need you are either dealing with regular GPL, LGPL, or LGPL with a special java exception. I beleive (without giving it too much thought) that doing either 1 or 2 above would end up linking GPL code into postgres. This can be worked around by requiring the the necessary gcj libraries be installed separately and detected at configure time (like is done elsewhere). But is does (I think) present a problem for commercial products that would like to redistribute postgres with pljava. Another challenge here it that the java code is going to want to use the jdbc api when communicating with the database. One difficulty here is getting jdbc to be part of the same transaction as the calling java function. Such that if the java stored procedure selects or updates data it is doing it in the same transaction as the caller of the function. Today the jdbc driver only knows how to communicate via the FE/BE protocol which will end up creating a new process and transaction. The jdbc driver would need to not use the FE/BE protocol but instead probably use jni calls. thanks, --Barry Tom Lane wrote: I had an interesting conversation today with Tom Tromey and Andrew Haley of Red Hat about how to implement pljava for Postgres. Rather than futzing with an external JVM, their thought is to use gcj (gcc compiling Java). It sounds like this approach would mostly just work, modulo needing to use a small amount of C++ code to call the defined APIs for gcj. This would not be a perfect solution: gcj isn't yet ported everywhere, and it would probably not play nice on machines where the standard C library isn't thread-safe. But it seems a lot more within reach than the approaches we've discussed in the past. I'm not volunteering to try to do this, but I wanted to toss the idea up in the air and see if anyone wants to try it. Tom and Andrew indicated they'd be willing to help out with advice etc for anyone who wants to take on the project. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] interesting side effect of autocommit = off
After turning autocommit off on my test database, my cron scripts that vacuum the database are now failing. This can be easily reproduced, turn autocommit off in your postgresql.conf, then launch psql and run a vacuum. [blind@blind databases]$ psql files Welcome to psql 7.3b2, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit files=# vacuum; ERROR: VACUUM cannot run inside a BEGIN/END block files=# It turns out that you need to commit/rollback first before you can issue the vacuum command. While I understand why this is happening (psql is issuing some selects on startup which automatically starts a transaction) it certainly isn't intuitive. Does this mean that I need to change my cron scripts to do rollback; vacuum;? thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] experiences with autocommit functionality in 7.3
I was spending some time investigating how to fix the jdbc driver to deal with the autocommit functionality in 7.3. I am trying to come up with a way of using 'set autocommit = on/off' as a way of implementing the jdbc symantics for autocommit. The current code just inserts a 'begin' after every commit or rollback when autocommit is turned off in jdbc. I can continue to use the old way and just issue a 'set autocommit = on' at connection initialization, but I wanted to investigate if using 'set autocommit = off' would be a better implementation. The problem I am having is finding a way to turn autocommit on or off without generating warning messages, or without having the change accidentally rolled back later. Below is the current behavior (based on a fresh pull from cvs this morning): Key: ACon = autocommit on ACoff = autocommit off NIT = not in transaction IT = in transaction IT* = in transaction where a rollback will change autocommit state Current State ActionEnd State ACon and NITset ACon ACon and NIT set ACoff ACoff and IT* ACon and IT set ACon ACon and IT set ACoff ACoff and IT* ACon and IT*set ACon ACon and IT* set ACoff ACoff and IT ACoff and NIT set ACon ACon and NIT set ACoff ACoff and IT ACoff and ITset ACon ACon and IT* set ACoff ACoff and IT ACoff and IT* set ACon ACon and IT set ACoff ACoff and IT* There are two conclusions I have drawn from this: 1) Without the ability for the client to know the current transaction state it isn't feasible to use set autocommit = on/off in the client. There will either end up being spurious warning messages about transaction already in process or no transaction in process, or situations where a subsequent rollback can undo the change. So I will stay with the current functionality in the jdbc driver until the FE/BE protocol provides access to the transaction status. 2) In one place the current functionality doesn't make sense (at least to me). ACon and NITset ACoff ACoff and IT* If I am running in autocommit mode and I issue a command I expect that command to be committed. But that is not the case here. I would have expected the result to be: ACoff and NIT thanks, --Barry ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Difference between 7.2 and 7.3, possible bug?
create table test (col_a bigint); update test set col_a = nullif('200', -1); The above works fine on 7.2 but the update fails on 7.3b2 with the following error: ERROR: column col_a is of type bigint but expression is of type text You will need to rewrite or cast the expression Is this change in behavior intentional or is it a bug? This situation is occuring because of two changes. The first being the difference in how the server is handling the above update in 7.2 vs. 7.3. The second is a change in the jdbc driver in 7.3. The actual update in jdbc looks like: update test set col_a = nullif(?, -1); and a setLong(1, 200) call is being done. In 7.2 the jdbc driver bound the long/bigint value as a plain number, but in 7.3 it binds it with quotes making it type text and exposing the change in server behavior. This change was made in the jdbc driver to work around the fact that indexes are not used for int2 or int8 columns unless the value is enclosed in quotes (or an explicit cast is used). I am not sure if the recent changes for implicit casts fixes this index usage problem in the server or not. So I have three options here: 1) if this is a server bug wait for a fix for 7.3 2) revert the jdbc driver back to not quoting int2 and int8 values - If the server now handles using indexes on int2/int8 columns then this should be done anyway - It the server still has problems with using indexes without the quotes then this removes an often requested bugfix/workaround for the index usage problem 3) Just have people rework their sql to avoid the change in behavior Any suggestions? thanks, --Barry ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Difference between 7.2 and 7.3, possible bug?
Tom Lane wrote: I would say that that is a very bad decision in the JDBC driver and should be reverted ... especially if the driver is not bright enough to notice the context in which the parameter is being used. Consider for example ... You are trying to mask a server problem in the driver. This is not a good idea. The server problem is short-term (yes, we've finally agreed how to fix it, and it will happen in 7.4), but a client-library hack to mask it will cause problems indefinitely. regards, tom lane Tom, Thanks for the quick reply. I will back out the jdbc change. It was one of those changes I did reluctantly. I have been pushing back for a couple of releases now saying that this is a server bug and needs to be fixed there. But it didn't seem like that was ever going to happen so I finally gave in. For some users this bug of not using indexes for int2/int8 makes it impossible for them to use postgres. This happens for users who are using a database abstraction layer that doesn't allow the user to actually touch the sql being sent to the server. Therefore they have no opportunity to work around the underlying bug and can't use postgres as their database because of the performance problems. I am glad to here this is finally getting resolved for 7.4. thanks, --Barry ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [JDBC] [HACKERS] problem with new autocommit config parameter
I am waiting for this thread to conclude before deciding exactly what to do for the jdbc driver for 7.3. While using the 'set autocommit true' syntax is nice when talking to a 7.3 server, the jdbc driver also needs to be backwardly compatible with 7.2 and 7.1 servers. So it may just be easier to continue with the current way of doing things, even in the 7.3 case. thanks, --Barry Curt Sampson wrote: On Mon, 9 Sep 2002, Tom Lane wrote: If autocommit=off really seriously breaks JDBC then I don't think a simple SET command at the start of a session is going to do that much to improve robustness. What if the user issues another SET to turn it on? You mean, to turn it off again? The driver should catch this, in theory. In practice we could probably live with saying, Don't use SET AUTOCOMMIT; use the methods on the Connection class instead. Probably the driver should be changed for 7.3 just to use the server's SET AUTOCOMMIT functionality cjs ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [JDBC] [HACKERS] problem with new autocommit config parameter
Daryl, The problem is an incompatiblity between a new server autocommit feature and the existing jdbc autocommit feature. The problem manifests itself when you turn autocommit off on the server (which is new functionality in 7.3). If you leave autocommit turned on on the server (which is the way the server has always worked until 7.3) the jdbc driver correctly handles issuing the correct begin/commit/rollback commands to support autocommit functionality in the jdbc driver. Autocommit will work with jdbc in 7.3 (and it does now as long as you leave autocommit set on in the postgresql.conf file). We are just need to decide what to do in this one corner case. thanks, --Barry Daryl Beattie wrote: Dear PostgreSQL people, Sorry for jumping into this conversation in the middle. Autocommit is very important, as appservers may turn it on or off at will in order to support EJB transactions (being able to set them up, roll them back, commit them, etc. by using the JDBC API). If it is broken, then all EJB apps using PostgreSQL may be broken also. ...This frightens me a little. Could somebody please explain? Sincerely, Daryl. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED]] Sent: Monday, September 09, 2002 2:54 PM To: Bruce Momjian Cc: Barry Lind; [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: [JDBC] [HACKERS] problem with new autocommit config parameter and jdbc Bruce Momjian [EMAIL PROTECTED] writes: Barry Lind wrote: How should client interfaces handle this new autocommit feature? Is it best to just issue a set at the beginning of the connection to ensure that it is always on? Yes, I thought that was the best fix for apps that can't deal with autocommit being off. If autocommit=off really seriously breaks JDBC then I don't think a simple SET command at the start of a session is going to do that much to improve robustness. What if the user issues another SET to turn it on? I'd suggest just documenting that it is broken and you can't use it, until such time as you can get it fixed. Band-aids that only partially cover the problem don't seem worth the effort to me. In general I think that autocommit=off is probably going to be very poorly supported in the 7.3 release. We can document it as being work in progress, use at your own risk. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [JDBC] [HACKERS] problem with new autocommit config parameter
Yes it is possible, but according to the jdbc spec, a new connection in jdbc is always initialized to autocommit=true. So jdbc needs to ignore whatever the current server setting is and reset to autocommit=true. --Barry snpe wrote: On Saturday 07 September 2002 02:55 am, Bruce Momjian wrote: Barry Lind wrote: Haris, You can't use jdbc (and probably most other postgres clients) with autocommit in postgresql.conf turned off. Hackers, How should client interfaces handle this new autocommit feature? Is it best to just issue a set at the beginning of the connection to ensure that it is always on? Yes, I thought that was the best fix for apps that can't deal with autocommit being off. Can client get information from backend for autocommit (on or off) and that work like psql ? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Interesting results using new prepared statements
In testing the new 7.3 prepared statement functionality I have come across some findings that I cannot explain. I was testing using PREPARE for a fairly complex sql statement that gets used frequently in my applicaition. I used the timing information from: show_parser_stats = true show_planner_stats = true show_executor_stats = true The timing information showed that 60% of time was in the parse and planning, and 40% was in the execute for the original statement. This indicated that this statement was a good candidate for using the new PREPARE functionality. Now for the strange part. When looking at the execute timings as shown by 'show_executor_stats' under three different senerios I see: regular execute = 787ms(regular sql execution, not using prepare at all) prepare execute = 737ms(execution of a prepared statement via EXECUTE with no bind variable, all values are hardcoded into the prepared sql statement) prepare/bind execute = 693ms(same as above, but using bind variables) These results where consistent across multiple runs. I don't understand why the timings for prepared statements would be less than for a regular statement, and especially why using bind variables would be better than without. I am concerned that prepared statements may be choosing a different execution plan than non-prepared statements. But I am not sure how to find out what the execution plan is for a prepared statement, since EXPLAIN doesn't work for a prepared statement (i.e. EXPLAIN EXECUTE preparedStatementName, doesn't work). I like the fact that the timings are better in this particular case (upto 12% better), but since I don't understand why that is, I am concerned that under different circumstances they may be worse. Can anyone shed some light on this? thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] problem with new autocommit config parameter and jdbc
Haris, You can't use jdbc (and probably most other postgres clients) with autocommit in postgresql.conf turned off. Hackers, How should client interfaces handle this new autocommit feature? Is it best to just issue a set at the beginning of the connection to ensure that it is always on? thanks, --Barry snpe wrote: Hi Dave, That is same.Program work with and without quote but row don't deleted. Postgresql is 7.3 beta (from cvs) and parameter autocommit in postgresql.conf is off (no auto commit). I am tried with db.autocommit(true) after getConnection, but no success I thin that is bug in JDBC PGSql 7.3 beta have new features autocommit on/off and JDBC driver don't work with autocommit off Thanks P.S I am play ith Oracle JDeveloper 9i and Postgresql and I get error in prepared statement like this error : (oracle.jbo.SQLStmtException) JBO-27123: SQL error during call statement preparation. Statement: DELETE FROM org_ban WHERE id=? and pgsqlerror is : (org.postgresql.util.PSQLException) Malformed stmt [DELETE FROM org_ban WHERE id=?] usage : {[? =] call some_function ([? [,?]*]) } I think that JDeveloper call CallableStatement for insert or delete (select and update work fine), but I don't know how. On Friday 06 September 2002 04:35 pm, Dave Cramer wrote: Remove the quotes around id, and let me know what happens Dave On Fri, 2002-09-06 at 10:52, snpe wrote: Hello Dave, There isn't any error.Program write 'Rows deleted 1', but row hasn't been deleted Thanks Haris Peco On Friday 06 September 2002 04:05 pm, Dave Cramer wrote: Harris, What error do you get? Also you don't need the quotes around id Dave On Fri, 2002-09-06 at 10:06, snpe wrote: Hello, I have simple table with column ID and values '4' in this. I user 7.3 beta1 (from cvs 05.09.2002) and autocommit off in postgresql.conf. Next program don't work . I am tried with compiled postgresql.jar form CVS and with pg73b1jdbc3.jar from 05.09.2002 on jdbc.postgresql.org What is wrong ? regards Haris Peco import java.io.*; import java.sql.*; import java.text.*; public class PrepStatTest { Connection db; String stat=DELETE FROM org_ban WHERE \id\ = ?; String delid = 4; public PrepStatTest() throws ClassNotFoundException, FileNotFoundException, IOException, SQLException { Class.forName(org.postgresql.Driver); db = DriverManager.getConnection(jdbc:postgresql://spnew/snpe, snpe, snpe); PreparedStatement st = db.prepareStatement(stat); st.setString(1, delid); int rowsDeleted = st.executeUpdate(); System.out.println(Rows deleted + rowsDeleted); db.commit(); st.close(); db.close(); } public static void main(String args[]) { try { PrepStatTest test = new PrepStatTest(); } catch (Exception ex) { System.err.println(Exception caught.\n + ex); ex.printStackTrace(); } } } ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] possible vacuum improvement?
Wouldn't it make sense to implement autovacuum information in a struture like the FSM, a Dirty Space Map (DSM)? As blocks are dirtied by transactions they can be added to the DSM. Then vacuum can give priority processing to those blocks only. The reason I suggest this is that in many usage senerios it will be more efficient to only vacuum part of a table than the entire table. Given a large table that grows over time, it tends to be the case that older data in the table becomes more static as it ages (a lot of financial data is like this, when it is initially created it may get a lot of updates done early in it's life and may even be deleted, but once the data gets older (for example a year old), it is unlikely to change). This would imply that over time the first blocks in a table will change less and most activity will occur towards the end of the table. If you have a multigig table, where most of the activity occurs near the end, a lot of cpu cycles can be wasted going over the mostly static begining of the table. thanks, --Barry Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: 1)Is this sounds like a workable solution? Adding a trigger to every tuple update won't do at all. Storing the counts in a table won't do either, as the updates on that table will generate a huge amount of wasted space themselves (not to mention enough contention to destroy concurrent performance). 4)Is use of threads sounds portable enough? Threads are completely out of the question, at least if you have any hope of seeing this code get accepted into the core distro. For vacuum's purposes all that we really care to know about is the number of obsoleted tuples in each table: committed deletes and updates, and aborted inserts and updates all count. Furthermore, we do not need or want a 100% reliable solution; approximate counts would be plenty good enough. What I had in the back of my mind was: each backend counts attempted insertions and deletions in its relcache entries (an update adds to both counts). At transaction commit or abort, we know which of these two counts represents the number of dead tuples added to each relation, so while we scan the relcache for post-xact cleanup (which we will be doing anyway) we can transfer the correct count into the shared FSM entry for the relation. This gives us a reasonably accurate count in shared memory of all the tuple obsoletions since bootup, at least for heavily-used tables. (The FSM might choose to forget about lightly-used tables.) The auto vacuumer could look at the FSM numbers to decide which tables are highest priority to vacuum. This scheme would lose the count info on a database restart, but that doesn't bother me. In typical scenarios the same tables will soon get enough new counts to be highly ranked for vacuuming. In any case the auto vacuumer must be designed so that it vacuums every table every so often anyhow, so the possibility of forgetting that there were some dead tuples in a given table isn't catastrophic. I do not think we need or want a control table for this; certainly I see no need for per-table manual control over this process. There should probably be a few knobs in the form of GUC parameters so that the admin can control how much overall work the auto-vacuumer does. For instance you'd probably like to turn it off when under peak interactive load. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Bug/Change in behavior for 7.3 vs 7.2.1
Then shouldn't this appear on the Open 7.3 issues list that has been circulating around? This seems like an open issue to me, that needs to be addressed before 7.3 ships. --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: You can no long insert large values into a bigint column without a cast. This seems to be fallout from the move to tighten up implicit coercions (cf http://archives.postgresql.org/pgsql-hackers/2002-04/msg00528.php as well as lots of earlier discussions). I said right along that this topic needed more debate, but we haven't gotten back to looking at it. We have a number of other nasty behaviors in current sources that trace back to altering the set of available coercions. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] CVS Messages
It is certainly possibly. We have added that type of functionality to our inhouse CVS system. Below is an example. We include at the bottom of the checkin mail a link to the webcvs diff page so you can quickly see what changed for a particular checkin. --Barry wfs checkin by barry 02/08/16 12:10:39 Modified:com/.../sql/postgres SessionManagerBaseSql.java com/.../sql/mssql SessionManagerBaseSql.java com/.../sql/oracle SessionManagerBaseSql.java Log: port fix for bug 1605 from 3.3 to 4.0 https://foo.bar.com/cgi-bin/viewcvs.cgi/wfs/com/.../sql/postgres/SessionManagerBaseSql.java.diff?r1=40.1r2=40.2diff_format=h https://foo.bar.com/cgi-bin/viewcvs.cgi/wfs/com/.../sql/mssql/SessionManagerBaseSql.java.diff?r1=40.0r2=40.1diff_format=h https://foo.bar.com/cgi-bin/viewcvs.cgi/wfs/com/.../sql/oracle/SessionManagerBaseSql.java.diff?r1=40.0r2=40.1diff_format=h Rod Taylor wrote: Is it possible for the cvs emails to include a URL to the appropriate entries in cvs web? The below is current: Modified files: src/backend/utils/adt: ruleutils.c Is this possible? Modified files: http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/utils/adt/ruleutils.c Or perhaps have a LINKS section at the very bottom below the current messages? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Bug/Change in behavior for 7.3 vs 7.2.1
I was just testing my product running on a 7.3 snapshot from a few days ago. And I ran into the following change in behavior that I consider a bug. You can no long insert large values into a bigint column without a cast. Small values (in the int range work fine though). On 7.3 I get: files=# create table test (cola bigint); CREATE files=# insert into test values (99); ERROR: column cola is of type 'bigint' but expression is of type 'double precision' You will need to rewrite or cast the expression On 7.2.1 this works correctly: files=# create table test (cola bigint); CREATE files=# insert into test values (99); INSERT 108683 1 thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] VACUUM's No one parent tuple was found, redux
Tom Lane wrote: Also, for Mario and Barry: does this test case look anything like what your real applications do? In particular, do you ever do a SELECT FOR UPDATE in a transaction that commits some changes, but does not update or delete the locked-for-update row? If not, it's possible there are yet more bugs lurking in this. This certainly seems plausible for my application. --Barry ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [PATCHES] prepareable statements
Neil Conway wrote: On Sat, Jul 20, 2002 at 10:00:01PM -0400, Tom Lane wrote: AFAICT, the syntax we are setting up with actual SQL following the PREPARE keyword is *not* valid SQL92 nor SQL99. It would be a good idea to look and see whether any other DBMSes implement syntax that is directly comparable to the feature we want. (Oracle manuals handy, anyone?) I couldn't find anything on the subject in the Oracle docs -- they have PREPARE for use in embedded SQL, but I couldn't see a reference to PREPARE for usage in regular SQL. Does anyone else know of an Oracle equivalent? Oracle doesn't have this functionality exposed at the SQL level. In Oracle the implementation is at the protocol level (i.e. sqlnet). Therefore the SQL syntax is the same when using prepared statements or when not using them. The client implementation of the sqlnet protocol decides to use prepared statements or not. As of Oracle 8, I think pretty much all of the Oracle clients use prepared statements for all the sql statements. The sqlnet protocol exposes 'open', 'prepare', 'describe', 'bind', 'fetch' and 'close'. None of these are exposed out into the SQL syntax. thanks, --Barry ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] error during vacuum full
When trying to perform a full vacuum I am getting the following error: ERROR: No one parent tuple was found Plain vacuum works fine. Thinking it might be a problem with the indexes I have rebuilt them but still get the error. What does this error indicate and what are my options to solve the problem? I am running: PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.96 on a RedHat 7.3 system. thanks, --Barry ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] error during vacuum full
Tom, It was not compiled with debug. I will do that now and see if this happens again in the future. If and when it happens again what would you like me to do? I am willing provide you access if you need it. thanks, --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: When trying to perform a full vacuum I am getting the following error: ERROR: No one parent tuple was found Want to dig into it with gdb, or let someone else into your system to do so? I've suspected for awhile that there's still a lurking bug or three in the VACUUM FULL tuple-chain-moving code, but without an example to study it's damn hard to make any progress. Note that restarting the postmaster will probably make the problem go away, as tuple chains cannot exist unless there are old open transactions. So unless your installation is already compiled --enable-debug, there's probably not much we can learn :=( regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] error during vacuum full
Tom, I am not sure exactly what you mean by 'open client transactions' but I just killed off all client processes. The only postgres processes running are: [root@cvs root]# ps -ef | grep post postgres 1004 1 0 Jul03 ?00:00:00 /usr/local/pgsql/bin/postmaster postgres 1069 1004 0 Jul03 ?00:00:00 postgres: stats buffer process postgres 1070 1069 0 Jul03 ?00:00:00 postgres: stats collector proces I then reconnected via psql and reran the vacuum full getting the same error. --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: It was not compiled with debug. I will do that now and see if this happens again in the future. If and when it happens again what would you like me to do? I am willing provide you access if you need it. Well, first off, please confirm that killing off open client transactions (you shouldn't even need to do a full postmaster restart) makes the problem go away. Beyond that, I have no advice except to be prepared to apply a debugger next time. I believe we could fix the problem if we could examine the situation VACUUM is seeing --- but it's so far not been possible to do that, because the triggering conditions are so transient. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] error during vacuum full
Tom, No. Restarting the postmaster does not resolve the problem. I am going to put the debug build in place and see if I can still reproduce. --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: The only postgres processes running are: [root@cvs root]# ps -ef | grep post postgres 1004 1 0 Jul03 ?00:00:00 /usr/local/pgsql/bin/postmaster postgres 1069 1004 0 Jul03 ?00:00:00 postgres: stats buffer process postgres 1070 1069 0 Jul03 ?00:00:00 postgres: stats collector proces I then reconnected via psql and reran the vacuum full getting the same error. Really!? Well, does restarting the postmaster fix it? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] error during vacuum full
Tom, Good. I am dead tired and am about to go to bed, but if you can reproduce with a debuggable build then I would definitely like to crawl into it tomorrow. Good night. We can pick this up tomorrow. At this point I do not have the faintest idea what might cause the problem to go away, so if possible I'd urge you to do the minimum possible work on the system overnight. Alternatively, if you can spare the disk space, make a tarball copy of the whole $PGDATA tree while you have the postmaster shut down. Then we can study the problem offline without worrying about your live application. I need the app up and running, but I did shut it down and created a backup of the entire directory as you suggested. thanks, --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: No. Restarting the postmaster does not resolve the problem. Now you've got my attention ;-) ... this is completely at variance with my theories about the cause of the problem. Destruction of a theory is usually a sign of impending advance. I am going to put the debug build in place and see if I can still reproduce. Good. I am dead tired and am about to go to bed, but if you can reproduce with a debuggable build then I would definitely like to crawl into it tomorrow. At this point I do not have the faintest idea what might cause the problem to go away, so if possible I'd urge you to do the minimum possible work on the system overnight. Alternatively, if you can spare the disk space, make a tarball copy of the whole $PGDATA tree while you have the postmaster shut down. Then we can study the problem offline without worrying about your live application. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Issues Outstanding for Point In Time Recovery (PITR)
I know that in Oracle there are 'alter database begin backup' and 'alter database end backup' commands that allow you to script your hot backups through a cron job by calling the begin backup command first, then using disk backup method of choice and then finally call the end backup command. --Barry Patrick Macdonald wrote: Zeugswetter Andreas SB SD wrote: As noted, one of the main problems is knowing where to begin in the log. This can be handled by having backup processing update the control file with the first lsn and log file required. At the time of the backup, this information is or can be made available. The control file can be the last file added to the tar and can contain information spanning the entire backup process. lsn and logfile number (of latest checkpoints) is already in the control file, thus you need control file at start of backup. (To reduce the number of logs needed for restore of an online backup you could force a checkpoint before starting file backup) Maybe I should have been more clear. The control file snapshot must be taken at backup start (as you mention) but can be stored in cache. The fields can then be modified as we see fit. At the end of backup, we can write this to a temp file and add it to the tar. Therefore, as mentioned, the snapshot spans the entire backup process. You will also need lsn and logfile number after file backup, to know how much log needs to at least be replayed to regain a consistent state. This is a nicety but not a necessity. If you have a backup end log record, you just have to enforce that the PIT recovery encounters that particular log record on forward recovery. Once encountered, you know that you at passed the point of back up end. Cheers, Patrick ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [SQL] Efficient DELETE Strategies
This Hannu Krosing wrote: DELETE relation_expr FROM relation_expr [ , table_ref [ , ... ] ] [ WHERE bool_expr ] This in some ways is similar to Oracle where the FROM is optional in a DELETE (ie. DELETE foo WHERE ...). By omitting the first FROM, the syntax ends up mirroring the UPDATE case: DELETE foo FROM bar WHERE ... UPDATE foo FROM bar WHERE ... However I think the syntax should also support the first FROM as being optional (even though it looks confusing): DELETE FROM foo FROM bar WHERE ... thanks, --Barry ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Trouble with pg_encoding_to_char
It means you are running a jdbc driver from 7.2 (perhaps 7.1, but I think 7.2) against a 6.5 database. While we try to make the jdbc driver backwardly compatable, we don't go back that far. You really should consider upgrading your database to something remotely current. thanks, --Barry [EMAIL PROTECTED] wrote: Hi, I've been developing a program with the postgres jdbc 2 driver, jdk-1.3.0 and postgres 6.5. When I start my program up it bombs like so: Something unusual has occured to cause the driver to fail. Please report this exception: Exception: java.sql.SQLException: ERROR: No such function 'pg_encoding_to_char' with the specified attributes Stack Trace: java.sql.SQLException: ERROR: No such function 'pg_encoding_to_char' with the specified attributes at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:94) at org.postgresql.Connection.ExecSQL(Connection.java:398) at org.postgresql.Connection.ExecSQL(Connection.java:381) at org.postgresql.Connection.openConnection(Connection.java:314) at org.postgresql.Driver.connect(Driver.java:149) Does anyone know what any of this means...? Regards, Youenn Université de Bretagne sud http://www.univ-ubs.fr/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] bug? in current cvs with bigint datatype
I just did a fresh build from current cvs and found the following regression from 7.2: create table test (cola bigint); update test set cola = 100; In 7.3 the update results in the following error: ERROR: column cola is of type 'bigint' but expression is of type 'double precision' You will need to rewrite or cast the expression In 7.2 the update worked. (updated 0 rows in this case) It is interesting to note that if I use 'cola = 100' in a where clause instead of as an assignment (i.e. select * from test where cola = 100) this works in both 7.3 and 7.2. thanks, --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Large object security
The problem with this is that the existing functionality of LOs allows you to share a single LO across multiple tables. There may not be a single source, but multiple. Since LOs just use an OID as a FK to the LO, you can store that OID in multiple different tables. --Barry Mario Weilguni wrote: would'nt it be much better to expand pg_largeobject to have another column src_oid (or similar), containing the OID of the referencing table from pg_class, and when accessing large objects take the privilieges from the referencing class? -Ursprüngliche Nachricht- Von: Damon Cokenias [mailto:[EMAIL PROTECTED]] Gesendet: Freitag, 19. April 2002 11:04 An: pgsql-hackers Betreff: [HACKERS] Large object security Hi all, I see there's a TODO item for large object security, it's a feature I'd really like to see. I'm willing to put in the time to write a patch, but know far to little about postgres internals and history to just dive in. Has there been any discussion on this list about what this feature should be or how it might be implemented? I saw a passing reference to LOB LOCATORs in the list archives, but that was all. What's a LOB LOCATOR ? What about giving each large object its own permission flags? ex: GRANT SELECT ON LARGE OBJECT 10291 TO USER webapp; GRANT SELECT, DELETE, UPDATE ON LARGE OBJECT 10291 TO USER admin; Default permission flags (and INSERT permissions) would be set at the table level. All objects without specific permissions would use the table rules. This allows for backward compatibility and convenience. I think per-object security is important. A user shouldn't be able to get at another user's data just by guessing the right OID. Ideally, users without permission would not know there were objects in the database they were not allowed to see. I can also imagine a security scheme that uses rule/trigger syntax to give the user a hook to provide her own security functions. I haven't thought that through, though. Any thoughts? -Damon ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] bug with current sources? Re: cost of parse/plan/execute
Tom, OK here is a test case: create table test1 (t1a int); create table test2 (t2a int); create table test3 (t3a int); SELECT x2.t2a FROM ((test1 t1 LEFT JOIN test2 t2 ON (t1.t1a = t2.t2a)) AS x1 LEFT OUTER JOIN test3 t3 ON (x1.t2a = t3.t3a)) AS x2 WHERE x2.t2a = 1; The select works under 7.2, but gives the following error in 7.3: ERROR: JOIN/ON clause refers to x1, which is not part of JOIN thanks, --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: In testing Neil's PREPARE/EXECUTE patch on my test query, I found the parser complains that this query is not valid when using current sources. The error I get is: psql:testorig.sql:1: ERROR: JOIN/ON clause refers to xf2, which is not part of JOIN Hmm. I have an open bug with sub-SELECTs inside a JOIN, but this example doesn't look like it would trigger that. I think the sql is valid (at least it has worked in 7.1 and 7.2). Is this a bug? Dunno. Give me a test case (and no, I am *not* going to try to reverse-engineer table schemas from that SELECT). regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.3 schedule
Curt Sampson wrote: On Thu, 11 Apr 2002, Barry Lind wrote: I'm not sure that JDBC would use this feature directly. When a PreparableStatement is created in JDBC there is nothing that indicates how many times this statement is going to be used. Many (most IMHO) will be used only once. Well, the particular PreparedStatement instance may be used only once, yes. But it's quite likely that other, identical PreparedStatement objects would be used time and time again, so it's still good if you don't need to do much work on the second and subsequent preparations of that statement. But since the syntax for prepare is: PREPARE name AS statement you can't easily reuse sql prepared by other PreparedStatement objects since you don't know if the sql you are about to execute has or has not yet been prepared or what name was used in that prepare. Thus you will always need to do a new prepare. (This only is true if the driver is trying to automatically use PREPARE/EXECUTE, which was the senario I was talking about). If it only is used once, it will actually perform worse than without the feature (since you need to issue two sql statements to the backend to accomplish what you were doing in one before). I'm not sure that it would be much worse unless you need to wait for an acknowledgement from the back-end for the first statement. If you had a back-end command along the lines of prepare this statement and execute it with these parameters, it would have pretty much the same performance as giving the statement directly with the parameters already substituted in, right? I didn't say it would be much worse, but it won't be faster than not using PREPARE. Thus if someone wanted to use this functionality from jdbc they would need to do it manually, i.e. issue the prepare and execute statements manually instead of the jdbc driver doing it automatically for them. I'd say that this is awfully frequent, anyway. I use PreparedStatements for pretty much any non-constant input, because it's just not safe or portable to try to escape parameters yourself. I agree this is useful, and you can write user code to take advantage of the functionality. I am just pointing out that I don't think the driver can behind the scenes use this capability automatically. --Barry ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] cost of parse/plan/execute for one sample query
In benchmarks that I have done in the past comparing performance of Oracle and Postgres in our web application, I found that I got ~140 requests/sec on Oracle and ~50 requests/sec on postgres. The code path in my benchmark only issues one sql statement. Since I know that Oracle caches query plans, I wanted to see the cost under postgres of the parse/plan/execute to see if the parsing and planing of the sql statement would account for the difference in performance between Oracle and postgres. In a recent mail note to hackers, Tom mentioned the existence of the show_parser_stats, show_planner_stats, and show_executor_stats parameters in the postgresql.conf file. So I turned them on ran my query a few times and here are the results: average of 10 runs: parsing = .003537 sec (19.3%)* planning = .009793 sec (53.5%) execute = .004967 sec (27.2%) If Oracle is only incurring the execute cost for each query then this would explain the difference in performance between Oracle and Postgres. This would lead me to conclude that the current proposed PREPARE/EXECUTE patch will be very useful to me. (now I just need to find the time to test it). thanks, --Barry * show_parser_stats prints out three separate timings: parser statistics, parse analysis statistics, rewriter statistics, the number .003537 is the sum of those three (.001086 + .002350 + .000101) SELECT XF.FILE_TYPE_CODE, XF.FULL_PATH, XF.FILE_ID, XF.PARENT_ID, XF.MIME_TYPE, XF.OWNER_PRINCIPAL_ID, XF.REVISIONABLE_FLAG, XF.OWNER_DELETE_FLAG, XF.OWNER_WRITE_FLAG, XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF.FILE_SIZE, CASE WHEN XF.QUOTA IS NULL THEN -1 ELSE XF.QUOTA END AS QUOTA, XF.LOGGING_FLAG, XF.HAS_LOCKS, XF.HAS_DEAD_PROPERTIES, XF.LATEST_VERSION, XF.QUOTA_LOCKED, XF.TRASHCAN_PATH, XF.PRE_MOVE_NAME, XF.VIRTUAL_SERVER, MAX(XEA.READ_FLAG) || MAX(XEA.WRITE_FLAG) || MAX(XEA.DELETE_FLAG) || MAX(XEA.PERMISSION_FLAG) ||CASE WHEN MAX(XCP.DIGEST) IS NULL THEN 'OO' ELSE MAX(XCP.DIGEST) END AS PERMISSIONS,XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY,XF.VERSION, XF.BLOB_ID, XF.BLOB_SIZE, XF.DATA, XF.STORAGE_STATE, XF.STORAGE_DATE, XF.STORAGE_LOCATION_ID, XF.STORAGE_FILENAME, XBU.PERIOD_START, XBU.BYTES_THIS_PERIOD, XBU.BYTES_TOTAL, XF.DIGEST, XF.HIGHEST_VERSION, XF.VERSIONING_FLAGS, XF.CONTENT_LANGUAGE, XF.OWNER_INHERIT_DELETE_FLAG, XF.OWNER_INHERIT_WRITE_FLAG, XF.VER_COMMENT, XF.CHILD_INHERIT_ON_CREATE FROM (((XYF_URLS XU LEFT JOIN XYF_FILES XF1 (XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF1_FILE_ID) ON (XU.FILE_ID = XF1_FILE_ID)) XF2 LEFT OUTER JOIN XYF_FILE_VERSIONS XFV (XFV_FILE_ID, XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY) ON (XF2.FILE_ID = XFV.XFV_FILE_ID AND XFV.VERSION = CASE WHEN -1 = -1 THEN XF2.LATEST_VERSION ELSE -1 END)) AS XF3 LEFT OUTER JOIN XYF_BLOBS XB (XB_BLOB_ID) ON (XF3.BLOB_ID = XB.XB_BLOB_ID)) AS XF, XYF_BANDWIDTH_USAGE XBU, XYF_ENTRY_ACLS XEA, XYF_CACHED_PRINCIPALS XCP WHERE XBU.ROOT_DIRECTORY = '/testuser2' AND XF.VIRTUAL_SERVER = XBU.ROOT_DIRECTORY_VIRTUAL_SERVERAND XEA.PRINCIPAL_ID = XCP.ALT_PRINCIPAL_ID AND (XCP.PRINCIPAL_ID = 1000 OR XCP.PRINCIPAL_ID = 1) AND XF.FILE_ID = XEA.FILE_ID AND XF.VIRTUAL_SERVER = 1 AND (XF.FULL_PATH = '/testuser2/bugs.txt') GROUP BY XF.FILE_ID, XF.FULL_PATH, XF.FILE_TYPE_CODE, XF.PARENT_ID, XF.MIME_TYPE, XF.REVISIONABLE_FLAG, XF.OWNER_DELETE_FLAG, XF.OWNER_WRITE_FLAG, XF.OWNER_INHERIT_DELETE_FLAG, XF.OWNER_INHERIT_WRITE_FLAG, XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF.FILE_SIZE, XF.QUOTA, XF.LOGGING_FLAG, XF.HAS_LOCKS, XF.HAS_DEAD_PROPERTIES, XF.LATEST_VERSION, XF.OWNER_PRINCIPAL_ID, XF.QUOTA_LOCKED, XF.TRASHCAN_PATH, XF.PRE_MOVE_NAME, XF.VIRTUAL_SERVER, XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY, XF.VERSION, XF.BLOB_ID, XF.BLOB_SIZE, XF.DATA, XF.STORAGE_STATE, XF.STORAGE_DATE, XF.STORAGE_LOCATION_ID, XF.STORAGE_FILENAME, XBU.PERIOD_START, XBU.BYTES_THIS_PERIOD, XBU.BYTES_TOTAL, XF.DIGEST, XF.HIGHEST_VERSION, XF.VERSIONING_FLAGS, XF.CONTENT_LANGUAGE, XF.VER_COMMENT, XF.CHILD_INHERIT_ON_CREATE; ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] bug with current sources? Re: cost of parse/plan/execute for onesample query
In testing Neil's PREPARE/EXECUTE patch on my test query, I found the parser complains that this query is not valid when using current sources. The error I get is: psql:testorig.sql:1: ERROR: JOIN/ON clause refers to xf2, which is not part of JOIN I think the sql is valid (at least it has worked in 7.1 and 7.2). Is this a bug? thanks, --Barry PS. I forgot to mention that the below performance numbers were done on 7.2 (not current sources). Barry Lind wrote: In benchmarks that I have done in the past comparing performance of Oracle and Postgres in our web application, I found that I got ~140 requests/sec on Oracle and ~50 requests/sec on postgres. The code path in my benchmark only issues one sql statement. Since I know that Oracle caches query plans, I wanted to see the cost under postgres of the parse/plan/execute to see if the parsing and planing of the sql statement would account for the difference in performance between Oracle and postgres. In a recent mail note to hackers, Tom mentioned the existence of the show_parser_stats, show_planner_stats, and show_executor_stats parameters in the postgresql.conf file. So I turned them on ran my query a few times and here are the results: average of 10 runs: parsing = .003537 sec (19.3%)* planning = .009793 sec (53.5%) execute = .004967 sec (27.2%) If Oracle is only incurring the execute cost for each query then this would explain the difference in performance between Oracle and Postgres. This would lead me to conclude that the current proposed PREPARE/EXECUTE patch will be very useful to me. (now I just need to find the time to test it). thanks, --Barry * show_parser_stats prints out three separate timings: parser statistics, parse analysis statistics, rewriter statistics, the number .003537 is the sum of those three (.001086 + .002350 + .000101) SELECT XF.FILE_TYPE_CODE, XF.FULL_PATH, XF.FILE_ID, XF.PARENT_ID, XF.MIME_TYPE, XF.OWNER_PRINCIPAL_ID, XF.REVISIONABLE_FLAG, XF.OWNER_DELETE_FLAG, XF.OWNER_WRITE_FLAG, XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF.FILE_SIZE, CASE WHEN XF.QUOTA IS NULL THEN -1 ELSE XF.QUOTA END AS QUOTA, XF.LOGGING_FLAG, XF.HAS_LOCKS, XF.HAS_DEAD_PROPERTIES, XF.LATEST_VERSION, XF.QUOTA_LOCKED, XF.TRASHCAN_PATH, XF.PRE_MOVE_NAME, XF.VIRTUAL_SERVER, MAX(XEA.READ_FLAG) || MAX(XEA.WRITE_FLAG) || MAX(XEA.DELETE_FLAG) || MAX(XEA.PERMISSION_FLAG) ||CASE WHEN MAX(XCP.DIGEST) IS NULL THEN 'OO' ELSE MAX(XCP.DIGEST) END AS PERMISSIONS,XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY,XF.VERSION, XF.BLOB_ID, XF.BLOB_SIZE, XF.DATA, XF.STORAGE_STATE, XF.STORAGE_DATE, XF.STORAGE_LOCATION_ID, XF.STORAGE_FILENAME, XBU.PERIOD_START, XBU.BYTES_THIS_PERIOD, XBU.BYTES_TOTAL, XF.DIGEST, XF.HIGHEST_VERSION, XF.VERSIONING_FLAGS, XF.CONTENT_LANGUAGE, XF.OWNER_INHERIT_DELETE_FLAG, XF.OWNER_INHERIT_WRITE_FLAG, XF.VER_COMMENT, XF.CHILD_INHERIT_ON_CREATE FROM (((XYF_URLS XU LEFT JOIN XYF_FILES XF1 (XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF1_FILE_ID) ON (XU.FILE_ID = XF1_FILE_ID)) XF2 LEFT OUTER JOIN XYF_FILE_VERSIONS XFV (XFV_FILE_ID, XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY) ON (XF2.FILE_ID = XFV.XFV_FILE_ID AND XFV.VERSION = CASE WHEN -1 = -1 THEN XF2.LATEST_VERSION ELSE -1 END)) AS XF3 LEFT OUTER JOIN XYF_BLOBS XB (XB_BLOB_ID) ON (XF3.BLOB_ID = XB.XB_BLOB_ID)) AS XF, XYF_BANDWIDTH_USAGE XBU, XYF_ENTRY_ACLS XEA, XYF_CACHED_PRINCIPALS XCP WHERE XBU.ROOT_DIRECTORY = '/testuser2' AND XF.VIRTUAL_SERVER = XBU.ROOT_DIRECTORY_VIRTUAL_SERVERAND XEA.PRINCIPAL_ID = XCP.ALT_PRINCIPAL_ID AND (XCP.PRINCIPAL_ID = 1000 OR XCP.PRINCIPAL_ID = 1) AND XF.FILE_ID = XEA.FILE_ID AND XF.VIRTUAL_SERVER = 1 AND (XF.FULL_PATH = '/testuser2/bugs.txt') GROUP BY XF.FILE_ID, XF.FULL_PATH, XF.FILE_TYPE_CODE, XF.PARENT_ID, XF.MIME_TYPE, XF.REVISIONABLE_FLAG, XF.OWNER_DELETE_FLAG, XF.OWNER_WRITE_FLAG, XF.OWNER_INHERIT_DELETE_FLAG, XF.OWNER_INHERIT_WRITE_FLAG, XF1_CREATION_DATE, XF1_CREATED_BY, XF1_LAST_UPDATE_DATE, XF1_LAST_UPDATED_BY, XF.FILE_SIZE, XF.QUOTA, XF.LOGGING_FLAG, XF.HAS_LOCKS, XF.HAS_DEAD_PROPERTIES, XF.LATEST_VERSION, XF.OWNER_PRINCIPAL_ID, XF.QUOTA_LOCKED, XF.TRASHCAN_PATH, XF.PRE_MOVE_NAME, XF.VIRTUAL_SERVER, XFV_FILE_VERSION_ID, XFV_CREATION_DATE, XFV_CREATED_BY, XF.VERSION, XF.BLOB_ID, XF.BLOB_SIZE, XF.DATA, XF.STORAGE_STATE, XF.STORAGE_DATE, XF.STORAGE_LOCATION_ID, XF.STORAGE_FILENAME, XBU.PERIOD_START, XBU.BYTES_THIS_PERIOD, XBU.BYTES_TOTAL, XF.DIGEST, XF.HIGHEST_VERSION, XF.VERSIONING_FLAGS, XF.CONTENT_LANGUAGE, XF.VER_COMMENT, XF.CHILD_INHERIT_ON_CREATE; ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] 7.3 schedule
Neil Conway wrote: I would suggest using it any time you're executing the same query plan a large number of times. In my experience, this is very common. There are already hooks for this in many client interfaces: e.g. PrepareableStatement in JDBC and $dbh-prepare() in Perl DBI. I'm not sure that JDBC would use this feature directly. When a PreparableStatement is created in JDBC there is nothing that indicates how many times this statement is going to be used. Many (most IMHO) will be used only once. As I stated previously, this feature is only useful if you are going to end up using the PreparedStatement multiple times. If it only is used once, it will actually perform worse than without the feature (since you need to issue two sql statements to the backend to accomplish what you were doing in one before). Thus if someone wanted to use this functionality from jdbc they would need to do it manually, i.e. issue the prepare and execute statements manually instead of the jdbc driver doing it automatically for them. thanks, --Barry PS. I actually do believe that the proposed functionality is good and should be added (even though it may sound from the tone of my emails in this thread that that isn't the case :-) I just want to make sure that everyone understands that this doesn't solve the whole problem. And that more work needs to be done either in 7.3 or some future release. My fear is that everyone will view this work as being good enough such that the rest of the issues won't be addressed anytime soon. I only wish I was able to work on some of this myself, but I don't have the skills to hack on the backend too much. (However if someone really wanted a new feature in the jdbc driver in exchange, I'd be more than happy to help) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.3 schedule
Tom Lane wrote: Yes, that is the part that was my sticking point last time around. (1) Because shared memory cannot be extended on-the-fly, I think it is a very bad idea to put data structures in there without some well thought out way of predicting/limiting their size. (2) How the heck do you get rid of obsoleted cached plans, if the things stick around in shared memory even after you start a new backend? (3) A shared cache requires locking; contention among multiple backends to access that shared resource could negate whatever performance benefit you might hope to realize from it. A per-backend cache kept in local memory avoids all of these problems, and I have seen no numbers to make me think that a shared plan cache would achieve significantly more performance benefit than a local one. Oracle's implementation is a shared cache for all plans. This was introduced in Oracle 6 or 7 (I don't remember which anymore). The net effect was that in general there was a significant performance improvement with the shared cache. However poorly written apps can now bring the Oracle database to its knees because of the locking issues associated with the shared cache. For example if the most frequently run sql statements are coded poorly (i.e. they don't use bind variables, eg. 'select bar from foo where foobar = $1' vs. 'select bar from foo where foobar = || somevalue' (where somevalue is likely to be different on every call)) the shared cache doesn't help and its overhead becomes significant. thanks, --Barry ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.3 schedule
Tom Lane wrote: It would be interesting to see some stats for the large-BLOB scenarios being debated here. You could get more support for the position that something should be done if you had numbers to back it up. Below are some stats you did a few months ago when I was asking a related question. Your summary was: Bottom line: feeding huge strings through the lexer is slow. --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: In looking at some performance issues (I was trying to look at the overhead of toast) I found that large insert statements were very slow. ... ... I got around to reproducing this today, and what I find is that the majority of the backend time is going into simple scanning of the input statement: Each sample counts as 0.01 seconds. % cumulative self self totaltime seconds secondscalls ms/call ms/call name 31.24 11.90 11.90 _mcount 19.51 19.33 7.4310097 0.74 1.06 base_yylex 7.48 22.18 2.85 21953666 0.00 0.00 appendStringInfoChar 5.88 24.42 2.24 776 2.89 2.89 pglz_compress 4.36 26.08 1.66 21954441 0.00 0.00 pq_getbyte 3.57 27.44 1.36 7852141 0.00 0.00 addlit 3.26 28.68 1.24 1552 0.80 0.81 scanstr 2.84 29.76 1.08 779 1.39 7.18 pq_getstring 2.31 30.64 0.8810171 0.09 0.09 _doprnt 2.26 31.50 0.86 776 1.11 1.11 byteain 2.07 32.29 0.79 msquadloop 1.60 32.90 0.61 7931430 0.00 0.00 memcpy 1.18 33.35 0.45 chunks 1.08 33.76 0.4146160 0.01 0.01 strlen 1.08 34.17 0.41 encore 1.05 34.57 0.40 8541 0.05 0.05 XLogInsert 0.89 34.91 0.34 appendStringInfo 60% of the call graph time is accounted for by these two areas: index % timeself childrencalled name 7.433.32 10097/10097 yylex [14] [13]41.07.433.32 10097 base_yylex [13] 1.360.61 7852141/7852141 addlit [28] 1.240.011552/1552scanstr [30] 0.020.033108/3108ScanKeywordLookup [99] 0.000.022335/2335yy_get_next_buffer [144] 0.020.00 776/781 strtol [155] 0.000.01 777/3920MemoryContextStrdup [108] 0.000.00 1/1 base_yy_create_buffer [560] 0.000.004675/17091 isupper [617] 0.000.001556/1556yy_get_previous_state [671] 0.000.00 779/779 yywrap [706] 0.000.00 1/2337 base_yy_load_buffer_state [654] --- 1.084.51 779/779 pq_getstr [17] [18]21.41.084.51 779 pq_getstring [18] 2.850.00 21953662/21953666 appendStringInfoChar [20] 1.660.00 21954441/21954441 pq_getbyte [29] --- While we could probably do a little bit to speed up pg_getstring and its children, it's not clear that we can do anything about yylex, which is flex output code not handmade code, and is probably well-tuned already. Bottom line: feeding huge strings through the lexer is slow. regards, tom lane It would be interesting to see some stats for the large-BLOB scenarios being debated here. You could get more support for the position that something should be done if you had numbers to back it up. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] 7.3 schedule
Neil Conway wrote: On Thu, 11 Apr 2002 16:25:24 +1000 Ashley Cambrell [EMAIL PROTECTED] wrote: What are the chances that the BE/FE will be altered to take advantage of prepare / execute? Or is it something that will never happen? Is there a need for this? The current patch I'm working on just does everything using SQL statements, which I don't think is too bad (the typical client programmer won't actually need to see them, their interface should wrap the PREPARE/EXECUTE stuff for them). Yes there is a need. If you break up the query into roughly three stages of execution: parse, plan, and execute, each of these can be the performance bottleneck. The parse can be the performance bottleneck when passing large values as data to the parser (eg. inserting one row containing a 100K value will result in a 100K+ sized statement that needs to be parsed, parsing will take a long time, but the planning and execution should be relatively short). The planning stage can be a bottleneck for complex queries. And of course the execution stage can be a bottleneck for all sorts of reasons (eg. bad plans, missing indexes, bad statistics, poorly written sql, etc.). So if you look at the three stages (parse, plan, execute) we have a lot of tools, tips, and techniques for making the execute faster. We have some tools (at least on the server side via SPI, and plpgsql) to help minimize the planning costs by reusing plans. But there doesn't exist much to help with the parsing cost of large values (actually the fastpath API does help in this regard, but everytime I mention it Tom responds that the fastpath API should be avoided). So when I look at the proposal for the prepare/execute stuff: PREPARE plan AS query; EXECUTE plan USING parameters; DEALLOCATE plan; Executing a sql statement today is the following: insert into table values (stuff); which does one parse, one plan, one execute under the new functionality: prepare plan as insert into table values (stuff); execute plan using stuff; which does two parses, one plan, one execute which obviously isn't a win unless you end up reusing the plan many times. So lets look at the case of reusing the plan multiple times: prepare plan as insert into table values (stuff); execute plan using stuff; execute plan using stuff; ... which does n+1 parses, one plan, n executes so this is a win if the cost of the planing stage is significant compared to the costs of the parse and execute stages. If the cost of the plan is not significant there is little if any benefit in doing this. I realize that there are situations where this functionality will be a big win. But I question how the typical user of postgres will know when they should use this functionality and when they shouldn't. Since we don't currently provide any information to the user on the relative cost of the parse, plan and execute phases, the end user is going to be guessing IMHO. What I think would be a clear win would be if we could get the above senario of multiple inserts down to one parse, one plan, n executes, and n binds (where binding is simply the operation of plugging values into the statement without having to pipe the values through the parser). This would be a win in most if not all circumstances where the same statement is executed many times. I think it would also be nice if the new explain anaylze showed times for the parsing and planning stages in addition to the execution stage which it currently shows so there is more information for the end user on what approach they should take. thanks, --Barry On the other hand, there are already a few reasons to make some changes to the FE/BE protocol (NOTIFY messages, transaction state, and now possibly PREPARE/EXECUTE -- anything else?). IMHO, each of these isn't worth changing the protocol by itself, but perhaps if we can get all 3 in one swell foop it might be a good idea... Cheers, Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Implicit coercions need to be reined in
Tom, My feeling is that this change as currently scoped will break a lot of existing apps. Especially the case where people are using where clauses of the form: bigintcolumn = '999' to get a query to use the index on a column of type bigint. thanks, --Barry Tom Lane wrote: Awhile back I suggested adding a boolean column to pg_proc to control which type coercion functions could be invoked implicitly, and which would need an explicit cast: http://archives.postgresql.org/pgsql-hackers/2001-11/msg00803.php There is a relevant bug report #484 showing the dangers of too many implicit coercion paths: http://archives.postgresql.org/pgsql-bugs/2001-10/msg00108.php I have added such a column as part of the pg_proc changes I'm currently doing to migrate aggregates into pg_proc. So it's now time to debate the nitty-gritty: exactly which coercion functions should not be implicitly invokable anymore? My first-cut attempt at this is shown by the two printouts below. The first cut does not allow any implicit coercions to text from types that are not in the text category, which seems a necessary rule to me --- the above-cited bug report shows why free coercions to text are dangerous. However, it turns out that several of the regression tests fail with this rule; see the regression diffs below. Should I consider these regression tests wrong, and correct them? If not, how can we limit implicit coercions to text enough to avoid the problems illustrated by bug #484? Another interesting point is that I allowed implicit coercions from float8 to numeric; this is necessary to avoid breaking cases like insert into foo(numeric_col) values(12.34); since the constant will be initially typed as float8. However, because I didn't allow the reverse coercion implicitly, this makes numeric more preferred than float8. Thus, for example, select '12.34'::numeric + 12.34; which draws a can't-resolve-operator error in 7.2, is resolved as numeric addition with these changes. Is this a good thing, or not? We could preserve the can't-resolve behavior by marking numeric-float8 as an allowed implicit coercion, but that seems ugly. I'm not sure we can do a whole lot better without some more wide-ranging revisions of the way we handle untyped numeric literals (as in past proposals to invent an UNKNOWNNUMERIC pseudo-type). Also, does anyone have any other nits to pick with this classification of which coercions are implicitly okay? I've started with a fairly tough approach of disallowing most implicit coercions, but perhaps this goes too far. regards, tom lane Coercions allowed implicitly: oid | result|input|prosrc --+-+-+--- 860 | bpchar | char| char_bpchar 408 | bpchar | name| name_bpchar 861 | char| bpchar | bpchar_char 944 | char| text| text_char 312 | float4 | float8 | dtof 236 | float4 | int2| i2tof 318 | float4 | int4| i4tof 311 | float8 | float4 | ftod 235 | float8 | int2| i2tod 316 | float8 | int4| i4tod 482 | float8 | int8| i8tod 314 | int2| int4| i4toi2 714 | int2| int8| int82 313 | int4| int2| i2toi4 480 | int4| int8| int84 754 | int8| int2| int28 481 | int8| int4| int48 1177 | interval| reltime | reltime_interval 1370 | interval| time| time_interval 409 | name| bpchar | bpchar_name 407 | name| text| text_name 1400 | name| varchar | text_name 1742 | numeric | float4 | float4_numeric 1743 | numeric | float8 | float8_numeric 1782 | numeric | int2| int2_numeric 1740 | numeric | int4| int4_numeric 1781 | numeric | int8| int8_numeric 946 | text| char| char_text 406 | text| name| name_text 2046 | time| timetz | timetz_time 2023 | timestamp | abstime | abstime_timestamp 2024 | timestamp | date| date_timestamp 2027 | timestamp | timestamptz | timestamptz_timestamp 1173 | timestamptz | abstime | abstime_timestamptz 1174 | timestamptz | date| date_timestamptz 2028 | timestamptz | timestamp | timestamp_timestamptz 2047 | timetz | time| time_timetz 1401 | varchar | name| name_text (38 rows) Coercions that will require explicit CAST, ::type, or typename(x) syntax (NB: in 7.2 all of these would have been allowed implicitly): oid | result|input|prosrc --+-+-+-- 2030 | abstime | timestamp |
Re: [HACKERS] help with bison
Neil, Will this allow you to pass bytea data as binary data in the parameters section (ability to bind values to parameters) or will this still require that the data be passed as a text string that the parser needs to parse. When passing bytea data that is on the order of Megs in size (thus the insert/update statement is multiple Megs in size) it takes a lot of CPU cycles for the parser to chug through sql statements that long. (In fact a posting to the jdbc mail list in the last couple of days shows that postgres is 22 times slower than oracle when handling a 1Meg value in a bytea column). thanks, --Barry Neil Conway wrote: On Thu, 11 Apr 2002 10:54:14 +0800 Christopher Kings-Lynne [EMAIL PROTECTED] wrote: Out of interest, since the FE/BE protocol apprently doesn't support prepared statements (bound variables), what does this patch actually _do_? It implements preparable statements, by adding 3 new SQL statements: PREPARE plan AS query; EXECUTE plan USING parameters; DEALLOCATE plan; I didn't write the original patch -- that was done by Karel Zak. But since that was several years ago, I'm working on cleaning it up, getting it to apply to current sources (which has taken a while), and fixing the remaining issues with it. Karel describes his work here: http://groups.google.com/groups?q=query+cache+planhl=enselm=8l4jua%242fo0%241%40FreeBSD.csie.NCTU.edu.twrnum=1 (If that's messed up due to newlines, search for query cache plan on Google Groups, it's the first result) Cheers, Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Implicit coercions need to be reined in
OK. My mistake. In looking at the regression failures in your post, I thought I saw errors being reported of this type. My bad. --Barry Tom Lane wrote: Barry Lind [EMAIL PROTECTED] writes: My feeling is that this change as currently scoped will break a lot of existing apps. Especially the case where people are using where clauses of the form: bigintcolumn = '999' to get a query to use the index on a column of type bigint. Eh? That case will not change behavior in the slightest, because there's no type conversion --- the literal is interpreted as the target type to start with. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timeout implementation issues
Tom Lane wrote: Note: I am now pretty well convinced that we *must* fix SET to roll back to start-of-transaction settings on transaction abort. If we do that, at least some of the difficulty disappears for JDBC to handle one-shot timeouts by issuing SETs before and after the target query against a query_timeout variable that otherwise acts like a good-til-canceled setting. Can we all compromise on that? This plan should work well for JDBC. (It actually makes the code on the jdbc side pretty easy). thanks, --Barry ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Suggestion for optimization
Af far as I know Oracle doesn't have any short cut (along the lines of what is being discussed in this thread) for this operation. However Oracle is more efficient in providing the answer than postgres currently is. While postgres needs to perform a full scan on the table, Oracle will only need to perform a full index scan on the primary key if one exists. Since the index will likely have much less data than the full table this will result in fewer IOs and be faster than what postgres does, but it still takes a while for large tables even in Oracle. thanks, --Barry Mike Mascari wrote: Dann Corbit wrote: I guess that this model can be viewed as everything is a snapshot. It seems plain that the repercussions for a data warehouse and for reporting have not been thought out very well. This is definitely very, very bad in that arena. I suppose that reporting could still be accomplished, but it would require pumping the data into a new copy of the database that does not allow writes at all. Yuck. At any rate, there is clearly a concept of cardinality in any case. Perhaps the information would have to be kept as part of the connection. If (after all) you cannot even compute cardinality for a single connection then the database truly is useless. In fact, under a scenario where cardinality has no meaning, neither does select count() since that is what it measures. Might as well remove it from the language. I have read a couple books on Postgresql and somehow missed the whole MVCC idea. Maybe after I understand it better the clammy beads of sweat on my forehead will dry up a little. Oracle is also a MVCC database. So this notion that MVCC somehow makes it inappropriate for data warehousing would imply that Oracle is also inappropriate. However, in your defense, Oracle did apparently find enough customer demand for a MVCC-compatible hack of COUNT() to implement a short-cut route to calculate its value... Mike Mascari [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timeout implementation issues
Since both the JDBC and ODBC specs have essentially the same symantics for this, I would hope this is done in the backend instead of both interfaces. --Barry Jessica Perry Hekman wrote: On Mon, 1 Apr 2002, Tom Lane wrote: On the other hand, we do not have anything in the backend now that applies to just one statement and then automatically resets afterwards; and I'm not eager to add a parameter with that behavior just for JDBC's convenience. It seems like it'd be a big wart. Does that leave us with implementing query timeouts in JDBC (timer in the driver; then the driver sends a cancel request to the backend)? j ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] timeout implementation issues
Jessica, My reading of the JDBC spec would indicate that this is a statement level property (aka query level) since the method to enable this is on the Statement object and is named setQueryTimeout(). There is nothing I can find that would indicate that this would apply to the transaction in my reading of the jdbc spec. thanks, --Barry Jessica Perry Hekman wrote: On Mon, 1 Apr 2002, Bruce Momjian wrote: I don't know which people want, and maybe this is why we need both GUC and BEGIN WORK timeouts. I don't remember this distinction in previous discussions but it may be significant. Of course, the GUC could behave at a transaction level as well. It will be tricky to manage multiple alarms in a single process, but it can be done by creating an alarm queue. I think we should do just BEGIN WORK (transaction-level) timeouts; that is all that the JDBC spec asks for. Does that sound good to people? So the work that would need to be done is asking the driver to request the timeout via BEGIN WORK TIMEOUT 5; getting the backend to parse that request and set the alarm on each query in that transaction; getting the backend to send a cancel request if the alarm goes off. I am right now in the process of finding the place where BEGIN-level queries are parsed. Any pointers to the right files to read would be appreciated. j ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] timeout implementation issues
The spec isn't clear on that point, but my interpretation is that it would apply to all types of statements not just queries. --Barry Peter Eisentraut wrote: Barry Lind writes: My reading of the JDBC spec would indicate that this is a statement level property (aka query level) since the method to enable this is on the Statement object and is named setQueryTimeout(). There is nothing I can find that would indicate that this would apply to the transaction in my reading of the jdbc spec. Does it time out only queries or any kind of statement? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] storing binary data
Jason, BLOBs as you have correctly inferred do not get automatically deleted. You can add triggers to your tables to delete them automatically if you so desire. However 'bytea' is the datatype that is most appropriate for your needs. It has been around for a long time, but not well documented. I have been using it in my code since 7.0 of postgres and it works fine. In fact many of the internal postgres tables use it. The problem with bytea is that many of the client interfaces don't support it well or at all. So depending on how you intend to access the data you may not be able to use the bytea datatype. The situation is much improved in 7.2 with bytea documented and better support for it in the client interfaces (jdbc especially). Encoding the data into a text format will certainly work, if you can't work around the current limitations of the above two options. And I believe there is some contrib code to help in this area. thanks, --Barry Jason Orendorff wrote: Reply-To: sender Hi. I was surprised to discover today that postgres's character types don't support zero bytes. That is, Postgres isn't 8-bit clean. Why is that? More to the point, I need to store about 1k bytes per row of varying-length 8-bit binary data. I have a few options: + BLOBs. PostgreSQL BLOBs make me nervous. I worry about the BLOB not being deleted when the corresponding row in the table is deleted. The documentation is vague. + What I really need is a binary *short* object type. I have heard rumors of a legendary bytea type that might help me, but it doesn't appear to be documented anywhere, so I hesitate to use it. + I can base64-encode the data and store it in a text field. But postgres is a great big data-storage system; surely it can store binary data without resorting to this kind of hack. What should I do? Please help. Thanks! ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Deadlock? idle in transaction
Also note that an uncommitted select statement will lock the table and prevent vacuum from running. It isn't just inserts/updates that will lock and cause vacuum to block, but selects as well. This got me in the past. (Of course this is all fixed in 7.2 with the new vacuum functionality that doesn't require exclusive locks on the tables). thanks, --Barry Michael Meskes wrote: On Thu, Oct 11, 2001 at 08:26:48PM -0400, Tom Lane wrote: You evidently have some client applications holding open transactions Okay, I know where to look for that. Thanks. that have locks on some tables. That's not a deadlock --- at least, It is no deadlock if the transaction holding the lock remains idle and does nothing. But I cannot imagine how this could happen. What happens if there is a real deadlock, i.e. the transaction holding the lock tries to lock a table vacuum already locked? Ah, I just checked and rendered my last mail useless. It appears the backend does correctly detect the deadlock and kill one transaction. it's not Postgres' fault. The VACUUM is waiting to get exclusive access to some table that's held by one of these clients, and the COPY is probably queued up behind the VACUUM. So the reason is that the transaction does hold a lock but does not advance any further? Michael ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] TOAST and bytea JAVA
Chris, Current sources for the jdbc driver does support the bytea type. However the driver for 7.1 does not. thanks, --Barry Chris Bitmead wrote: Use bytea, its for 0-255, binary data. When your client library does not support it, then base64 it in client side and later decode() into place. Thanks, bytea sounds like what I need. Why no documentation on this important data type? Does the Java client library support setting this type using setBytes or setBinaryStream? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Timestamp, fractional seconds problem
Thomas, Can you explain more how this functionality has changed? I know that in the JDBC driver fractional seconds are assumed to be two decimal places. If this is no longer true, I need to understand the new symantics so that the JDBC parsing routines can be changed. Other interfaces may have similar issues. thanks, --Barry Thomas Lockhart wrote: Problem: the external representation of time and timestamp are less precise than the internal representation. Fixed (as of yesterday) in the upcoming release. - Thomas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Abort transaction on duplicate key error
Haller, The way I have handled this in the past is to attempt the following insert, followed by an update if the insert doesn't insert any rows: insert into foo (fooPK, foo2) select 'valuePK', 'value2' where not exists (select 'x' from foo where fooPK = 'valuePK') if number of rows inserted = 0, then the row already exists so do an update update foo set foo2 = 'value2' where fooPK = 'valuePK' Since I don't know what client interface you are using (java, perl, C), I can't give you exact code for this, but the above should be easily implemented in any language. thanks, --Barry Haller Christoph wrote: Hi all, Sorry for bothering you with my stuff for the second time but I haven't got any answer within two days and the problem appears fundamental, at least to me. I have a C application running to deal with meteorological data like temperature, precipitation, wind speed, wind direction, ... And I mean loads of data like several thousand sets within every ten minutes. From time to time it happens the transmitters have delivered wrong data, so they send the sets again to be taken as correction. The idea is to create a unique index on the timestamp, the location id and the measurement id, then when receiving a duplicate key error move on to an update command on that specific row. But, within PostgreSQL this strategy does not work any longer within a chained transaction, because the duplicate key error leads to 'abort the whole transaction'. What I can do is change from chained transaction to unchained transaction, but what I have read in the mailing list so far, the commit operation requires loads of cpu time, and I do not have time for this when processing thousands of sets. I am wondering now whether there is a fundamental design error in my strategy. Any ideas, suggestions highly appreciated and thanks for reading so far. Regards, Christoph My first message: In a C application I want to run several insert commands within a chained transaction (for faster execution). From time to time there will be an insert command causing an ERROR: Cannot insert a duplicate key into a unique index As a result, the whole transaction is aborted and all the previous inserts are lost. Is there any way to preserve the data except working with autocommit ? What I have in mind particularly is something like Do not abort on duplicate key error. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] slow UNIONing
Kovacs, A 'union all' will be much faster than 'union'. 'union all' returns all results from both queries, whereas 'union' will return all distinct records. The 'union' requires a sort and a merge to remove the duplicate values. Below are explain output for a union query and a union all query. files=# explain files-# select dummy from test files-# union all files-# select dummy from test; NOTICE: QUERY PLAN: Append (cost=0.00..40.00 rows=2000 width=12) - Subquery Scan *SELECT* 1 (cost=0.00..20.00 rows=1000 width=12) - Seq Scan on test (cost=0.00..20.00 rows=1000 width=12) - Subquery Scan *SELECT* 2 (cost=0.00..20.00 rows=1000 width=12) - Seq Scan on test (cost=0.00..20.00 rows=1000 width=12) EXPLAIN files=# explain files-# select dummy from test files-# union files-# select dummy from test; NOTICE: QUERY PLAN: Unique (cost=149.66..154.66 rows=200 width=12) - Sort (cost=149.66..149.66 rows=2000 width=12) - Append (cost=0.00..40.00 rows=2000 width=12) - Subquery Scan *SELECT* 1 (cost=0.00..20.00 rows=1000 width=12) - Seq Scan on test (cost=0.00..20.00 rows=1000 width=12) - Subquery Scan *SELECT* 2 (cost=0.00..20.00 rows=1000 width=12) - Seq Scan on test (cost=0.00..20.00 rows=1000 width=12) EXPLAIN files=# thanks, --Barry Kovacs Zoltan wrote: I experienced that UNIONs in 7.1.1 are rather slow: tir=# explain (select nev from cikk) union (select tevekenyseg from log); NOTICE: QUERY PLAN: Unique (cost=667.63..687.18 rows=782 width=12) - Sort (cost=667.63..667.63 rows=7817 width=12) - Append (cost=0.00..162.17 rows=7817 width=12) - Subquery Scan *SELECT* 1 (cost=0.00..28.16 rows=1316 width=12) - Seq Scan on cikk (cost=0.00..28.16 rows=1316 width=12) - Subquery Scan *SELECT* 2 (cost=0.00..134.01 rows=6501 width=12) - Seq Scan on log (cost=0.00..134.01 rows=6501 width=12) Of course a simple SELECT is fast: tir=# explain select nev from cikk; NOTICE: QUERY PLAN: Seq Scan on cikk (cost=0.00..28.16 rows=1316 width=12) For me it seems to be slow due to the sorting. Is this right? Is this normal at all? Is it possible to make it faster? TIA, Zoltan ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Timezones and time/timestamp values in FE/BE protocol
Rene, Since the FE/BE protocol deals only with string representations of values, the protocol doesn't have too much to do with it directly. It is what happens on the client and server sides that is important here. Under the covers the server stores all timestamp values as GMT. When a select statement queries one the value is converted to the session timezone and formated to a string that includes the timezone offset used (i.e. 2001-09-09 14:24:35.12-08 which the database had stored as 2001-09-09 22:24:35.12 GMT). The client then needs to handle this accordingly and convert to a different timezone if desired. On an insert or update the client and server are essentially doing the opposite. The client converts the timestamp value to a string and then the server converts that string to GMT for storage. If the client does not pass the timezone offset (i.e. 2001-09-09 14:24:35.12 instead of 2001-09-09 14:24:35.12-08) then the server needs to guess the timezone and will use the session timezone. Now when it comes to the JDBC code this is what happens. (Since you didn't state what specific problem you where having I will give a general overview). When the JDBC driver connects to the server it does one thing timestamp related. It does a 'set datestyle to ISO' so that the client and the server both know how the strings are formated. I don't know what the session timezone defaults to, but it really shouldn't matter since the server always sends the timezone offset as part of the string representation of the timestamp value. Therefore the JDBC client can always figure out how to convert the string to a Java Timestamp object. On the insert/update opperation the JDBC client converts the Timestamp object to GMT (see the logic in setTimestamp() of PreparedStatement) and then builds the string to send to the server as the formated date/time plus the timezone offset used (GMT in this case). Thus it does something that looks like: 2001-09-09 14:24:35.12 + +00. When the server gets this string it has all the information it needs to convert to GMT for storage (it actually doesn't need to do anything since the value is clearly already in GMT). I hope this helps to answer your questions. If you could post a bit more about the issue you are having I might be able to be more specific. thanks, --Barry Rene Pijlman wrote: I'm working on a problem in the JDBC driver that's related to timezones. How does PostgreSQL handle timezones in the FE/BE protocol exactly? When a client sends a time or timestamp value to the server via the FE/BE protocol, should that be: 1) a value in the client's timezone? 2) a value in the server's timezone? 3) a value in a common frame of reference (GMT/UTC)? 4) any value with an explicit timezone? And how should a time or timestamp value returned by the server be interpreted in the client interface? And how does this all depend on the timezone setting of the server? Regards, René Pijlman [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [JDBC] Troubles using German Umlauts with JDBC
Rene, I would like to add one additional comment. In current sources the jdbc driver detects (through a hack) that the server doesn't have multibyte enabled and then ignores the SQL_ASCII return value and defaults to the JVM's character set instead of using SQL_ASCII. The problem boils down to the fact that without multibyte enabled, the server has know way of specifiying which 8bit character set is being used for a particular database. Thus a client like JDBC doesn't know what character set to use when converting to UNICODE. Thus the best we can do in JDBC is use our best guess (JVM character set is probably the best default), and allow the user to explicitly specify something else if necessary. thanks, --Barry Rene Pijlman wrote: [forwarding to pgsql-hackers and Bruce as Todo list maintainer, see comment below] [insert with JDBC converts Latin-1 umlaut to ?] On 04 Sep 2001 09:54:27 -0400, Dave Cramer wrote: You have to set the encoding when you make the connection. Properties props = new Properties(); props.put(user,user); props.put(password,password); props.put(charSet,encoding); Connection con = DriverManager.getConnection(url,props); where encoding is the proper encoding for your database For completeness, I quote the answer Barry Lind gave yesterday. [the driver] asks the server what character set is being used for the database. Unfortunatly the server only knows about character sets if multibyte support is compiled in. If the server is compiled without multibyte, then it always reports to the client that the character set is SQL_ASCII (where SQL_ASCII is 7bit ascii). Thus if you don't have multibyte enabled on the server you can't support 8bit characters through the jdbc driver, unless you specifically tell the connection what character set to use (i.e. override the default obtained from the server). This really is confusing and I think PostgreSQL should be able to support single byte encoding conversions without enabling multi-byte. To the very least there should be a --enable-encoding-conversion or something similar, even if it just enables the current multibyte support. Bruce, can this be put on the TODO list one way or the other? This problem has appeared 4 times in two months or so on the JDBC list. Regards, René Pijlman [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Escaping strings for inclusion into SQL queries
I agree with Hannu, that: * make SQL changes to allow PREPARE/EXECUTE in main session, not only in SPI is an important feature to expose out to the client. My primary reason is a perfomance one. Allowing the client to parse a SQL statement once and then supplying bind values for arguments and executing it multiple times can save a significant amount of server CPU, since the parsing and planning of the statement is only done once, even though multiple executions occur. This functionality is available in the backend (through SPI) and plpgsql uses it, but there isn't anyway to take advantage of this SPI functionality on the client (i.e. jdbc, odbc, etc.) I could see this implemented in different ways. One, by adding new SQL commands to bind or execute an already open statement, or two, by changing the FE/BE protocol to allow the client to open, parse, describe, bind, execute and close a statement as separate actions that can be sent to the server in one or more requests. (The latter is how Oracle does it). I also would like to see this added to the todo list. thanks, --Barry Hannu Krosing wrote: Bruce Momjian wrote: Your patch has been added to the PostgreSQL unapplied patches list at: http://candle.pha.pa.us/cgi-bin/pgpatches I will try to apply it within the next 48 hours. It has come to our attention that many applications which use libpq are vulnerable to code insertion attacks in strings and identifiers passed to these applications. We have collected some evidence which suggests that this is related to the fact that libpq does not provide a function to escape strings and identifiers properly. (Both the Oracle and MySQL client libraries include such a function, and the vast majority of applications we examined are not vulnerable to code insertion attacks because they use this function.) I think the real difference is what I complained in another mail to this list - in postgresql you can't do PREPARE / EXECUTE which could _automatically_ detect where string escaping is needed or just eliminate the need for escaping. In postgreSQL you have to construct all queries yourself by inserting your parameters inside your query strings in right places and escaping them when needed. That is unless you use an interface like ODBC/JDBS that fakes the PREPARE/EXECUTE on the client side and thus does the auto-escaping for you . I think that this should be added to TODO * make portable BINARY representation for frontend-backend protocol by using typsend/typreceive functions for binary and typinput typoutput for ASCII (as currently typinput==typreceive and typoutput==typsend is suspect the usage to be inconsistent). * make SQL changes to allow PREPARE/EXECUTE in main session, not only in SPI * make changes to client libraries to support marshalling arguments to EXECUTE using BINARY wire protocol or correctly escaped ASCII. The binary protocol would be very helpful for BYTEA and other big binary types. We therefore suggest that a string escaping function is included in a future version of PostgreSQL and libpq. A sample implementation is provided below, along with documentation. While you are at it you could also supply a standard query delimiter function as this is also a thing that seems to vary from db to db. -- Hannu ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Question about todo item
Can this be added to the TODO list? (actually put back on the TODO list) Along with this email thread? I feel that it is very important to have BLOB support in postgres that is similar to what the commercial databases provide. This could either mean fixing the current implementation or adding additional capabilities to toasted columns. The major problem with the current LargeObject implementation is that when the row containing the LargeObject is deleted the LargeObject isn't. This can be a useful feature under some circumstances, but it isn't how other databases handle BLOBs. Thus porting code from other databases is a challenge. While it is true that this can be worked around through triggers, I don't like the manual nature of the workarounds. thanks, --Barry Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Offhand this seems like it would be doable for a column-value that was actually moved out-of-line by TOAST, since the open_toast_object function could see and return the TOAST pointer, and then the read/ write operations just hack on rows in pg_largeobject. The hard part I am confused how pg_largeobject is involved? s/pg_largeobject/toast_table_for_relation/ ... sorry about that ... Don't forget compression of TOAST columns. How do you fseek/read/write in there? Well, you can *do* it, just don't expect it to be fast. The implementation would have to read or write most of the value, not just the segment you wanted. A person who actually expected to use this stuff would likely want to disable compression on a column he wanted random access within. Hmm ... that provides an idea. We could easily add some additional 'attstorage' settings that say *all* values of a column must be forced out-of-line (with or without allowing compression), regardless of size. Then, open_toast_object would work reliably on such a column. One possible user API to such an infrastructure is to invent BLOB and CLOB datatypes, which are just like bytea and text except that they force the appropriate attstorage value. Ugly as sin, ain't it ... but I bet it could be made to work. Okay, there's your idea. Now, who can do better? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Question about todo item
I was going through the Todo list looking at the items that are planned for 7.2 (i.e. those starting with a '-'). I was doing this to see if any might impact the jdbc driver. The only one that I thought might have an impact on the jdbc code is the item: * -Make binary/file in/out interface for TOAST columns (base64) I looked through the 7.2 docs and I couldn't find any reference to this new functionality, so I am assuming that it isn't completed yet. If this is going to be done for 7.2, I would like to get a better understanding of what functionality is going to be provided. That way I can decide how best to expose that functionality through the jdbc interface. thanks, --Barry ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Problems with outer joins in 7.1beta5
My problem is that my two outer joined tables have columns that have the same names. Therefore when my select list tries to reference the columns they are ambiguously defined. Looking at the doc I see the way to deal with this is by using the following syntax: table as alias (column1alias, column2alias,...) So we can alias the conficting column names to resolve the problem. However the problem with this is that the column aliases are positional per the table structure. Thus column1alias applies to the first column in the table. Code that relies on the order of columns in a table is very brittle. As adding a column always places it at the end of the table, it is very easy to have a newly installed site have one order (the order the create table command creates them in) and a site upgrading from an older version (where the upgrade simply adds the new columns) to have column orders be different. My feeling is that postgres has misinterpreted the SQL92 spec in this regards. But I am having problems finding an online copy of the SQL92 spec so that I can verify. What I would expect the syntax to be is: table as alias (columna as aliasa, columnb as aliasb,...) This will allow the query to work regardless of what the table column order is. Generally the SQL spec has tried not to tie query behaviour to the table column order. I will fix my code so that it works given how postgres currently supports the column aliases. Can anyone point me to a copy of the SQL92 spec so that I can research this more? thanks, --Barry ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Re: [GENERAL] Trouble porting postgreSQL to WinNT
Peter, Yes I had the same problem, but for me the reason was that I forgot to start the ipc-daemon before running initdb. Last night I had no problems installing beta4 on WinNT 4.0. thanks, --Barry Peter T Mount wrote: Quoting Tom Lane [EMAIL PROTECTED]: This doesn't make any sense, since genbki.sh has nothing to do with creating the fmgr.h file. I think your rebuild probably cleaned up something else ... hard to tell what though. On a similar vein, is anyone seeing initdb hanging under NT? So far everything compiles, but it just hangs (CPU isn't going anything either, so it's not looping etc). Peter -- Peter Mount [EMAIL PROTECTED] PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/ RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/
Re: [HACKERS] WAL documentation
Not knowing much about WAL, but understanding a good deal about Oracle's logs, I read the WAL documentation below. While it is good, after reading it I am still left with a couple of questions and therefore believe the doc could be improved a bit. The two questions I am left with after reading the WAL doc are: 1) In the 'WAL Parameters' section, paragraph 3 there is the following sentence: "After a checkpoint has been made, any log segments written before the redo record may be removed/archived..." What does the 'may' refer mean? Does the database administrator need to go into the directory and remove the no longer necessary log files? What does archiving have to do with this? If I archived all log files, could I roll forward a backup made previously? That is the only reason I can think of that you would archive log files (at least that is why you archive log files in Oracle). 2) The doc doesn't seem to explain how on database recovery the database knows which log file to start with. I think walking through an example of how after a database crash, the log file is used for recovery, would be useful. At least it would make me as a user of postgres feel better if I understood how crashes are recovered from. thanks, --Barry Oliver Elphick wrote: Here is documentation for WAL, as text for immediate review and as SGML source, generated from Vadim's original text with my editing. Please review for correctness. === WAL chapter == Write-Ahead Logging (WAL) in Postgres Author: Written by Vadim Mikheev and Oliver Elphick. General description Write Ahead Logging (WAL) is a standard approach to transaction logging. Its detailed description may be found in most (if not all) books about transaction processing. Briefly, WAL's central concept is that changes to data files (where tables and indices reside) must be written only after those changes have been logged - that is, when log records have been flushed to permanent storage. When we follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we know that in the event of a crash we will be able to recover the database using the log: any changes that have not been applied to the data pages will first be redone from the log records (this is roll-forward recovery, also known as REDO) and then changes made by uncommitted transactions will be removed from the data pages (roll-backward recovery - UNDO). Immediate benefits of WAL The first obvious benefit of using WAL is a significantly reduced number of disk writes, since only the log file needs to be flushed to disk at the time of transaction commit; in multi-user environments, commits of many transactions may be accomplished with a single fsync() of the log file. Furthermore, the log file is written sequentially, and so the cost of syncing the log is much less than the cost of syncing the data pages. The next benefit is consistency of the data pages. The truth is that, before WAL, PostgreSQL was never able to guarantee consistency in the case of a crash. Before WAL, any crash during writing could result in: 1. index tuples pointing to non-existent table rows; 2. index tuples lost in split operations; 3. totally corrupted table or index page content, because of partially written data pages. (Actually, the first two cases could even be caused by use of the "pg_ctl -m {fast | immediate} stop" command.) Problems with indices (problems 1 and 2) might have been capable of being fixed by additional fsync() calls, but it is not obvious how to handle the last case without WAL; WAL saves the entire data page content in the log if that is required to ensure page consistency for after-crash recovery. Future benefits In this first release of WAL, UNDO operation is not implemented, because of lack of time. This means that changes made by aborted transactions will still occupy disk space and that we still need a permanent pg_log file to hold the status of transactions, since we are not able to re-use transaction identifiers. Once UNDO is implemented, pg_log will no longer be required to be permanent; it will be possible to remove pg_log at shutdown, split it into segments and remove old segments. With UNDO, it will also be possible to implement SAVEPOINTs to allow partial rollback of invalid transaction operations (parser errors caused by mistyping commands, insertion of duplicate primary/unique keys and so on) with the ability to continue or commit valid operations made by the transaction before the error. At present, any error will invalidate the whole transaction and require a transaction abort. WAL offers the opportunity for a new method for database on-line backup and restore (BAR). To use this method, one would have to make periodic saves of data files to another disk, a tape or another host and also archive the WAL log
Re: [HACKERS] 7.0.3 reproduceable serious select error
I meant to ask this the last time this came up on the list, but now is a good time. Given what Tom describes below as the behavior in 7.1 (initdb stores the locale info), how do you determine what locale a database is running in in 7.1 after initdb? Is there some file to look at? Is there some sql statement that can be used to select the setting from the DB? thanks, --Barry Tom Lane wrote: Rob van Nieuwkerk [EMAIL PROTECTED] writes: Checking whith ps and looking in /proc reveiled that postmaster indeed had LANG set to "en_US" in its environment. I disabled the system script that makes this setting, restarted postgres/postmaster and reran my tests. The problem query returns the *right* answer now ! Turning LANG=en_US back on gives the old buggy behaviour. Caution: you can't just change the locale willy-nilly, because doing so invalidates the sort ordering of btree indexes. An index built under one sort order is effectively corrupt under another. I recommend that you dumpall, then initdb under the desired LANG setting, then reload, and be careful always to start the postmaster under that same setting henceforth. (BTW, 7.1 prevents this type of index screwup by locking down the database's locale at initdb time --- the ONLY way to change sort order in 7.1 is to initdb with the right locale environment variables. But in 7.0 you gotta be careful about keeping the locale consistent.) I know very little about this LANG, LOCALE etc. stuff. But for our application it is very important to support "weird" characters like "éõåÊ ..." etc. for names. Basically we need all letter symbols in ISO-8859-1 (Latin 1). As long as you are not expecting things to sort in any particular order, it really doesn't matter what locale you run Postgres in. If you do care about sort order of characters that aren't bog-standard USASCII, then you may have a problem. But you can store 'em in any case. regards, tom lane
[HACKERS] Bug in index scans with Locale support enabled
In researching a problem I have uncovered the following bug in index scans when Locale support is enabled. Given a 7.0.3 postgres installation built with Locale support enabled and a default US RedHat 7.0 Linux installation (meaning that the LANG environment variable is set to en_US) to enable the US english locale and Given the following table and index structure with the following data: create table test (test_col text); create index test_index on test (test_col); insert into test values ('abc.xyz'); insert into test values ('abcxyz'); insert into test values ('abc/xyz'); If you run the query: select * from test where test_col = 'abc.'; One would normally expect to only get one record returned, but instead all records are returned. The reason for this is that in the en_US locale all non-alphanumeric characters are ignored when doing string comparisons. So the data above gets treated as: abc.xyz = abcxyz = abc/xyz (as the non-alphanumeric characters of '.' and '/' are ignored). This implys that the above query will then return all rows as the constant 'abc.' is the same as 'abc' for comparison purposes and all rows are = 'abc'. Note that if you use a different locale for example en_UK, you will get different results as this locale does not ignore the . and / in the comparison. Now the real problem comes in when either the like or regex operators are used in a sql statement. Consider the following sql: select * from text where test_col like 'abc/%'; This query should return one row, the row for 'abc/xyz'. However if the above query is executed via an index scan it will return the wrong number of rows (0 in this case). Why is this? Well the query plan created for the above like expression looks like the following: select * from text where test_col = 'abc/' and test_col 'abc0'; In order to use the index the like has been changed into a '=' and a '' for the constant prefix ('abc/') and the constant prefix with the last character incremented by one ('/abc0') (0 is the next character after / in ASCII). Given what was shown above about how the en_US locale does comparisons we know that the non-alphanumeric characters are ignored. So the query essentially becomes: select * from text where test_col = 'abc' and test_col 'abc0'; and the data it is comparing against is 'abcxyz' in all cases (once the .'s an /'s are removed). Therefore since 'abcxyz' 'abc0', no rows are returned. Over the last couple of months that I have been on the postgres mail lists there have been a few people who reported that queries of the form "like '/aaa/bbb/%' don't work. From the above information I have determined that such queries don't work if: a) database is built with Locale support enabled (--enable-locale) b) the database is running with locale en_US c) the column the like is being performed on is indexed d) the query execution plan uses the above index (Discovering the exact set of circumstances for how to reproduce this has driven me crazy for a while now). The current implementation for converting the like into an index scan doesn't work with Locale support enabled and the en_US locale as shown above. thanks, --Barry PS. my test case: drop table test; create table test (test_col text); create index test_index on test (test_col); insert into test values ('abc.xyz'); insert into test values ('abcxyz'); insert into test values ('abc/xyz'); explain select * from test where test_col like 'abc/%'; select * from test where test_col like 'abc/%'; when run against postgres 7.0.3 with locale support enabled (used the standard RPMs on postgresql.org for RedHat) with LANG=en_US: barry=# drop table test; DROP barry=# create table test (test_col text); CREATE barry=# create index test_index on test (test_col); CREATE barry=# insert into test values ('abc.xyz'); INSERT 227611 1 barry=# insert into test values ('abcxyz'); INSERT 227612 1 barry=# insert into test values ('abc/xyz'); INSERT 227613 1 barry=# explain select * from test where test_col like 'abc/%'; NOTICE: QUERY PLAN: Index Scan using test_index on test (cost=0.00..8.14 rows=10 width=12) EXPLAIN barry=# select * from test where test_col like 'abc/%'; test_col -- (0 rows) barry=# when run against postgres 7.0.3 with locale support enabled (used the standard RPMs on postgresql.org) with LANG=en_UK: barry=# drop table test; DROP barry=# create table test (test_col text); CREATE barry=# create index test_index on test (test_col); CREATE barry=# insert into test values ('abc.xyz'); INSERT 227628 1 barry=# insert into test values ('abcxyz'); INSERT 227629 1 barry=# insert into test values ('abc/xyz'); INSERT 227630 1 barry=# explain select * from test where test_col like 'abc/%'; NOTICE: QUERY PLAN: Index Scan using test_index on test (cost=0.00..8.14 rows=10 width=12) EXPLAIN barry=# select * from test where test_col like 'abc/%'; test_col -- abc/xyz (1 row) barry=# Note the second query (under en_UK) returns