Re: [HACKERS] options in conninfo
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Using libpq PQconnect function, what is the syntax for the 'options' entry in the conninfo? I think it's stuck straight into the backend command line, so whatever you read in the 'postgres' reference page applies. I think the docs should be updated to give an example.. Send a patch... regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] int64/double for time/timestamp
Thomas Hallgren [EMAIL PROTECTED] writes: When making PL/Java dynamically adapt to the setting of integer-datetimes, I too was bitten by this bug. Is it safe to assume that the fix for this will arrive in 8.1.0? I believe Teodor already committed the change in CVS HEAD. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] options in conninfo
Using libpq PQconnect function, what is the syntax for the 'options' entry in the conninfo? I think it's stuck straight into the backend command line, so whatever you read in the 'postgres' reference page applies. Oh, I thought it was for setting GUCs at connect time. Is that possible? I thought it was in the new protocol, but I just cannot find the docs on it... Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] options in conninfo
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Using libpq PQconnect function, what is the syntax for the 'options' entry in the conninfo? I think it's stuck straight into the backend command line, so whatever you read in the 'postgres' reference page applies. Oh, I thought it was for setting GUCs at connect time. Is that possible? I thought it was in the new protocol, but I just cannot find the docs on it... No, the 'options' thing is exceedingly ancient. Try something like PGOPTIONS=-fn psql mydb or equivalently PGOPTIONS=-c enable-mergejoin=false psql mydb The GUC-at-connect-time feature is new in the V3 protocol. Offhand I do not think libpq exposes that to its users; it does use it internally IIRC. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [BUGS] We are not following the spec for HAVING without GROUP
Bruno Wolff III [EMAIL PROTECTED] writes: The case I was thinking of were datatypes without a defined ordering where max and min wouldn't be usable. But if GROUP BY was going to changed to allow any columns if the primary key was used in the GROUP BY clause, I can't see any use for those functions. Well any other case will basically be a another spelling for DISTINCT ON. Except DISTINCT ON only handles a limited range of cases. Basically DISTINCT ON is a special case of GROUP BY where the _only_ aggregate function you're allowed is first(). Handling the same cases using GROUP BY would let you mix other aggregate functions so where you have: select distinct on (x) x,y,z order by x,y,z You can do the equivalent: select x, first(y), first(z) order by x,y,z group by x But you can also handle the more general case like: select x, first(y), first(z), avg(a), sum(s) order by x,y,z group by x I don't really care one way or the other about the first function per se. But it seems odd to have a feature to handle a special case of an existing much more general feature separately. It seems it would be more worthwhile to handle the general case of aggregate functions that don't need all the records to generate an answer, including first(), last(), min(), and max(). That would better handle the cases DISTINCT ON handles but also solve many other problems. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] We are not following the spec for HAVING without GROUP
On Fri, 11 Mar 2005 10:37:13 +1300, Mark Kirkwood [EMAIL PROTECTED] wrote: Firebird 1.5.1 FreeBSD 5.3 [correct results] Interbase 6.0: SQL create table tab (col integer); SQL select 1 from tab having 1=0; SQL select 1 from tab having 1=1; 0---:-) SQL insert into tab values(1); SQL insert into tab values(2); SQL select 1 from tab having 1=0; SQL select 1 from tab having 1=1; 1 SQL Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] signed short fd
Maybe we make the assumption that all OS will implement fd as an array index The POSIX spec requires open() to assign fd's consecutively from zero. http://www.opengroup.org/onlinepubs/007908799/xsh/open.html With all due respect, PostgreSQL now runs natively on Win32. Having a POSIX-only mentality, especially with something so trivial, is a mistake. I would say int is the best way to handle it. You just *never* know. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [BUGS] We are not following the spec for HAVING without
On 14 Mar 2005, Greg Stark wrote: select distinct on (x) x,y,z order by x,y,z You can do the equivalent: select x, first(y), first(z) order by x,y,z group by x But you can also handle the more general case like: select x, first(y), first(z), avg(a), sum(s) order by x,y,z group by x I don't really care one way or the other about the first function per se. The standard (sql2003) have what is called windows where one can do these things and much more. A window is like a group by, but you keep all rows in the result. This can be used to for example enumrate the rows within a window partition using ROW_NUMBER(). It can later can be used in a WHERE to select the top 3 rows in each window, or something like that. Here is an example that calculate the avg and sum for each window. It return all the rows (x values) in the window together with a row number (within the window) and the 2 aggregate results. In this case the aggregates will be the same for all rows in the partition but one can also get it to do a kind of of sliding window aggregate (for example the avarage of the row before and the row after the current row): SELECT ROW_NUMBER() OVER bar AS num, x, avg(a) OVER bar, sum (a) OVER bar FROM foo WINDOW bar AS PARTITION BY x ORDER BY x, y, z; and then one can put that whole thing as a subselect and just select the rows with num = 1. This doesn't mean that we don't want functions like first() and last(), they are also be useful. I just wanted to inform that with sql2003 one can write queries with the same effect as the above (but much more complicated, of course :-). ps. All I know about the window functions is from what I've read in the draft of the sql2003 standard. It's not the perfect way to learn about new features so I wont bet my life on that the above example works as is. If someone knows better I'd like to hear about it. ps2. I'd love to read a book that discusses the sql2003 (or even sql99) that explain features, give examples, and so on. But i guess the market don't want books that explain things that no database have implemented yet (Oracle have window functions but i've never used that). -- /Dennis Björklund ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Raw size
Ühel kenal päeval (neljapäev, 10. märts 2005, 20:07+0200), kirjutas Ioannis Theoharis: Hi, i have a table: create table triples( att0 varchar(1000), att1 int4, att2 varchar(20), att3 varchar(1000) ) My table has 990 raws. The (possibly wrong) way, with wich i compute the size of the table is: Varchar fields (actually most *char and text fields) use only actual length bytes + some overhead for tuple header + page header, so unless you fill all varchar(1000) fields with exactly 1000-byte strings, you should use less than that. Is there any compression or what? Compression is not used for tuples under 2k, so there _may_ be coimpression depending on your exact data and TOAST settings. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] How to read query plan
=?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: As there are a lot of varchar(1) in the AdDevicesSites table, wouldn't be helpful to change them to char(1)? Would it solve the variable-width problem at least for some fields and speed the query up? No, because char(1) isn't physically fixed-width (consider multibyte encodings). There's really no advantage to char(N) in Postgres. I don't know what you're doing with those fields, but if they are effectively booleans or small codes you might be able to convert them to bool or int fields. There is also the char datatype (not to be confused with char(1)) which can hold single ASCII characters, but is nonstandard and a bit impoverished as to functionality. However, I doubt this is worth pursuing. One of the things I tested yesterday was a quick hack to organize the storage of intermediate join tuples with fixed-width fields first and non-fixed ones later. It really didn't help much at all :-(. I think the trouble with your example is that in the existing code, the really fast path applies only when the tuple contains no nulls --- and since you're doing all that left joining, there's frequently at least one null lurking. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] writes: The POSIX spec requires open() to assign fd's consecutively from zero. http://www.opengroup.org/onlinepubs/007908799/xsh/open.html With all due respect, PostgreSQL now runs natively on Win32. ... using the POSIX APIs that Microsoft so kindly provides. fd.c will certainly not work at all on a platform that doesn't provide a POSIX-like file access API, and in the absence of any evidence to the contrary, I don't see why we shouldn't assume that the platform adheres to that part of the spec too. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] How to read query plan
=?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: Tom Lane wrote: Actually, we already had a pending patch (from Atsushi Ogawa) that eliminates that particular O(N^2) behavior in another way. After applying it, I get about a factor-of-4 reduction in the runtime for Miroslav's example. Is there a chance we will see this patch in the 8.0.2 release? No. We are not in the habit of making non-bug-fix changes in stable branches. Ogawa's patch is in CVS for 8.1. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [BUGS] We are not following the spec for HAVING without GROUP
Dennis Bjorklund [EMAIL PROTECTED] writes: The standard (sql2003) have what is called windows where one can do these things and much more. OLAP functions would be very nice. But they're not the same thing. In fact that's precisely *why* they would be really nice. They allow you to do things that are not feasible to do with just plain old GROUP BY. But I stared at this spec for a while and it seemed like implementing it would be pretty hard. Like, I don't see any way to implement your query below without sorting every record in the group repeatedly for every record. That would be a lot of sorts. SELECT ROW_NUMBER() OVER bar AS num, x, avg(a) OVER bar, sum (a) OVER bar FROM foo WINDOW bar AS PARTITION BY x ORDER BY x, y, z; Note that as you said, this returns just as many records as are in the original table. The OLAP functions here are just regular functions, not aggregate functions -- albeit functions that use data from other records other than the one being output. DISTINCT ON is just another spelling for GROUP BY, it always outputs only one record per group. ps2. I'd love to read a book that discusses the sql2003 (or even sql99) that explain features, give examples, and so on. But i guess the market don't want books that explain things that no database have implemented yet (Oracle have window functions but i've never used that). DB2 has them too. Check out this DB2 documentation, it might be more helpful than the specs. http://publib.boulder.ibm.com/infocenter/rb63help/index.jsp?topic=/com.ibm.redbrick.doc6.3/sqlrg/sqlrg35.htm -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [BUGS] We are not following the spec for HAVING without
On 14 Mar 2005, Greg Stark wrote: SELECT ROW_NUMBER() OVER bar AS num, x, avg(a) OVER bar, sum (a) OVER bar FROM foo WINDOW bar AS PARTITION BY x ORDER BY x, y, z; Note that as you said, this returns just as many records as are in the original table. The OLAP functions here are just regular functions, not aggregate functions They are aggregate functions, the avg() is a window aggregate function according to the standard. It runs over all values in the same partition. -- albeit functions that use data from other records other than the one being output. Yes, and not just one other record, but a number of them. Isn't that what aggregate functions are? Anyway, I just wanted to point to this area in the standard to tell others what are possible using standard constructs. It doesn't really help anything in this specific case. Pg will not have any of this implemented in the nearest future (I guess). There is always a chance that someone see the mail, get interested, learn about it and then implements it :-) -- /Dennis Björklund ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] writes: The POSIX spec requires open() to assign fd's consecutively from zero. http://www.opengroup.org/onlinepubs/007908799/xsh/open.html With all due respect, PostgreSQL now runs natively on Win32. ... using the POSIX APIs that Microsoft so kindly provides. fd.c will certainly not work at all on a platform that doesn't provide a POSIX-like file access API, and in the absence of any evidence to the contrary, I don't see why we shouldn't assume that the platform adheres to that part of the spec too. I'm a better safe than sorry sort of guy. I would rather code defensively against a poorly implemented API. However: Upon successful completion, the function will open the file and return a non-negative integer representing the lowest numbered unused file descriptor. Otherwise, -1 is returned and errno is set to indicate the error. No files will be created or modified if the function returns -1. That is hardly anything that I would feel comfortable with. Lets break this down into all the areas that are ambiguous: unused file descriptor, define unused. Is it unused ever, or currently unused? Could an API developer simply just increment file opens? What about just allocating a structure on each open, and returning its pointer cast to an int? Also notice that no mention of process separation exists, it could very well be that a file descriptor may be usable system wide, with the exceptions of stdin, stdout, and stderr. Nowhere does it say how the file descriptors are numbered. 1,2,3,4 sure, that's what you expect, but it isn't an explicitly documented behavior. What is documented, however, that it is a machine int and that the number will be positive and be the lowest unused descriptor (depending on the definition of unused) This is the sort of thing that makes software brittle and likely to crash. Sure, you may be right in saying a short int is enough. Some developer creating a POSIX clib my think he is right doing something his way. What happens is that there is a potentially serious bug that will only show up at seemingly random times. The fact is that it is PostgreSQL that would be wrong, the API is documented as taking an int. PostgreSQL casts it to a short. What ever you read into the implementation of the API is wrong. The API is an abstraction and you should assume you don't know anything about it. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [BUGS] We are not following the spec for HAVING without GROUP
Dennis Bjorklund [EMAIL PROTECTED] writes: They are aggregate functions, the avg() is a window aggregate function according to the standard. It runs over all values in the same partition. -- albeit functions that use data from other records other than the one being output. Yes, and not just one other record, but a number of them. Isn't that what aggregate functions are? Er, yeah. They're aggregate functions in that they consider many records as input. However they're not aggregate functions in that they produce an output for every record, as opposed to outputting only one value for a whole group. -- greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] writes: That is hardly anything that I would feel comfortable with. Lets break this down into all the areas that are ambiguous: There isn't anything ambiguous about this, nor is it credible that there are implementations that don't follow the intent of the spec. Consider the standard paradigm for replacing stdout: you close(1) and then open() the target file. If the open() doesn't pick 1 as the fd, you're screwed. Every shell in the world would break atop such an implementation. It may well be the case that saving 4 bytes per VFD is useless micro-optimization. But the code isn't broken as it stands. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] invalidating cached plans
On Mon, Mar 14, 2005 at 02:53:36AM -0500, Tom Lane wrote: Probably the first thing to do is look around at the plausible users of this thing and see what they'd find most convenient. This may be totally irrelevant: Our current load distributors, like pgpool, have no way of knowing the side effects of backend functions. It would be interesting if the client could send each potential query to the master saying, execute this query if there are side effects, otherwise do no operation and and let me execute this read-only query on a replicated copy. -Mike ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Avoiding tuple construction/deconstruction during joining
=?ISO-8859-2?Q?Miroslav_=A9ulc?= [EMAIL PROTECTED] writes: [ concerning a deeply nested LEFT JOIN to get data from a star schema ] So I have some results. I have tested the query on both PostgreSQL 8.0.1 and MySQL 4.1.8 with LIMIT set to 30 and OFFSET set to 6000. PostgreSQL result is 11,667.916 ms, MySQL result is 448.4 ms. That's a fairly impressive discrepancy :-(, and even the slot_getattr() patch that Atsushi Ogawa provided isn't going to close the gap. (I got about a 4x speedup on Miroslav's example in my testing, which leaves us still maybe 6x slower than MySQL.) Looking at the post-patch profile for the test case, there is still quite a lot of cycles going into tuple assembly and disassembly: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls ms/call ms/call name 24.47 4.49 4.49 _mcount 8.01 5.96 1.47 9143692 0.00 0.00 ExecEvalVar 6.92 7.23 1.27 6614373 0.00 0.00 slot_deformtuple 6.54 8.43 1.20 9143692 0.00 0.00 slot_getattr 6.21 9.57 1.14 103737 0.01 0.03 ExecTargetList 5.56 10.59 1.02 103775 0.01 0.01 DataFill 3.22 11.18 0.59 103775 0.01 0.01 ComputeDataSize 2.83 11.70 0.52 ExecEvalVar 2.72 12.20 0.50 9094122 0.00 0.00 memcpy 2.51 12.66 0.46 encore 2.40 13.10 0.44 427448 0.00 0.00 nocachegetattr 2.13 13.49 0.39 103775 0.00 0.02 heap_formtuple 2.07 13.87 0.38 noshlibs 1.20 14.09 0.22 225329 0.00 0.00 _doprnt 1.20 14.31 0.22 msquadloop 1.14 14.52 0.21 chunks 0.98 14.70 0.18 871885 0.00 0.00 AllocSetAlloc 0.98 14.88 0.18 $$dyncall 0.76 15.02 0.14 594242 0.00 0.00 FunctionCall3 0.71 15.15 0.13 213312 0.00 0.00 comparetup_heap 0.65 15.27 0.12 6364 0.02 0.13 printtup 0.60 15.38 0.11 790702 0.00 0.00 pfree (_mcount is profiling overhead, ignore it.) It looks to me like just about everything in the top dozen functions is there as a result of the fact that join steps form new tuples that are the merge of their input tuples. Even our favorite villains, palloc and pfree, are down in the sub-percent range. I am guessing that the reason MySQL wins on this is that they avoid doing any data copying during a join step. I wonder whether we could accomplish the same by taking Ogawa's patch to the next level: allow a TupleTableSlot to contain either a materialized tuple as now, or a virtual tuple that is simply an array of Datums and null flags. (It's virtual in the sense that any pass-by-reference Datums would have to be pointing to data at the next level down.) This would essentially turn the formtuple and deformtuple operations into no-ops, and get rid of a lot of the associated overhead such as ComputeDataSize and DataFill. The only operations that would have to forcibly materialize a tuple would be ones that need to keep the tuple till after they fetch their next input tuple --- hashing and sorting are examples, but very many plan node types don't ever need to do that. I haven't worked out the details, but it seems likely that this could be a relatively nonintrusive patch. The main thing that would be an issue would be that direct reference to slot-val would become verboten (since you could no longer be sure there was a materialized tuple there). I think this would possibly affect some contrib stuff, which is a strong hint that it'd break some existing user-written code out there. Thoughts? regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] invalidating cached plans
Michael Adler [EMAIL PROTECTED] writes: Our current load distributors, like pgpool, have no way of knowing the side effects of backend functions. It would be interesting if the client could send each potential query to the master saying, execute this query if there are side effects, otherwise do no operation and and let me execute this read-only query on a replicated copy. Wouldn't you want to handle that the other way around? I mean there's not much point in distributing the load if it still requires passing everything through a single point of contention anyways. So I think the feature you really want is a kind of read-only mode. execute this but if it tries to have any side effects abort and give me an error That seems like a reasonably useful thing for other circumstances as well. DBAs sanity checking a database that don't want to make any modifications, low privilege users like cron jobs that aren't supposed to be making modifications, etc. In an ideal world it would combine well with having tablespaces be on read-only media. I had the impression Postgres wants to make modifications to data for purely read-only operations though. It might be hard to detect side effects that the user would care about distinct from invisible internal operations. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests
Nicolai Tufar wrote: On Thu, 10 Mar 2005 19:21:41 -0500 (EST), Bruce Momjian pgman@candle.pha.pa.us wrote: The CVS-tip implementation is fundamentally broken and won't work even for our internal uses. I've not wasted time complaining about it because I thought we were going to replace it. If we can't find a usable replacement then we're going to have to put a lot of effort into fixing what's there. On the whole I think the effort would be better spent importing someone else's solution. Oh, so our existing implementation doesn't even meet our needs. OK. (Your new patch is in the queue.) I have been thinking about our current snprintf() implementation. As I remember, we use snprintf mostly for an snprintf that doesn't support long long, and now those that don't support %$. I am wondering if we should just process long long args and %$ args, and pass everything else to the native snprintf. In fact, one trick would be to substitute long long and %$ in the printf format string, and then pass that to the native libc printf, with adjustments for the printf format arguments. That might be simpler than emulating all of snprintf. FYI, now that we are using pg_snprintf macros the native snprintf is available to us. Anyway, I am sure there are some platforms that don't have vsnprint or snprintf, but could we just say we don't support them, or emulate one of we only have the other? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Avoiding tuple construction/deconstruction during joining
=?windows-1250?Q?Miroslav_=8Aulc?= [EMAIL PROTECTED] writes: seriously, I am far below this level of knowledge. But I can contribute a test that (maybe) can help. I have rewritten the query so it JOINs the varchar() fields (in fact all fields except the IDPK) at the last INNER JOIN. Though there is one more JOIN, the query is more than 5 times faster (1975.312 ms) :-) That confirms my thought that passing the data up through multiple levels of join is what's killing us. I'll work on a solution. This will of course be even less back-patchable to 8.0.* than Ogawa's work, but hopefully it will fix the issue for 8.1. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests to fail
Bruce Momjian pgman@candle.pha.pa.us writes: I am wondering if we should just process long long args and %$ args, and pass everything else to the native snprintf. AFAICS this is a non-starter --- how will you construct the call to snprintf? Or even vsnprintf? C doesn't provide the tools you need to make it happen. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests
Tom Lane wrote: Bruce Momjian pgman@candle.pha.pa.us writes: I am wondering if we should just process long long args and %$ args, and pass everything else to the native snprintf. AFAICS this is a non-starter --- how will you construct the call to snprintf? Or even vsnprintf? C doesn't provide the tools you need to make it happen. Couldn't you spin through the varargs and reconstruct a new one? Is there no way to create a va_arg va_list structure in C? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression tests to fail
Bruce Momjian pgman@candle.pha.pa.us writes: Is there no way to create a va_arg va_list structure in C? Exactly. The standard lets you *read out* from such a structure, but there's no provision for creating one on-the-fly. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] writes: That is hardly anything that I would feel comfortable with. Lets break this down into all the areas that are ambiguous: There isn't anything ambiguous about this, nor is it credible that there are implementations that don't follow the intent of the spec. How do you know the intent of the spec? I have seen no meta discussion about the behavior of the file descriptor integer returned from open. The Steven's book makes no such assumptions, and the steven's book (Advanced Programming in the UNIX Environment) is what people reference. Consider the standard paradigm for replacing stdout: you close(1) and then open() the target file. If the open() doesn't pick 1 as the fd, you're screwed. Every shell in the world would break atop such an implementation. I said that stdin, stdout, and stderr would be treated differently as they are on all platforms. It may well be the case that saving 4 bytes per VFD is useless micro-optimization. But the code isn't broken as it stands. It most likely is not broken as it is, but it would be interesting to put an assert(fd 32768) in the code and see if it ever breaks. Never the less, the spec DOES call for file fds to be a machine int. All acceptable coding practices would demand that since the API spec calls for an int, the application should use an int. This is the sort of thing that is caught and fixed in any standard code review. Why is this an argument? What am I missing that you are defending? ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] writes: That is hardly anything that I would feel comfortable with. Lets break this down into all the areas that are ambiguous: There isn't anything ambiguous about this, nor is it credible that there are implementations that don't follow the intent of the spec. How do you know the intent of the spec? I have seen no meta discussion about the behavior of the file descriptor integer returned from open. The Steven's book makes no such assumptions, and the steven's book (Advanced Programming in the UNIX Environment) is what people reference. My copy of APUE says on page 49: The file descriptor returned by open is the lowest numbered unused descriptor. This is used by some applications to open a new file on standard input, standard output, or standard error. Unless someone can show there's an actual problem this discussion seems quite pointless. cheers andrew ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] signed short fd
My copy of APUE says on page 49: The file descriptor returned by open is the lowest numbered unused descriptor. This is used by some applications to open a new file on standard input, standard output, or standard error. Yes, I'll restate my questions: What is meant by unused? Is it read to mean that a higher number file is *never* returned if there is a lower number that has been used and is now available? Is that something we can 100% absolutely depend on. On All curent and future platforms? It is a stupid idea to truncate the upper bytes of an integer without good reason. I can see LOTS of reasons why this will break something in the future. The upper bits may be used to identify storage media or characteristics. My point is that the spec calls for an int, PostgreSQL should use an int. Unless someone can show there's an actual problem this discussion seems quite pointless. The point is that this *is* silly, but I am at a loss to understand why it isn't a no-brainer to change. Why is there a fight over a trivial change which will ensure that PostgreSQL aligns to the documented behavior of open() ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] wrote: The point is that this *is* silly, but I am at a loss to understand why it isn't a no-brainer to change. Why is there a fight over a trivial change which will ensure that PostgreSQL aligns to the documented behavior of open() (Why characterise this as a fight, rather than a discussion? Perhaps it is because of the same combative, adversarial attitude you seem to bring to every discussion you're involved in on -hackers...) Anyway, I agree, there's no point keeping it a short; I highly doubt this would actually be a problem, but we may as well change it to an int. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] invalidating cached plans
In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: One possible approach is to do the invalidation on a sufficiently coarse grain that we don't care. For example, I would be inclined to make any change in a table's schema invalidate all plans that use that table at all; that would then subsume the constraint problem for instance. This doesn't solve the inlined function problem however. How about using an even coarser grain? Whenever something in the database in question changes, blindly throw away all cached plans for this DB. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] invalidating cached plans
Harald Fuchs wrote: In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: One possible approach is to do the invalidation on a sufficiently coarse grain that we don't care. For example, I would be inclined to make any change in a table's schema invalidate all plans that use that table at all; that would then subsume the constraint problem for instance. This doesn't solve the inlined function problem however. How about using an even coarser grain? Whenever something in the database in question changes, blindly throw away all cached plans for this DB. We could, but the creation of a single temp table would invalidate all caches, and temp table creation might be pretty frequent. One idea would be to record if the function uses non-temp tables, temp tables, or both, and invalidate based on the type of table being invalidated, rather than the table name itself. I can imagine this hurting temp table caching, but at least functions using regular tables would not be affected, and functions using temp tables would work reliably. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] invalidating cached plans
Bruce Momjian wrote: One idea would be to record if the function uses non-temp tables, temp tables, or both, and invalidate based on the type of table being invalidated, rather than the table name itself. I can imagine this hurting temp table caching, but at least functions using regular tables would not be affected, and functions using temp tables would work reliably. It seems to me it's not _that_ difficult to invalidate plans at a more granular level (based on the individual database objects they depend upon). Inlined functions need to be handled, but that is doable -- it just needs some work. The efficiency win of not needlessly throwing away cached plans is worth investing some effort, I think. -Neil ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] invalidating cached plans
Harald Fuchs [EMAIL PROTECTED] writes How about using an even coarser grain? Whenever something in the database in question changes, blindly throw away all cached plans for this DB. If we clearly define what is something in database in question, we have to trace all the objects the query will touch. There are two difficulities: First, even if we can trace all the changes to the objects we will touch, it is still difficult to differenciate what changes do not invalidate the plan, what do. For instance, if random() function changes its behavior in two ways, (1) change its returned precision, then there is no problem of our plan; (2) change its distribution, then it might be a problem of our plan. A fast solution to this problem is to discard all the plans once the referencing object changes (no matter what change). Second (as Tom says), some changes can hardly be traced. For example, we only use function A. But function A cites function B, function B cites function C. when C changes, how do we know that we should worry about our plan? Maybe we not only need caller-graph, we also need callee-graph. But I am afraid this will be a big cost. A fast solution is that we forbidden some kind of query to be cached. Regards, Qingqing ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] invalidating cached plans
This may be totally irrelevant: Our current load distributors, like pgpool, have no way of knowing the side effects of backend functions. It would be interesting if the client could send each potential query to the master saying, execute this query if there are side effects, otherwise do no operation and and let me execute this read-only query on a replicated copy. You can go 'SET TRANSACTION READ ONLY;' or something... Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] invalidating cached plans
On Mon, 2005-03-14 at 20:06 -0500, Bruce Momjian wrote: Harald Fuchs wrote: In article [EMAIL PROTECTED], Tom Lane [EMAIL PROTECTED] writes: One possible approach is to do the invalidation on a sufficiently coarse grain that we don't care. For example, I would be inclined to make any change in a table's schema invalidate all plans that use that table at all; that would then subsume the constraint problem for instance. This doesn't solve the inlined function problem however. How about using an even coarser grain? Whenever something in the database in question changes, blindly throw away all cached plans for this DB. We could, but the creation of a single temp table would invalidate all caches, and temp table creation might be pretty frequent. One idea would be to record if the function uses non-temp tables, temp tables, or both, and invalidate based on the type of table being invalidated, rather than the table name itself. I can imagine this hurting temp table caching, but at least functions using regular tables would not be affected, and functions using temp tables would work reliably. Too coarse I think, especially with schemas being considered user workspaces where they are free to add or modify their structures as they like (for maintenance, reports, temporary storage, etc.) but there are a significant number of prepared statements in the controlled segments of the database. This would cause the system to hiccup fairly regularly still when a couple hundred connections are forced to replan their queries. -- ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] invalidating cached plans
Qingqing Zhou wrote: Second (as Tom says), some changes can hardly be traced. For example, we only use function A. But function A cites function B, function B cites function C. when C changes, how do we know that we should worry about our plan? I don't see that this is a major problem. If a plan A invokes a function B, then changes to B will need to invalidate A; that should be pretty easy to arrange. If B is a PL/PgSQL function that invokes a function C, it will probably cache a plan involving C. But when C changes, we need only flush B's cached plan, _not_ A -- as far as A is concerned, the operation of B is a blackbox. The only exception is when B is a SQL function that is inlined, but we can handle that separately. Regarding performance, the important point is that a DDL command pushes changes out to backends to invalidate cached plans -- a plan doesn't need to poll to see if there have been any changes to objects it depends upon. And on a production system, DDL should usually be infrequent (the primary exception is temp table creation/destruction, but we can potentially optimize for that since it is backend-local). Or am I missing your point? -Neil ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] TODO item: support triggers on columns
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Lane wrote: http://developer.postgresql.org/docs/postgres/catalog-pg-trigger.html says currently unused and a desultory search through the sources confirms that. I imagine it was put there with the intent of adding column info later on. However, being a fixed-width array, I don't actually see that it would be useful for anything much ... you'd at least want to change it to a normal array. I grepped the sources and saw the same: I just wanted to make sure that nobody had any plans on it. If so, speak now, for I am planning on using it. As for the type, I presume you mean something like grolist in pg_group would be more applicable than indkey in pg_index. BTW, if you don't have a convenient way of grepping the entire PG source tree and quickly viewing all the hits on a particular symbol, I *highly* recommend setting up something that can do that. I use emacs + glimpse but there are probably newer tools out there. I actually use emacs and glimpse, but not together. I won't bother to ask if emacs has a glimpse-mode: it's emacs after all, so of course it does. :) I'll have to give it a shot. Thank, - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200503140712 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFCNX/AvJuQZxSWSsgRAiq8AKD6+5GGa8Eh28acVHvx/0RZWpKBwwCgyDeD GojL9ZIFiUUDT0ZjIsuQ4Dk= =QivK -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] signed short fd
[EMAIL PROTECTED] wrote: The point is that this *is* silly, but I am at a loss to understand why it isn't a no-brainer to change. Why is there a fight over a trivial change which will ensure that PostgreSQL aligns to the documented behavior of open() (Why characterise this as a fight, rather than a discussion? Perhaps it is because of the same combative, adversarial attitude you seem to bring to every discussion you're involved in on -hackers...) I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. Anyway, I agree, there's no point keeping it a short; I highly doubt this would actually be a problem, but we may as well change it to an int. And this is my point. There are things that are no brainers, and a few times I have been completely dumbfounded as to the source of resistence. Silently truncating the upper 2 bytes of data type declared as an int is a bug. I can't believe anyone would defend it, but here it happens. Maybe it is me. I know I'm stubborn and confrontational, personally I've wished I could be different, but I'm 42 so I guess I'm not going to change any time soon. Regardless of the source, if you want code to be portable, you have to take APIs at face value. Any assumptions you think you can make are by definition wrong. Allow the API authors the space to handle what they need to handle. Assuming a specific behavior is dangerous. Is it currently a problem, most likely not, but since there is no downside, why leave it lurking to bite us? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] signed short fd
I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people... ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] signed short fd
At 2005-03-14 16:25:22 -0500, [EMAIL PROTECTED] wrote: The file descriptor returned by open is the lowest numbered unused descriptor. [...] What is meant by unused? Perhaps you should actually look at the standard. The open( ) function shall return a file descriptor for the named file that is the lowest file descriptor not currently open for that process. The close( ) function shall deallocate the file descriptor indicated by fildes. To deallocate means to make the file descriptor available for return by subsequent calls to open( ) or other functions that allocate file descriptors. Is it read to mean that a higher number file is *never* returned if there is a lower number that has been used and is now available? Yes. -- ams ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] signed short fd
Christopher Kings-Lynne wrote: I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people... I actually met him _briefly_ at Linuxworld in Boston. He just said hi, then disappeared. :-) -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] signed short fd
Christopher Kings-Lynne wrote: I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people... Is it fixed? I actually met him _briefly_ at Linuxworld in Boston. He just said hi, then disappeared. :-) Bruce, I did want to meet you to a greater extent, but you we surrounded by people and looked quite busy. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] signed short fd
Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people... Is it fixed? Yeah, hi Mark :) Chris ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] signed short fd
Mark Woodward wrote: Christopher Kings-Lynne wrote: I really don't intend to do that, and it does seem to happen a lot. I am the first to admit I lack tact, but often times I view the decisions made as rather arbitrary and lacking a larger perspective, but that is a rant I don't want to get right now. Perhaps it's your lack of a real name and complete anonyminity (hence invulnerablility) that gets to people... Is it fixed? Wow, he comes out of the shadows. :-) I actually met him _briefly_ at Linuxworld in Boston. He just said hi, then disappeared. :-) Bruce, I did want to meet you to a greater extent, but you we surrounded by people and looked quite busy. Yea, I was just teasing. It was a very busy conference. I remember at night just wanting to turn myself off. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] signed short fd
On Mon, Mar 14, 2005 at 10:45:51PM -0500, Bruce Momjian wrote: Mark Woodward wrote: Bruce, I did want to meet you to a greater extent, but you we surrounded by people and looked quite busy. Yea, I was just teasing. It was a very busy conference. I remember at night just wanting to turn myself off. Were you able to? That'd make a very cool trick. -- Alvaro Herrera ([EMAIL PROTECTED]) That sort of implies that there are Emacs keystrokes which aren't obscure. I've been using it daily for 2 years now and have yet to discover any key sequence which makes any sense.(Paul Thomas) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [pgsql-hackers-win32] UNICODE/UTF-8 on win32
Peter Eisentraut wrote: o Disallow encodings like UTF8 which PostgreSQL supports but the operating system does not (already disallowed by pginstaller) I think the warning that initdb shouts out is already enough for this. I don't think we want to disallow this for people who know what they are doing. I have updated the Win32 TODO item: o Add support for Unicode To fix this, the data needs to be converted to/from UTF16/UTF8 so the Win32 wcscoll() can be used, and perhaps other functions like towupper(). However, UTF8 already works with normal locales but provides no ordering or character set classes. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] signed short fd
Alvaro Herrera wrote: On Mon, Mar 14, 2005 at 10:45:51PM -0500, Bruce Momjian wrote: Mark Woodward wrote: Bruce, I did want to meet you to a greater extent, but you we surrounded by people and looked quite busy. Yea, I was just teasing. It was a very busy conference. I remember at night just wanting to turn myself off. Were you able to? That'd make a very cool trick. No, but I have wished to have that switch on my children sometimes. :-) -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] invalidating cached plans
Neil Conway wrote: - it is the responsibility of the call site managing the prepared plan to check whether a previously prepared plan is invalid or not -- and to take the necessary steps to replan it when needed. Does this mean that clients that use PREPARE/Parse need to handle plan invalidated as a possible response to EXECUTE/Bind, or will the backend keep the query string / parse tree around and replan on next execution? -O ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] invalidating cached plans
Oliver Jowett wrote: Does this mean that clients that use PREPARE/Parse need to handle plan invalidated as a possible response to EXECUTE/Bind, or will the backend keep the query string / parse tree around and replan on next execution? The latter -- the client won't be aware that replanning took place. (If your prepared queries take minutes of planning time, perhaps this is something you *would* like to be made aware of, however...) -Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] invalidating cached plans
Neil Conway [EMAIL PROTECTED] writes I don't see that this is a major problem. If a plan A invokes a function B, then changes to B will need to invalidate A; that should be pretty easy to arrange. If B is a PL/PgSQL function that invokes a function C, it will probably cache a plan involving C. But when C changes, we need only flush B's cached plan, _not_ A -- as far as A is concerned, the operation of B is a blackbox. This is the key point (say this is point_1) - we must make sure how deep we have to go to check validity. So if the plan of A will not reply on any result information of B, say returned/affected row count of B, then it is ok. The only exception is when B is a SQL function that is inlined, but we can handle that separately. I don't quite understand the difference between a SQL function and a PL/PgSQL function here - since there is a overlapped functionality that we could implement by SQL function or by PL/PgSQL function. Regarding performance, the important point is that a DDL command pushes changes out to backends to invalidate cached plans -- a plan doesn't need to poll to see if there have been any changes to objects it depends upon. And on a production system, DDL should usually be infrequent (the primary exception is temp table creation/destruction, but we can potentially optimize for that since it is backend-local). Yes, it is DDL's responsibility to do invalidation, and the query should never worry about the cached plan it will use. So when a DDL comes, it has to know all the objects it affects directly(no need to go deeper, based on point_1), then for each plan in the cache we check if they are directly(based on point_1) related to these changed objects. Regards, Qingqing ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] idea for concurrent seqscans
Added to TODO list: * Allow sequential scans to take advantage of other concurrent sequentiqal scans, also called Synchronised Scanning --- Jeff Davis wrote: I had an idea that might improve parallel seqscans on the same relation. If you have lots of concurrent seqscans going on a large relation, the cache hit ratio is very low. But, if the seqscans are concurrent on the same relation, there may be something to gain by starting a seqscan near the page being accessed by an already-in-progress seqscan, and wrapping back around to that start location. That would make some use of the shared buffers, which would otherwise just be cache pollution. I made a proof-of-concept implementation, which is entirely in heapam.c, except for one addition to the HeapScanDesc struct in relscan.h. It is not at all up to production quality; there are things I know that need to be addressed. Basically, I just modified heapam.c to be able to start at any page in the relation. Then, every time it reads a new page, I have it mark the relation's oid and the page number in a shared mem segment. Everytime a new scan is started, it reads the shared mem segment, and if the relation's oid matches, it starts the scan at the page number it found in the shared memory. Otherwise, it starts the scan at 0. There are a couple obvious issues, one is that my whole implementation doesn't account for reverse scans at all (since initscan doesn't know what direction the scan will move in), but that shouldn't be a major problem since at worst it will be the current behavior (aside: can someone tell me how to force reverse scans so I can test that better?). Another is that there's a race condition with the shared mem, and that's out of pure laziness on my part. This method is really only effective at all if there is a significant amount of disk i/o. If it's pulling the data from O/S buffers the various scans will diverge too much and not be using eachother's shared buffers. I tested with shared_buffers=500 and all stats on. I used 60 threads performing 30 seqscans each in my script ssf.rb (I refer to my modification as sequential scan follower or ssf). Here are some results with my modifications: $ time ./ssf.rb # my script real4m22.476s user0m0.389s sys 0m0.186s test=# select relpages from pg_class where relname='test_ssf'; relpages -- 1667 (1 row) test=# select count(*) from test_ssf; count 20 (1 row) test=# select pg_stat_get_blocks_hit(17232) as hit, pg_stat_get_blocks_fetched(17232) as total; hit | total +- 971503 | 3353963 (1 row) Or, approx. 29% cache hit. Here are the results without my modifications: test=# select relpages from pg_class where relname='test_ssf'; relpages -- 1667 (1 row) test=# select count(*) from test_ssf; count 20 (1 row) test=# select pg_stat_get_blocks_hit(17231) as hit, pg_stat_get_blocks_fetched(17231) as total; hit | total +- 19 | 3353963 (1 row) Or, approx. 6% cache hit. Note: the oid is different, because I have two seperately initdb'd data directories, one for the modified version, one for the unmodified 8.0.0. This is the first time I've really modified the PG source code to do anything that looked promising, so this is more of a question than anything else. Is it promising? Is this a potentially good approach? I'm happy to post more test data and more documentation, and I'd also be happy to bring the code to production quality. However, before I spend too much more time on that, I'd like to get a general response from a 3rd party to let me know if I'm off base. Regards, Jeff Davis [ Attachment, skipping... ] [ Attachment, skipping... ] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] invalidating cached plans
Tom Lane wrote: I hadn't really gotten as far as working out a reasonable API for the module. The $64 question seems to be what is the input: a textual query string, a raw parse analysis tree, or what? It should be easy enough to accept either, and then convert from the query string into a raw parse tree. The storage of the parse tree should probably be owned by the cache module, so that might introduce some slight complications -- like exposing a MemoryContext for callers to allocate inside, but it should be doable anyway. And what sort of key does the caller want to use to re-find a previously cached plan? Do we want to share plans between call sites? If so, an API like this comes to mind is: struct CachedPlan { List *query_list; List *plan_list; char *query_str; int nargs; Oid *argtypes; int refcnt; /* various other info -- perhaps memory context? */ }; struct CachedPlan *cache_get_plan(const char *query_str, int nargs, Oid *argtypes); void cache_destroy_plan(struct CachedPlan *plan); Where cache_get_plan() would lookup the query string in a hash table (mapping strings = CachedPlans). If found, it would check if the plan had been invalidated, and would replan it if necessary, then bump its reference count and return it. If not found, it would create a new CachedPlan, parse, rewrite and plan the query string, and return it. This would mean that within a backend we could share planning for queries that happened to be byte-for-byte identical. - it would be nice to do the hash lookup on the result of raw_parser() rather than the query string itself, since we would be able to share more plans that way. Not sure if that's worth doing, though. - how do we manage storage? The reference counting above is off-the-cuff. Perhaps there's a better way to do this... (Of course, if a plan has refcnt 0, we can still remove it from memory if needed, since any call site should provide sufficient information to reconstruct it) This would also make it somewhat more plausible to share the query cache among backends, but I'm not interested in pursuing that right now. (BTW, another thing to consider is how the rewriter will effect a plan's dependencies: I think we should probably invalidate a plan when a modification is made to a view or rule that affected the plan. This should also be doable, though: we could either modify the rewriter to report these dependencies, or trawl the system catalogs looking for rules that apply to any of the relations in the query. The latter method would result in spurious invalidations, in the case of rules with a WHERE clause.) -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] invalidating cached plans
Neil Conway [EMAIL PROTECTED] writes: (BTW, another thing to consider is how the rewriter will effect a plan's dependencies: I think we should probably invalidate a plan when a modification is made to a view or rule that affected the plan. This issue goes away as long as you follow the rule that any change to a table's schema invalidates all plans that mention the table. Views and tables that have rules will still be mentioned in the rangetable of the resulting plan, even if they aren't part of the active plan. (We use that for access rights checking.) Too tired to consider the other details at the moment... regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] invalidating cached plans
Neil Conway [EMAIL PROTECTED] writes: Oliver Jowett wrote: Does this mean that clients that use PREPARE/Parse need to handle plan invalidated as a possible response to EXECUTE/Bind, or will the backend keep the query string / parse tree around and replan on next execution? The latter -- the client won't be aware that replanning took place. It seems possible that replanning would fail for some reason, in which case the EXECUTE would get an error of a kind you maybe weren't expecting during EXECUTE. Other than that it seems it should be transparent. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] invalidating cached plans
Qingqing Zhou wrote: I don't quite understand the difference between a SQL function and a PL/PgSQL function here - since there is a overlapped functionality that we could implement by SQL function or by PL/PgSQL function. The difference is between an inlined function (which is integrated directly into the plan of the query that invokes it) and a function that is not inlined. Only SQL functions can be inlined, and only some SQL functions at that. With an out-of-line function, we just invoke the function via the fmgr infrastructure -- if it chooses to create any plans (e.g. via SPI), that is its own business, and they would be treated as distinct plans by the cache module. -Neil ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] invalidating cached plans
Qingqing Zhou [EMAIL PROTECTED] writes: This is the key point (say this is point_1) - we must make sure how deep we have to go to check validity. The recorded plan has to somehow mention all the inlined functions that were expanded out of existence. There might be several levels of such things, but as long as we remember them all for invalidation purposes, I don't see a problem. A related example: an operator might point to an inline-able function. Neither the operator nor the function will appear explicitly in the finished plan tree, but they'd better both be listed in the side list of invalidation dependencies. I don't quite understand the difference between a SQL function and a PL/PgSQL function here - The planner doesn't know anything about inlining plpgsql functions. So while the function might have its own invalidation issues to deal with internally, a plan that calls it cannot need invalidation because of that. Obviously these issues depend a lot on the internal behavior of the planner, so we are going to have to fix the planner to record the identity of every object that it looks at without explicitly mentioning it in the final plan. No other part of the system can be expected to track all that. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] PQexecParams
Will PQexecParams automatically escape bytea data as it goes in, or must one run it through PQescapeBytea first? Chris ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] PHP stuff
I'm currently adding support for the v3 protocol in PHP pgsql extension. I'm wondering if anyone minds if I lift documentation wholesale from the PostgreSQL docs for the PHP docs for these functions. For instance, the fieldcodes allowed for PQresultErrorField, docs on PQtransactionStatus, etc. I don't really see the point of me 'rephrasing' it all :) Cheers, Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq