Re: [HACKERS] Different length lines in COPY CSV
Tom Lane said: Bruce Momjian pgman@candle.pha.pa.us writes: Christopher Kings-Lynne wrote: Is there any way to force COPY to accept that there will be lines of different length in a data file? I suppose we could have a TRAILINGNULL flag to COPY but because few ask for this feature, it hardly seems worth it. There is no chance that we'll ever be able to cope with every insane file format that some benighted program claims is CSV. The harder we try, the more we will lose the ability to detect data errors at all; not to mention the likely negative consequences for the readability and performance of the COPY code. I think fix it with a perl script is a very reasonable answer for cases like this one. I agree. The COPY code is probably on the edge of maintainability now. Our CSV routines accept a wide variety of imports formats, but a fixed number of columns is required. Maybe we need a pgfoundry project with some general perl CSV munging utilities - this issue comes up often enough. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] psql patch: new host/port
On Dec 9, 2005, at 18:10 , David Fetter wrote: Please find enclosed a patch that lets you use \c to connect (optionally) to a new host and port without exiting psql. I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] pg_relation_size locking
Until recently, pg_relation_size used SearchSysCache to locate the relation to examine, and calculated the file location from that information. Starting with dbsize.c V1.5 (committed after Beta2), relation_open(.., AccessShareLock) is used. This is very unfortunate because it will not allow to observe a table growing while it is populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. After reverting to 1.4, everything was fine again. Can we have this reverted/fixed? Regards, Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] psql patch: new host/port
Michael Glaesemann wrote: I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? You don't need to be familiar with the psql code to know that it would be pretty stupid if client programs could override the server authentication setup. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] psql patch: new host/port
On Dec 12, 2005, at 20:33 , Peter Eisentraut wrote: Michael Glaesemann wrote: I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? You don't need to be familiar with the psql code to know that it would be pretty stupid if client programs could override the server authentication setup. I'm sorry if I wasn't clear. My point was I'm not familiar enough with the code to see if this implementation is secure. I do indeed realize that clients bypassing server authentication is a Bad Thing. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] psql patch: new host/port
Michael Glaesemann said: On Dec 12, 2005, at 20:33 , Peter Eisentraut wrote: Michael Glaesemann wrote: I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? You don't need to be familiar with the psql code to know that it would be pretty stupid if client programs could override the server authentication setup. I'm sorry if I wasn't clear. My point was I'm not familiar enough with the code to see if this implementation is secure. I do indeed realize that clients bypassing server authentication is a Bad Thing. The patch is to the client only, not even to libpq, so of course no auth bypass is involved. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] psql patch: new host/port
On Dec 12, 2005, at 21:32 , Andrew Dunstan wrote: The patch is to the client only, not even to libpq, so of course no auth bypass is involved. Cool. Thanks for the explanation, Andrew. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Please Help: PostgreSQL Query Optimizer
Defaulat values of various parameters in PostgreSQL: #random_page_cost = 4 # units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) #effective_cache_size = 1000# typically 8KB each Since sequential access is not significantly faster than random access in a MMDB, random_page_cost will be approximately same as sequential page fetch cost. If we make both sequential_page_fetch_cost and random_page_cost to 1, then we need to increase the various cpu_* paramters by multiplying the default values with appropriate Scaling Factor. Now, we need to determine this Scaling Factor. Through googling, i found that Normal Disk has external data transfer rate of around 40MBps, where as Main Memory has Data transfer rate ranging from 1.6GBps to 2.8GBps. As we can see, the ratio between Disk and Main Memory data transfer rates is around 50. Then, if we multiply all cpu_* paramters by 50, the resulting values will be: random_page_cost = 1; cpu_tuple_cost = 0.5; cpu_index_tuple_cost = 0.05; cpu_operator_cost = 0.0125; Would it be a suitable approach ? We request all of u to give comments/suggestions on this calcualations. Thanking You. On Sun, 11 Dec 2005, Tom Lane wrote: [ trimming cc list to something sane ] Anjan Kumar. A. [EMAIL PROTECTED] writes: In Main Memory DataBase(MMDB) entire database on the disk is loaded on to the main memory during initial startup of the system. There after all the references are made to database on the main memory. When the system is going to shutdown, we will write back the database on the main memory to disk. Here, for the sake of recovery we are writing log records on to the disk during the transaction execution. Don't you get 99.9% of this for free with Postgres' normal behavior? Just increase shared_buffers. Can any one tell me the modifications needs to be incorporated to PostgreSQL, so that it considers only Processing Costs during optimization of the Query. Assuming that a page fetch costs zero is wrong even in an all-in-memory environment. So I don't see any reason you can't maintain the convention that a page fetch costs 1.0 unit, and just adjust the other cost parameters in the light of a different idea about what that actually means. Will it be sufficient, if we change the default values of above paramters in src/include/optimizer/cost.h and src/backend/utils/misc/postgresql.conf.sample as follows: random_page_cost = 4; cpu_tuple_cost = 2; cpu_index_tuple_cost = 0.2; cpu_operator_cost = 0.05; You'd want random_page_cost = 1 since there is presumably no penalty for random access in this context. Also, I think you'd want cpu_operator_cost a lot higher than that (maybe you dropped a decimal place? You scaled the others up by 200 but this one only by 20). It's entirely possible that the ratios of the cpu_xxx_cost values aren't very good and will need work. In the past we've never had occasion to study them very carefully, since they were only marginal contributions anyway. regards, tom lane -- Regards. Anjan Kumar A. MTech2, Comp Sci., www.cse.iitb.ac.in/~anjankumar __ A woman physician has made the statement that smoking is neither physically defective nor morally degrading, and that nicotine, even when indulged to in excess, is less harmful than excessive petting. -- Purdue Exponent, Jan 16, 1925 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Log of CREATE USER statement
Dear friends, I thank you for all replies. If you permit, I'd like to present my modest view of the problem. I agree with Tom when he says: ... if the user wishes the password to be secure, he needs to encrypt it on the client side. Anything else is just the illusion of security. and with Bruce: ... I see no way to secure this really since the administrator typically has control over the database installation. There isn't a 100% secure system. So, I'm working in a framework to audit all operations over the database. The rastreability is the only one tool to identify actions of an untrustworthy DBA. In this context, the identity of the user may be protected and it's obvious that the protection of user password is extremely important for preventing that someone can login as another user. From there it came the concern with the register of the password in plaintext in the archives and log files. I had not thought about the history and the activity display. It´s one another vulnerability... I cannot see another solution not to be overhead in the logging code. The idea of to provide a backslash command in psql is very good. But, what about pgAdmin, phpPgAdmin and other management tools? I think that these tools, for its easiness of use, are important in the use dissemination of PostgreSQL. I know that I did not contribute with new facts to the discussion. I would like, only, to stand out its importance and, one more time, to be thankful for the attention of all. Best regards, Ricardo Vaz ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Log of CREATE USER statement
On 12/9/05, Peter Eisentraut [EMAIL PROTECTED] wrote: Tom Lane wrote: As I said already, if the user wishes the password to be secure, he needs to encrypt it on the client side. Maybe we should provide a backslash command in psql for secure password entry, say, \password [username]. This would then ask for the password through a somewhat secure, unlogged channel, encrypt it, and send an ALTER ROLE command to the server. Letting createuser.c hash the password would be the biggest win. -- marko ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] default resource limits
The discussion was a bit inconclusive last time, so I would like to renew my proposal for more generous default resource limits. Nearly everyone seems to agree that the default for max_fsm_pages is woefully low, so I would like to have the default for this set unconditionally to 200,000 rather than 20,000. The cost would be just over 1Mb of shared memory, if the docs are correct. Alternatively, we could put this into the mix that is calculated by initdb, scaling it linearly with shared_buffers (but with the default still at 200,000). I would also like to propose a more modest increase in max_connections and shared_buffers by a factor of 3. This number is picked with some caution, and is intended to ensure that a) we don't eat all the resources on a modestly resourced machine, and b) we can accomodate 1 connection per apache child in a typical (unix) apache configuration. If we were slightly less conservative, we'd at least take default max_connections to 400, which is the maximum number of apache threads on Windows, in the default configuration.) Since the time when these defaults were set, the price of memory has plummeted, and the typical memory population of an entry level machine has roughly quadrupled, from my observation, so it seems to me that by revising these limits we are just adjusting to the new reality. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Backslashes in string literals
On Sat, Dec 10, 2005 at 8:01 pm, in message [EMAIL PROTECTED], Bruce Momjian pgman@candle.pha.pa.us wrote: Kevin Grittner wrote: Since the non- standard behavior is in the lexer, I couldn't see any reasonable way to base it on a runtime switch. I'm curious what is intended here. Can anyone give a one- paragraph explanation of how this configuration option will work? Have you read our documentation? http://www.postgresql.org/docs/8.1/static/sql- syntax.html#SQL- SYNTAX- CONSTANTS http://www.postgresql.org/docs/8.1/static/runtime- config- compatible.html#RUNTI ME- CONFIG- COMPATIBLE- VERSION Yes. Between those and the release notes, I don't know what additional information you want. In the future you will set standard_conforming_strings to on and backslashes will be treated literally. Perhaps my language was ambiguous. I'm not curious about the intended behavior from a user perspective, but what I might have missed in the source code which would have allowed me to write my patch to better comply with the documentation you cited. Since the problem is in the lexer, the only way I could see to implement it as a run-time configuration option, rather than a compile-time option, would be to duplicate the lexer and maintain two sets of rules in parallel. I generally try to avoid maintaining two parallel copies of code. I'm curious whether I missed some other programming approach. -Kevin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] psql patch: new host/port
On Mon, Dec 12, 2005 at 09:20:57PM +0900, Michael Glaesmann wrote: On Dec 12, 2005, at 21:32 , Andrew Dunstan wrote: The patch is to the client only, not even to libpq, so of course no auth bypass is involved. Cool. Thanks for the explanation, Andrew. Is the patch suitable for a re-send to -patches? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Different length lines in COPY CSV
Andrew Dunstan [EMAIL PROTECTED] writes: The COPY code is probably on the edge of maintainability now. Our CSV routines accept a wide variety of imports formats, but a fixed number of columns is required. Maybe we need a pgfoundry project with some general perl CSV munging utilities - this issue comes up often enough. What's been suggested in the past is some sort of standalone file-format-conversion utility, which could deal with this sort of stuff without having to also deal with all the backend-internal considerations that COPY must handle. So (at least in theory) it'd be simpler and more maintainable. That still seems like a good idea to me --- in fact, given my druthers I would rather have seen CSV support done in such an external program. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] default resource limits
Andrew Dunstan [EMAIL PROTECTED] writes: Nearly everyone seems to agree that the default for max_fsm_pages is woefully low, so I would like to have the default for this set unconditionally to 200,000 rather than 20,000. The cost would be just over 1Mb of shared memory, if the docs are correct. Alternatively, we could put this into the mix that is calculated by initdb, scaling it linearly with shared_buffers (but with the default still at 200,000). I would also like to propose a more modest increase in max_connections and shared_buffers by a factor of 3. I don't mind having initdb try larger values to see if they work, but if you are suggesting that we try to force adoption of larger settings I'll resist it. Factor of three seems mighty weird. The existing numbers (100 and 1000) at least have the defensibility of being round. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Backslashes in string literals
Kevin Grittner wrote: Between those and the release notes, I don't know what additional information you want. In the future you will set standard_conforming_strings to on and backslashes will be treated literally. Perhaps my language was ambiguous. I'm not curious about the intended behavior from a user perspective, but what I might have missed in the source code which would have allowed me to write my patch to better comply with the documentation you cited. Since the problem is in the lexer, the only way I could see to implement it as a run-time configuration option, rather than a compile-time option, would be to duplicate the lexer and maintain two sets of rules in parallel. I generally try to avoid maintaining two parallel copies of code. I'm curious whether I missed some other programming approach. Oh, that question. :-) We haven't looked yet at what it will require to do this in the lexer, but I think what we will eventually do is to add a more generalized filter to the lexer, and have the actions behave differntly based on the boolean of whether we are using sql-standard strings. If you keep you eye on hackers or the committers messages you will see when we commit the change for 8.2. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Different length lines in COPY CSV
Tom Lane wrote: What's been suggested in the past is some sort of standalone file-format-conversion utility, which could deal with this sort of stuff without having to also deal with all the backend-internal considerations that COPY must handle. So (at least in theory) it'd be simpler and more maintainable. That still seems like a good idea to me --- in fact, given my druthers I would rather have seen CSV support done in such an external program. Why not add hooks into COPY to call the user's massage functions? That way you don't have to read and write the data, then read it again to load it into the database. Mike Pollard SUPRA Server SQL Engineering and Support Cincom Systems, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] psql patch: new host/port
Michael Glaesemann [EMAIL PROTECTED] writes: On Dec 9, 2005, at 18:10 , David Fetter wrote: Please find enclosed a patch that lets you use \c to connect (optionally) to a new host and port without exiting psql. I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? No, security is the server's problem. What's not clear to me about this patch is what's the point. It's certainly not a feature we've heard any requests for. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_relation_size locking
Andreas Pflug [EMAIL PROTECTED] writes: Until recently, pg_relation_size used SearchSysCache to locate the relation to examine, and calculated the file location from that information. Starting with dbsize.c V1.5 (committed after Beta2), relation_open(.., AccessShareLock) is used. This is very unfortunate because it will not allow to observe a table growing while it is populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. Nonsense. After reverting to 1.4, everything was fine again. Can we have this reverted/fixed? Can we have the actual problem explained? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_relation_size locking
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Until recently, pg_relation_size used SearchSysCache to locate the relation to examine, and calculated the file location from that information. Starting with dbsize.c V1.5 (committed after Beta2), relation_open(.., AccessShareLock) is used. This is very unfortunate because it will not allow to observe a table growing while it is populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. Nonsense. Ahem. I'm running Slony against a big replication set. While slon runs COPY foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). pg_locks will show that the AccessShareLock on foo is not granted. Problem is gone with reverted dbsize.c Regards, Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_relation_size locking
Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: Nonsense. Ahem. I'm running Slony against a big replication set. While slon runs COPY foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). pg_locks will show that the AccessShareLock on foo is not granted. That's only possible if Slony is taking AccessExclusive lock; if so, your gripe is properly directed to the Slony folks, not to pg_relation_size which is acting as a good database citizen should. Certainly a plain COPY command does not take AccessExclusive. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_relation_size locking
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: Nonsense. Ahem. I'm running Slony against a big replication set. While slon runs COPY foo(colnamelist) FROM STDIN, I can't execute pg_relation_size(foo_oid). pg_locks will show that the AccessShareLock on foo is not granted. That's only possible if Slony is taking AccessExclusive lock; if so, your gripe is properly directed to the Slony folks, not to pg_relation_size which is acting as a good database citizen should. More precisely, it executes TRUNCATE;COPY at the same time; there might be additional locks to prevent using the table. Still, I see no reason why pg_relation_size shouldn't continue to use SearchSysCache as id did for years now. There's no sense in using locking mechanisms on table foo while reading file system data; pg_class is sufficient to locate the table's files. Regards, Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Different length lines in COPY CSV
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: The COPY code is probably on the edge of maintainability now. Our CSV routines accept a wide variety of imports formats, but a fixed number of columns is required. Maybe we need a pgfoundry project with some general perl CSV munging utilities - this issue comes up often enough. What's been suggested in the past is some sort of standalone file-format-conversion utility, which could deal with this sort of stuff without having to also deal with all the backend-internal considerations that COPY must handle. So (at least in theory) it'd be simpler and more maintainable. That still seems like a good idea to me --- in fact, given my druthers I would rather have seen CSV support done in such an external program. We debated the reasons at the time, and I am not convinced we were wrong - huge bulk loads are a lot simpler if you don't have to call some external program to munge the data first. From time to time people thank me for things I have contributed to in PostgreSQL. The two that get the most thanks by far are CSV support and dollar quoting. Anyway, that's history now. Where would you want this file conversion utility? bin? contrib? pgfoundry? cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] psql patch: new host/port
On Mon, Dec 12, 2005 at 10:19:00AM -0500, Tom Lane wrote: Michael Glaesemann [EMAIL PROTECTED] writes: On Dec 9, 2005, at 18:10 , David Fetter wrote: Please find enclosed a patch that lets you use \c to connect (optionally) to a new host and port without exiting psql. I'm not familiar enough with the psql code to be able to tell, but is this secure? The pg_hba.conf on the new server is enforced, I assume? No, security is the server's problem. What's not clear to me about this patch is what's the point. It's certainly not a feature we've heard any requests for. Informally, I've heard some carping about how you can change DBs on one server, but you have to exit the program if you want to change servers. The change is low-impact because \c continues to work exactly as before when supplied with 3 arguments :) Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_relation_size locking
Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: That's only possible if Slony is taking AccessExclusive lock; if so, your gripe is properly directed to the Slony folks, not to pg_relation_size which is acting as a good database citizen should. More precisely, it executes TRUNCATE;COPY at the same time; there might be additional locks to prevent using the table. Still, I see no reason why pg_relation_size shouldn't continue to use SearchSysCache as id did for years now. There's no sense in using locking mechanisms on table foo while reading file system data; pg_class is sufficient to locate the table's files. The fact that the contrib version did things incorrectly for years is no justification for not fixing it at the time it's taken into the core. You have to have a lock to ensure that the table even exists, let alone that you are looking at the right set of disk files. In the above example, the contrib code would have not done the right thing at all --- if I'm not mistaken, it would have kept handing back the size of the original, pre-TRUNCATE file, since the new pg_class row with the new relfilenode isn't committed yet. So it wouldn't have done what you wish anyway. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_relation_size locking
Tom Lane wrote: In the above example, the contrib code would have not done the right thing at all --- if I'm not mistaken, it would have kept handing back the size of the original, pre-TRUNCATE file, since the new pg_class row with the new relfilenode isn't committed yet. So it wouldn't have done what you wish anyway. It wouldn't have worked anyway because it used the Oid to search the file, not the relfilenode. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Different length lines in COPY CSV
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: What's been suggested in the past is some sort of standalone file-format-conversion utility, Anyway, that's history now. Where would you want this file conversion utility? bin? contrib? pgfoundry? I'd say pgfoundry for starters --- there's no reason to tie it down to server release cycles. Maybe when the thing is fairly mature and doesn't need frequent releases, we could think about whether it ought to be brought into the core distro. However, it likely won't ever be a candidate to become part of core unless it's written in C, and offhand I would judge C to not be the best choice of implementation language for such a thing. This is surely going to be mostly a string-pushing type of problem, so something like perl might be a better bet. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] default resource limits
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Nearly everyone seems to agree that the default for max_fsm_pages is woefully low, so I would like to have the default for this set unconditionally to 200,000 rather than 20,000. The cost would be just over 1Mb of shared memory, if the docs are correct. Alternatively, we could put this into the mix that is calculated by initdb, scaling it linearly with shared_buffers (but with the default still at 200,000). I would also like to propose a more modest increase in max_connections and shared_buffers by a factor of 3. I don't mind having initdb try larger values to see if they work, but if you are suggesting that we try to force adoption of larger settings I'll resist it. OK, works for me. The only thing I suggested might be set in stone was max_fsm_pages; I always envisioned the others being tested as now by initdb. Factor of three seems mighty weird. The existing numbers (100 and 1000) at least have the defensibility of being round. What numbers would you like? If what I suggested seems odd, how about targets of 400 connections, 4000 shared_buffers and 200,000 max_fsm_pages? cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_relation_size locking
Alvaro Herrera wrote: The problem with the original coding was that it used the table Oid to look up the file name, which is wrong. (Test it with a table that has been clustered or an index that has been reindexed.) Um, can't test at the moment. The oldcode used pg_class-relfilnode, which delivers Name of the on-disk file of this relation according to the docs. What's wrong with that? regards, Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Which qsort is used
Seems we don't link against the port/qsort.c - is there any reason for that? My tests indicates our qsort is much much faster than the libc's. Regards, Qingqing ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Which qsort is used
Qingqing Zhou wrote: Seems we don't link against the port/qsort.c - is there any reason for that? My tests indicates our qsort is much much faster than the libc's. We haven't been able to determine if the OS's qsort or pgport's is faster. Right now we only force pgport qsort on Solaris (from configure.in): # Solaris has a very slow qsort in certain cases, so we replace it. if test $PORTNAME = solaris; then AC_LIBOBJ(qsort) fi Are you willing to say that we should always prefer pgport over glibc's qsort()? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Which qsort is used
Bruce Momjian pgman@candle.pha.pa.us wrote Are you willing to say that we should always prefer pgport over glibc's qsort()? At least for Linux and windows. My test is performed on a dataset ranges from 10 to 1500 elements. Each elements contains a 64 bytes garbage character area and an integer key, which is uniformly distributed from 1 to RANGE. RANGE takes values from 2 to 2^31. In all cases, our qsort absolutely wins. Maybe skewed distribution should be tested? Another interesting thing is that the qsort on RANGE=2 or other small number in windows is terriblly slow - our version does not have this problem. The test code could be found here (Note: it mixed with some other experiements I am doing but might be a good start point to construct your own tests): http://www.cs.toronto.edu/~zhouqq/sort.c Regards, Qingqing ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_relation_size locking
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: That's only possible if Slony is taking AccessExclusive lock; if so, your gripe is properly directed to the Slony folks, not to pg_relation_size which is acting as a good database citizen should. More precisely, it executes TRUNCATE;COPY at the same time; there might be additional locks to prevent using the table. Still, I see no reason why pg_relation_size shouldn't continue to use SearchSysCache as id did for years now. There's no sense in using locking mechanisms on table foo while reading file system data; pg_class is sufficient to locate the table's files. The fact that the contrib version did things incorrectly for years is no justification for not fixing it at the time it's taken into the core. You have to have a lock to ensure that the table even exists, let alone that you are looking at the right set of disk files. This would require a lock on pg_class, not table foo, no? In the above example, the contrib code would have not done the right thing at all --- if I'm not mistaken, it would have kept handing back the size of the original, pre-TRUNCATE file, since the new pg_class row with the new relfilenode isn't committed yet. Hm, I see the issue. Interesting enough, I *do* see the size growing. OTOH, when running BEGIN;TRUNCATE against a test table and retrieving pg_relation_size returns the previous relfilenode and size as expected. Regards, Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Which qsort is used
On Mon, 2005-12-12 at 11:50 -0500, Bruce Momjian wrote: Are you willing to say that we should always prefer pgport over glibc's qsort()? glibc's qsort is actually implemented via merge sort. I'm not sure why the glibc folks chose to do that, but as a result, it's not surprising that BSD qsort beats it for typical inputs. Whether we should go to the trouble of second-guessing glibc is a separate question, though: it would be good to see some performance figures for real-world queries. BTW, Luke Lonergan recently posted some performance results for a fairly efficient public domain implementation of qsort to the bizgres list: http://lists.pgfoundry.org/pipermail/bizgres-general/2005-December/000294.html -Neil ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Which qsort is used
On Mon, 12 Dec 2005, Neil Conway wrote: Whether we should go to the trouble of second-guessing glibc is a separate question, though: it would be good to see some performance figures for real-world queries. For qsort, due to its simple usage, I think simulation test should be enough. But we have to consider many situations like cardinality, data distribution etc. Maybe not easy to find real world queries providing so many variations. BTW, Luke Lonergan recently posted some performance results for a fairly efficient public domain implementation of qsort to the bizgres list: http://lists.pgfoundry.org/pipermail/bizgres-general/2005-December/000294.html Ooops, more interesting than the thread itself ;-) Regards, Qingqing ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] number of loaded/unloaded COPY rows
I prepared a patch for Have COPY return the number of rows loaded/unloaded? TODO. (Sorry for disturbing list with such a simple topic, but per warning from Bruce Momjian, I send my proposal to -hackers first.) I used the appending related information to commandTag method which is used for INSERT/UPDATE/DELETE/FETCH commands too. Furthermore, I edited libpq to make PQcmdTuples() interpret affected rows from cmdStatus value for COPY command. (Changes don't cause any compatibility problems for API and seems like work with triggers too.) One of the problems related with the used concept is trying to encapsulate processed number of rows within an uint32 variable. This causes an internal limit for counting COPY when we think it can process billions of rows. I couldn't find a solution for this. (Maybe, two uint32 can be used to store row count.) But other processed row counters (like INSERT/UPDATE) uses uint32 too. What's your suggestions and comments? Regards. Index: src/backend/commands/copy.c === RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v retrieving revision 1.255 diff -u -r1.255 copy.c --- src/backend/commands/copy.c 22 Nov 2005 18:17:08 - 1.255 +++ src/backend/commands/copy.c 12 Dec 2005 17:18:44 - @@ -102,6 +102,7 @@ int client_encoding;/* remote side's character encoding */ boolneed_transcoding; /* client encoding diff from server? */ boolclient_only_encoding; /* encoding not valid on server? */ + uint32 processed; /* # of tuples processed */ /* parameters from the COPY command */ Relationrel;/* relation to copy to or from */ @@ -646,7 +647,7 @@ * Do not allow the copy if user doesn't have proper permission to access * the table. */ -void +uint32 DoCopy(const CopyStmt *stmt) { CopyState cstate; @@ -660,6 +661,7 @@ AclMode required_access = (is_from ? ACL_INSERT : ACL_SELECT); AclResult aclresult; ListCell *option; + uint32 processed; /* Allocate workspace and zero all fields */ cstate = (CopyStateData *) palloc0(sizeof(CopyStateData)); @@ -935,7 +937,7 @@ initStringInfo(cstate-line_buf); cstate-line_buf_converted = false; cstate-raw_buf = (char *) palloc(RAW_BUF_SIZE + 1); - cstate-raw_buf_index = cstate-raw_buf_len = 0; + cstate-raw_buf_index = cstate-raw_buf_len = cstate-processed = 0; /* Set up encoding conversion info */ cstate-client_encoding = pg_get_client_encoding(); @@ -1080,7 +1082,10 @@ pfree(cstate-attribute_buf.data); pfree(cstate-line_buf.data); pfree(cstate-raw_buf); + + processed = cstate-processed; pfree(cstate); + return processed; } @@ -1310,6 +1315,8 @@ VARSIZE(outputbytes) - VARHDRSZ); } } + + cstate-processed++; } CopySendEndOfRow(cstate); @@ -1916,6 +1923,8 @@ /* AFTER ROW INSERT Triggers */ ExecARInsertTriggers(estate, resultRelInfo, tuple); + + cstate-processed++; } } Index: src/backend/tcop/utility.c === RCS file: /projects/cvsroot/pgsql/src/backend/tcop/utility.c,v retrieving revision 1.250 diff -u -r1.250 utility.c --- src/backend/tcop/utility.c 29 Nov 2005 01:25:49 - 1.250 +++ src/backend/tcop/utility.c 12 Dec 2005 17:18:45 - @@ -640,7 +640,12 @@ break; case T_CopyStmt: - DoCopy((CopyStmt *) parsetree); + { + uint32 processed = DoCopy((CopyStmt *) parsetree); + + snprintf(completionTag, COMPLETION_TAG_BUFSIZE, +COPY %u, processed); + } break; case T_PrepareStmt: Index: src/include/commands/copy.h === RCS file: /projects/cvsroot/pgsql/src/include/commands/copy.h,v retrieving revision 1.25 diff -u -r1.25 copy.h --- src/include/commands/copy.h 31 Dec 2004 22:03:28 - 1.25 +++ src/include/commands/copy.h 12 Dec 2005 17:19:07 - @@ -17,6 +17,6 @@ #include nodes/parsenodes.h -extern void DoCopy(const CopyStmt *stmt); +extern uint32 DoCopy(const CopyStmt *stmt); #endif /* COPY_H */ Index: src/interfaces/libpq/fe-exec.c ===
[HACKERS] space for optimalization: DISTINCT without index
Hello I did some test and I can see so DISTINCT works well on indexed columns, but is slow on derived tables without indexes. If I use without distinct group by I get much better times. SELECT DISTINCT a, b FROM tab SELECT a,b FROM tab GROUP BY a, b. Can You Explain it. Thank You Pavel Stehule _ Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Foreign key trigger timing bug?
On 12/9/2005 8:27 PM, Stephan Szabo wrote: On Fri, 9 Dec 2005, Jan Wieck wrote: On 12/8/2005 8:53 PM, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Yeah. I really don't understand it, but it appears to me to be explicitly different in the spec for on delete cascade even compared to the rest of the referential actions. One problem I see is, what do we do if the BEFORE trigger then returns NULL (to skip the delete). The cascaded operations are already done. Do we have to execute the cascaded deletes in a subtransaction or do we disallow the skip in this case? I think we'd have disallow skipping. Especially since skipping would probably end up with a violated constraint. That seems to me to be a sufficient reason to not follow the spec in this respect. A BEFORE trigger should be run BEFORE anything happens, full stop. I can't think of any good reason why the spec's semantics are better. (It's not like our triggers are exactly spec-compatible anyway.) It doesn't lead to a violated constraint. bar references foo on delete cascade, now delete from foo will first delete from bar, then the before trigger on foo skips the delete. That's not the right case I think. Pseudo example: create table a create table b references a on delete cascade create before trigger on b that sometimes skips a delete to b insert into a and b. delete from a -- AFAICS, you can end up with a row in b that no longer has its associated row in a since the a row will be deleted but there's no guarantee its referencing rows in b will have successfully been deleted. Yes, you can deliberately break referential integrity with that. But you know what? I think the overall waste of performance and developer time, required to fix this rather exotic (and idiotic) problem, is too high to seriously consider it. Jan And besides, as the other post (Trigger preventing delete causes circumvention of FK) in GENERAL shows, triggers can break RI anyway. Yeah, although fixing the cases where the trigger interacted badly with before triggers was the point of the posts that started this. The original problem was with a case where it acted differently, but it's fairly related. ---(end of broadcast)--- TIP 6: explain analyze is your friend -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_relation_size locking
Andreas Pflug wrote: Until recently, pg_relation_size used SearchSysCache to locate the relation to examine, and calculated the file location from that information. Starting with dbsize.c V1.5 (committed after Beta2), relation_open(.., AccessShareLock) is used. This is very unfortunate because it will not allow to observe a table growing while it is populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. After reverting to 1.4, everything was fine again. The diff: http://projects.commandprompt.com/projects/public/pgsql/changeset/23120 The problem with the original coding was that it used the table Oid to look up the file name, which is wrong. (Test it with a table that has been clustered or an index that has been reindexed.) We could use a SysCache on filenode, if there was one. Unfortunately I don't think we have it. Can we have this reverted/fixed? If you can see a way without reintroducing the old bugs, let me know. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing relation locking overhead
Hannu Krosing wrote: Ühel kenal päeval, L, 2005-12-10 kell 21:07, kirjutas Tom Lane: In any case the design idea here seems to be we don't care how long REINDEX takes as long as it's not blocking anyone. Yes, thats the general idea. Within reason of course, for example making a seqscan over the index for each and every tuple inserted during building the first index would probably still be too slow :) You don't need to seqscan the _index_. You need to scan the table. Those tuples that do not satisfy the snapshot or where you are in doubt, you examine the index to see whether they are there. The bulk of it you just skip. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_relation_size locking
Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: You have to have a lock to ensure that the table even exists, let alone that you are looking at the right set of disk files. This would require a lock on pg_class, not table foo, no? No, the convention is that you take a lock on the relation you're interested in. The fact that some of the information you care about is in pg_class is incidental. There is actually stuff going on behind the scenes to make sure that you get up-to-date info when you do LockRelation; looking at the pg_class row does *not* by itself guarantee that. That is, when you SearchSysCache you might get a row that was good at the start of your transaction but no longer is; relation_open with a lock guarantees that you get a relation descriptor that is currently correct. Hm, I see the issue. Interesting enough, I *do* see the size growing. OTOH, when running BEGIN;TRUNCATE against a test table and retrieving pg_relation_size returns the previous relfilenode and size as expected. That's a bit curious. If they just did TRUNCATE then COPY, the commit of the TRUNCATE should have released the lock. If the TRUNCATE wasn't committed yet, then how are you able to pick up the correct relfilenode to look at? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Foreign key trigger timing bug?
On 12/9/2005 8:27 PM, Stephan Szabo wrote: On Fri, 9 Dec 2005, Jan Wieck wrote: On 12/8/2005 8:53 PM, Tom Lane wrote: Stephan Szabo [EMAIL PROTECTED] writes: Yeah. I really don't understand it, but it appears to me to be explicitly different in the spec for on delete cascade even compared to the rest of the referential actions. One problem I see is, what do we do if the BEFORE trigger then returns NULL (to skip the delete). The cascaded operations are already done. Do we have to execute the cascaded deletes in a subtransaction or do we disallow the skip in this case? I think we'd have disallow skipping. Especially since skipping would probably end up with a violated constraint. That seems to me to be a sufficient reason to not follow the spec in this respect. A BEFORE trigger should be run BEFORE anything happens, full stop. I can't think of any good reason why the spec's semantics are better. (It's not like our triggers are exactly spec-compatible anyway.) It doesn't lead to a violated constraint. bar references foo on delete cascade, now delete from foo will first delete from bar, then the before trigger on foo skips the delete. That's not the right case I think. Pseudo example: create table a create table b references a on delete cascade create before trigger on b that sometimes skips a delete to b insert into a and b. delete from a -- AFAICS, you can end up with a row in b that no longer has its associated row in a since the a row will be deleted but there's no guarantee its referencing rows in b will have successfully been deleted. Yes, you can deliberately break referential integrity with that. But you know what? I think the overall waste of performance and developer time, required to fix this rather exotic (and idiotic) problem, is too high to seriously consider it. Well, the case that brought up the original question was one where the before trigger updated rows that were going to be affected by the cascaded delete. Before this worked by accident, now it gives an error (even though the key wasn't changed due to some other possibilities of violation forcing the check). The problem is that if we're not consistent about what violation cases are acceptable, it's hard to diagnose if something is an actual bug or merely an acceptable side effect. :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Different length lines in COPY CSV
On Mon, Dec 12, 2005 at 10:15:03AM -0500, Pollard, Mike wrote: Tom Lane wrote: What's been suggested in the past is some sort of standalone file-format-conversion utility, which could deal with this sort of stuff without having to also deal with all the backend-internal considerations that COPY must handle. So (at least in theory) it'd be simpler and more maintainable. That still seems like a good idea to me --- in fact, given my druthers I would rather have seen CSV support done in such an external program. Why not add hooks into COPY to call the user's massage functions? That way you don't have to read and write the data, then read it again to load it into the database. Well, it does make you wonder about supporting something like (perl style): \copy foo FROM 'myfilter dodgy-data.csv |' or maybe \copy foo from pipe 'myfilter dodgy-data.csv' or possibly \pipe foo from dodgy-data.csv using autodetecting-format-filter.pl Which would cause psql to fork/exec the filter and pass the output data to the server. We could then provide all sorts of parsers for format-of-the-week. This would probably include the converse: \pipe foo to table.xls using make-excel-spreadsheet.pl Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpwxn3pkXZyG.pgp Description: PGP signature
Re: [HACKERS] pg_relation_size locking
[Resend: apparently there's a problem with my mail server] Andreas Pflug wrote: Until recently, pg_relation_size used SearchSysCache to locate the relation to examine, and calculated the file location from that information. Starting with dbsize.c V1.5 (committed after Beta2), relation_open(.., AccessShareLock) is used. This is very unfortunate because it will not allow to observe a table growing while it is populated, e.g. with a lengthy COPY; pg_relation_size will be blocked. After reverting to 1.4, everything was fine again. The diff: http://projects.commandprompt.com/projects/public/pgsql/changeset/23120 The problem with the original coding was that it used the table Oid to look up the file name, which is wrong. (Test it with a table that has been clustered or an index that has been reindexed.) We could use a SysCache on filenode, if there was one. Unfortunately I don't think we have it. Can we have this reverted/fixed? If you can see a way without reintroducing the old bugs, let me know. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Different length lines in COPY CSV
I too have to deal with this issue daily since I deal with Windows daily. It sounds like the source of the csv files were originally Excel files. I have never been able to figure out how Excel determines to quit putting null values in empty columns after X rows and resume again later on. If the file has less than 65000 rows I would suggest using OpenOffice 2.0 instead of Excel. OpenOffice does not stop filling the empty columns and with 2.0 it now supports the same maximum number of rows that Excel does. I use Perl constantly to reformat files and import them as a csv using the COPY command. I think the original poster would prefer a php solution though... While it is not a problem for me I do have other less technical users who don't know perl and this makes postgres much more difficult for them to use. Most of them come from a M$ Access background which can handle importing of Excel files directly thus don't have to deal with this issue. A file conversion utility would be very helpful for supporting Postgres with Windows especially if it could handle Excel files in their native format. Mike On Mon, Dec 12, 2005 at 07:58:52PM +0100, Martijn van Oosterhout wrote: On Mon, Dec 12, 2005 at 10:15:03AM -0500, Pollard, Mike wrote: Tom Lane wrote: What's been suggested in the past is some sort of standalone file-format-conversion utility, which could deal with this sort of stuff without having to also deal with all the backend-internal considerations that COPY must handle. So (at least in theory) it'd be simpler and more maintainable. That still seems like a good idea to me --- in fact, given my druthers I would rather have seen CSV support done in such an external program. Why not add hooks into COPY to call the user's massage functions? That way you don't have to read and write the data, then read it again to load it into the database. Well, it does make you wonder about supporting something like (perl style): \copy foo FROM 'myfilter dodgy-data.csv |' or maybe \copy foo from pipe 'myfilter dodgy-data.csv' or possibly \pipe foo from dodgy-data.csv using autodetecting-format-filter.pl Which would cause psql to fork/exec the filter and pass the output data to the server. We could then provide all sorts of parsers for format-of-the-week. This would probably include the converse: \pipe foo to table.xls using make-excel-spreadsheet.pl Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] PgInstaller error on upgrade
I am running XP Pro with SP2. Postgres 8.1 installed with MSI but later on I did upgrade the ODBC driver. Shortly after selecting upgrade.bat I received an error: This error may indicate a problem with the installation package. Number 2803 After clicking OK the installer continued anyway and seems to have worked. I executed the command using the runas and did remember to extract the files before proceeding. Mike ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Something I don't understand with the use of schemas
Le Samedi 10 Décembre 2005 17:43, vous avez écrit : Guillaume LELARGE [EMAIL PROTECTED] writes: Apparently, I can rename all schemas, even system schemas ! metier=# alter schema pg_catalog rename to foobar; ALTER SCHEMA If you are superuser, you can do anything you want, up to and including breaking the system irretrievably. Compare rm -rf / on Unix. We won't be putting training wheels on superuser status for the same reasons that no one finds it a good idea to restrict root's abilities. Seems pretty fair. I've made more tests on schemas. I'm able to drop information_schema and public schemas but I can't drop pg_catalog and pg_toast. It makes me think that only pg_* are system schemas and that public and information_schema are public schemas. Am I right on this one ? -- Guillaume. !-- http://abs.traduc.org/ http://lfs.traduc.org/ http://traduc.postgresqlfr.org/ -- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Something I don't understand with the use of schemas
On Sat, Dec 10, 2005 at 09:18:32AM -0800, Joshua D. Drake wrote: However there is an effort to get rid of root in some Unix lands, separating its responsabilities with more granularity. Maybe there could be an effort, not to hand-hold the true superusers, but to delegate some of its responsabilities to other users. Like sudo? I think it would be a huge benefit to have something equivalent to sudo for psql (though maybe it could be generalized more). Having to change to a different connection/authorization anytime you need to do something requiring superuser is a real pita, and encourages things like making yourself a superuser. In the case of shell commands like createdb, sudo is indeed a pretty viable alternative; you just need to do something like sudo -u postgres command. But there's no equivalent for psql; if you sudo -u postgres psql it's the equivalent of su - root (though at least you wouldn't need to know the password to the postgres account). I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Different length lines in COPY CSV
Tom Lane wrote: Where would you want this file conversion utility? bin? contrib? pgfoundry? I'd say pgfoundry for starters --- there's no reason to tie it down to server release cycles. Maybe when the thing is fairly mature and doesn't need frequent releases, we could think about whether it ought to be brought into the core distro. However, it likely won't ever be a candidate to become part of core unless it's written in C, and offhand I would judge C to not be the best choice of implementation language for such a thing. This is surely going to be mostly a string-pushing type of problem, so something like perl might be a better bet. You are probably right. The biggest wrinkle will be dealing with various encodings, I suspect. That at least is one thing that doing CSV within the backend bought us fairly painlessly. Perl's Text::CSV_XS module for example simply handles this by declaring that only [\x09\x20-\x7f] are valid in its non-binary mode, and in either mode appears to be MBCS unaware. We should try to do better than that. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, 12 Dec 2005 14:05:03 -0600 Jim C. Nasby [EMAIL PROTECTED] wrote: I think it would be a huge benefit to have something equivalent to sudo for psql (though maybe it could be generalized more). Having to change to a different connection/authorization anytime you need to do something requiring superuser is a real pita, and encourages things like making yourself a superuser. Me too. I think this would be a great feature. - Frank Wiles [EMAIL PROTECTED] http://www.wiles.org - ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Different length lines in COPY CSV
Am Montag, den 12.12.2005, 15:08 -0500 schrieb Andrew Dunstan: Tom Lane wrote: ... You are probably right. The biggest wrinkle will be dealing with various encodings, I suspect. That at least is one thing that doing CSV within the backend bought us fairly painlessly. Perl's Text::CSV_XS module for example simply handles this by declaring that only [\x09\x20-\x7f] are valid in its non-binary mode, and in either mode appears to be MBCS unaware. We should try to do better than that. Are there any test datafiles available in a repository? I could give it a shot I think. If not maybe we could set up something like that. Regards Tino ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Please Help: PostgreSQL Query Optimizer
On Mon, Dec 12, 2005 at 06:39:42PM +0530, Anjan Kumar. A. wrote: Through googling, i found that Normal Disk has external data transfer rate of around 40MBps, where as Main Memory has Data transfer rate ranging from 1.6GBps to 2.8GBps. I think 40MB/s is a burst speed. You should do some testing to verify. In any case, PostgreSQL doesn't come close to the theoretical maximum disk bandwidth even on a sequential scan. There's been discussion about this on various lists in the past. For a single drive, expect something more in the range of 4-6MB/s (depending on the drive). More important that throughput though, is latency. Because the latency on memory is much closer to 0 (it's not truely 0 due to L1/L2 caching), you can serve concurrent requests a lot faster. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Different length lines in COPY CSV
On Mon, Dec 12, 2005 at 09:30:12PM +0100, Tino Wildenhain wrote: Am Montag, den 12.12.2005, 15:08 -0500 schrieb Andrew Dunstan: You are probably right. The biggest wrinkle will be dealing with various encodings, I suspect. That at least is one thing that doing CSV within the backend bought us fairly painlessly. Perl's Text::CSV_XS module for example simply handles this by declaring that only [\x09\x20-\x7f] are valid in its non-binary mode, and in either mode appears to be MBCS unaware. We should try to do better than that. Are there any test datafiles available in a repository? I could give it a shot I think. If not maybe we could set up something like that. Note, recent versions of Perl allow you to specify the file encoding when you open the file and will convert things to UTF-8 as appropriate. So in theory it should be fairly simple to make a script that could handle various encodings. The hardest part is always determining which encoding a file is in in the first place... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpEkOiRiUBs5.pgp Description: PGP signature
[HACKERS] [Bug] Server Crash, possible security exploit, where to send security report?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, while playing with Npgsql I faced an strange behavior of Postgresql server. I have all the details of it and I thought it could be a severe security exploit, so I don't send it in clear to this mailing list directly as, I think, anybody with this information could Dos postgresql servers. Please, send me information to where/who I should send the details in order this can be fixed as soon as possible. This is the log I get when I receive the problem. I think that as server is killing all processes, any client which can do that can kill all client connections to that server. That's why I think it is very dangerous. DEBUG: server process (PID 2874) was terminated by signal 11 LOG: server process (PID 2874) was terminated by signal 11 LOG: terminating any other active server processes DEBUG: sending SIGQUIT to process 2111 DEBUG: sending SIGQUIT to process 2112 LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2005-12-12 17:54:12 BRST LOG: checkpoint record is at 0/38E290 LOG: redo record is at 0/38E290; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 619; next OID: 24576 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: database system was not properly shut down; automatic recovery in progress LOG: record with zero length at 0/38E2D4 LOG: redo is not required LOG: database system is ready LOG: transaction ID wrap limit is 2147484148, limited by database postgres - -- Regards, Francisco Figueiredo Jr. Npgsql Lead Developer http://www.pgfoundry.org/projects/npgsql MonoBrasil Project Founder Member http://monobrasil.softwarelivre.org - - Science without religion is lame; religion without science is blind. ~ Albert Einstein -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQEVAwUBQ53c8f7iFmsNzeXfAQIIhgf9ENy4JADnkmkTzvegHtLjOxv9Qc7Tc5nr z3uHOS3cV+I/0x6iu+DFu27uioCZV+/n8kuhNCE7r7q5kfIXu/NFRF2sULacH2bf qT1oeL9IxB1DH/MStPADZAXNaDqvuKBOacACHjjisOFalOBFuymjpVMI+idsKptK gmZT3I3qrsTvkGjPCnsSML7vHerJKXSkhew1yPLzg/V0qx+S36q0A6aR0pUNAnLV Js6k2bmTEZSljt7BXIR9ISrw2CA4UG71C/njGt+RFX8P1d0aXrMG5zClAd42aKsB Gy4A4CBbNHCiP8BuSd01VIdzyZbbvMI9qkP/4/7Gdaym3MbAN0UMzQ== =A0iI -END PGP SIGNATURE- ___ Yahoo! doce lar. Faça do Yahoo! sua homepage. http://br.yahoo.com/homepageset.html ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Anyone for adding -fwrapv to our standard CFLAGS?
It seems that gcc is up to some creative reinterpretation of basic C semantics again; specifically, you can no longer trust that traditional C semantics of integer overflow hold: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=175462 While I don't think we are anywhere using exactly the same trick that the referenced mysql code is using, it certainly seems likely to me that a compiler that is willing to replace x 0 -x 0 with false might be able to break some of the integer overflow checks we do use. I think we need to add -fwrapv to CFLAGS anytime the compiler will take it, same as we recently started doing with -fno-strict-aliasing. Comments? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] replan needed?
I have a few SPs I wrote in C language in PostgreSQL 8.1, they do mainly SPI functions as SELECT field1, field2 from table1 and INSERT INTO table2 VALUES (val1, val2) and to minimize the execution time I do something like that: static void *splan_insertstuff = NULL; PG_FUNCTION_INFO_V1(insertstuff); Datum insertstuff(PG_FUNCTION_ARGS) { void*pplan; Datum args[2]; Oid argt[2]; SPI_connect(); args[0] = PG_GETARG_DATUM(0); args[1] = PG_GETARG_DATUM(1); if(splan_insertstuff == NULL) { pplan = SPI_prepare( INSERT INTO table1 (field1, field2) VALUES ($1, $2), 2, argt); if(pplan == NULL) { SPI_finish(); PG_RETURN_NULL(); } splan_insertstuff = SPI_saveplan(pplan); } pplan = splan_insertstuff; SPI_execute_plan(pplan, args, NULL, false, 1); SPI_finish(); PG_RETURN_NULL(); } Well, the table1 is plain easy, just two text fields with an index in the field1. I have a daemon running to empty table1 and pass the data to another table. The main question is here: When I delete several tuples from table1 do I need to regenerate the plan? I mean, I store the saved plan in a static global variable, and the connection stay alive at least for the next 5 hours (I empty the table each 15 min). I have autovacuum running and my table1 grows around 1 tuples per minute. I have a performance drop in the next 2 or 3 hours running the database. There is something related to my sps here or do I need to search in another place? Somebody could help me with this? Thanks a lot! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] space for optimalization: DISTINCT without index
On Mon, Dec 12, 2005 at 18:35:07 +0100, Pavel Stehule [EMAIL PROTECTED] wrote: Hello I did some test and I can see so DISTINCT works well on indexed columns, but is slow on derived tables without indexes. If I use without distinct group by I get much better times. SELECT DISTINCT a, b FROM tab SELECT a,b FROM tab GROUP BY a, b. Can You Explain it. DISTINCT will require a sort step to remove duplicates. GROUP BY can use either a sort or hash aggregate plan. If there are few distinct values, the hash aggregate plan can be much faster. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] buildfarm off the air pro tem
due to a piece of monumental carelessness by me, buildfarm web app is off the air. I will advise when I have managed to recover. It's a good thing we are not coming up to release :-) cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Something I don't understand with the use of schemas
Jim C. Nasby [EMAIL PROTECTED] writes: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. You can already do most of this with SET/RESET ROLE: regression=# create user tgl; CREATE ROLE regression=# create user admin createrole; CREATE ROLE regression=# grant admin to tgl; GRANT ROLE regression=# \c - tgl You are now connected as new user tgl. regression= create user boo; ERROR: permission denied to create role regression= set role admin; SET regression= create user boo; CREATE ROLE regression= reset role; RESET regression= create user bar; ERROR: permission denied to create role regression= regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [Bug] Server Crash, possible security exploit, where to send security report?
On Mon, Dec 12, 2005 at 06:26:25PM -0200, Francisco Figueiredo Jr. wrote: Hi all, while playing with Npgsql I faced an strange behavior of Postgresql server. I have all the details of it and I thought it could be a severe security exploit, so I don't send it in clear to this mailing list directly as, I think, anybody with this information could Dos postgresql servers. Well, you're not giving any details but if you can cause the server to dump core in a standard installation, we're interested. You didn't specify your version BTW. Here has instructions, including for security related stuff: http://www.postgresql.org/docs/current/static/bug-reporting.html Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp30gjguSe1p.pgp Description: PGP signature
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, Dec 12, 2005 at 05:00:45PM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. You can already do most of this with SET/RESET ROLE: Very cool, I didn't realize that. It would still be nice if there was a way to do it on a per-command basis (since often you just need to run one command as admin/dba/what-have-you), but I suspect adding that to the grammar would be a real PITA. Perhapse it could be added to psql though... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Something I don't understand with the use of schemas
Jim C. Nasby wrote: On Mon, Dec 12, 2005 at 05:00:45PM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. You can already do most of this with SET/RESET ROLE: Very cool, I didn't realize that. It would still be nice if there was a way to do it on a per-command basis (since often you just need to run one command as admin/dba/what-have-you), but I suspect adding that to the grammar would be a real PITA. Perhapse it could be added to psql though... If it's one command can't you wrap it in a security definer function? cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, Dec 12, 2005 at 05:27:33PM -0500, Andrew Dunstan wrote: On Mon, Dec 12, 2005 at 05:00:45PM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. You can already do most of this with SET/RESET ROLE: Very cool, I didn't realize that. It would still be nice if there was a way to do it on a per-command basis (since often you just need to run one command as admin/dba/what-have-you), but I suspect adding that to the grammar would be a real PITA. Perhapse it could be added to psql though... If it's one command can't you wrap it in a security definer function? Sure, if it's a command you'll be using over and over. Which I guess some are, but it's still a pain. Maybe what I'm asking for will only make sense to people who use sudo... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, 2005-12-12 at 16:35 -0600, Jim C. Nasby wrote: On Mon, Dec 12, 2005 at 05:27:33PM -0500, Andrew Dunstan wrote: On Mon, Dec 12, 2005 at 05:00:45PM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. You can already do most of this with SET/RESET ROLE: Very cool, I didn't realize that. It would still be nice if there was a way to do it on a per-command basis (since often you just need to run one command as admin/dba/what-have-you), but I suspect adding that to the grammar would be a real PITA. Perhapse it could be added to psql though... If it's one command can't you wrap it in a security definer function? Sure, if it's a command you'll be using over and over. Which I guess some are, but it's still a pain. Maybe what I'm asking for will only make sense to people who use sudo... Having a set of fine-grained permissions that you could grant to roles could be useful. A sudo equivalent would be a version of psql that always connected to the database using super-user and allowed command execution based on a regular expression. Bit of a hack to say the least. -- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Which qsort is used
Neil Conway [EMAIL PROTECTED] writes: BTW, Luke Lonergan recently posted some performance results for a fairly efficient public domain implementation of qsort to the bizgres list: http://lists.pgfoundry.org/pipermail/bizgres-general/2005-December/000294.html As those results suggest, there can be huge differences in sort performance depending on whether the input is random, nearly sorted, nearly reverse sorted, possesses many equal keys, etc. It would be very dangerous to say implementation A is better than implementation B without having tested all those scenarios. IIRC, the reason we reject Solaris' qsort is not that it is so bad in the typical case, but that it has some horrible corner-case behaviors. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, Dec 12, 2005 at 05:43:47PM -0500, Rod Taylor wrote: I'd love to see something like SUDO ALTER USER ... SUDO REINDEX ... etc. That would make it easy to do 'normal' work with a non-superuser account. A sudo equivalent would be a version of psql that always connected to the database using super-user and allowed command execution based on a regular expression. Bit of a hack to say the least. How is that at all what you're describing? sudo gives you the ability to run a command as root, plain and simple. IE: sudo ls -la blah sudo /usr/local/etc/rc.d/010.pgsql.sh stop etc Some SQL examples would be... sudo CREATE USER ... sudo UPDATE table SET ... I have no idea what you're envisioning, but based on your description it certainly doesn't sound like what I'm envisioning... -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] space for optimalization: DISTINCT without index
Bruno Wolff III [EMAIL PROTECTED] writes: DISTINCT will require a sort step to remove duplicates. GROUP BY can use either a sort or hash aggregate plan. If there are few distinct values, the hash aggregate plan can be much faster. The DISTINCT code hasn't been revisited in a long time. One obstacle to improving it is that it's very tightly intertwined with ORDER BY. While fixing that might be just a Small Matter Of Programming, it's not clear how to decouple them without breaking DISTINCT ON. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_relation_size locking
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: You have to have a lock to ensure that the table even exists, let alone that you are looking at the right set of disk files. This would require a lock on pg_class, not table foo, no? No, the convention is that you take a lock on the relation you're interested in. So pgAdmin violates the convention, because it doesn't hold a lock an a table when reengineering its attributes Since pg_relation_size is a pure metadata function, I don't think the convention hits here (. Hm, I see the issue. Interesting enough, I *do* see the size growing. OTOH, when running BEGIN;TRUNCATE against a test table and retrieving pg_relation_size returns the previous relfilenode and size as expected. That's a bit curious. If they just did TRUNCATE then COPY, the commit of the TRUNCATE should have released the lock. If the TRUNCATE wasn't committed yet, then how are you able to pick up the correct relfilenode to look at? The truncate is buried in a function, I suspect that actually no truncate happened on an empty table. Regards, Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] buildfarm off the air pro tem
Andrew Dunstan wrote: due to a piece of monumental carelessness by me, buildfarm web app is off the air. I will advise when I have managed to recover. It's a good thing we are not coming up to release :-) We are back again, thanks to Joshua Drake and CommandPrompt's excellent service. We have lost about 12 hours of data. Maybe I need to look at a replication system ... cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] buildfarm off the air pro tem
On Mon, 2005-12-12 at 18:00 -0500, Andrew Dunstan wrote: Andrew Dunstan wrote: due to a piece of monumental carelessness by me, buildfarm web app is off the air. I will advise when I have managed to recover. It's a good thing we are not coming up to release :-) We are back again, thanks to Joshua Drake and CommandPrompt's excellent service. We have lost about 12 hours of data. Maybe I need to look at a replication system ... I happen to know of one ;) cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: PLphp, PLperl, ODBCng - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Something I don't understand with the use of schemas
Guillaume LELARGE [EMAIL PROTECTED] writes: I've made more tests on schemas. I'm able to drop information_schema and public schemas but I can't drop pg_catalog and pg_toast. It makes me think that only pg_* are system schemas and that public and information_schema are public schemas. Am I right on this one ? Yeah, both of the latter are intentionally droppable because nothing in the core server depends on them. The C code is explicitly aware of both pg_catalog and pg_toast, so those are pinned. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] 7.3 failure on platypus
Platypus (http://lnk.nu/pgbuildfarm.org/6yt.pl) started failing about 12 days ago with the following: ccache gcc -O3 -pipe -pipe -fno-strict-aliasing -g -Wall -Wmissing-prototypes -Wmissing-declarations -fpic -DPIC -I. -I../../../src/include -I/usr/local/include -DFRONTEND -DSYSCONFDIR='/home/buildfarm/buildfarm/REL7_3_STABLE/inst/etc/postgresql' -c -o fe-misc.o fe-misc.c fe-auth.c: In function `pg_local_sendauth': fe-auth.c:466: error: conflicting types for 'cmsgmem' fe-auth.c:459: error: previous declaration of 'cmsgmem' was here fe-auth.c:469: error: redefinition of 'cmsg' fe-auth.c:462: error: previous definition of 'cmsg' was here fe-auth.c:462: warning: unused variable `cmsg' gmake[3]: *** [fe-auth.o] Error 1 gmake[3]: *** Waiting for unfinished jobs gmake[3]: Leaving directory `/home/buildfarm/buildfarm/REL7_3_STABLE/pgsql.62497/src/interfaces/libpq' gmake[2]: *** [all] Error 2 gmake[2]: Leaving directory `/home/buildfarm/buildfarm/REL7_3_STABLE/pgsql.62497/src/interfaces' gmake[1]: *** [all] Error 2 gmake[1]: Leaving directory `/home/buildfarm/buildfarm/REL7_3_STABLE/pgsql.62497/src' gmake: *** [all] Error 2 The machine was having some issues during that time and we did recently upgrade to FreeBSD 6.0, but all the other branches are clean. Any ideas? -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Something I don't understand with the use of schemas
Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Dec 12, 2005 at 05:43:47PM -0500, Rod Taylor wrote: A sudo equivalent would be a version of psql that always connected to the database using super-user and allowed command execution based on a regular expression. Bit of a hack to say the least. How is that at all what you're describing? sudo gives you the ability to run a command as root, plain and simple. Perhaps you should read the sudo documentation sometime ;-). sudo allows fairly fine-grained control over who can become which userid and what commands they can issue. (At least the Linux version does.) I'm not sure that a psql sudo would have to have all that, since to some extent it'd duplicate the existing SQL permissions machinery, but at the very least it needs to allow specification of the target userid. There isn't any universal equivalent to root that we could sensibly default to in Postgres. So you're really talking about sudo postgres create user joe ... ; versus set role postgres; create user joe ... ; reset role; which is not *that* amazing a savings in typing, and becomes very rapidly less so when you need to execute multiple commands as the more-privileged user. Implementing sudo on the psql side would be a bit of a PITA, because of the problem of how do you reset role if the called command fails (and thereby aborts your open transaction)? On the backend side I think it could use the same reset mechanism that already exists for security-definer functions... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Something I don't understand with the use of schemas
Le Mardi 13 Décembre 2005 00:13, Tom Lane a écrit : Guillaume LELARGE [EMAIL PROTECTED] writes: I've made more tests on schemas. I'm able to drop information_schema and public schemas but I can't drop pg_catalog and pg_toast. It makes me think that only pg_* are system schemas and that public and information_schema are public schemas. Am I right on this one ? Yeah, both of the latter are intentionally droppable because nothing in the core server depends on them. The C code is explicitly aware of both pg_catalog and pg_toast, so those are pinned. Thanks for your answer. -- Guillaume. !-- http://abs.traduc.org/ http://lfs.traduc.org/ http://traduc.postgresqlfr.org/ -- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Something I don't understand with the use of schemas
On Mon, Dec 12, 2005 at 06:37:03PM -0500, Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Dec 12, 2005 at 05:43:47PM -0500, Rod Taylor wrote: A sudo equivalent would be a version of psql that always connected to the database using super-user and allowed command execution based on a regular expression. Bit of a hack to say the least. How is that at all what you're describing? sudo gives you the ability to run a command as root, plain and simple. Perhaps you should read the sudo documentation sometime ;-). sudo allows fairly fine-grained control over who can become which userid and what commands they can issue. (At least the Linux version does.) I'm not sure that a psql sudo would have to have all that, since to some extent it'd duplicate the existing SQL permissions machinery, Yeah, that's taking my analogy farther than I intended. :) but at the very least it needs to allow specification of the target userid. There isn't any universal equivalent to root that we could sensibly default to in Postgres. So you're really talking about Database owner? Implementing sudo on the psql side would be a bit of a PITA, because of the problem of how do you reset role if the called command fails (and thereby aborts your open transaction)? On the backend side I think it could use the same reset mechanism that already exists for security-definer functions... Heh, I figured adding this to the grammar would be a nightmare compared to anything else; shows what (little) I know. :) -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 7.3 failure on platypus
Jim C. Nasby [EMAIL PROTECTED] writes: Platypus (http://lnk.nu/pgbuildfarm.org/6yt.pl) started failing about 12 days ago with the following: fe-auth.c: In function `pg_local_sendauth': fe-auth.c:466: error: conflicting types for 'cmsgmem' fe-auth.c:459: error: previous declaration of 'cmsgmem' was here Yeah, all of the freebsd 6 buildfarm machines say that :-( ... the fbsd6 system headers contain some incompatible changes, evidently. It looks like we dealt with this in 7.4 but didn't risk back-patching. The PG 7.3 branch is definitely showing its age. I'm not sure how interesting it is to keep updating it for newer platforms; is anyone very likely to run 7.3 on a new machine, rather than some later PG? (Note: while I have a personal need to keep supporting 7.3 on RHEL3, this requirement does not extend to any later platforms; so I don't personally care very much about back-porting fixes like this. Obviously my opinion is biased by what Red Hat cares about.) regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 7.3 failure on platypus
On Mon, Dec 12, 2005 at 06:44:23PM -0500, Tom Lane wrote: The PG 7.3 branch is definitely showing its age. I'm not sure how interesting it is to keep updating it for newer platforms; is anyone very likely to run 7.3 on a new machine, rather than some later PG? Probably no one... I'll just gank it from playtypus's config. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Which qsort is used
Tom, On 12/12/05 2:47 PM, Tom Lane [EMAIL PROTECTED] wrote: As those results suggest, there can be huge differences in sort performance depending on whether the input is random, nearly sorted, nearly reverse sorted, possesses many equal keys, etc. It would be very dangerous to say implementation A is better than implementation B without having tested all those scenarios. Yes. The Linux glibc qsort is proven terrible in the general case by these examples though. Bruce's point on that thread was that we shouldn't be replacing the OS routine in the general case. On the other hand, there is the precedent of replacing Solaris' routine with the NetBSD version. Based on the current testing, I think it would be a good idea to expose a --with-qsort option in configure to allow for it's selection as suggested by other posters. IIRC, the reason we reject Solaris' qsort is not that it is so bad in the typical case, but that it has some horrible corner-case behaviors. Do you have a test suite you can recommend with those edge cases? I built the one in the bizgres-general thread based on edge cases for Solaris that I found on a sun mailing list. The edge case referred to there was the all zero one, which does seem to have a significant advantage in the NetBSD. - Luke ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Anyone for adding -fwrapv to our standard CFLAGS?
On Mon, 2005-12-12 at 16:19 -0500, Tom Lane wrote: It seems that gcc is up to some creative reinterpretation of basic C semantics again; specifically, you can no longer trust that traditional C semantics of integer overflow hold: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=175462 While I don't think we are anywhere using exactly the same trick that the referenced mysql code is using, it certainly seems likely to me that a compiler that is willing to replace x 0 -x 0 with false might be able to break some of the integer overflow checks we do use. IMHO code that makes assumptions about overflow behavior beyond what is defined by the standard is asking for trouble, whether those assumptions are traditional C semantics or not. Given that -fwrapv apparently hurts performance *and* you've presented no evidence that we actually need the flag in the first place, I'm not sold on this idea... -Neil ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 7.3 failure on platypus
Tom Lane said: Jim C. Nasby [EMAIL PROTECTED] writes: Platypus (http://lnk.nu/pgbuildfarm.org/6yt.pl) started failing about 12 days ago with the following: fe-auth.c: In function `pg_local_sendauth': fe-auth.c:466: error: conflicting types for 'cmsgmem' fe-auth.c:459: error: previous declaration of 'cmsgmem' was here Yeah, all of the freebsd 6 buildfarm machines say that :-( ... the fbsd6 system headers contain some incompatible changes, evidently. It looks like we dealt with this in 7.4 but didn't risk back-patching. The PG 7.3 branch is definitely showing its age. I'm not sure how interesting it is to keep updating it for newer platforms; is anyone very likely to run 7.3 on a new machine, rather than some later PG? (Note: while I have a personal need to keep supporting 7.3 on RHEL3, this requirement does not extend to any later platforms; so I don't personally care very much about back-porting fixes like this. Obviously my opinion is biased by what Red Hat cares about.) I don't care that much if 7.3 fails to build on fbsd 6. The flipside is that the fix for this particular problem appears to be very simple and very low risk, unless I'm going blind. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Which qsort is used
On Mon, 12 Dec 2005, Luke Lonergan wrote: Do you have a test suite you can recommend with those edge cases? I have at least those factors in mind: + f1: number of elements - in {10^3, 10^4, 10^5, 10^6, 10^7} + f2: key comparators measured by cpu cost - in {1, 10, 100+}; + f3: data distribution - in {uniform, Gussian, 95% sorted, 95% reverse sorted} + f4: data value range - in {2, 32, 1024, unlimited}: radix sort might be better for small range The element size doesn't matter since the main usage of our qsort is on pointer array. Element data type is covered by f2 and f4. This will gives us a 5*3*4*4 = 240 tests ... Regards, Qingqing ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Anyone for adding -fwrapv to our standard CFLAGS?
Neil Conway [EMAIL PROTECTED] writes: IMHO code that makes assumptions about overflow behavior beyond what is defined by the standard is asking for trouble, whether those assumptions are traditional C semantics or not. Given that -fwrapv apparently hurts performance *and* you've presented no evidence that we actually need the flag in the first place, I'm not sold on this idea... Can you present a reasonably efficient way to test for integer overflow, as is needed in int4pl and friends (not to mention a lot of security checks that you yourself had a hand in adding), without making any assumptions that the new gcc may think are illegal? ISTM that what the spec is doing (according to Jakub's interpretation of it anyway) is denying the existence of overflow for signed values, which is (a) silly and (b) unhelpful. The larger problem here, however, is exactly the same as it was with -fno-strict-aliasing: the compiler provides no useful tools for finding the places where your code may be broken by the new assumptions. I'd be willing to have a go at fixing the problems if I could be certain that we'd found them all, but what gcc is actually offering us is a game of Russian roulette. I have better things to do with my time than to try to find all the places we'd need to fix to have any confidence that our code is safe without -fwrapv. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] GUI Debugging on Windows
Hi, This might be a newbie question but I'm wondering how most of you are debugging PostgreSQL on Windows (XP) ? Are you all using gdb? I want to develop a new data type and I'd like to trace code in a graphical interface. I have been trying very hard with Eclipse without success. Insight doesn't seem to work very well on Windows... I can't believe most of you are happy with gdb... Thanks for any info. Pierre ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] 7.3 failure on platypus
Andrew Dunstan [EMAIL PROTECTED] writes: I don't care that much if 7.3 fails to build on fbsd 6. The flipside is that the fix for this particular problem appears to be very simple and very low risk, unless I'm going blind. Possibly --- if you've gone to the trouble of identifying the relevant patch, we might as well apply it. Send it over. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Cost-based optimizers
A vaguely interesting interview with IBM and MS guys about cost-based optimizers. http://www.acmqueue.com/modules.php?name=Contentpa=showpagepid=297 Chris ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_relation_size locking
Andreas Pflug wrote: Alvaro Herrera wrote: The problem with the original coding was that it used the table Oid to look up the file name, which is wrong. (Test it with a table that has been clustered or an index that has been reindexed.) Um, can't test at the moment. The oldcode used pg_class-relfilnode, which delivers Name of the on-disk file of this relation according to the docs. What's wrong with that? Hum, nothing that I can see, but I changed that code precisely because somebody complained that it didn't work after truncating. Do you mean oldcode as the contrib code, or the initially integrated in backend code? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Different length lines in COPY CSV
Anyway, that's history now. Where would you want this file conversion utility? bin? contrib? pgfoundry? How about a full SQL*Loader clone? :D ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 7.3 failure on platypus
Tom Lane said: Andrew Dunstan [EMAIL PROTECTED] writes: I don't care that much if 7.3 fails to build on fbsd 6. The flipside is that the fix for this particular problem appears to be very simple and very low risk, unless I'm going blind. Possibly --- if you've gone to the trouble of identifying the relevant patch, we might as well apply it. Send it over. away from my workstation, but it looks to me like the HAVE_STRUCT_SOCKCRED branch is completely redundant in that function, since the structures in question are only referred to in a HAVE_CMSGCRED branch. I think I'd just delete lines 464-470 in http://developer.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/libpq/fe-auth.c?annotate=1.71 Jim, can you test that? (I don't have a fbsd box either). cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] missing something obvious about intervals?
On Mon, Dec 12, 2005 at 12:34:45PM -0600, Bruno Wolff III wrote: You are better off doing a multiply. Something like: area= select '2005-12-01'::date + 456.5 * '1 second'::interval; ?column? 2005-12-01 00:07:36.50 (1 row) This comes up often enough that it's probably worth adding a built-in function, especially if it's faster to do the multiply (though presumably a built-in function could outperform both the multiply and the more common (4.5::float || ' seconds')::interval form. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 7.3 failure on platypus
Andrew Dunstan [EMAIL PROTECTED] writes: I think I'd just delete lines 464-470 in http://developer.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/libpq/fe-auth.c?annotate=1.71 I confirmed that this is the patch appearing in 7.4 (fe-auth.c rev 1.84), so went ahead and applied it to 7.3. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [GENERAL] missing something obvious about intervals?
Jim C. Nasby [EMAIL PROTECTED] writes: On Mon, Dec 12, 2005 at 12:34:45PM -0600, Bruno Wolff III wrote: You are better off doing a multiply. Something like: area= select '2005-12-01'::date + 456.5 * '1 second'::interval; ?column? 2005-12-01 00:07:36.50 (1 row) This comes up often enough that it's probably worth adding a built-in function, The only case I've noticed seeing on a regular basis is Unix timestamp conversion (ie, the special case where the date is 1970-01-01); and we did add a built-in function for that in 8.1. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Cost-based optimizers
http://www.acmqueue.com/modules.php?name=Contentpa=showpagepid=297 I saw it in print; the only thing that seemed interesting about it was the recommendation that query optimization be biased towards the notion of stable plans, query plans that may not be the most aggressively fast, but which don't fall apart into hideous performance if the estimates are a little bit off. -- output = (cbbrowne @ ntlug.org) http://linuxdatabases.info/info/lsf.html Rules of the Evil Overlord #114. I will never accept a challenge from the hero. http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Which qsort is used
Qingqing, On 12/12/05 5:08 PM, Qingqing Zhou [EMAIL PROTECTED] wrote: This will gives us a 5*3*4*4 = 240 tests ... Looks good - I'm not going to be able to implement this matrix of tests quickly, but each dimension seems right. Might you have time to implement these within the testing framework I published previously? It has both the NetBSD and qsortG included along with a timing routine, etc. BTW - the edge case reported to the Sun mailing list was here: http://forum.sun.com/thread.jspa?forumID=4threadID=7231 - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Cost-based optimizers
I saw it in print; the only thing that seemed interesting about it was the recommendation that query optimization be biased towards the notion of stable plans, query plans that may not be the most aggressively fast, but which don't fall apart into hideous performance if the estimates are a little bit off. And the answer is interesting as well: I think we have to approach it in two ways. One is that you have to be able to execute good plans, and during the execution of a plan you want to notice when the actual data is deviating dramatically from what you expected. If you expected five rows and you’ve got a million, chances are your plan is not going to do well because you chose it based on the assumption of five. Thus, being able to correct mid-course is an area of enhancement for query optimizers that IBM is pursuing. Hmmm dynamic re-planning! Chris ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_relation_size locking
Alvaro Herrera wrote: Andreas Pflug wrote: Alvaro Herrera wrote: The problem with the original coding was that it used the table Oid to look up the file name, which is wrong. (Test it with a table that has been clustered or an index that has been reindexed.) Um, can't test at the moment. The oldcode used pg_class-relfilnode, which delivers Name of the on-disk file of this relation according to the docs. What's wrong with that? Hum, nothing that I can see, but I changed that code precisely because somebody complained that it didn't work after truncating. Do you mean oldcode as the contrib code, or the initially integrated in backend code? Both, esp. backend/utils/adt/dbsize.c V1.4. and contrib/dbsize/dbsize.c from 8.0.5. You might have been irritated by the naming: relnodeOid = pg_class-relfilenode; (..) PG_RETURN_INT64(calculate_relation_size(tblspcOid, relnodeOid)); Regards, Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Cost-based optimizers
Chris, On 12/12/05 8:44 PM, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: assumption of five. Thus, being able to correct mid-course is an area of enhancement for query optimizers that IBM is pursuing. Hmmm dynamic re-planning! I recently interviewed someone who is in the research group working on this at IBM. From what he said - it seems this is pretty far from making it's way into the production codebase. Apparently there is too much history in DB2's optimizer and the perception is that the risk / payoff is too high. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Please Help: PostgreSQL Query Optimizer
Anjan, But, in PostgreSQL all costs are scaled relative to a page fetch. If we make both sequential_page_fetch_cost and random_page_cost to 1, then we need to increase the various cpu_* paramters by multiplying the default values with appropriate Scaling Factor. Now, we need to determine this Scaling Factor. I see, so you're saying that because the real cost of a page fetch has decreased, the CPU_* costs should increase proportionally because relative to the real costs of a page fetch they should be higher? That makes a sort of sense. The problem that you're going to run into is that currently we have no particularly reason to believe that the various cpu_* costs are more than very approximately correct as rules of thumb. So I think you'd be a lot better off trying to come up with some means of computing the real cpu costs of each operation, rather than trying to calculate a multiple of numbers which may be wrong in the first place. I know that someone on this list was working on a tool to digest EXPLAIN ANALYZE results and run statistics on them. Can't remember who, though. Also, I'm still curious on how you're handling shared_mem, work_mem and maintenance_mem. You didn't answer last time. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Which qsort is used
Tom, IIRC, the reason we reject Solaris' qsort is not that it is so bad in the typical case, but that it has some horrible corner-case behaviors. Sun claims to have fixed these. Hopefully they'll do some testing which will prove it. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 7.3 failure on platypus
On Mon, Dec 12, 2005 at 10:39:47PM -0500, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I think I'd just delete lines 464-470 in http://developer.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/libpq/fe-auth.c?annotate=1.71 I confirmed that this is the patch appearing in 7.4 (fe-auth.c rev 1.84), so went ahead and applied it to 7.3. Seems to have worked... but now there's another problem... creating template1 database in data/base/1... IpcSemaphoreCreate: semget(key=3, num=17, 03600) failed: No space left on device The error talks about SEMMNI and SEMMNS, but both look ok... kern.ipc.semmns: 100 kern.ipc.semmni: 10 [EMAIL PROTECTED]:51]~/buildfarm/source:141ipcs Message Queues: T ID KEY MODEOWNERGROUP Shared Memory: T ID KEY MODEOWNERGROUP m65536 5432001 --rw---pgsqlpgsql Semaphores: T ID KEY MODEOWNERGROUP s65536 5432001 --rw---pgsqlpgsql s65537 5432002 --rw---pgsqlpgsql s65538 5432003 --rw---pgsqlpgsql All the other branches are working fine as well, so I'm wondering if this is some other issue... This first appeared 26 days ago: http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=platypusdt=2005-11-16%2007:35:00 But the only file changed there is documentation... that's also before we upgraded the machine, so it doesn't appear to be related to that either. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 7.3 failure on platypus
Jim C. Nasby [EMAIL PROTECTED] writes: The error talks about SEMMNI and SEMMNS, but both look ok... kern.ipc.semmns: 100 That's not enough to run two postmasters ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings