Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
Joshua D. Drake [EMAIL PROTECTED] writes: Christopher Browne wrote: That won't help; that would introduce the embarrassment of having a known default password. No it wouldn't unless the packagers set it up to do that. My point is that when a packager (or source) runs initdb, it would prompt for the postgres user password. Practically every existing packaging of PG tries to run initdb as a hidden, behind-the-scenes, definitely not-interactive procedure. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
That won't help; that would introduce the embarrassment of having a known default password. No it wouldn't unless the packagers set it up to do that. My point is that when a packager (or source) runs initdb, it would prompt for the postgres user password. Practically every existing packaging of PG tries to run initdb as a hidden, behind-the-scenes, definitely not-interactive procedure. afaik, practically every existing packaging of pg has *already* solved the problem and does not set trust as default anyway. ident sameuser I think is the most common. One thing I've thought about doing is to remove the default in initdb completely and *force* the user to choose auth type. Packagers can then just use that to set ident or whatever. and interactive users can pick trust if they really need it, but it will be a known choice. /Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Tuple alignment
Why do we require that t_hoff is MAXALIGNed? ISTM that if the first field in a tuple doesn't require alignment, it could be stored immediately after the null bitmap, without padding. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Load Distributed Checkpoints test results
On Sun, 2007-06-17 at 01:36 -0400, Greg Smith wrote: The last project I was working on, any checkpoint that caused a transaction to slip for more than 5 seconds would cause a data loss. One of the defenses against that happening is that you have a wicked fast transaction rate to clear the buffer out when thing are going well, but by no means is that rate the important thing--never having the response time halt for so long that transactions get lost is. You would want longer checkpoints in that case. You're saying you don't want long checkpoints because they cause an effective outage. The current situation is that checkpoints are so severe that they cause an effective halt to processing, even though checkpoints allow processing to continue. Checkpoints don't hold any locks that prevent normal work from occurring but they did cause an unthrottled burst of work to occur that raised expected service times dramatically on an already busy server. There were a number of effects contributing to the high impact of checkpointing. Heikki's recent changes reduce the impact of checkpoints so that they do *not* halt other processing. Longer checkpoints do *not* mean longer halts in processing, they actually reduce the halt in processing. Smoother checkpoints mean smaller resource queues when a burst coincides with a checkpoint, so anybody with throughput-maximised or bursty apps should want longer, smooth checkpoints. You're right to ask for a minimum write rate, since this allows very small checkpoints to complete in reduced times. There's no gain from having long checkpoints per se, just the reduction in peak write rate they typically cause. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] EXPLAIN omits schema?
On Wed, 2007-06-13 at 14:01 -0400, Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: Arguably this is a bug if it's causing pg_admin difficulties in parsing the output. Even for a user in an environment where, for example, he has several identical schemas and may be accidentally getting a different table than he's expecting the current output is ambiguous. Attached is a small patch which adds this conditionally on a guc that pg_admin or other GUI tools could set, leaving it unchanged for users. That makes things *worse* not better, since now tools would have to deal with both possibilities. The context here was a new feature in PgAdmin, which needs to be able to parse the SQL *and* find out the schema of a table. The idea was to have a tool that would issue an EXPLAIN *and* collect all the other relevant details required to submit an optimizer question to the lists. The tool would then be able to check for simple things like not having run ANALYZE. That feature would be very useful in identifying optimizer issues, as well as filtering out many requests that arrive, only to be easily explainable. Greg's small patch will allow this useful utility to be available for use with the 8.3 release package, so I hope you'll reconsider. Of course, XML output can be done for the next release. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
Hi, On Mon, 2007-06-18 at 01:58 -0400, Tom Lane wrote: Practically every existing packaging of PG tries to run initdb as a hidden, behind-the-scenes, definitely not-interactive procedure. Also, from RPM perspective: RPMs are *not* interactive, and will *never* be. So we cannot ask user a password during initdb, or such. Regards, -- Devrim GÜNDÜZ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/ signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
* Jeremy Drake ([EMAIL PROTECTED]) wrote: The crux of this seems to be two-fold: 1. If dblink is installed, an untrusted user could use it to gain privileges, either using trust/ident auth (you have a superuser named after the account the postmaster is runing as), or can be scripted to brute force passwords. The dblink w/ ident case is at least somewhat interesting since, iirc anyway, if you install dblink it comes with permissions for anyone to run it. That's pretty ugly if your PG superuser is the same user PostgreSQL runs as and you're using ident (which is quite common, esp. over unix sockets). The answer here being, don't allow just anyone to run dblink. 2. If you are a superuser, you can gain access to the external system, ie, by creating C language functions. Which, as an issue, is pretty much resolved in 8.2 anyway... You'd have to be able to compile and/or upload new libraries to the system w/ 8.2 since the PG_MODULE_MAGIC is required now. Neither of these are news to me, but maybe some new postgres admin will read it and figure out to disable trust auth and not to let untrusted users call dblink (either not install it or REVOKE the rights to call it). I'm strongly tempted to say this should be set up as the default for dblink, if it's not too hard to implement (I'd expect there's already a .sql which does the in-db create function and whatnot, just revoke all from it after it's created and tell people to create views using it instead as superuser). Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Load Distributed Checkpoints test results
On Mon, 18 Jun 2007, Simon Riggs wrote: Smoother checkpoints mean smaller resource queues when a burst coincides with a checkpoint, so anybody with throughput-maximised or bursty apps should want longer, smooth checkpoints. True as long as two conditions hold: 1) Buffers needed to fill allocation requests are still being written fast enough. The buffer allocation code starts burning a lot of CPU+lock resources when many clients are all searching the pool looking for a buffers and there aren't many clean ones to be found. The way the current checkpoint code starts at the LRU point and writes everything dirty in the order new buffers will be allocating in as fast as possible means it's doing the optimal procedure to keep this from happening. It's being presumed that making the LRU writer active will mitigate this issue, my experience suggests that may not be as effective as hoped--unless it gets changed so that it's allowed to decrement usage_count. To pick one example of a direction I'm a little concerned about related to this, Itagaki's sorted writes results look very interesting. But as his test system is such that the actual pgbench TPS numbers are 1/10 of the ones I was seeing when I started having ugly buffer allocation issues, I'm real sure the particular test he's running isn't sensitive to issues in this area at all; there's just not enough buffer cache churn if you're only doing a couple of hundred TPS for this to happen. 2) The checkpoint still finishes in time. The thing you can't forget about when dealing with an overloaded system is that there's no such thing as lowering the load of the checkpoint such that it doesn't have a bad impact. Assume new transactions are being generated by an upstream source such that the database itself is the bottleneck, and you're always filling 100% of I/O capacity. All I'm trying to get everyone to consider is that if you have a large pool of dirty buffers to deal with in this situation, it's possible (albeit difficult) to get into a situation where if the checkpoint doesn't write out the dirty buffers fast enough, the client backends will evacuate them instead in a way that makes the whole process less efficient than the current behavior. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] GUC time unit spelling a bit inconsistent
It seems that time-based GUC variables can be spelled like 1h but not 1hr 1minbut not 1m 1s but not 1sec This is inconsistent and confusing. I don't object to the ones on the left as being the standard spellings for printout, but if we're not going to have a simple uniform rule like shortest possible abbreviation then we ought to accept plausible alternatives on input. I got burnt by this just now because I looked at the autovacuum_naptime setting in postgresql.conf, which is shown as '1min', and figured I could change it to '5sec'. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server
Am Dienstag, 22. Mai 2007 05:58 schrieb Tom Lane: Okay, I spent some time googling this question, and I can't find any suggestion that any ARM variant uses non-IEEE-compliant float format. Some news I'm picking up at DebConf is that the existing Debian arm port will be replaced by a new armel port with a new ABI that has, among other things, the effect that the double format ceases to be completely different from everyone else. So the problem under discussion here will probably go away soon. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Reducing NUMERIC size for 8.3
We've changed the on-disk database format in 8.3, so we have an opportunity to change other things also. There is a patch thats been on the patch queue for some time called numeric508, submitted Dec 2005; I've updated this patch now for 8.3 to remove bit rot (an hour's work). This is posted to pgsql-patches now and it works. The benefit of the patch is that it reduces each NUMERIC value by 2 bytes, so will speed up things considerably. This is now especially important if we are looking to reduce the speed of numeric division by a factor of 4 (recent WIP patch). The objections to applying this patch originally were: 1. it changes on-disk format (we've done this, so argument is void) 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. The current patch passes all regression tests, but currently fails numeric_big.sql since this explicitly checks for support of numeric(1000,800). We could: a) accept the patch as-is and restrict NUMERIC to 508 digits b) refine the patch somewhat to allow 1000 digits (b) is possible in a couple of ways, both fairly quick: - extend the patch so that one of the spare bits from the second digit is used to represent dscale 508-1000. - extend the patch so that if weight 127 or dscale 127 we would use the first byte in the digits as an extra indicator byte holding the high bits of both fields. Neither change makes any difference to numbers below 1,000,000,000,000,000(127 zeroes in total)...000 which probably covers the vast majority of people's usage. Objections: True, we are passed feature freeze, but this patch has been on the queue for 14 months prior to freeze and has been waiting on disk format changes to make patch application acceptable. We definitely want to reduce the size of Numeric by 2 bytes at some point. The question in my mind is: When is the best time to make this change? If we put this off until 8.4, then it will get rejected again because we won't want to change the disk format again. So the best time to do this is now, otherwise we'll put it off forever. Can I get somebody other than Tom to agree to review the patch? Clearly waiting for Tom to review this is just going to delay release, which I don't want to do. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Tuple alignment
Heikki Linnakangas [EMAIL PROTECTED] writes: Why do we require that t_hoff is MAXALIGNed? ISTM that if the first field in a tuple doesn't require alignment, it could be stored immediately after the null bitmap, without padding. Then the intra-tuple alignment would be unpredictable. The OID trick doesn't work very well either. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs [EMAIL PROTECTED] writes: We've changed the on-disk database format in 8.3, so we have an opportunity to change other things also. There is a patch thats been on the patch queue for some time called numeric508, submitted Dec 2005; I thought that idea had been rejected long since as being an unreasonable reduction in the flexibility of numeric arithmetic. The benefit of the patch is that it reduces each NUMERIC value by 2 bytes, so will speed up things considerably. This is now especially important if we are looking to reduce the speed of numeric division by a factor of 4 (recent WIP patch). The only way that patch would make division faster is by making the slowest cases (longest input numbers) impossible. Which is hardly a solution. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. Regards, Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 16:56 +0200, Andreas Pflug wrote: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. Thats the current limit; I agree, but I didn't choose it. IIRC if you don't specify a limit then you can have arbitrarily long numbers. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 10:54 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: We've changed the on-disk database format in 8.3, so we have an opportunity to change other things also. There is a patch thats been on the patch queue for some time called numeric508, submitted Dec 2005; I thought that idea had been rejected long since as being an unreasonable reduction in the flexibility of numeric arithmetic. I've re-read all of the posts from Dec 2005 and I see everyone in favour of committing the patch, subject to review of whether limitation to 508 was acceptable. For me that was never finalised, at least I can't see a clear decision. The important point is that the patch can be easily modified to overcome these restrictions, a line of thought that was never explored. The current patch can be easily modified to allow 1000 digits, yet still saving 2 bytes per value when storing more common values. Or we can use the same techniques to represent arbitrary number of digits, again still saving 2 bytes on common values. We have the ground work done, we just need to agree further changes. The benefit of the patch is that it reduces each NUMERIC value by 2 bytes, so will speed up things considerably. This is now especially important if we are looking to reduce the speed of numeric division by a factor of 4 (recent WIP patch). The only way that patch would make division faster is by making the slowest cases (longest input numbers) impossible. Which is hardly a solution. It isn't directly related to division, but is a speed up nonetheless which can offset the loss. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug wrote: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. This 1000 is just a restriction on the typmod of numeric. You can still use a much higher number of digits, if you use unconstrained numeric: test= create table test (n numeric); CREATE TABLE test= insert into test values (10::numeric ^ 9); INSERT 0 1 test= select length(n) from test; length 100017 (1 row) Best Regards Michael Paesold ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug [EMAIL PROTECTED] writes: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. As per the recent discussion about factorial, the current limit of numeric format is 10^131071 --- there is a whole lot of daylight between that and 10^508. I had a thought though: it's possible to reduce the header overhead for typical-size numbers without giving up the ability to store large ones. This is because the POS/NEG/NAN sign possibilities leave one unused bit pattern. Hence: 1. Switch the positions of the n_sign_dscale and n_weight fields in the long format, so that the sign bits are in the first word. 2. Reserve the fourth sign bit pattern to denote a compressed-header format in which there's just one uint16 header word and the NumericDigits start right after that. The header word could contain: 2 bits: sign distinguishing this from the two-word-header format 1 bit: actual number sign (POS or NEG, disallow NaN) 6 bits: weight, room for -32 .. 31 7 bits: dscale, room for 0 .. 127 3. When packing a NumericVar into a Numeric, use this short format when it's not a NaN and the weight and dscale are in range, else use the long format. Since the weight is in base-1 digits, this bit allocation allows a dynamic range of about +- 10^127 which fits well with the dscale range. But I suspect that most of the use-cases for long numerics involve large integers, so it might be more useful to shave another bit or two from dscale and give 'em to weight. In any case, no capability is lost, unlike the original proposal; and this would be much less invasive than the original patch since there's no need to play tricks with the content of the digit array. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. As per the recent discussion about factorial, the current limit of numeric format is 10^131071 --- there is a whole lot of daylight between that and 10^508. I had a thought though: it's possible to reduce the header overhead for typical-size numbers without giving up the ability to store large ones. This is because the POS/NEG/NAN sign possibilities leave one unused bit pattern. Hence: 1. Switch the positions of the n_sign_dscale and n_weight fields in the long format, so that the sign bits are in the first word. 2. Reserve the fourth sign bit pattern to denote a compressed-header format in which there's just one uint16 header word and the NumericDigits start right after that. The header word could contain: 2 bits: sign distinguishing this from the two-word-header format 1 bit: actual number sign (POS or NEG, disallow NaN) 6 bits: weight, room for -32 .. 31 7 bits: dscale, room for 0 .. 127 3. When packing a NumericVar into a Numeric, use this short format when it's not a NaN and the weight and dscale are in range, else use the long format. Since the weight is in base-1 digits, this bit allocation allows a dynamic range of about +- 10^127 which fits well with the dscale range. But I suspect that most of the use-cases for long numerics involve large integers, so it might be more useful to shave another bit or two from dscale and give 'em to weight. In any case, no capability is lost, unlike the original proposal; and this would be much less invasive than the original patch since there's no need to play tricks with the content of the digit array. I wonder if the currently waiting patch isn't Good Enough for 999. % of use cases, and all others can use numeric instead of numeric(1000,800) or so. Especially since there are many patches waiting that do need further investigation and refining. Regards, Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 11:32 -0400, Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: Simon Riggs wrote: The objections to applying this patch originally were: 2. it would restrict number of digits to 508 and there are allegedly some people that want to store 508 digits. If 508 digits are not enough, are1000 digits be sufficient? Both limits appear quite arbitrary to me. As per the recent discussion about factorial, the current limit of numeric format is 10^131071 --- there is a whole lot of daylight between that and 10^508. I had a thought though: it's possible to reduce the header overhead for typical-size numbers without giving up the ability to store large ones. This is because the POS/NEG/NAN sign possibilities leave one unused bit pattern. Hence: 1. Switch the positions of the n_sign_dscale and n_weight fields in the long format, so that the sign bits are in the first word. 2. Reserve the fourth sign bit pattern to denote a compressed-header format in which there's just one uint16 header word and the NumericDigits start right after that. The header word could contain: 2 bits: sign distinguishing this from the two-word-header format 1 bit: actual number sign (POS or NEG, disallow NaN) 6 bits: weight, room for -32 .. 31 7 bits: dscale, room for 0 .. 127 3. When packing a NumericVar into a Numeric, use this short format when it's not a NaN and the weight and dscale are in range, else use the long format. Since the weight is in base-1 digits, this bit allocation allows a dynamic range of about +- 10^127 which fits well with the dscale range. But I suspect that most of the use-cases for long numerics involve large integers, so it might be more useful to shave another bit or two from dscale and give 'em to weight. In any case, no capability is lost, unlike the original proposal; and this would be much less invasive than the original patch since there's no need to play tricks with the content of the digit array. Sounds good. I thought there'd be a way. Since this is your idea, would you like to do this, or should I? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Tuple alignment
Tom Lane [EMAIL PROTECTED] writes: Heikki Linnakangas [EMAIL PROTECTED] writes: Why do we require that t_hoff is MAXALIGNed? ISTM that if the first field in a tuple doesn't require alignment, it could be stored immediately after the null bitmap, without padding. Then the intra-tuple alignment would be unpredictable. At first I thought that was a killer problem too. But on further thought I could only think of one thing that actually depends on consistent intra-tuple alignment: the cache offset in the tupledesc used by heaptuple.c. It's probably too gross to consider but in fact we could work around this problem. At the limit we could keep 8 cache offsets for the 8 possible alignments of t_hoff. But I think we could be cleverer. If we kept track of the normal offset for the maxaligned t_hoff then any field would only ever be up to 7 bytes offset from that. As you meet fields with larger and larger alignment eventually fields would be back to their normal alignment if you mean a maxaligned field. So the correct place to find any field is cached normal alignment - (t_hoff aligned to largest alignment seen on earlier fields - t_hoff) This depends on an assumption which is true in C but perhaps not for Postgres datatypes: sizeof(datatype) is an integral multiple of alignof(datatype) for all data types. I'm not sure it's worth bothering with this given that the space savings possible is bounded to a fixed alignment per tuple. For tuples with 8 fields or under there's usually going to be no savings at all. I just thought I would get this down and in the mail archives before I forget it. The OID trick doesn't work very well either. expn OID trick? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing NUMERIC size for 8.3
Andreas Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: In any case, no capability is lost, unlike the original proposal; and this would be much less invasive than the original patch since there's no need to play tricks with the content of the digit array. I wonder if the currently waiting patch isn't Good Enough for 999. % of use cases, and all others can use numeric instead of numeric(1000,800) or so. Apparently you misunderstand that patch: it takes capability away from unconstrained numeric too. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 17:49 +0200, Andreas Pflug wrote: I wonder if the currently waiting patch isn't Good Enough for 999. % of use cases, and all others can use numeric instead of numeric(1000,800) or so. Especially since there are many patches waiting that do need further investigation and refining. That still has problems. Another approach would be to restrict the existing datatype NUMERIC to 508 digits, but introduce a new datatype of LONGNUM which allows arbitrary length numerics for those that *do* care. It might be more beneficial in the longer run to separate the use cases so we can further tune them (not sure how just yet...) -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs [EMAIL PROTECTED] writes: Since this is your idea, would you like to do this, or should I? Go for it. I'm not actually convinced this is worth spending time on, as Greg Stark's 1-byte-varlena patch already saved more for typical numerics than this will. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Since this is your idea, would you like to do this, or should I? Go for it. OK I'm not actually convinced this is worth spending time on, as Greg Stark's 1-byte-varlena patch already saved more for typical numerics than this will. Understood, thats why I left this till last. This will save even more on top of those savings. My concern is to make this change now while we can, since we will be aiming for 8.3-8.4 to be a binary upgrade. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing NUMERIC size for 8.3
Tom Lane [EMAIL PROTECTED] writes: I had a thought though: it's possible to reduce the header overhead for typical-size numbers without giving up the ability to store large ones. This is because the POS/NEG/NAN sign possibilities leave one unused bit pattern. Hence: I had a whack and doing something similar to this a while back -- though I was more concerned with optimizing very small numeric variables, especially small integers and typical currency-sized quantities. The bottleneck I ran into was desupporting the macros which access the sign, display scale, etc. Those are currently in numeric.h making them--in theory-- publicly available outside the internal functions. Afaik nothing actively uses them outside of numeric.c though. If we're happy desupporting those macros and not replacing them with anything comparable then there are additional bits of state hidden in the varlena size. Anything shorter than the shortest possible numeric representation can implicitly be interpreted as some alternate compact representation. I already had a patch that stored small integers in a single NumericDigit without any numeric header at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Tuple alignment
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: The OID trick doesn't work very well either. expn OID trick? See htup.h concerning where we stick OID into a tuple that has OID. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Reducing NUMERIC size for 8.3
Gregory Stark [EMAIL PROTECTED] writes: Anything shorter than the shortest possible numeric representation can implicitly be interpreted as some alternate compact representation. I already had a patch that stored small integers in a single NumericDigit without any numeric header at all. That's getting well beyond the realm of reason IMHO. I doubt you can merge it with this proposal anyway --- how will you disambiguate from zero with a positive dscale (0.00)? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Tom Lane wrote: It seems that time-based GUC variables can be spelled like 1h but not 1hr 1minbut not 1m 1s but not 1sec This is inconsistent and confusing. I don't object to the ones on the left as being the standard spellings for printout, but if we're not going to have a simple uniform rule like shortest possible abbreviation then we ought to accept plausible alternatives on input. I got burnt by this just now because I looked at the autovacuum_naptime setting in postgresql.conf, which is shown as '1min', and figured I could change it to '5sec'. Some random observations: - I was bitten by this too, not long ago, and took me a while to understand why. Should we at least log a HINT or something? - We do allow preffixes in certain cases. For example I can specify a naptime in milliseconds: $ postmaster -c autovacuum_naptime=2000ms and it shows up as 2s in SHOW. However, preffixing with M or K does not work: $ postmaster -c autovacuum_naptime=2Ms FATAL: parameter autovacuum_naptime requires an integer value $ postmaster -c autovacuum_naptime=2Ks FATAL: parameter autovacuum_naptime requires an integer value millihours doesn't seem to work either. - In shared_buffers, these work: 8MB 8 MB These don't work: 8M B 8 M B 8mB 8m 8M I think this means we could safely use m as an abbreviation for minutes, where it is not preffixed by anything else (so mm would not mean milliminutes, nor millimeters). It is not confused with meters because we don't use longitude anywhere in our configuration settings and are not likely to start doing so in the foreseeable future. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Alvaro Herrera [EMAIL PROTECTED] writes: - I was bitten by this too, not long ago, and took me a while to understand why. Should we at least log a HINT or something? Yeah, a HINT listing the allowed spellings of the unit would go a long way here. However, preffixing with M or K does not work: It's case-sensitive. We had that argument already, but I still think this decision was wrong. - In shared_buffers, these work: 8MB 8 MB These don't work: 8M B 8 M B Looking at the code, spaces before the unit are allowed, but not spaces within or after. I agree with disallowing embedded spaces, I think, but not allowing trailing spaces is inconsistent with our practice in other cases (in particular, these very same variables, when written as pure numbers...) regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Am Montag, 18. Juni 2007 16:16 schrieb Tom Lane: It seems that time-based GUC variables can be spelled like 1h but not 1hr 1minbut not 1m 1s but not 1sec The left columns are the standard units. The right columns are just randomly made up AFAICT. If we allow that, what's someone to stop from making up their own set? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
I wrote: Gregory Stark [EMAIL PROTECTED] writes: Anything shorter than the shortest possible numeric representation can implicitly be interpreted as some alternate compact representation. I already had a patch that stored small integers in a single NumericDigit without any numeric header at all. That's getting well beyond the realm of reason IMHO. I doubt you can merge it with this proposal anyway --- how will you disambiguate from zero with a positive dscale (0.00)? Hmm, I take that back: it could be merged with this proposal pretty easily. Again assuming that we move the POS/NEG/NAN/SHORT sign flags to the first header word, we can have: Sign=NAN: it's a NAN, regardless of anything else. We may as well store NAN as just 2 bytes. Sign=SHORT: it's a non-NAN with limited weight and dscale range, as per my proposal. A zero value would still be only 2 bytes, but anything else is longer. (This would be needed only for zero with nonzero dscale, though.) Sign=POS or NEG: if length == 2 bytes then interpret the remaining 14 bits as a single NBASE digit, with assumed weight and dscale 0. This allows storing integers up to +/- in 2 bytes (+ 1 byte varlena header). If length 2 then it is a traditional-format numeric. I'm not entirely convinced this is worth the extra pack/unpack logic, since said integers would be 4 bytes (+1 byte header) in the sign=SHORT representation, which is not that much more. Also, this means at least three different representations of some values, which is going to be a headache for comparison and hashing. But the bit-space is there. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Tsearch vs Snowball, or what's a source file?
Teodor Sigaev [EMAIL PROTECTED] writes: 1) rename FULLTEXT to TEXT SEARCH in SQL command Working on it, I found rather obvious undesired side-effect: if TEXT becomes a keyword then any output of name of text type becomes quoted. Even if TEXT is in unreserved_keyword list. Yeah, I was aware that would happen. What I've been thinking for some time is that we ought to fix quote_ident and ruleutils.c to distinguish unreserved keywords from the other ones, and not quote unreserved ones. The list of unreserved words has gotten longer and more invasive in every release, and I don't see that stopping soon. It's already annoying that perfectly ordinary words like document and key get quoted. The main argument I can think of for not doing this is that if a currently non-reserved keyword becomes reserved in a future release, then having quoted it would prevent problems with restoring dump scripts. I don't find this argument very compelling, though; it seems just as likely that words we don't even have as keywords today will become reserved later. The only thing I see on the horizon that does fit that argument is WITH, which we could special-case. Comments? I'm willing to make this happen if there are no objections. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 18. Juni 2007 16:16 schrieb Tom Lane: It seems that time-based GUC variables can be spelled like 1h but not 1hr 1minbut not 1m 1s but not 1sec The left columns are the standard units. Standard according to whom? In time-related contexts (eg ISO 8601) I'd expect just h m and s. Since there's no likelihood that anyone would think autovacuum_naptime is measured in meters, I think insisting that it must not be written as 1m is just pedantry. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL
On Jun 18, 2007, at 12:58 AM, Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Christopher Browne wrote: That won't help; that would introduce the embarrassment of having a known default password. No it wouldn't unless the packagers set it up to do that. My point is that when a packager (or source) runs initdb, it would prompt for the postgres user password. Practically every existing packaging of PG tries to run initdb as a hidden, behind-the-scenes, definitely not-interactive procedure. I know there's issues with using ident sameuser via TCP, but what about for filesystem socket connections? As for the interactive/non-interactive, we could just leave that as an option to initdb, and make the default to ask for a password. Packagers would just need to feed the right option to initdb. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Am Montag, 18. Juni 2007 18:16 schrieb Alvaro Herrera: - We do allow preffixes in certain cases. It would certainly be fun to have a general units system, which you could use for configuration and data in general. But that would definitely require that we stay strict on what we allow, or you could do no meaningful things with this in a safe way. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] EXPLAIN omits schema?
Tom Lane wrote: Gregory Stark [EMAIL PROTECTED] writes: Once you have an XML plan what can you do with it? All you can do is parse it into constituent bits and display it. You cant do any sort of comparison between plans, aggregate results, search for plans matching constraints, etc. Sure you can, just not in SQL ;-) Given the amount of trouble we'd have to go to to put the data into a pure SQL format, I don't think that's exactly an ideal answer either. I'm for making the raw EXPLAIN output be in a simple and robust format, which people can then postprocess however they want --- including forcing it into SQL if that's what they want. But just because we're a SQL database doesn't mean we should think SQL is the best answer to every problem. While I'm surely not an XML fanboy, it looks better suited to this problem than a pure relational representation would be. If we are looking into such a format we could even think a bit about including basic plan-influencing information like work_mem, enable_* settings, effective_cache_size,.. there too ... Stefan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Peter Eisentraut [EMAIL PROTECTED] writes: I'm pretty sure a lot of people would initially be confused why anyone would write time in meters, let alone those that might associate it with memory units. In my subjective view (and I acknowledge that we have all been educated in different ways), writing 1m for a time quantity is meaningless and an error. That's an argument for why Postgres maybe shouldn't print times with m for minutes -- though I for one would prefer it. Or why it might not be a particularly good idea for a sysadmin to use m given the choice. But to argue that Postgres should refuse m when presented with it you would have to say there's a substantial chance that the user didn't mean minutes and that there was a risk Postgres would do something bad that outweighs giving users who do want minutes getting what they want. Frankly, I think I see m as an abbreviation for minutes *more* often than min anyways. I see times written as 2h30m quite frequently and then there's precedent like this: $ time echo real0m0.000s user0m0.000s sys 0m0.000s -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Am Montag, 18. Juni 2007 19:03 schrieb Tom Lane: Standard according to whom? ISO 31 a.k.a. SI In time-related contexts (eg ISO 8601) I'd expect just h m and s. ISO 8601 appears to use a slightly different syntax for writing timespans. I would not object if anyone added support for that. Since there's no likelihood that anyone would think autovacuum_naptime is measured in meters, I think insisting that it must not be written as 1m is just pedantry. I'm pretty sure a lot of people would initially be confused why anyone would write time in meters, let alone those that might associate it with memory units. In my subjective view (and I acknowledge that we have all been educated in different ways), writing 1m for a time quantity is meaningless and an error. Standards exist for these things, and we have a fine tradition for choosing standards in favor of randomness. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Reducing NUMERIC size for 8.3
On Mon, 2007-06-18 at 12:44 -0400, Tom Lane wrote: I wrote: Gregory Stark [EMAIL PROTECTED] writes: Anything shorter than the shortest possible numeric representation can implicitly be interpreted as some alternate compact representation. I already had a patch that stored small integers in a single NumericDigit without any numeric header at all. That's getting well beyond the realm of reason IMHO. I doubt you can merge it with this proposal anyway --- how will you disambiguate from zero with a positive dscale (0.00)? Hmm, I take that back: it could be merged with this proposal pretty easily. Again assuming that we move the POS/NEG/NAN/SHORT sign flags to the first header word, we can have: Sign=NAN: it's a NAN, regardless of anything else. We may as well store NAN as just 2 bytes. I'm not entirely convinced this is worth the extra pack/unpack logic, since said integers would be 4 bytes (+1 byte header) in the sign=SHORT representation, which is not that much more. Also, this means at least three different representations of some values, which is going to be a headache for comparison and hashing. But the bit-space is there. I think representing zero as compactly as possible is worth the trouble, so that in itself is enough for me to say Yes. Rest seems good too. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs [EMAIL PROTECTED] writes: I think representing zero as compactly as possible is worth the trouble, Either of these proposals can do that in 2 bytes. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] GUC time unit spelling a bit inconsistent
Tom Lane [EMAIL PROTECTED] writes: It's case-sensitive. We had that argument already, but I still think this decision was wrong. I thought the consensus was that it should change. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing NUMERIC size for 8.3
Simon Riggs wrote: On Mon, 2007-06-18 at 11:55 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Since this is your idea, would you like to do this, or should I? Go for it. OK I'm not actually convinced this is worth spending time on, as Greg Stark's 1-byte-varlena patch already saved more for typical numerics than this will. Understood, thats why I left this till last. This will save even more on top of those savings. My concern is to make this change now while we can, since we will be aiming for 8.3-8.4 to be a binary upgrade. Would someone please explain why we are considering this so far past features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary upgrade was anything more than a pipe dream? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing NUMERIC size for 8.3
Bruce Momjian [EMAIL PROTECTED] writes: Would someone please explain why we are considering this so far past features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary upgrade was anything more than a pipe dream? Well, Greg had left further squeezing of numerics as an open item in his varlena patch, so it's not totally unreasonable to consider a patch for that now --- as long as it's pretty small and simple. I agree that in-place upgrade is a pipe dream until we see someone actually step forward and do the work to build a usable pg_upgrade utility. If nothing else, we should consider swapping the n_sign_dscale and n_weight fields now, since that would enable upward-compatible implementation of these ideas later. Otherwise any such patch would probably get rejected if pg_upgrade did happen to emerge out of nowhere. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing NUMERIC size for 8.3
Bruce Momjian [EMAIL PROTECTED] writes: Would someone please explain why we are considering this so far past features freeze, and who suggtested that the 8.3-8.4 upgrade being a binary upgrade was anything more than a pipe dream? Simon just updated a patch he had originally submitted over a year ago. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] PostgreSQL Developer needed in San Diego
We currently have a need for a PostgreSQL Developer to fill a permanent requirement with our company here in San Diego. Details below. Please contact Lance Perry at 858-550-1658 or [EMAIL PROTECTED] for further info. Title: SQL Server/PostgreSQL Database Developer This position involves creating tables, views, functions and stored procedures in SQL Server and PostgreSQL to support front end OLTP and reporting applications. The ideal developer will have thorough knowledge of Transact-SQL, and extensive experience with complex stored procedures, code optimization, and index tuning. Ideal candidate will have the following qualifications: * Prefer 5 years of database development with SQL Server 7.0/2000/2005 and Enterprise Manager/Query Analyzer * Significant background in creating complex stored procedures and SQL scripts with Transact-SQL * Significant understanding of PostgreSQL * Understanding of database normalization concepts * Some experience in logical and physical database design and implementation * Prior experience working in a project oriented environment and meeting deadlines under tight time constraints * Strong analytical skills * Capable of working independently with minimal supervision We are proud to be an EEO/AA employer M/F/D/V. We maintain a drug-free workplace and perform pre-employment substance abuse testing. Lance Perry Technology Recruiter San Diego, CA 858.550.1658 direct 858.552.9071 fax [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] www.kforce.com http://www.kforce.com/ Great People = Great ResultsSM Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and/or privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
Re: [HACKERS] PostgreSQL Developer needed in San Diego
On 6/18/07, Perry, Lance [EMAIL PROTECTED] wrote: We currently have a need for a PostgreSQL Developer to fill a permanent requirement with our company here in San Diego. In the future, please post to pgsql-jobs. Thank you. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq