Re: [HACKERS] Call for 7.5 feature completion
On Tue, 2004-05-18 at 09:30, Andrew Sullivan wrote: On Sun, May 16, 2004 at 02:46:38PM -0400, Jan Wieck wrote: fact that checkpoints, vacuum runs and pg_dumps bog down their machines to the state where simple queries take several seconds care that much for any Win32 port? Do you think it is a good sign for those who have Yes. I am one such person, but from the marketing side of things, I understand perfectly well what failing to deliver on Win32 again will cost. It's not important to _me_, but that doesn't mean it's unimportant to the project. My primary fear about delivering Win32 with all of these other great features is that, IMO, there is a higher level of risk associated with these advanced features. At the same time, this will be the first trial for many Win32 users. Should there be some problems, in general, or worse, specific to Win32 users as it relates to these new features, it could cost some serious Win32/PostgreSQL community points. A troubled release which is experienced by Win32 users is going to be a battle cry for MySQL. I've been quietly hoping that these great new features would become available a release before Win32 was completed. That way, a shake down would occur before the Win32 audience got a hold of it. Which, in turn, should make for a great Win32 experience. I guess my point is, if these other features don't make it into 7.5 and Win32 does, that might still be a good thing for the potential Win32 market. Granted, if I was a developer on one of these big features and it didn't make it, I too would be fairly disappointed. Yet, once we get a release out with Win32, it should help give everyone a feel for the ability to actually support this new audience and platform. If there is a large influx of users compounded by problems, I suspect it's again, going to reflect poorly on the PostgreSQL community. ...just some ramblings -- Greg Copeland, Owner [EMAIL PROTECTED] Copeland Computer Consulting 940.206.8004 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Call for 7.5 feature completion
From the FAQ (http://www.drbd.org/316.html): Q: Can XFS be used with DRBD? A: XFS uses dynamic block size, thus DRBD 0.7 or later is needed. Hope we're talking about the same project. ;) Cheers! On Tue, 2004-05-18 at 00:16, Mario Weilguni wrote: Well that seems to be part of the problem. ext3 does not scale well at all under load. You should probably upgrade to a better FS (like XFS). I am not saying that your point isn't valid (it is) but upgrading to a better FS will help you. Thanks for the info, but I've already noticed that. XFS is no option since it does not work with drbd, but jfs seems to be quite good. Regards, Mario Weilguni ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings -- Greg Copeland, Owner [EMAIL PROTECTED] Copeland Computer Consulting 940.206.8004 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Detecting corrupted pages earlier
On Mon, 2003-02-17 at 22:04, Tom Lane wrote: Curt Sampson [EMAIL PROTECTED] writes: On Mon, 17 Feb 2003, Tom Lane wrote: Postgres has a bad habit of becoming very confused if the page header of a page on disk has become corrupted. What typically causes this corruption? Well, I'd like to know that too. I have seen some cases that were identified as hardware problems (disk wrote data to wrong sector, RAM dropped some bits, etc). I'm not convinced that that's the whole story, but I have nothing to chew on that could lead to identifying a software bug. If it's any kind of a serious problem, maybe it would be worth keeping a CRC of the header at the end of the page somewhere. See past discussions about keeping CRCs of page contents. Ultimately I think it's a significant expenditure of CPU for very marginal returns --- the layers underneath us are supposed to keep their own CRCs or other cross-checks, and a very substantial chunk of the problem seems to be bad RAM, against which occasional software CRC checks aren't especially useful. This is exactly why magic numbers or simple algorithmic bit patterns are commonly used. If the magic number or bit pattern doesn't match it's page number accordingly, you know something is wrong. Storage cost tends to be slightly and CPU overhead low. I agree with you that a CRC is seems overkill for little return. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Incremental backup
On Fri, 2003-02-14 at 06:52, Bruce Momjian wrote: OK, once we have PITR, will anyone want incremental backups? --- Martin Marques wrote: On Jue 13 Feb 2003 16:38, Bruce Momjian wrote: Patrick Macdonald wrote: Bruce Momjian wrote: Someone at Red Hat is working on point-in-time recovery, also known as incremental backups. PITR and incremental backup are different beasts. PITR deals with a backup + logs. Incremental backup deals with a full backup + X smaller/incremental backups. So... it doesn't look like anyone is working on incremental backup at the moment. But why would someone want incremental backups compared to PITR? The backup would be mixture of INSERTS, UPDATES, and DELETES, right? Seems pretty weird. :-) Good backup systems, such as Informix (it's the one I used) doesn't do a query backup, but a pages backup. What I mean is that it looks for pages in the system that has changed from the las full backup and backs them up. That's how an incremental backup works. PITR is another thing, which is even more important. :-) I do imagine for some people it will register high on their list. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Changing the default configuration
On Wed, 2003-02-12 at 10:36, Robert Treat wrote: On Tue, 2003-02-11 at 21:00, Tatsuo Ishii wrote: while 200 may seem high, 32 definitely seems low. So, what IS a good compromise? for this and ALL the other settings that should probably be a bit higher. I'm guessing sort_mem or 4 or 8 meg hits the knee for most folks, and the max fsm settings tom has suggested make sense. 32 is not too low if the kernel file descriptors is not increased. Beware that running out of the kernel file descriptors is a serious problem for the entire system, not only for PostgreSQL. Had this happen at a previous employer, and it definitely is bad. I believe we had to do a reboot to clear it up. And we saw the problem a couple of times since the sys admin wasn't able to deduce what had happened the first time we got it. IIRC the problem hit somewhere around 150 connections, so we ran with 128 max. I think this is a safe number on most servers these days (running linux as least) though out of the box I might be more inclined to limit it to 64. If you do hit a file descriptor problem, *you are hosed*. That does seem like a more reasonable upper limit. I would rather see people have to knowingly increase the limit rather than bump into system upper limits and start scratching their heads trying to figure out what the heck is going on. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PGP signing release
On Tue, 2003-02-11 at 20:17, Bruce Momjian wrote: I hate to poo-poo this, but this web of trust sounds more like a web of confusion. I liked the idea of mentioning the MD5 in the email announcement. It doesn't require much extra work, and doesn't require a 'web of %$* to be set up to check things. Yea, it isn't as secure as going through the motions, but if someone breaks into that FTP server and changes the tarball and MD5 file, we have much bigger problems than someone modifying the tarballs; our CVS is on that machine too. --- Greg Copeland wrote: On Tue, 2003-02-11 at 18:27, Curt Sampson wrote: On Wed, 11 Feb 2003, Greg Copeland wrote: On Wed, 2003-02-05 at 18:53, Curt Sampson wrote: [Re: everybody sharing a single key] This issue doesn't change regardless of the mechanism you pick. Anyone that is signing a key must take reasonable measures to ensure the protection of their key. Right. Which is why you really want to use separate keys: you can determine who compromised a key if it is compromised, and you can revoke one without having to revoke all of them. Which pretty much inevitably leads you to just having the developers use their own personal keys to sign the release. Basically, you are saying: You trust a core developer You trust they can protect their keys You trust they can properly distribute their trust You don't trust a core developer with a key Not at all. I trust core developers with keys, but I see no reason to weaken the entire system by sharing keys when it's not necessary. Having each developer sign the release with his own personal key solves every problem you've brought up. cjs You need to keep in mind, I've not been advocating, rather, clarifying. The point being, having a shared key between trusted core developers is hardly an additional risk. After all, either they can be trusted or they can't. At this point, I think we both understand where the other stands. Either we agree or agree to disagree. The next step is for the developers to adopt which path they prefer to enforce and to ensure they have the tools and knowledge at hand to support it. Anyone know if Tom and Bruce know each other well enough to sign each other's keys outright, via phone, via phone and snail-mail? That would put us off to an excellent start. Bruce, Since you just got back in town I'm not sure if you've been able to follow the thread or not. Just the same, I wanted to remind you that using MD5 is not a security mechanism of any worth. As such, this thread was an effort to add a layer of authenticity. Again, this is not something that MD5 is going to provide for, now or in the future. If it sounds confusing, it's only because you've never done it. Honestly, once you take the 20-minutes to do it the first time, you'll understand what's going on. Beyond that, you won't have to sign additional keys until you can validate them or as they expire. It only takes minutes once you understand what's going on after that. The time to actually sign packages is more or less the same as creating your hashes. Lastly, don't forget that your site is mirrored all over the place. As such, you're not the only place open to attack. Just because you have additional software running on this box is no reason to throw your hands in the air and say, I don't care. Simple fact is, it only takes one site to become compromised to significantly effect PostgreSQL's reputation. And that site doesn't have to be yours. If it's an official mirror, it reflects (oh...a pun!) accordingly on the project. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PGP signing releases
Well said. I'm glad someone else is willing to take a stab at addressing these issues, since I've been down with the flu. Thanks Greg. As both Gregs have pointed out, hashes and checksums alone should only be used as an integrity check. It is not a viable security mechanism. A hash does not provide for authentication and even more importantly, verification of authentication. These concepts are key to creating a secure environment. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting On Mon, 2003-02-10 at 21:57, [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 So you put the MD5 sum into the release announcement email. That is downloaded by many people and also archived in many distributed places that we don't control, so it would be very hard to tamper with. ISTM that this gives you the same result as a PGP signature but with much less administrative overhead. Not the same results. For one thing, the mailing announcement may be archived on google, but asking people to search google for an MD5 sum as they download the tarball is hardly feasible. Second, it still does not prevent someone from breaking into the server and replacing the tarball with their own version, and their own MD5 checksum. Or maybe just one of the mirrors. Users are not going to know to compare that MD5 with versions on the web somewhere. Third, is does not allow a positive history to be built up due to signing many releases over time. With PGP, someone can be assured that the 9.1 tarball they just downloaded was signed by the same key that signed the 7.3 tarball they've been using for 2 years. Fourth, only with PGP can you trace your key to the one that signed the tarball, an additional level of security. MD5 provides an integrity check only. Any security it affords (such as storing the MD5 sum elsewhere) is trivial and should not be considered when using PGP is standard, easy to implement, and has none of MD5s weaknesses. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200302102250 -BEGIN PGP SIGNATURE- Comment: http://www.turnstep.com/pgp.html iD8DBQE+SA4AvJuQZxSWSsgRAhenAKDu0vlUBC5Eodyt2OxTG6el++BJZACguR2i GGLAzhtA7Tt9w4RUYXY4g2U= =3ryu -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PostgreSQL Benchmarks
On Tue, 2003-02-11 at 08:26, Christopher Kings-Lynne wrote: Hrm. I just saw that the PHP ADODB guy just published a bunch of database benchmarks. It's fairly evident to me that benchmarking PostgreSQL on Win32 isn't really fair: http://php.weblogs.com/oracle_mysql_performance *sigh* How much of the performance difference is from the RDBMS, from the middleware, and from the quality of implementation in the middleware. While I'm not surprised that the the cygwin version of PostgreSQL is slow, those results don't tell me anything about the quality of the middleware interface between PHP and PostgreSQL. Does anyone know if we can rule out some of the performance loss by pinning it to bad middleware implementation for PostgreSQL? Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL Benchmarks
On Tue, 2003-02-11 at 08:31, Mario Weilguni wrote: Hrm. I just saw that the PHP ADODB guy just published a bunch of database benchmarks. It's fairly evident to me that benchmarking PostgreSQL on Win32 isn't really fair: http://php.weblogs.com/oracle_mysql_performance And why is the highly advocated transaction capable MySQL 4 not tested? That's the problem, for every performance test they choose ISAM tables, and when transactions are mentioned it's said MySQL has transactions. But why no benchmarks? Insert Statement Not using bind variables (MySQL and Oracle): $DB-BeginTrans(); Using bind variables: $DB-BeginTrans(); PL/SQL Insert Benchmark Appears to not initiate a transaction. I'm assuming this is because it's implicitly within a transaction? Oddly enough, I am seeing explicit commits here. It appears that the benchmarks are attempting to use transactions, however, I have no idea if MySQL's HEAP supports them. For all I know, transactions are being silently ignored. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PGP signing releases
On Wed, 2003-02-05 at 18:53, Curt Sampson wrote: On Thu, 5 Feb 2003, Greg Copeland wrote: Who will actually hold the key? Where will it be physically kept? Good question but can usually be addressed. It can be addressed, but how well? This is another big issue that I don't see any plan for that I'm comfortable with.. The reason I was vague is because it depends on the key route. Obviously, if each person signs, each person must protect their own key. If there is a central project key, it's simply a matter of determining which box is used for signing, etc...while important, it's certainly not difficult to address. It seems to me extremely difficult to address. Unless you are physically monitoring someone, how do you prevent someone from copying the key off of that machine. At which point anybody with the passphrase can use it for anything. This issue doesn't change regardless of the mechanism you pick. Anyone that is signing a key must take reasonable measures to ensure the protection of their key. How many people will know the passphrase? As few as possible. Ideally only two, maybe three core developers. Um...I'm not sure that this is a relevant question at all. The passphrase is not part of the key; it's just used to encrypt the key for storage. If you know the passphrase, you can make unlimited copies of the key, and these copies can be protected with any passphrases you like, or no passphrase, for that matter. If you're concerned about this to that extent, clearly those people should not part of the web of trust nor should they be receiving the passphrase nor a copy of the private key. Remember, trust is a key (pun intended) part of a reliable PKI. In that case, I would trust only one person with the key. Making copies of the key for others gives no additional protection (since it takes only one person out of the group to sign the release) while it increases the chance of key compromise (since there are now more copies of the key kicking around, and more people who know the passphrase). Which brings us back to backups. Should the one person that has the key be unavailable or dead, who will sign the release? Furthermore, making *limited* copies of the key does provide for additional limited protection in case it's lost for some reason. This helps mitigate the use of the revocation key until it's absolutely required. Also provides for backup (of key and people). Basically, you are saying: You trust a core developer You trust they can protect their keys You trust they can properly distribute their trust You don't trust a core developer with a key Hmmm...something smells in your web of trust...So, which is it? Do you trust the core developers to protect the interests of the project and the associated key or not? If not, why trust any digital signature from them in the first place? Can't stress this enough. PKI is an absolute failure without trust. Period. Keys cannot be transfered from one person to another since, being digital data, there's no way to ascertain that the original holder does not still (on purpose or inadvertantly) have copies of the key. So in the case where we want to transfer trust from one person to another, we must also generate a new key and revoke the old one. No one is talking about transferring keys. In fact, I've previously addressed this topic, from a different angle, a number of times. We are talking about shared trust and not transfered trust. The transferring of trust is done by signing keys, not transferring keys. This is now exactly equivalant to having each developer sign postgres with a signing key (signed by his main key) for which the other developers (or appropriate authority) have a revocation certificate. And back to the passphrase issue, once again, can't you see that it's completely irrelevant? At some point, someone who knows the passphrase is going to have to be in a position to use that to decrypt the key. At that point he has the key, period. Changing the passphrase does no good, because you can't change the passphrase on the copy of the key he may have made. So you trust the core developer to sign the package but you don't trust him to have the key that's required to sign it? You can't have it both ways. A passphrase is like a lock on your barn door. After you've given someone the key and he's gone in and taken the cow, changing the lock gives you no protection at all. I can assure you I fully understand the implications and meaning of everything I've said. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Changing the default configuration (was Re: [pgsql-advocacy]
On Tue, 2003-02-11 at 10:20, Tom Lane wrote: Merlin Moncure [EMAIL PROTECTED] writes: May I make a suggestion that maybe it is time to start thinking about tuning the default config file, IMHO its just a little bit too conservative, It's a lot too conservative. I've been thinking for awhile that we should adjust the defaults. The original motivation for setting shared_buffers = 64 was so that Postgres would start out-of-the-box on machines where SHMMAX is 1 meg (64 buffers = 1/2 meg, leaving 1/2 meg for our other shared data structures). At one time SHMMAX=1M was a pretty common stock kernel setting. But our other data structures blew past the 1/2 meg mark some time ago; at default settings the shmem request is now close to 1.5 meg. So people with SHMMAX=1M have already got to twiddle their postgresql.conf settings, or preferably learn how to increase SHMMAX. That means there is *no* defensible reason anymore for defaulting to 64 buffers. We could retarget to try to stay under SHMMAX=4M, which I think is the next boundary that's significant in terms of real-world platforms (isn't that the default SHMMAX on some BSDen?). That would allow us 350 or so shared_buffers, which is better, but still not really a serious choice for production work. What I would really like to do is set the default shared_buffers to 1000. That would be 8 meg worth of shared buffer space. Coupled with more-realistic settings for FSM size, we'd probably be talking a shared memory request approaching 16 meg. This is not enough RAM to bother any modern machine from a performance standpoint, but there are probably quite a few platforms out there that would need an increase in their stock SHMMAX kernel setting before they'd take it. So what this comes down to is making it harder for people to get Postgres running for the first time, versus making it more likely that they'll see decent performance when they do get it running. It's worth noting that increasing SHMMAX is not nearly as painful as it was back when these decisions were taken. Most people have moved to platforms where it doesn't even take a kernel rebuild, and we've acquired documentation that tells how to do it on all(?) our supported platforms. So I think it might be okay to expect people to do it. The alternative approach is to leave the settings where they are, and to try to put more emphasis in the documentation on the fact that the factory-default settings produce a toy configuration that you *must* adjust upward for decent performance. But we've not had a lot of success spreading that word, I think. With SHMMMAX too small, you do at least get a pretty specific error message telling you so. Comments? I'd personally rather have people stumble trying to get PostgreSQL running, up front, rather than allowing the lowest common denominator more easily run PostgreSQL only to be disappointed with it and move on. After it's all said and done, I would rather someone simply say, it's beyond my skill set, and attempt to get help or walk away. That seems better than them being able to run it and say, it's a dog, spreading word-of-mouth as such after they left PostgreSQL behind. Worse yet, those that do walk away and claim it performs horribly are probably doing more harm to the PostgreSQL community than expecting someone to be able to install software ever can. Nutshell: Easy to install but is horribly slow. or Took a couple of minutes to configure and it rocks! Seems fairly cut-n-dry to me. ;) Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Changing the default configuration (was Re:
On Tue, 2003-02-11 at 11:23, mlw wrote: Greg Copeland wrote: I'd personally rather have people stumble trying to get PostgreSQL running, up front, rather than allowing the lowest common denominator more easily run PostgreSQL only to be disappointed with it and move on. After it's all said and done, I would rather someone simply say, it's beyond my skill set, and attempt to get help or walk away. That seems better than them being able to run it and say, it's a dog, spreading word-of-mouth as such after they left PostgreSQL behind. Worse yet, those that do walk away and claim it performs horribly are probably doing more harm to the PostgreSQL community than expecting someone to be able to install software ever can. RANT And that my friends is why PostgreSQL is still relatively obscure. This attitude sucks. If you want a product to be used, you must put the effort into making it usable. Ah..okay It is a no-brainer to make the default configuration file suitable for the majority of users. It is lunacy to create a default configuration which provides poor performance for over 90% of the users, but which allows the lowest common denominator to work. I think you read something into my email which I did not imply. I'm certainly not advocating a default configuration file assuming 512M of share memory or some such insane value. Basically, you're arguing that they should keep doing exactly what they are doing. It's currently known to be causing problems and propagating the misconception that PostgreSQL is unable to perform under any circumstance. I'm arguing that who cares if 5% of the potential user base has to learn to properly install software. Either they'll read and learn, ask for assistance, or walk away. All of which are better than Jonny-come-lately offering up a meaningless benchmark which others are happy to eat with rather large spoons. A product must not perform poorly out of the box, period. A good product manager would choose one of two possible configurations, (a) a high speed fairly optimized system from the get-go, or (b) it does not run unless you create the configuration file. Option (c) out of the box it works like crap, is not an option. That's the problem. Option (c) is what we currently have. I'm amazed that you even have a problem with option (a), as that's what I'm suggesting. The problem is, potentially for some minority of users, it may not run out of the box. As such, I'm more than happy with this situation than 90% of the user base being stuck with a crappy default configuration. Oddly enough, your option (b) is even worse than what you are ranting at me about. Go figure. This is why open source gets such a bad reputation. Outright contempt for the user who may not know the product as well as those developing it. This attitude really sucks and it turns people off. We want people to use PostgreSQL, to do that we must make PostgreSQL usable. Usability IS important. /RANT There is no contempt here. Clearly you've read your own bias into this thread. If you go back and re-read my posting, I think it's VERY clear that it's entirely about usability. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Changing the default configuration (was Re:
On Tue, 2003-02-11 at 12:55, Tom Lane wrote: scott.marlowe [EMAIL PROTECTED] writes: Is setting the max connections to something like 200 reasonable, or likely to cause too many problems? That would likely run into number-of-semaphores limitations (SEMMNI, SEMMNS). We do not seem to have as good documentation about changing that as we do about changing the SHMMAX setting, so I'm not sure I want to buy into the it's okay to expect people to fix this before they can start Postgres the first time argument here. Also, max-connections doesn't silently skew your testing: if you need to raise it, you *will* know it. Besides, I'm not sure that it makes sense to let other product needs dictate the default configurations for this one. It would be one thing if the vast majority of people only used PostgreSQL with Apache. I know I'm using it in environments in which no way relate to the web. I'm thinking I'm not alone. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Windows SHMMAX (was: Default configuration)
On Tue, 2003-02-11 at 12:49, Merlin Moncure wrote: Does anyone know whether cygwin has a setting comparable to SHMMAX, and if so what is its default value? How about the upcoming native Windows port --- any issues there? From a pure win32 point of view, a good approach would be to use the VirtualAlloc() memory allocation functions and set up a paged memory allocation system. From a very top down point of view, this is the method of choice if portability is not an issue. An abstraction to use this technique within pg context is probably complex and requires writing lots of win32 api code, which is obviously not desirable. Another way of looking at it is memory mapped files. This probably most closely resembles unix shared memory and is the de facto standard way for interprocess memory block sharing. Sadly, performance will suffer because you have to rely on the virtual memory system (think: writing to files) to do a lot of stupid stuff you don't necessarily want or need. The OS has to guarantee that the memory can be swapped out to file at any time and therefore mirrors the pagefile to the allocated memory blocks. With the C++/C memory malloc/free api, you are supposed to be able to get some of the benefits of virtual alloc (in particular, setting a process memory allocation limit), but personal experience did not bear this out. However, this api sits directly over the virtual allocation system and is the most portable. The application has to guard against fragmentation and things like that in this case. In win32, server thrashing is public enemy #1 for database servers, mostly due to the virtual allocation system (which is quite fast when used right, btw). IIRC, there is a mechanism which enables it to be directly supported/mapped via pagefile. This is the preferred means of memory mapped files unless you have a specific need which dictates otherwise. Meaning, it allows for many supposed optimizations to be used by the OS as it is suppose to bypass some of the filesystem overhead. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PGP signing release
On Tue, 2003-02-11 at 18:27, Curt Sampson wrote: On Wed, 11 Feb 2003, Greg Copeland wrote: On Wed, 2003-02-05 at 18:53, Curt Sampson wrote: [Re: everybody sharing a single key] This issue doesn't change regardless of the mechanism you pick. Anyone that is signing a key must take reasonable measures to ensure the protection of their key. Right. Which is why you really want to use separate keys: you can determine who compromised a key if it is compromised, and you can revoke one without having to revoke all of them. Which pretty much inevitably leads you to just having the developers use their own personal keys to sign the release. Basically, you are saying: You trust a core developer You trust they can protect their keys You trust they can properly distribute their trust You don't trust a core developer with a key Not at all. I trust core developers with keys, but I see no reason to weaken the entire system by sharing keys when it's not necessary. Having each developer sign the release with his own personal key solves every problem you've brought up. cjs You need to keep in mind, I've not been advocating, rather, clarifying. The point being, having a shared key between trusted core developers is hardly an additional risk. After all, either they can be trusted or they can't. At this point, I think we both understand where the other stands. Either we agree or agree to disagree. The next step is for the developers to adopt which path they prefer to enforce and to ensure they have the tools and knowledge at hand to support it. Anyone know if Tom and Bruce know each other well enough to sign each other's keys outright, via phone, via phone and snail-mail? That would put us off to an excellent start. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PGP signing releases
On Tue, 2003-02-04 at 18:27, Curt Sampson wrote: On Tue, 2003-02-04 at 16:13, Kurt Roeckx wrote: On Tue, Feb 04, 2003 at 02:04:01PM -0600, Greg Copeland wrote: Even improperly used, digital signatures should never be worse than simple checksums. Having said that, anyone that is trusting checksums as a form of authenticity validation is begging for trouble. Should I point out that a fingerprint is nothing more than a hash? Since someone already mentioned MD5 checksums of tar files versus PGP key fingerprints, perhaps things will become a bit clearer here if I point out that the important point is not that these are both hashes of some data, but that the time and means of acquisition of that hash are entirely different between the two. And that it creates a verifiable chain of entities with direct associations to people and hopefully, email addresses. Meaning, it opens the door for rapid authentication and validation of each entity and associated person involved. Again, something a simple MD5 hash does not do or even allow for. Perhaps even more importantly, it opens the door for rapid detection of corruption in the system thanks to revocation certificates/keys. In turn, allows for rapid repair in the event that the worst is realized. Again, something a simple MD5 does not assist with in the least. Thanks Curt. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PGP signing releases
On Wed, 2003-02-05 at 00:22, Curt Sampson wrote: On Wed, 4 Feb 2003, Greg Copeland wrote: If three people are required to sign a package prior to release, what happens when one of them is unavailable for signing (vacation, hospital, etc). This is one of the reasons why having a single project key which the core developers sign may appear to be easier. I don't see that it makes that much difference. So the release is signed only by, say, only three people instead of four. It's still signed. Note that I said appear to be easier, not that it actually is easier an any meaningful way. Some of the more paranoid will look for consistency from those that sign the package in question. The fact different people sign may be the cause of additional footwork for some. Which probably isn't such a bad thing. Nonetheless, it could be a sign of alarm for a few. One hopes that situations like last week's ousting of one of the core FreeBSD developers are rare but if such a situation were to arise, a shared project key would be Very Bad (tm). If a revocation key has been properly generated (as it should of been), this is not a problem at all. Actually, it is still a problem. Revocations are not reliable in PGP, and there's really no way to make them perfectly reliable in any system, because you've got no way to force the user to check that his cached data (i.e., the key he holds in his keyring) is still valid. This is why we generally expire signing keys and certificates and stuff like that on a regular basis. When a package is released which has a new key signing it, revocation should normally be found fairly quickly. This is especially true if it's included in the package AND from normal PKI routes. Revocation should accompany any packages which later follow until the key in question has expired. This one element alone makes me think that individual signing is a better thing. (With individual signing you'd have to compromise several keys before you have to start relying on revocation certificates.) Who will actually hold the key? Where will it be physically kept? Good question but can usually be addressed. It can be addressed, but how well? This is another big issue that I don't see any plan for that I'm comfortable with.. The reason I was vague is because it depends on the key route. Obviously, if each person signs, each person must protect their own key. If there is a central project key, it's simply a matter of determining which box is used for signing, etc...while important, it's certainly not difficult to address. How many people will know the passphrase? As few as possible. Ideally only two, maybe three core developers. Um...I'm not sure that this is a relevant question at all. The passphrase is not part of the key; it's just used to encrypt the key for storage. If you know the passphrase, you can make unlimited copies of the key, and these copies can be protected with any passphrases you like, or no passphrase, for that matter. If you're concerned about this to that extent, clearly those people should not part of the web of trust nor should they be receiving the passphrase nor a copy of the private key. Remember, trust is a key (pun intended) part of a reliable PKI. One could also only allow a single person to hold the passphrase and divide it into parts between two or more. This is commonly done in financial circles. Hm. Splitting the key into parts is a very interesting idea, but I'd be interested to know how you might implement it without requiring everybody to be physically present at signing. cjs I was actually talking about splitting the passphrase, however, splitting the key is certainly possible as well. Having said that, if a private key is shared, it should still be encrypted. As such, a passphrase should still be considered; as will splitting it. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PostgreSQL, NetBSD and NFS
On Wed, 2003-02-05 at 11:18, Tom Lane wrote: D'Arcy J.M. Cain [EMAIL PROTECTED] writes: On Wednesday 05 February 2003 11:49, Tom Lane wrote: I wonder if it is possible that, every so often, you are losing just the last few bytes of an NFS transfer? Yah, that's kind of what it looked like when I tried this before Christmas too although the actual errors differd. The observed behavior could vary wildly depending on what data happened to get read. Wild thought here: can you reduce the MTU on the LAN linking the NFS server to the NetBSD box? If so, does it help? Tom, I'm curious as to why you think adjusting the MTU may have an effect on this. Lowering the MTU may actually increase fragmentation, lower efficiency, and even exacerbate the situation. Is this purely a diagnostic suggestion? Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PGP signing releases
Comments intermixed below. On Tue, 2003-02-04 at 12:04, Steve Crawford wrote: Having just started working with GPG I shouldn't be considered an expert but it seems to me that each core developer should create a key and should cross-sign each others' keys to form a web of trust to verify the This is a good idea regardless as which key approach is used. Being able to reliably trust a key is only as strong as the associated web of trust. authenticity of those signatures. In any case, I think that if security-related projects like GnuPG and OpenSSH use the individual method then it wouldn't be a bad idea to follow their lead. There are pros and cons associated with each approach. Neither is really better IMO. If three people are required to sign a package prior to release, what happens when one of them is unavailable for signing (vacation, hospital, etc). This is one of the reasons why having a single project key which the core developers sign may appear to be easier. One hopes that situations like last week's ousting of one of the core FreeBSD developers (http://slashdot.org/article.pl?sid=03/02/03/239238mode=threadtid=122tid=156) are rare but if such a situation were to arise, a shared project key would be Very Bad (tm). If a revocation key has been properly generated (as it should of been), this is not a problem at all. The revocation key is quickly shot over to the known key servers and included with the newly generated project key. As people add and confirm the new project key, the old key is automatically revoked. Again, if this is properly handled, it is not much of a problem at all. PKI, by means of revocation keys, specifically addresses this need. If I understand GPG correctly, one can create a detached signature of a document. As such, any or all of the core developers could create and post such a signature and a user could verify against as many signatures as desired to feel secure that the file is good. Cheers, Steve IIRC, PGP and GPG both support detached signatures. Who will actually hold the key? Where will it be physically kept? Good question but can usually be addressed. How many people will know the passphrase? As few as possible. Ideally only two, maybe three core developers. One could also only allow a single person to hold the passphrase and divide it into parts between two or more. This is commonly done in financial circles. The exact details will be mostly driven by the key approach that is picked. Who will be responsible for signing the files? Is there a backup person? This is important to make sure that any backup people are properly included in the web of trust from the very beginning. Will it be a signing-only key? What size? Should it expire? Keys should always expire. If you want to allow for one year, two year, or even maybe three years, that's fine. Nonetheless, expiration should always be built in, especially on a project like this where people may be more transient. How is verification of the files before signing accomplished? The person creating the initial package release should also initially sign it. From there, the web of trust for the people signing it can work as designed. Once the initial package has been generated, it should not leave his eyes until it has been signed. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PGP signing releases
On Tue, 2003-02-04 at 12:02, Rod Taylor wrote: On Tue, 2003-02-04 at 12:55, Kurt Roeckx wrote: On Tue, Feb 04, 2003 at 01:35:47PM +0900, Curt Sampson wrote: On Mon, 3 Feb 2003, Kurt Roeckx wrote: I'm not saying md5 is as secure as pgp, not at all, but you can't trust those pgp keys to be the real one either. Sure you can. Just verify that they've been signed by someone you trust. I know how it works, it's just very unlikely I'll ever meet someone so it gives me a good chain. Anyway, I think pgp is good thing to do, just don't assume that it's always better then just md5. Not necessarily better -- but it's always as good as md5. Even improperly used, digital signatures should never be worse than simple checksums. Having said that, anyone that is trusting checksums as a form of authenticity validation is begging for trouble. Checksums are not, in of themselves, a security mechanism. I can't stress this enough. There really isn't any comparison here. Please stop comparing apples and oranges. No matter how hard you try, you can not make orange juice from apples. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PGP signing releases
On Tue, 2003-02-04 at 16:13, Kurt Roeckx wrote: On Tue, Feb 04, 2003 at 02:04:01PM -0600, Greg Copeland wrote: Even improperly used, digital signatures should never be worse than simple checksums. Having said that, anyone that is trusting checksums as a form of authenticity validation is begging for trouble. Should I point out that a fingerprint is nothing more than a hash? You seem to not understand the part where I said, in of themselves. Security is certainly an area of expertise where the devil is in the details. One minor detail can greatly effect the entire picture. You're simply ignoring all the details and looking for obtuse parallels. Continue to do so all you like. It still doesn't effectively and reliably address security in the slightest. Checksums are not, in of themselves, a security mechanism. So a figerprint and all the hash/digest function have no purpose at all? This is just getting silly and bordering on insulting. If you have meaningful comments, please offer them up. Until such time, I have no further comments for you. Obviously, a fingerprint is derivative piece of information which, in of it self, does not validate anything. Thusly, the primary supporting concept is the web of trust, associated process and built in mechanisms to help ensure it all makes sense and maintained in proper context. Something that a simple MD5 checksum does not provide for. Not in the least. A checksum or hash only allows for comparisons between two copies to establish they are the same or different. It, alone, can never reliably be a source of authentication and validation. A checksum or hash, alone, says nothing about who created it, where it came from, how old it is, or whom is available to readily and authoritatively assist in validation of the checksummed (or hashed) entity or the person who created it. I do agree that a checksum (or hash) is better than nothing, however, a serious security solution it is not. Period. Feel free to be lulled into complacent comfort. In the mean time, I'll choose a system which actually has a chance at working. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PGP signing releases
On Sun, 2003-02-02 at 20:23, Marc G. Fournier wrote: right, that is why we started to provide md5 checksums ... md5 checksums only validate that the intended package (trojaned or legit) has been properly received. They offer nothing from a security perspective unless the checksums have been signed with a key which can be readily validated from multiple independent sources. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PGP signing releases
On Mon, 2003-02-03 at 13:55, Kurt Roeckx wrote: On Mon, Feb 03, 2003 at 12:24:14PM -0600, Greg Copeland wrote: On Sun, 2003-02-02 at 20:23, Marc G. Fournier wrote: right, that is why we started to provide md5 checksums ... md5 checksums only validate that the intended package (trojaned or legit) has been properly received. They offer nothing from a security perspective unless the checksums have been signed with a key which can be readily validated from multiple independent sources. If you can get the md5 sum of multiple independent sources, it's about the same thing. It all depends on how much you trust those sources. I'm not saying md5 is as secure as pgp, not at all, but you can't trust those pgp keys to be the real one either. No, that is not the same thing at all. PKI specifically allows for web of trust. Nothing about md5 checksums allows for this. As such, chances are, if a set of md5 checksums have been forged, they will be propagated and presented as being valid even though they are not. I'll say this again. Checksums alone offers zero security protection. It was never intended to address that purpose. As such, it does not address it. If you need security, use a security product. Checksums ONLY purpose is to ensure copy propagation validation. It does not address certification of authenticity in any shape or form. As for trusting the validity of the keys contained within a PKI, that's where the whole concept of web of trust comes into being. You can ignore it and not benefit or you can embrace it, as people are advocating, and leverage it. Validation of keys can be as simple as snail-mail, phone calls, and fingerprint validation. It's that simple. It's why fingerprints exist in the first place. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PGP signing releases
On Mon, 2003-02-03 at 22:35, Curt Sampson wrote: On Mon, 3 Feb 2003, Kurt Roeckx wrote: I'm not saying md5 is as secure as pgp, not at all, but you can't trust those pgp keys to be the real one either. Sure you can. Just verify that they've been signed by someone you trust. For example, next time I happen to run into Bruce Momjian, I hope he'll have his PGP key fingerprint with him. I can a) verify that he's the same guy I who, under the name Bruce Momjian, was giving the seminar I went to last weekend, and b) check his passport ID to see that the U.S. government believes that someone who looks him is indeed Bruce Momjian and a U.S. citizen. That, for me, is enough to trust that he is who he says he is when he gives me the fingerprint. I take that fingerprint back to my computer and verify that the key I downloaded from the MIT keyserver has the same fingerprint. Then I sign that key with my own signature, assigning it an appropriate level of trust. Next time I download a postgres release, I then grab a copy of the postgres release-signing public key, and verify that its private key was used to sign the postgres release, and that it is signed by Bruce's key. Now I have a direct chain of trust that I can evaluate: 1. Do I believe that the person I met was indeed Bruce Momjian? 2. Do I trust him to take care of his own key and be careful signing other keys? 3. Do I trust his opinion that the postgres release-signing key that he signed is indeed valid? 4. Do I trust the holder of the postgres release-signing key to have taken care of the key and have been careful about signing releases with it? Even if you extend this chain by a couple of people, that's trust in a lot fewer people than you're going to need if you want to trust an MD5 signature. cjs And that's the beginning of the web of trust. ;) Worth noting that snail-mail and phone calls can easily play a role in this process as well. I think if USPO can play a role in delivering master keys for pin pads used by banks across America and the around the world, surely it's good enough to help propagate key information for signing packages. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PGP signing releases
On Mon, 2003-02-03 at 22:35, Curt Sampson wrote: 2. Do I trust him to take care of his own key and be careful signing other keys? 3. Do I trust his opinion that the postgres release-signing key that he signed is indeed valid? 4. Do I trust the holder of the postgres release-signing key to have taken care of the key and have been careful about signing releases with it? Sorry to respond again, however, I did want to point out, signing a key does not have to imply an absolute level of trust of the signer. There are several trust levels. For example, if we validated keys via phone and mail, I would absolutely not absolutely trust the key I'm signing. However, if I had four people which mostly trusted the signed key and one or two which absolutely trusted the signed key whom I absolutely trust, then it's a fairly safe bet I too can trust the key. Again, this all comes back to building a healthy web of trust. Surely there are a couple of key developers whom would be willing to sign each other's keys and have previously met before. Surely this would be the basis for phone validation. Then, of course, there is 'ol snail-mail route too. Of course, nothing beats meeting in person having valid ID and fingerprints in hand. ;) Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PGP signing releases
On Sun, 2003-02-02 at 18:39, Neil Conway wrote: Folks, I think we should PGP sign all the official packages that are provided for download from the various mirror sites. IMHO, this is important because: - ensuring that end users can trust PostgreSQL is an important part to getting the product used in mission-critical applications, as I'm sure you all know. Part of that is producing good software; another part is ensuring that users can trust that the software we put out hasn't been tampered with. - people embedding trojan horses in open source software is not unheard of. In fact, it's probably becoming more common: OpenSSH, sendmail, libpcap/tcpdump and bitchx have all been the victim of trojan horse attacks fairly recently. - PGP signing binaries is relatively easy, and doesn't need to be done frequently. Comments? I'd volunteer to do the work myself, except that it's pretty closely intertwined with the release process itself... Cheers, Neil Actually, if you just had everyone sign the official key and submit it back to the party that's signing, that would probably be good enough. Basically, as long as people can verify the package has been signed and can reasonably verify that the signing key is safe and/or can be verified, confidence should be high in the signed package. I certainly have no problem with people signing my key nor with signing others as long as we can verify/authenticate each others keys prior. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Win32 port powerfail testing
On Sat, 2003-02-01 at 00:34, Adam Haberlach wrote: On Sat, Feb 01, 2003 at 12:27:31AM -0600, Greg Copeland wrote: On Fri, 2003-01-31 at 14:36, Dave Page wrote: I intend to run the tests on a Dual PIII 1GHz box, with 1Gb of Non-ECC RAM and a 20Gb (iirc) IDE disk. I will run on Windows 2000 Server with an NTFS filesystem, and again on Slackware Linux 8 with either ext3 or reiserfs (which is preferred?). Please go with XFS or ext3. There are a number of blessed and horror stories which still float around about reiserfs (recent and old; even though I've never lost data with it -- using it now even). Might be worth testing FAT32 on NT as well. Even if we don't advocate it's use, it may not hurt to at least get an understanding of what one might reasonably expect from it. I'm betting there are people just waiting to run with FAT32 in the Win32 world. ;) You'd better go with NTFS. There are a number of blessed and horror stories which still float around about FAT32 (recent and old; even though I've never lost data with it -- using it now even now. Might be worth testing reiserfs on Linux as well. Even if we don't advocate it's use, it may not hurt to at least get an understanding of what one my reasonably expect from it. I'm better there are people just waiting to run with reiserfs in the Linux world. ;) Regards, and tongue firmly in cheek, Touche! :P While I understand and even appreciate the humor value, I do believe the picture is slightly different than your analysis. If we make something that runs on Win32 platforms, might it also run on Win98, WinME, etc.? Let's face the facts that should it also run on these platforms, it's probably only a matter of time before someone has it running on FAT32 (even possible on NT, etc). In other words, I'm fully expecting the lowest common denominator of MySQL user to be looking at PostgreSQL on Win32. Which potentially means lots of FAT32 use. And yes, even for a production environment. Ack! Double-ack! Also, Dave was asking for feedback between reiserfs and ext3. I offered XFS and ext3 as candidates. I personally believe that ext3 and XFS are going to be the more common (in that order) of journaled FS for DB Linux users. Besides, aside from any bugs in reiserfs, testing results for ext3 or XFS should probably coincide with reasonable expectations for reiserfs as well. As I consider FAT32 to be much more fragile than ext2 (having had seriously horrendous corruption and repaired/recovered from it on ext2), the results may prove interesting. Which is to say, should testing prove absolutely horrible results, proper disclaimers and warnings should be made readily available to avoid its use. Which is probably not a bad idea to begin with. ;) Nonetheless, it's an unknown right now in my mind. Hopefully some testing my reveal what reasonable expectations we should hold so that we can knowingly advise accordingly. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [mail] Re: [HACKERS] Windows Build System
On Fri, 2003-01-31 at 07:22, Christopher Browne wrote: But it's not /nearly/ that straightforward. If you look at the downloads that MySQL AB provides, they point you to a link that says Windows binaries use the Cygwin library. Which apparently means that this feature is not actually a feature. Unlike PostgreSQL, which is run under the Cygwin emulation, MySQL runs as a native Windows application (with Cygwin emulation). Apparently those are not at all the same thing, even though they are both using Cygwin... I'm confused as to whether you are being sarcastic or truly seem to think there is a distinction here. Simple question, does MySQL require the cygwin dll's (or statically linked to) to run? If the answer is yes, then there is little question that they are as emulated as is the current PostgreSQL/Win32 effort. Care to expand on exactly what you believe the distinction is? ...or did I miss the humor boat? :( Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Linux.conf.au 2003 Report
On Fri, 2003-01-31 at 13:04, Kurt Roeckx wrote: On Thu, Jan 30, 2003 at 08:21:09PM -0600, Greg Copeland wrote: It doesn't help the confusion that many OS's try to confuse programmers by exposing a single socket interface, etc. Simple fact remains, IPv6 is not IPv4. It's a good things that the socket interface can actually work with all protocol! It doesn't only work with AF_INET, but also AF_UNIX, and probably others. It's a good things that things like socket(), bind(), connect() don't need to be replaced by other things. That's actually not what I was talking about. Please see the recent IPv6 support thread as an example. The fact that an OS allows IPv4 connections to be completed even though you explicitly requested IPv6 protocol, only adds to much confusion (one of many such oddities which some OS's allow). Heck, along those lines, they should allow NCP connections to come through too. Or, how about UDP traffic on TCP sockets. If I wanted IPv4 traffic, I'll ask for it. Likewise of IPv6. My point being, too many people are in a hurry to confuse/combine the two when they are very clearly two distinct protocols, each having distinct needs. The faster people treat them as such, the quicker things will become better for everyone. The fact that some OS's attempt to blur the API lines and underlying semantics between the two protocols only further confuses things as it falsely leads people to believe that they are more or less the same protocol. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [mail] Re: [HACKERS] Windows Build System
On Fri, 2003-01-31 at 19:22, Dann Corbit wrote: For MySQL: There is no Cygwin needed. Period. Any idea as to why we seem to be getting such a conflicting story here? By several accounts, it does. Now, your saying it doesn't. What the heck is going on here. Not that I'm doubting you. I'm just trying to figure out which side of the coin is the shinny one. ;) There's a tool that comes with either the resource kit or the VC++ stuff that will tell you information like what ldd does. I don't recall the name of the tool. Can anyone comment if cygwin (or equivalent) is being linked in (statically or dynamically)? -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [mail] Re: [HACKERS] Windows Build System
On Fri, 2003-01-31 at 19:22, Dann Corbit wrote: For MySQL: There is no Cygwin needed. Period. Sorry to followup again, but I did want to point out something. I'm assuming you actually installed it. Please take note that the cygwin dll is normally installed into one of the window's directories (system, windows, etc). My point being, just because you didn't find it in the mysql directory, doesn't mean it wasn't installed system-wide. Not saying it does or doesn't do this. Just offering something else that may need to be looked at. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] plpython fails its regression test
On Thu, 2003-01-30 at 16:39, Tom Lane wrote: In CVS tip, if you run make installcheck in src/pl/plpython, the test fails with a number of diffs between the expected and actual output. I'm not sure if plpython is broken, or if it's just that someone changed the behavior and didn't bother to update the test's expected files (the test files don't seem to have been maintained since they were first installed). Comments? Could this have anything to do with the changes I made to the python stuff to get it to support longs (IIRC)? It's been a while now so I don't recall exactly what got changed. I do remember that I chanced some test code to ensure it tested the newly fixed data type. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [mail] Re: [HACKERS] Windows Build System
On Thu, 2003-01-30 at 13:56, Dave Page wrote: When properly configured, Windows can be reliable, maybe not as much as Solaris or HPUX but certainly some releases of Linux (which I use as well). You don't see Oracle or IBM avoiding Windows 'cos it isn't stable enough. I'm not jumping on one side or the other but I wanted to make clear on something. The fact that IBM or Oracle use windows has absolutely zero to do with reliability or stability. They are there because the market is willing to spend money on their product. Let's face it, the share holders of each respective company would come unglued if the largest software audience in the world were completely ignored. Simple fact is, your example really is pretty far off from supporting any view. Bluntly stated, both are in that market because they want to make money; they're even obligated to do so. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [mail] Re: [HACKERS] Windows Build System
On Thu, 2003-01-30 at 14:27, Dave Page wrote: -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED]] Sent: 30 January 2003 15:56 To: Hannu Krosing Cc: Vince Vielhaber; Dave Page; Ron Mayer; [EMAIL PROTECTED] Subject: Re: [mail] Re: [HACKERS] Windows Build System In the pull-the-plug case you have to worry about what is on disk at any given instant and whether you can make all the bits on disk consistent again. (And also about whether your filesystem can perform the equivalent exercise for its own metadata; which is why we are questioning Windows here. I've never (to my knowledge) lost any data following a powerfail or system crash on a system using NTFS - that has always seemed pretty solid to me. By comparison, I have lost data on ext2 filesystems on a couple of occasions. More info at: http://www.ntfs.com/data-integrity.htm http://www.pcguide.com/ref/hdd/file/ntfs/relRec-c.html Obviously this goes out of the window is the user chooses to run on FAT/FAT32 partitions. I think that it should be made *very* clear in any future documentation that the user is strongly advised to use only NTFS filesystems. I realise this is not proof that it actually works of course... I have lost entire directory trees (and all associated data) on NTFS before. NTFS was kind enough to detect an inconsistency during boot and repaired the file system by simply removing any and all references to the top level damaged directory (on down). Sure, the file system was in a known good state following the repair but the 2-days to recover from it, pretty much stunk! I would also like to point out that this damage/repair occurred on a RAID-5 box (hardware, not software). As the repairs placed the file system back into known good state, the raid hardware was happy to obey. Guess what, it did! :( Make no mistake about it. You can easily lose large amounts of data on NTFS. You also compared NTFS with ext2. That's not exactly fair. Better you should compare NTFS with ext3, XFS, JFS, ReiserFS. It's a better, more fair comparison, as now we're talking about the same category of file system. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] PostgreSQL, NetBSD and NFS
That was going to be my question too. I thought NFS didn't have some of the requisite file system behaviors (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or reliably. Please correct as needed. Regards, Greg On Thu, 2003-01-30 at 13:02, mlw wrote: Forgive my stupidity, are you running PostgreSQL with the data on an NFS share? D'Arcy J.M. Cain wrote: I have posted before about this but I am now posting to both NetBSD and PostgreSQL since it seems to be some sort of interaction between the two. I have a NetAPP filer on which I am putting a PostgreSQL database. I run PostgreSQL on a NetBSD box. I used rsync to get the database onto the filer with no problem whatsoever but as soon as I try to open the database the NFS mount hangs and I can't do any operations on that mounted drive without hanging. Other things continue to run but the minute I do a df or an ls on that drive that terminal is lost. On the NetBSD side I get a server not responding error. On the filer I see no problems at all. A reboot of the filer doesn't correct anything. Since NetBSD works just fine with this until I start PostgreSQL and PostgreSQL, from all reports, works well with the NetApp filer, I assume that there is something out of the ordinary about PostgreSQL's disk access that is triggering some subtle bug in NetBSD. Does the shared memory stuff use disk at all? Perhaps that's the difference between PostgreSQL and other applications. The NetApp people are being very helpful and are willing to follow up any leads people might have and may even suggest fixes if necessary. I have Bcc'd the engineer on this message and will send anything I get to them. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] postgresql.org
Should it be saying, Temporarily Unavailable? Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Threads
On Thu, 2003-01-23 at 09:12, Steve Wampler wrote: On Sat, 4 Jan 2003, Christopher Kings-Lynne wrote: Also remember that in even well developed OS's like FreeBSD, all a process's threads will execute only on one CPU. I doubt that - it certainly isn't the case on Linux and Solaris. A thread may *start* execution on the same CPU as it's parent, but native threads are not likely to be constrained to a specific CPU with an SMP OS. You are correct. When spawning additional threads, should an idle CPU be available, it's very doubtful that the new thread will show any bias toward the original thread's CPU. Most modern OS's do run each thread within a process spread across n-CPUs. Those that don't are probably attempting to modernize as we speak. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] C++ coding assistance request for a
On Wed, 2003-01-22 at 23:40, Justin Clift wrote: Justin Clift wrote: Greg Copeland wrote: Have you tried IBM's OSS visualization package yet? Sorry, I don't seem to recall the name of the tool off the top of my head (Data Explorer??) but it uses OpenGL (IIRC) and is said to be able to visualize just about anything. Anything is said to include simple data over time to complex medical CT scans. Cool. Just found it... IBM Open Visualization Data Explorer: http://www.research.ibm.com/dx/ That seems to be a very outdated page for it. The new pages for it (in case anyone else is interested) are at: http://www.opendx.org :-) Regards and best wishes, Justin Clift Yep! That's the stuff! Sorry I wasn't more specific. Just been a while since I'd looked at it. I'd love to know how well it works out for you. Especially love to see any pretty pictures you create with it. ;) Regards, Greg -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] C++ coding assistance request for a visualisation
Have you tried IBM's OSS visualization package yet? Sorry, I don't seem to recall the name of the tool off the top of my head (Data Explorer??) but it uses OpenGL (IIRC) and is said to be able to visualize just about anything. Anything is said to include simple data over time to complex medical CT scans. Greg On Wed, 2003-01-22 at 12:19, Justin Clift wrote: Hi guys, Is there anyone here that's good with C++ and has a little bit of time to add PostgreSQL support to a project? There is a 4D visualisation program called Flounder: http://www.enel.ucalgary.ca/~vigmond/flounder/ And it does some pretty nifty stuff. It takes in data sets (x, y, z, time) and displays then graphically, saving them to image files if needed, and also creating the time sequences as animations if needed. Was looking at it from a performance tuning tool point of view. i.e. Testing PostgreSQL performance with a bunch of settings, then stuffing the results into a database, and then using something like Flounder for visualising it. It seems pretty simple, and Flounder seems like it might be the right kind of tool for doing things like this. Was emailing with Edward Vigmond, the author of it, and he seems to think it'd be pretty easy to implement too. Now, I'm not a C++ coder, and as short of time as anyone, so I was wondering if there is anyone here who'd be interested in helping out here. :-) Regards and best wishes, Justin Clift -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] \d type queries - why not views in system catalog?!?
Oh! That's an excellent idea. Seemingly addresses the issue and has value-add. I'm not aware of any gotchas here. Is there something that is being overlooked? Greg On Mon, 2003-01-13 at 14:50, Robert Treat wrote: On Mon, 2003-01-13 at 11:28, [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NotDashEscaped: You need GnuPG to verify this message Wouldn't it be easier (and more portable, see 7.3/7.2 system catalogs vs. psql) to have views for that? Do I miss a point here? Putting the \d commands into views has been on the TODO list for a long time: I think it is actually the only psql-related item left, until we change the backend protocol to indicate transaction state. I don't think a view would have helped with the psql 7.2/7.3 change: a lot more changed than simply the underlying SQL. Some of the the backslash commands are not amenable to putting inside a view, as they actually compromise multiple SQL calls and some logic in the C code, but a few could probably be made into views. Could whomever added that particular TODO item expand on this? One idea I've always thought would be nice would be to make full fledged C functions out of the \ commands and ship them with the database. This way the \ commands could just be alias to select myfunc(). This would help out all of us who write GUI interfaces since we would have standard functions we could call upon, and would also help with backward compatibility since \dv could always call select list_views(), which would already be included with each server. One of the reasons that this was not feasible in the past was that we needed functions that could return multiple rows and columns easily. Now that we have that in 7.3, it might be worth revisiting. Robert Treat ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] \d type queries - why not views in system catalog?!?
Views or C-functions, I think the idea is excellent. It's the concept that I really like. Greg On Mon, 2003-01-13 at 15:00, Dave Page wrote: -Original Message- From: Greg Copeland [mailto:[EMAIL PROTECTED]] Sent: 13 January 2003 20:56 To: Robert Treat Cc: [EMAIL PROTECTED]; PostgresSQL Hackers Mailing List Subject: Re: [HACKERS] \d type queries - why not views in system catalog?!? Oh! That's an excellent idea. Seemingly addresses the issue and has value-add. I'm not aware of any gotchas here. Is there something that is being overlooked? Why use functions instead of views? Most UIs will want to format the output as they see fit so a recordset would be the appropriate output. Yes, a function could do this, but surely views would be simpler to implement and maintain. Regards, Dave. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] redo error?
On Tue, 2003-01-07 at 22:58, Tom Lane wrote: It also logged that it was killed with signal 9, although I didn't kill it! Is there something weird going on here? Is this Linux? The Linux kernel seems to think that killing randomly-chosen processes with SIGKILL is an appropriate response to running out of memory. I cannot offhand think of a more brain-dead behavior in any OS living or dead, but that's what it does. Just FYI, I believe the 2.6.x series of kernels will rectify this situation. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Next platform query: Alphaservers under VMS?
IIRC, they too have a POSIX layer available. Greg On Tue, 2003-01-07 at 02:44, Justin Clift wrote: Hi guys, Also received a through the Advocacy website asking if anyone has ported PostgreSQL to the AlphaServers under VMS. Anyone know if we run on VMS? Last time I touched VMS (about 10 years ago) it wasn't all that Unix-like. :-) Regards and best wishes, Justin Clift -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Threads
On Tue, 2003-01-07 at 02:00, Shridhar Daithankar wrote: On 6 Jan 2003 at 6:48, Greg Copeland wrote: 1) Get I/O time used fuitfully AIO may address this without the need for integrated threading. Arguably, from the long thread that last appeared on the topic of AIO, some hold that AIO doesn't even offer anything beyond the current implementation. As such, it's highly doubtful that integrated threading is going to offer anything beyond what a sound AIO implementation can achieve. Either way, a complete aio or threading implementation is not available on major platforms that postgresql runs. Linux definitely does not have one, last I checked. There are two or three significant AIO implementation efforts currently underway for Linux. One such implementation is available from the Red Hat Server Edition (IIRC) and has been available for some time now. I believe Oracle is using it. SGI also has an effort and I forget where the other one comes from. Nonetheless, I believe it's going to be a hard fought battle to get AIO implemented simply because I don't think anyone, yet, can truly argue a case on the gain vs effort. If postgresql is not using aio or threading, we should start using one of them, is what I feel. What do you say? I did originally say that I'd like to see an AIO implementation. Then again, I don't current have a position to stand other than simply saying it *might* perform better. ;) Not exactly a position that's going to win the masses over. was expecting something that we could actually pull the trigger with. That could be done. I'm sure it can, but that's probably the easiest item to address. o Code isn't very portable. Looked fairly okay for pthread platforms, however, there is new emphasis on the Win32 platform. I think it would be a mistake to introduce something as significant as threading without addressing Win32 from the get-go. If you search for pthread in thread.c, there are not many instances. Same goes for thread.h. From what I understand windows threading, it would be less than 10 minutes job to #ifdef the pthread related part on either file. It is just that I have not played with windows threading and nor I am inclined to...;-) Well, the method above is going to create a semi-ugly mess. I've written thread abstraction layers which cover OS/2, NT, and pthreads. Each have subtle distinction. What really needs to be done is the creation of another abstraction layer which your current code would sit on top of. That way, everything contained within is clear and easy to read. The big bonus is that as additional threading implementations need to be added, only the low-level abstraction stuff needs to modified. Done properly, each thread implementation would be it's own module requiring little #if clutter. As you can see, that's a fair amount of work and far from where the code currently is. o I would desire a more highly abstracted/portable interface which allows for different threading and synchronization primitives to be used. Current implementation is tightly coupled to pthreads. Furthermore, on platforms such as Solaris, I would hope it would easily allow for plugging in its native threading primitives which are touted to be much more efficient than pthreads on said platform. Same as above. If there can be two cases separated with #ifdef, there can be more.. But what is important is to have a thread that can be woken up as and when required with any function desired. That is the basic idea. Again, there's a lot of work in creating a well formed abstraction layer for all of the mechanics that are required. Furthermore, different thread implementations have slightly different semantics which further complicates things. Worse, some types of primitives are simply not available with some thread implementations. That means those platforms require it to be written from the primitives that are available on the platform. Yet more work. o Code is fairly trivial and does not address other primitives (semaphores, mutexs, conditions, TSS, etc) portably which would be required for anything but the most trivial of threaded work. This is especially true in such an application where data IS the application. As such, you must reasonably assume that threads need some form of portable serialization primitives, not to mention mechanisms for non-trivial communication. I don't get this. Probably I should post a working example. It is not threads responsibility to make a function thread safe which is changed on the fly. The function has to make sure that it is thread safe. That is altogether different effort.. You're right, it's not the thread's responsibility, however, it is the threading toolkit's. In this case, you're offering to be the toolkit which functions across two platforms, just for starters. Reasonably, you should expect a third to quickly follow
Re: [HACKERS] Threads
On Tue, 2003-01-07 at 12:21, Greg Stark wrote: Greg Copeland [EMAIL PROTECTED] writes: That's the power of using the process model that is currently in use. Should it do something naughty, we bitch and complain politely, throw our hands in the air and exit. We no longer have to worry about the state and validity of that backend. You missed the point of his post. If one process in your database does something nasty you damn well should worry about the state of and validity of the entire database, not just that one backend. I can assure you I did not miss the point. No idea why you're continuing to spell it out. In this case, it appears the quotation is being taken out of context or it was originally stated in an improper context. Are you really sure you caught the problem before it screwed up the data in shared memory? On disk? This whole topic is in need of some serious FUD-dispelling and careful analysis. Here's a more calm explanation of the situation on this particular point. Perhaps I'll follow up with something on IO concurrency later. Hmmm. Not sure what needs to be dispelled since I've not seen any FUD. The point in consideration here is really memory isolation. Threads by default have zero isolation between threads. They can all access each other's memory even including their stack. Most of that memory is in fact only needed by a single thread. Again, this has been covered already. Processes by default have complete memory isolation. However postgres actually weakens that by doing a lot of work in a shared memory pool. That memory gets exactly the same protection as it would get in a threaded model, which is to say none. Again, this has all been covered, more or less. You're comments seem to imply that you did not fully read what has been said on the topic thus far or that you misunderstood something that was said. Of course, it's also possible that I may of said something out of it's proper context which may be confusing you. I think it's safe to say I don't have any further comment unless something new is being brought to the table. Should there be something new to cover, I'm happy to talk about it. At this point, however, it appears that it's been beat to death already. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [Npgsql-general] Get function OID and function
San someone point me to what exactly is planned for the protocol/networking stuff? Networking/protocols is one of my fortes and I believe that I could actually help here. Regards, Greg On Tue, 2003-01-07 at 09:01, Tom Lane wrote: Dave Page [EMAIL PROTECTED] writes: Sorry, don't know. Can anyone on pgsql-hackers tell us the purpose of the FunctionCall message? It's used to invoke the fast path function call code (src/backend/tcop/fastpath.c). libpq's large-object routines use this, but little else does AFAIK. The current protocol is sufficiently broken (see comments in fastpath.c) that I'd not really encourage people to use it until we can fix it --- hopefully that will happen in 7.4. regards, tom lane PS: what in the world is [EMAIL PROTECTED] ... is that a real mailing list, and if so why? It sounds a bit, um, duplicative. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Threads
On Mon, 2003-01-06 at 05:36, Shridhar Daithankar wrote: On 6 Jan 2003 at 12:22, Ulrich Neumann wrote: Hello all, If someone is interested in the code I can send a zip file to everyone who wants. I suggest you preserver your work. The reason I suggested thread are mainly two folds. 1) Get I/O time used fuitfully AIO may address this without the need for integrated threading. Arguably, from the long thread that last appeared on the topic of AIO, some hold that AIO doesn't even offer anything beyond the current implementation. As such, it's highly doubtful that integrated threading is going to offer anything beyond what a sound AIO implementation can achieve. 2) Use multiple CPU better. Multiple processes tend to universally support multiple CPUs better than does threading. On some platforms, the level of threading support is currently only user mode implementations which means no additional CPU use. Furthermore, some platforms where user-mode threads are defacto, they don't even allow for scheduling bias resulting is less work being accomplished within the same time interval (work slice must be divided between n-threads within the process, all of which run on a single CPU). It will not require as much code cleaning as your efforts might had. However your work will be very useful if somebody decides to use thread in any fashion in core postgresql. I was hoping for bit more optimistic response given that what I suggested was totally optional at any point of time but very important from performance point. Besides the change would have been gradual as required.. Speaking for my self, I probably would of been more excited if the offered framework had addressed several issues. The short list is: o Code needs to be more robust. It shouldn't be calling exit directly as, I believe, it should be allowing for PostgreSQL to clean up some. Correct me as needed. I would of also expected the code of adopted PostgreSQL's semantics and mechanisms as needed (error reporting, etc). I do understand it was an initial attempt to simply get something in front of some eyes and have something to talk about. Just the same, I was expecting something that we could actually pull the trigger with. o Code isn't very portable. Looked fairly okay for pthread platforms, however, there is new emphasis on the Win32 platform. I think it would be a mistake to introduce something as significant as threading without addressing Win32 from the get-go. o I would desire a more highly abstracted/portable interface which allows for different threading and synchronization primitives to be used. Current implementation is tightly coupled to pthreads. Furthermore, on platforms such as Solaris, I would hope it would easily allow for plugging in its native threading primitives which are touted to be much more efficient than pthreads on said platform. o Code is not commented. I would hope that adding new code for something as important as threading would be commented. o Code is fairly trivial and does not address other primitives (semaphores, mutexs, conditions, TSS, etc) portably which would be required for anything but the most trivial of threaded work. This is especially true in such an application where data IS the application. As such, you must reasonably assume that threads need some form of portable serialization primitives, not to mention mechanisms for non-trivial communication. o Does not address issues such as thread signaling or status reporting. o Pool interface is rather simplistic. Does not currently support concepts such as wake pool, stop pool, pool status, assigning a pool to work, etc. In fact, it's not altogether obvious what the capabilities intent is of the current pool implementation. o Doesn't seem to address any form of thread communication facilities (mailboxes, queues, etc). There are probably other things that I can find if I spend more than just a couple of minutes looking at the code. Honestly, I love threads but I can see that the current code offering is not much more than a token in its current form. No offense meant. After it's all said and done, I'd have to see a lot more meat before I'd be convinced that threading is ready for PostgreSQL; from both a social and technological perspective. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] IPv6 patch
On Mon, 2003-01-06 at 15:29, Peter Eisentraut wrote: (2) A socket type is explicitly enabled for the server to use, and if creation fails, server startup fails. It seems that the current code falls back to IPv4 if IPv6 fails. IIRC, it allows it to fall back to IPv4 in case it's compiled for IPv6 support but the kernel isn't compiled to support IPv6. If that is the case, admittedly, you seem to have a point. If someone compiles in v6 support and their system doesn't have v6 support and it's been requested via run-time config, it's should fail just like any other. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] IPv6 patch
On Mon, 2003-01-06 at 15:43, Bruce Momjian wrote: Greg Copeland wrote: On Mon, 2003-01-06 at 15:29, Peter Eisentraut wrote: (2) A socket type is explicitly enabled for the server to use, and if creation fails, server startup fails. It seems that the current code falls back to IPv4 if IPv6 fails. IIRC, it allows it to fall back to IPv4 in case it's compiled for IPv6 support but the kernel isn't compiled to support IPv6. If that is the case, admittedly, you seem to have a point. If someone compiles in v6 support and their system doesn't have v6 support and it's been requested via run-time config, it's should fail just like any other. Yes, right now, it is kind of a mystery when it falls back to IPv4. It does print a message in the server logs: LOG: server socket failure: getaddrinfo2() using IPv6: hostname nor servname provided, or not known LOG: IPv6 support disabled --- perhaps the kernel does not support IPv6 LOG: IPv4 socket created It appears right at the top because creating the socket is the first thing it does. A good question is once we have a way for the user to control IPv4/6, what do we ship as a default? IPv4-only? Both, and if both, do we fail on a kernel that doesn't have IPv6 enabled? So you're saying that by using the IPv6 address family and you bind to an IPv6 address (or even ANY interface), you still get v4 connections on the same bind/listen/accept sequence? I'm asking because I've never done v6 stuff. Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] IPv6 patch
On Mon, 2003-01-06 at 15:59, Bruce Momjian wrote: Greg Copeland wrote: It appears right at the top because creating the socket is the first thing it does. A good question is once we have a way for the user to control IPv4/6, what do we ship as a default? IPv4-only? Both, and if both, do we fail on a kernel that doesn't have IPv6 enabled? So you're saying that by using the IPv6 address family and you bind to an IPv6 address (or even ANY interface), you still get v4 connections on the same bind/listen/accept sequence? I'm asking because I've never done v6 stuff. Yes, it listens on both. The original author, Nigel, tested in using both IPv4 and IPv6, and the #ipv6 IRC channel and google postings seem to indicate that too. What I am not sure how to do is say _only_ IPv4. Wouldn't you just use an IPv4 address family when creating your socket? -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] IPv6 patch
On Mon, 2003-01-06 at 16:17, Bruce Momjian wrote: Greg Copeland wrote: On Mon, 2003-01-06 at 15:59, Bruce Momjian wrote: Greg Copeland wrote: It appears right at the top because creating the socket is the first thing it does. A good question is once we have a way for the user to control IPv4/6, what do we ship as a default? IPv4-only? Both, and if both, do we fail on a kernel that doesn't have IPv6 enabled? So you're saying that by using the IPv6 address family and you bind to an IPv6 address (or even ANY interface), you still get v4 connections on the same bind/listen/accept sequence? I'm asking because I've never done v6 stuff. Yes, it listens on both. The original author, Nigel, tested in using both IPv4 and IPv6, and the #ipv6 IRC channel and google postings seem to indicate that too. What I am not sure how to do is say _only_ IPv4. Wouldn't you just use an IPv4 address family when creating your socket? Sorry, I meant only IPv6. I found this. It seems to imply that you need different sockets to do what you want to do. You might snag a copy of the latest openldap code and look at slapd to see what it's doing. At any rate, here's what I found that pointed me at it: slapd compiled with --enable-ipv6 does only listen to the IPv6 connections. I suspect this is tested on an operating system that receives IPv4 connections to the IPv6 socket as well, although this is not not the case for OpenBSD (nor FreeBSD or NetBSD ,IIRC). slapd needs to listen to both IPv4 and IPv6 on separate sockets. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading rant.
On Sat, 2003-01-04 at 22:37, Tom Lane wrote: You're missing the point: I don't want to lock out everyone but the super-user, I want to lock out everyone, period. Superusers are just as likely to screw up pg_upgrade as anyone else. BTW: $ postmaster -N 1 -c superuser_reserved_connections=1 postmaster: superuser_reserved_connections must be less than max_connections. $ Well, first, let me say that the above just seems wrong. I can't think of any valid reason why reserved shouldn't be allowed to equal max. I also assumed that pg_update would be attempting to connect as the superuser. Therefore, if you only allow a single connection from the superuser and pg_upgrade is using it, that would seem fairly hard to mess things up. On top of that, that's also the risk of someone being a superuser. They will ALWAYS have the power to hose things. Period. As such, I don't consider that to be a valid argument. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] [HACKERS] v7.3.1 Bundled and Released ...
On Sun, 2003-01-05 at 06:41, Dan Langille wrote: On Sat, 4 Jan 2003, Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: I never considered tag'ng for minor releases as having any importance, since the tarball's themselves provide the 'tag' ... branches give us the ability to back-patch, but tag's don't provide us anything ... do they? Well, a tag makes it feasible for someone else to recreate the tarball, given access to the CVS server. Dunno how important that is in the real world --- but I have seen requests before for us to tag release points. FWIW, in the real world, a release doesn't happen if it's not taqged. Agreed! Any tarballs, rpms, etc., should be made from the tagged source. Period. If rpm's are made from a tarball that is made from tagged source, that's fine. Nonetheless, any official release (major or minor) should always be made from the resulting tagged source. This does two things. First, it validates that everything has been properly tagged. Two, it ensures that there are not any localized files or changes which might become part of a tarball/release which are not officially part of the repository. I can't stress enough that a release should never happen unless source has been tagged. Releases should ALWAYS be made from a checkout based on tags. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] [HACKERS] v7.3.1 Bundled and Released ...
On Sat, 2003-01-04 at 04:27, Peter Eisentraut wrote: Greg Copeland writes: Just a reminder, there still doesn't appear to be a 7.3.1 tag. There is a long tradition of systematically failing to tag releases in this project. Don't expect it to improve. Well, I thought I remembered from the release team thread that it was said there was a punch list of things that are done prior to actually releasing. If not, it certainly seems like we need one. If there is one, tagging absolutely needs to be on it. If we have one and this is already on the list, seems we need to be eating our own food. ;) -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Threads
On Sat, 2003-01-04 at 06:59, Kaare Rasmussen wrote: Umm. No. User or system level threads, the statement is true. If a thread kills over, the process goes with it. Furthermore, on Win32 Hm. This is a database system. If one of the backend processes dies unexpectedly, I'm not sure I would trust the consistency and state of the others. Or maybe I'm just being chicken. I'd call that being wise. That's the problem with using threads. Should a thread do something naughty, the state of the entire process is in question. This is true regardless if it is a user mode, kernel mode, or hybrid thread implementation. That's the power of using the process model that is currently in use. Should it do something naughty, we bitch and complain politely, throw our hands in the air and exit. We no longer have to worry about the state and validity of that backend. This creates a huge systemic reliability surplus. This is also why the concept of a hybrid thread/process implementation keeps coming to the surface on the list. If you maintain the process model and only use threads for things that ONLY relate to the single process (single session/connection), should a thread cause a problem, you can still throw you hands in the air and exit just as is done now without causing problems for, or questioning the validity of, other backends. The cool thing about such a concept is that it still opens the door for things like parallel sorts and queries as it relates to a single backend. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Upgrading rant.
On Sat, 2003-01-04 at 09:53, Tom Lane wrote: Oliver Elphick [EMAIL PROTECTED] writes: On Sat, 2003-01-04 at 02:17, Tom Lane wrote: There isn't any simple way to lock *everyone* out of the DB and still allow pg_upgrade to connect via the postmaster, and even if there were, the DBA could too easily forget to do it. I tackled this issue in the Debian upgrade scripts. I close the running postmaster and open a new postmaster using a different port, so that normal connection attempts will fail because there is no postmaster running on the normal port. That's a good kluge, but still a kluge: it doesn't completely guarantee that no one else connects while pg_upgrade is trying to do its thing. I am also concerned about the consequences of automatic background activities. Even the periodic auto-CHECKPOINT done by current code is not obviously safe to run behind pg_upgrade's back (it does make WAL entries). And the auto-VACUUM that we are currently thinking of is even less obviously safe. I think that someday, running pg_upgrade standalone will become *necessary*, not just a good safety feature. regards, tom lane I thought there was talk of adding a single user/admin only mode. That is, where only the administrator can connect to the database. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] complie error on windows
If you run, gcc, at the prompt (preferably the one you're trying to run configure from), do you get something like, gcc: No input files or do you get, gcc: command not found? If you get the later (or something like it), you need to include it in your path, just as it's telling you to do. If you get the former, something would appear to be off. Greg On Fri, 2003-01-03 at 09:43, Claiborne, Aldemaco Earl (Al) wrote: Hi, I am trying to install postgresql-7.3 on windows and I keep getting the following error despite having downloaded a compiler. Can anyone tell me what I am not doing right? I am a newbie to postgres and development. My ultimate goal is to create a data driven application utilizing the J2EE architecture. $ ./configure checking build system type... i686-pc-cygwin checking host system type... i686-pc-cygwin checking which template to use... win checking whether to build with 64-bit integer date/time support... no checking whether to build with recode support... no checking whether NLS is wanted... no checking for default port number... 5432 checking for default soft limit on number of connections... 32 checking for gcc... no checking for cc... no configure: error: no acceptable C compiler found in $PATH Thanks, Al ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Threads
On Fri, 2003-01-03 at 14:47, mlw wrote: Please no threading threads!!! Ya, I'm very pro threads but I've long since been sold on no threads for PostgreSQL. AIO on the other hand... ;) Your summary so accurately addresses the issue it should be a whole FAQ entry on threads and PostgreSQL. :) Drawbacks to a threaded model: (1) One thread screws up, the whole process dies. In a multiple process application this is not too much of an issue. (2) Heap fragmentation. In a long uptime application, such as a database, heap fragmentation is an important consideration. With multiple processes, each process manages its own heap and what ever fragmentation that exists goes away when the connection is closed. A threaded server is far more vulnerable because the heap has to manage many threads and the heap has to stay active and unfragmented in perpetuity. This is why Windows applications usually end up using 2G of memory after 3 months of use. (Well, this AND memory leaks) These are things that can't be stressed enough. IMO, these are some of the many reasons why applications running on MS platforms tend to have much lower application and system up times (that and resources leaks which are inherent to the platform). BTW, if you do much in the way of threaded coding, there is libHorde which is a heap library for heavily threaded, memory hungry applications. It excels in performance, reduces heap lock contention (maintains multiple heaps in a very thread smart manner), and goes a long way toward reducing heap fragmentation which is common for heavily memory based, threaded applications. (3) Stack space. In a threaded application they are more limits to stack usage. I'm not sure, but I bet PostgreSQL would have a problem with a fixed size stack, I know the old ODBC driver did. Most modern thread implementations use a page guard on the stack to determine if it needs to grow or not. Generally speaking, for most modern platforms which support threading, stack considerations rarely become an issue. (5) Lastly, why bother? Seriously? Process creation time is an issue true, but its an issue with threads as well, just not as bad. Anyone who is looking for performance should be using a connection pooling mechanism as is done in things like PHP. I have done both threaded and process servers. The threaded servers are easier to write. The process based severs are more robust. From an operational point of view, a select foo from bar where x y will take he same amount of time. I agree with this, however, using threads does open the door for things like splitting queries and sorts across multiple CPUs. Something the current process model, which was previously agreed on, would not be able to address because of cost. Example: select foo from bar where x y order by foo ;, could be run on multiple CPUs if the sort were large enough to justify. After it's all said and done, I do agree that threading just doesn't seem like a good fit for PostgreSQL. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Threads
On Fri, 2003-01-03 at 14:52, Dann Corbit wrote: -Original Message- (1) One thread screws up, the whole process dies. In a multiple process application this is not too much of an issue. If you use C++ you can try/catch and nothing bad happens to anything but the naughty thread. That doesn't protect against the type of issues he's talking about. Invalid pointer reference is a very common snafu which really hoses threaded applications. Not to mention resource leaks AND LOCKED resources which are inherently an issue on Win32. Besides, it's doubtful that PostgreSQL is going to be rewritten in C++ so bringing up try/catch is pretty much an invalid argument. (2) Heap fragmentation. In a long uptime application, such as a database, heap fragmentation is an important consideration. With multiple processes, each process manages its own heap and what ever fragmentation that exists goes away when the connection is closed. A threaded server is far more vulnerable because the heap has to manage many threads and the heap has to stay active and unfragmented in perpetuity. This is why Windows applications usually end up using 2G of memory after 3 months of use. (Well, this AND memory leaks) Poorly written applications leak memory. Fragmentation is a legitimate concern. And well written applications which attempt to safely handle segfaults, etc., often leak memory and lock resources like crazy. On Win32, depending on the nature of the resources, once this happens, even process termination will not free/unlock the resources. (4) Lock Contention. The various single points of access in a process have to be serialized for multiple threads. heap allocation, deallocation, etc all have to be managed. In a multple process model, these resources would be separated by process contexts. Semaphores are more complicated than critical sections. If anything, a shared memory approach is more problematic and fragile, especially when porting to multiple operating systems. And critical sections lead to low performance on SMP systems for Win32 platforms. No task can switch on ANY CPU for the duration of the critical section. It's highly recommend by MS as the majority of Win32 applications expect uniprocessor systems and they are VERY fast. As soon as multiple processors come into the mix, critical sections become a HORRIBLE idea if any soft of scalability is desired. Is it a FAQ? If not, it ought to be. I agree. I think mlw's list of reasons should be added to a faq. It terse yet says it all! -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Threads
On Fri, 2003-01-03 at 19:34, Tom Lane wrote: Serguei Mokhov [EMAIL PROTECTED] writes: (1) One thread screws up, the whole process dies. In a multiple process application this is not too much of an issue. (1) is an issue only for user-level threads. Umm. No. User or system level threads, the statement is true. If a thread kills over, the process goes with it. Furthermore, on Win32 platforms, it opens a whole can of worms no matter how you care to address it. Uh, what other kind of thread have you got in mind here? I suppose the lack-of-cross-thread-protection issue would go away if our objective was only to use threads for internal parallelism in each backend instance (ie, you still have one process per connection, but internally it would use multiple threads to process subqueries in parallel). Several have previously spoken about a hybrid approach (ala Apache). IIRC, it was never ruled out but it was simply stated that no one had the energy to put into such a concept. Of course that gives up the hope of faster connection startup that has always been touted as a major reason to want Postgres to be threaded... regards, tom lane Faster startup, should never be the primary reason as there are many ways to address that issue already. Connection pooling and caching are by far, the most common way to address this issue. Not only that, but by definition, it's almost an oxymoron. If you really need high performance, you shouldn't be using transient connections, no matter how fast they are. This, in turn, brings you back to persistent connections or connection pools/caches. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Threads
On Fri, 2003-01-03 at 21:39, mlw wrote: Connection time should *never* be in the critical path. There, I've said it!! People who complain about connection time are barking up the wrong tree. Regardless of the methodology, EVERY OS has issues with thread creation, process creation, the memory allocation, and system manipulation required to manage it. Under load this is ALWAYS slower. I think that if there is ever a choice, do I make startup time faster? or Do I make PostgreSQL not need a dump/restore for upgrade the upgrade problem has a much higher impact to real PostgreSQL sites. Exactly. Trying to speed up something that shouldn't be in the critical path is exactly what I'm talking about. I completely agree with you! -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] [HACKERS] v7.3.1 Bundled and Released ...
Just a reminder, there still doesn't appear to be a 7.3.1 tag. This is from the HISTORY file. symbolic names: REL7_3_STABLE: 1.182.0.2 REL7_2_3: 1.153.2.8 REL7_2_STABLE: 1.153.0.2 REL7_2: 1.153 Notice 7.3 stable but nothing about 7.3.x! I also see a 7.2.3, etc., just as one would expect but nothing about 7.3 dot releases. I'm still getting, cvs [server aborted]: no such tag REL7_3_1_STABLE. Something overlooked here? Regards, Greg Copeland On Mon, 2002-12-23 at 09:57, Bruce Momjian wrote: Greg Copeland wrote: On Sun, 2002-12-22 at 13:12, Marc G. Fournier wrote: Last night, we packaged up v7.3.1 of PostgreSQL, our latest stable release. Purely meant to be a bug fix release, this one does have one major change, in that the major number of the libpq library was increased, which means that everyone is encouraged to recompile their clients along with this upgrade. This release can be found on all the mirrors, and on the root ftp server, under: ftp://ftp.postgresql.org/pub/source/v7.3.1 Please report all bugs to [EMAIL PROTECTED] Hmm. For some reason I'm not seeing a 7.3.1 tag in CVS. Do you guys do something else for sub-releases? Case in point: cvs [server aborted]: no such tag REL7_3_1_STABLE It's still early here so I may be suffering from early morning brain rot. ;) There should be a 7.3.1 tag, but you can use the 7_3 branch to pull 7.3.1. Of course, that will shift as we patch for 7.3.2. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] v7.3.1 Bundled and Released ...
On Sun, 2002-12-22 at 13:12, Marc G. Fournier wrote: Last night, we packaged up v7.3.1 of PostgreSQL, our latest stable release. Purely meant to be a bug fix release, this one does have one major change, in that the major number of the libpq library was increased, which means that everyone is encouraged to recompile their clients along with this upgrade. This release can be found on all the mirrors, and on the root ftp server, under: ftp://ftp.postgresql.org/pub/source/v7.3.1 Please report all bugs to [EMAIL PROTECTED] Hmm. For some reason I'm not seeing a 7.3.1 tag in CVS. Do you guys do something else for sub-releases? Case in point: cvs [server aborted]: no such tag REL7_3_1_STABLE It's still early here so I may be suffering from early morning brain rot. ;) Regards, Greg -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] v7.3.1 tar ready ... please check it ...
On Wed, 2002-12-18 at 08:53, Dave Page wrote: -Original Message- From: Marc G. Fournier [mailto:[EMAIL PROTECTED]] Sent: 18 December 2002 14:51 To: Robert Treat Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] v7.3.1 tar ready ... please check it ... On Wed, 18 Dec 2002, Robert Treat wrote: Is this going to be announced to a wider press audience? Has anyone gone over the list of things to do when we release to make sure things like the websites getting updated or perhaps getting rpm builds coordinated has been done? No, we don't do that with minor releases ... nothing has changed that needs to be announced, other then a few bugs fixed ... Maybe we should? The more publicity the better etc... Regards, Dave ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] I agree it should be considered. It helps build mind share, which I think we can all agree is somewhat lacking for PostgreSQL compared to MySQL. It helps build the impression that PostgreSQL doesn't just sit idle between major releases. It allows a potential user base to see PostgreSQL more frequently and build interest. It let's people know that PostgreSQL is constantly being improved. Mind share is a powerful thing and as any advertiser will tell you, press releases is one of the best ways to get the word out. Greg -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Password security question
On Tue, 2002-12-17 at 10:49, mlw wrote: Christopher Kings-Lynne wrote: Hi guys, Just a thought - do we explicitly wipe password strings from RAM after using them? I just read an article (by MS in fact) that illustrates a cute problem. Imagine you memset the password to zeros after using it. There is a good chance that the compiler will simply remove the memset from the object code as it will seem like it can be optimised away... Just wondering... Chris Could you post that link? That seems wrong, an explicit memset certainly changes the operation of the code, and thus should not be optimized away. I'd like to see the link too. I can imagine that it would be possible for it to optimize it away if there wasn't an additional read/write access which followed. In other words, why do what is more or less a no-op if it's never accessed again. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Password security question
On Tue, 2002-12-17 at 11:11, Ken Hirsch wrote: http://msdn.microsoft.com/library/en-us/dncode/html/secure10102002.asp ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Thanks. Seems I hit the nail on the head. ;) -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Update on replication
On Tue, 2002-12-17 at 20:55, Neil Conway wrote: On Tue, 2002-12-17 at 21:33, Greg Copeland wrote: I do agree, GBorg needs MUCH higher visibility! I'm just curious: why do we need GBorg at all? Does it offer anything that SourceForge, or a similar service does not offer? Especially given that (a) most other OSS projects don't have a site for related projects (unless you count something like CPAN, which is totally different) (b) GBorg is completely unknown to anyone outside the PostgreSQL community and even to many people within it... Part I can answer, part I can not. Since I'm not the one that pushed the projects to that site, I can't answer that part of the equation. Addressing the part of your question that I think I can, I do like the concept of one-stop-shopping for all PostgreSQL needs. All Things ProgreSQL is a pretty neat concept. Of course, it rather defeats the whole purpose if no one, including potential developers, have no idea it exists. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Big 7.4 items
On Fri, 2002-12-13 at 04:53, Hannu Krosing wrote: On Fri, 2002-12-13 at 06:22, Bruce Momjian wrote: I wanted to outline some of the big items we are looking at for 7.4: Point-In-Time Recovery (PITR) J. R. Nield did a PITR patch late in 7.3 development, and Patrick MacDonald from Red Hat is working on merging it into CVS and adding any missing pieces. Patrick, do you have an ETA on that? How hard would it be to extend PITR for master-slave (hot backup) repliaction, which should then amount to continuously shipping logs to slave and doing nonstop PITR there :) It will never be usable for multi-master replication, but somehow it feels that for master-slave replication simple log replay would be most simple and robust solution. I'm curious, what would be the recovery strategy for PITR master-slave replication should the master fail (assuming hot fail over/backup)? A simple dump/restore? Are there/is there any facilities in PorstgreSQL for PITR archival which prevents PITR logs from be recycled (or perhaps, simply archived off)? What about PITR streaming to networked and/or removable media? -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Big 7.4 items
I must of miscommunicated here as you're describing PITR replication. I'm asking about a master failing and the slaving picking up. Now, some n-time later, how do you recover your master system to be back in sync with the slave. Obviously, I'm assuming some level of manual recovery. I'm wondering what the general approach would be. Consider that on the slave which is now the active server (master dead), it's possible that the slave's PITR's will be recycled before the master can come back up. As such, unless there is a, an archival process for PITR or b, a method of streaming PITR's off for archival, the odds of using PITR to recover the master (resync if you will) seem greatly reduced as you will be unable to replay PITR on the master for synchronization. Greg On Mon, 2002-12-16 at 08:02, Shridhar Daithankar wrote: On Monday 16 December 2002 07:26 pm, you wrote: I'm curious, what would be the recovery strategy for PITR master-slave replication should the master fail (assuming hot fail over/backup)? A simple dump/restore? Are there/is there any facilities in PorstgreSQL for PITR archival which prevents PITR logs from be recycled (or perhaps, simply archived off)? What about PITR streaming to networked and/or removable media? In asynchrounous replication, WAL log records are fed to anoter host, which replays those transactions to sync the data. This way it does not matter if WAL log is recycled as it is already replicated someplace else.. HTH Shridhar ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Big 7.4 items
On Mon, 2002-12-16 at 08:20, Shridhar Daithankar wrote: On Monday 16 December 2002 07:43 pm, you wrote: Consider that on the slave which is now the active server (master dead), it's possible that the slave's PITR's will be recycled before the master can come back up. As such, unless there is a, an archival process for PITR or b, a method of streaming PITR's off for archival, the odds of using PITR to recover the master (resync if you will) seem greatly reduced as you will be unable to replay PITR on the master for synchronization. I agree. Since we are talking about features in future release, I think it should be added to TODO if not already there. I don't know about WAL numbering but AFAIU, it increments and old files are removed once there are enough WAL files as specified in posgresql.conf. IIRC there are some perl based replication project exist already which use this feature. The problem with this is that most people, AFAICT, are going to size WAL based on their performance/sizing requirements and not based on theoretical estimates which someone might make to allow for a window of failure. That is, I don't believe increasing the number of WAL's is going to satisfactorily address the issue. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PQnotifies() in 7.3 broken?
But it's something they should of already had to do. We're just paying late for old sins. ;) Greg On Thu, 2002-12-12 at 23:34, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: OK, so what do we do with 7.3.1. Increment major or minor? Major. I thought you did it already? I did only minor, which I knew was safe. Do folks realize this will require recompile of applications by 7.3 users moving to 7.3.1? That seems very drastic, and there have been very few problem reports about the NOTIFY change. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [INTERFACES] Patch for DBD::Pg pg_relcheck problem
Perhaps compression should be added to the list of protocol changes. This way, we can allow for per packet evaluation for compression. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting On Tue, 2002-12-10 at 21:50, Bruce Momjian wrote: Tom Lane wrote: Ian Barwick [EMAIL PROTECTED] writes: Sounds good to me. Is it on the todo-list? (Couldn't see it there). Probably not; Bruce for some reason has resisted listing protocol change desires as an identifiable TODO category. There are a couple of threads in the pghackers archives over the last year or so that discuss the different things we want to do, though. (Improving the error-reporting framework and fixing the COPY protocol are a couple of biggies I can recall offhand.) Listing protocol changes seemed too low-level for the TODO list, but I have kept the email messages. Today I updated the TODO list and added a section for them. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Auto Vacuum Daemon (again...)
On Fri, 2002-11-29 at 07:19, Shridhar Daithankar wrote: On 29 Nov 2002 at 7:59, Matthew T. O'Connor wrote: On Thursday 28 November 2002 23:26, Shridhar Daithankar wrote: On 28 Nov 2002 at 10:45, Tom Lane wrote: This is almost certainly a bad idea. vacuum is not very processor-intensive, but it is disk-intensive. Multiple vacuums running at once will suck more disk bandwidth than is appropriate for a background operation, no matter how sexy your CPU is. I can't see any reason to allow more than one auto-scheduled vacuum at a time. Hmm.. We would need to take care of that as well.. Not sure what you mean by that, but it sounds like the behaviour of my AVD (having it block until the vacuum command completes) is fine, and perhaps preferrable. Right.. But I will still keep option open for parallel vacuum which is most useful for reusing tuples in shared buffers.. And stale updated tuples are what causes performance drop in my experience.. You know.. just enough rope to hang themselves..;-) Right. This is exactly what I was thinking about. If someone shoots their own foot off, that's their problem. The added flexibility seems well worth it. Greg ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Auto Vacuum Daemon (again...)
On Fri, 2002-11-29 at 06:59, Matthew T. O'Connor wrote: On Thursday 28 November 2002 23:26, Shridhar Daithankar wrote: On 28 Nov 2002 at 10:45, Tom Lane wrote: Matthew T. O'Connor [EMAIL PROTECTED] writes: interesting thought. I think this boils down to how many knobs do we need to put on this system. It might make sense to say allow upto X concurrent vacuums, a 4 processor system might handle 4 concurrent vacuums very well. This is almost certainly a bad idea. vacuum is not very processor-intensive, but it is disk-intensive. Multiple vacuums running at once will suck more disk bandwidth than is appropriate for a background operation, no matter how sexy your CPU is. I can't see any reason to allow more than one auto-scheduled vacuum at a time. Hmm.. We would need to take care of that as well.. Not sure what you mean by that, but it sounds like the behaviour of my AVD (having it block until the vacuum command completes) is fine, and perhaps preferrable. I can easily imagine larger systems with multiple CPUs and multiple disk and card bundles to support multiple databases. In this case, I have a hard time figuring out why you'd not want to allow multiple concurrent vacuums. I guess I can understand a recommendation of only allowing a single vacuum, however, should it be mandated that AVD will ONLY be able to perform a single vacuum at a time? Greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] 7.4 Wishlist
On Tue, 2002-12-10 at 09:36, Stephen L. wrote: 6. Compression between client/server interface like in MySQL Mammoth is supposed to be donating their compression efforts back to this project, or so I've been told. I'm not exactly sure of their time-line as I've slept since my last conversation with them. The initial feedback that I've gotten back from them on this subject is that the compression has been working wonderfully for them with excellent results. IIRC, in their last official release, they announced their compression implementation. So, I'd think that it would be available for 7.4 of 7.5 time frame. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [mail] Re: [HACKERS] 7.4 Wishlist
On Tue, 2002-12-10 at 11:25, Al Sutton wrote: Would it be possible to make compression an optional thing, with the default being off? I'm not sure. You'd have to ask Command Prompt (Mammoth) or wait to see what appears. What I originally had envisioned was a per database and user permission model which would better control use. Since compression can be rather costly for some use cases, I also envisioned it being negotiated where only the user/database combo with permission would be able to turn it on. I do recall that compression negotiation is part of the Mammoth implementation but I don't know if it's a simple capability negotiation or part of a larger scheme. The reason I originally imagined a user/database type approach is because I would think only a subset of a typical installation would be needing compression. As such, this would help prevent users from arbitrarily chewing up database CPU compressing data because: o datasets are uncompressable or poorly compresses o environment cpu is at a premium o is in a bandwidth rich environment I'm in a position that many others may be in where the link between my app server and my database server isn't the bottleneck, and thus any time spent by the CPU performing compression and decompression tasks is CPU time that is in effect wasted. Agreed. This is why I'd *guess* that Mammoth's implementation does not force compression. If a database is handling numerous small queries/updates and the request/response packets are compressed individually, then the overhead of compression and decompression may result in slower performance compared to leaving the request/response packets uncompressed. Again, this is where I'm gray on their exact implementation. It's possible they implemented a compressed stream even though I'm hoping they implemented a per packet compression scheme (because adaptive compression becomes much more capable and powerful; in both algorithmically and logistical use). An example of this would be to avoid any compression on trivially sized result sets. Again, this is another area where I can imagine some tunable parameters. Just to be on the safe side, I'm cc'ing Josh Drake at Command Prompt (Mammoth) to see what they can offer up on it. Hope you guys don't mind. Greg - Original Message - From: Greg Copeland [EMAIL PROTECTED] To: Stephen L. [EMAIL PROTECTED] Cc: PostgresSQL Hackers Mailing List [EMAIL PROTECTED] Sent: Tuesday, December 10, 2002 4:56 PM Subject: [mail] Re: [HACKERS] 7.4 Wishlist On Tue, 2002-12-10 at 09:36, Stephen L. wrote: 6. Compression between client/server interface like in MySQL Mammoth is supposed to be donating their compression efforts back to this project, or so I've been told. I'm not exactly sure of their time-line as I've slept since my last conversation with them. The initial feedback that I've gotten back from them on this subject is that the compression has been working wonderfully for them with excellent results. IIRC, in their last official release, they announced their compression implementation. So, I'd think that it would be available for 7.4 of 7.5 time frame. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [mail] Re: [HACKERS] 7.4 Wishlist
On Tue, 2002-12-10 at 13:38, Bruce Momjian wrote: I haven't heard anything about them contributing it. Doesn't mean it will not happen, just that I haven't heard it. This was in non-mailing list emails that I was told this by Joshua Drake at Command Prompt. Of course, that doesn't have to mean it will be donated for sure but nonetheless, I was told it will be. Here's a quote from one of the emails. I don't think I'll be too far out of line posting this. On August 9, 2002, Joshua Drake said, One we plan on releasing this code to the developers after 7.3 comes out. We want to be good members of the community but we have to keep a slight commercial edge (wait to you see what we are going to do to vacuum). Obviously, I don't think that was official speak, so I'm not holding them to the fire, nonetheless, that's what was said. Additional follow ups did seem to imply that they were very serious about this and REALLY want to play nice as good shared source citizens. I am not excited about per-db/user compression because of the added complexity of setting it up, and even set up, I can see cases where some queries would want it, and others not. I can see using GUC to control this. If you enable it and the client doesn't support it, it is a no-op. We have per-db and per-user settings, so GUC would allow such control if you wish. I never thought beyond the need for what form an actual implementation of this aspect would look like. The reason for such a concept would be to simply limit the number of users that can be granted compression. If you have a large user base all using compression or even a small user base where very large result sets are common, I can imagine your database server becoming CPU bound. The database/user thinking was an effort to allow the DBA to better manage the CPU effect. Ideally, it would be a tri-valued parameter, that is ON, OFF, or AUTO, meaning it would determine if there was value in the compression and do it only when it would help. Yes, that makes sense and was something I had originally envisioned. Simply stated, some installations may never want compression while others may want it for every connection. Beyond that, I believe there needs to be something of a happy medium where a DBA can better control who and what is taking his CPU away (e.g. only that one remote location being fed via ISDN). If GUC can fully satisfy, I certainly won't argue against it. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Auto Vacuum Daemon (again...)
On Tue, 2002-12-10 at 13:09, scott.marlowe wrote: On 10 Dec 2002, Rod Taylor wrote: Perhaps a more appropriate rule would be 1 AVD per tablespace? Since PostgreSQL only has a single tablespace at the moment But Postgresql can already place different databases on different data stores. I.e. initlocation and all. If someone was using multiple SCSI cards with multiple JBOD or RAID boxes hanging off of a box, they would have the same thing, effectively, that you are talking about. So, someone out there may well be able to use a multiple process AVD right now. Imagine m databases on n different drive sets for large production databases. That's right. I always forget about that. So, it seems, regardless of the namespace effort, we shouldn't be limiting the number of concurrent AVD's. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PQnotifies() in 7.3 broken?
Seems like a mistake was made. Let's (don't ya love how that sounds like I'm actually involved in the fix? ;) fix it sooner rather than later. Just curious, after a release, how come the numbers are not automatically bumped to ensure this type thing gets caught sooner rather than later? Is it possible to automate this as part of the build process so that they get grabbed from some version information during the build? Greg On Tue, 2002-12-10 at 17:36, Bruce Momjian wrote: OK, seeing that we don't have a third number, do people want me to increment the interface numbers for 7.3.1, or just wait for the increment in 7.4? --- Peter Eisentraut wrote: Tom Lane writes: It is not real clear to me whether we need a major version bump, rather than a minor one. We *do* need to signal binary incompatibility. Who can clarify the rules here? Strictly speaking, it's platform-dependent, but our shared library code plays a bit of abuse with it. What it comes down to is: If you change or remove an interface, increment the major version number. If you add an interface, increment the minor version number. If you did neither but changed the source code at all, increment the third version number, if we had one. To be thoroughly amused, read the libtool source. Grep for 'version_type'. -- Peter Eisentraut [EMAIL PROTECTED] -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [mail] Re: 7.4 Wishlist
This has been brought up a couple of times now. Feel free to search the old archives for more information. IIRC, it would of made the implementation more problematic, or so I think it was said. When I originally brought the topic (compression) up, it was not well received. As such, it may of been thought that additional effort on such an implementation would not be worth the return on a feature which most seemingly didn't see any purpose in supporting in the first place. You need to keep in mind that many simply advocated using a compressing ssh tunnel. Seems views may of changed some since then so it may be worth revisiting. Admittedly, I have no idea what would be required to move the toast data all the way through like that. Any idea? Implementing a compression stream (which seems like what was done for Mammoth) or even packet level compression were both something that I could comfortably put my arms around in a timely manner. Moving toast data around wasn't. Greg On Tue, 2002-12-10 at 18:45, Kyle wrote: Without getting into too many details, why not send toast data to non-local clients? Seems that would be the big win. The data is already compressed, so the server wouldn't pay cpu time to recompress anything. And since toast data is relatively large anyway, it's the stuff you'd want to compress before putting it on the wire anyway. If this is remotely possible let me know, I might be interested in taking a look at it. -Kyle Bruce Momjian wrote: I am not excited about per-db/user compression because of the added complexity of setting it up, and even set up, I can see cases where some queries would want it, and others not. I can see using GUC to control this. If you enable it and the client doesn't support it, it is a no-op. We have per-db and per-user settings, so GUC would allow such control if you wish. Ideally, it would be a tri-valued parameter, that is ON, OFF, or AUTO, meaning it would determine if there was value in the compression and do it only when it would help. -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] pgAdmin III (Was: Request for supported platforms)
Since you're using wxWindows, I *HIGHLY* recommend obtaining a license to wxDesigner from http://www.roebling.de/. It allows for very rapid GUI design. It also understands various sizers and makes it SOOO much easier to make use of them. Once you understand sizers, you'll love them but they are somewhat hard to use without a tool to assist. Also, for the about box, I also suggest that you make use of a wxHTML dialog. The cool thing about doing this is that you can create links and even place nice images within the about box. Plus, maintaining the about box can be easily done via an HTML resource file versus having to update code and recompile. If you have questions on wxWindows and/or wxPython, please let me know. I've been a long time user. Also, if you're willing, I also have some constructive criticism on the code. Since this is obviously going off topic as it relates to PostgreSQL, it probably wouldn't be appropriate to followup on the mailing list. Best Regards, Greg Copeland On Wed, 2002-10-30 at 02:19, Dave Page wrote: -Original Message- From: Greg Copeland [mailto:greg;copelandconsulting.net] Sent: 30 October 2002 01:08 To: Dave Page Subject: Re: [HACKERS] Request for supported platforms C++? Really? What GUI toolkit is being used? Just curious. wxWindows. The CVS is online if anyone is interested in taking a peek - it's at http://cvs.pgadmin.org/. The code is in the pgadmin3 module. Please bear in mind though, that Mark I have only been using C++ for about a month (though things are coming along very quickly), so there may be some nasties in our code. Any constructive comments from any of the -hackers would be more than welcome. Regards, Dave. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] idle connection timeout ...
On Fri, 2002-10-25 at 00:52, Marc G. Fournier wrote: Ya, I've thought that one through ... I think what I'm more looking at is some way of 'limiting' persistent connections, where a server opens n connections during a spike, which then sit idle indefinitely since it was one fo those 'slashdot effect' kinda spikes ... Is there any way of the 'master process' *safely/accurately* knowing, through the shared memory link, the # of connections currently open to a particular database? So that a limit could be set on a per db basis, say as an additional arg to pg_hba.conf? Well, if you're application is smart enough to know it needs to dynamically add connections, it should also be smart enough to tear them down after some idle period. I agree with Tom. I think that sounds like application domain. Greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Memory leaks
On Tue, 2002-10-22 at 22:28, Tom Lane wrote: Greg Copeland [EMAIL PROTECTED] writes: So again, I'm not really sure it they are meaningful at this point. psql might well have some internal leaks; the backend memory-context design doesn't apply to it. Okay. Thanks. I'll probably take another look at it a little later and report back if I find anything of value. Does that mean they are not using palloc/pfree stuff? Not everywhere. plpgsql is full of malloc's and I think the other PL modules are too --- and that's not to mention the allocation policies of the perl, tcl, etc, language interpreters. We could use a thorough review of that whole area. Okay. I've started looking at plpython to better understand it's memory needs. I'm seeing a mix of mallocs and PLy_malloc. The PLy version is basically malloc which also checks and reports on memory allocation errors. Anyone know if the cases where malloc was used was purposely done so for performance reasons or simply the flavor or the day? I thinking for starters, the plpython module could be normalized to use the PLy_malloc stuff across the board. Then again, I still need to spend some more time on it. ;) Well, the thing that really got my attention is that dmalloc is reporting frees on null pointers. AFAIK that would dump core on many platforms (it sure does here...), so I don't think I believe it without seeing chapter and verse. I actually expected it to do that here on my x86-Linux platform but a quick check showed that it was quiet happy with it. What platforms are you using -- just curious? But if you can point out where it's really happening, then we must fix it. I'll trying track this down some more this coming week to see if this is really occurring. After thinking about it, I'm not sure why dmalloc would ever report a free on null if it were not actually occurring. After all, any call to free should still be showing some memory address (valid or otherwise). Off the top of my head, I can't think of an artifact that would cause it to falsely report it. Greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Memory leaks
On Wed, 2002-10-23 at 08:48, Tom Lane wrote: Greg Copeland [EMAIL PROTECTED] writes: Okay. I've started looking at plpython to better understand it's memory needs. I'm seeing a mix of mallocs and PLy_malloc. The PLy version is basically malloc which also checks and reports on memory allocation errors. Anyone know if the cases where malloc was used was purposely done so for performance reasons or simply the flavor or the day? Probably either oversight or the result of different people's different coding styles. My local copy has this changed to PLy stuff now. Testing shows it's good...then again, I didn't really expect it to change anything. I'll submit patches later. I thinking for starters, the plpython module could be normalized to use the PLy_malloc stuff across the board. Then again, I still need to spend some more time on it. ;) Consistency is good. What I'd wonder about, though, is whether you shouldn't be using palloc ;-). malloc, with or without a PLy_ wrapper, doesn't provide any leverage to help you get rid of stuff when you don't want it anymore. Ya, I'm currently looking to see how the memory is being used and why. I'm trying to better understand it's life cycle. You implying that even the short term memory should be using the palloc stuff? What about long term? Blanket statement that pretty much all the PLy stuff should really be using palloc? Well, the thing that really got my attention is that dmalloc is reporting frees on null pointers. AFAIK that would dump core on many platforms (it sure does here...), I have to take that back: I was thinking about pfree() not free(). The ANSI C spec says that free(NULL) is a legal no-op, and there are Oh really. I didn't realize that. I've been using the if( ptr ) stuff for so long I didn't realize I didn't need to anymore. Thanks for the update. That was, of course, the cause for alarm. It's probably pointless to try to convince people to change that coding style. Well at this late time, I think it's safe to say that it's not causing problems for anyone on any of the supported platforms. So I'll not waste time looking for it even though I happen think it's a poor practice just the same. Thanks, Greg ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PREPARE / EXECUTE
If you were using them that frequently, couldn't you just keep a persistent connection? If it's not used that often, wouldn't the overhead of preparing the query following a new connection become noise? Greg On Wed, 2002-10-23 at 09:24, Hans-Jürgen Schönig wrote: First of all PREPARE/EXECUTE is a wonderful thing to speed up things significantly. I wonder if there is a way to store a parsed/rewritten/planned query in a table so that it can be loaded again. This might be useful when it comes to VERY complex queries ( 10 tables). I many applications the situation is like that: a. The user connects to the database. b. The user sends various different queries to the server (some might be the same) c. The user disconnects. If there was a way to store execution plans in a table the user could load the execution plans of the most time consuming stuff into the backend without parsing and optimizing it every time he authenticates. Does it sound useful to anybody? Is it possible to do it or are there some technical problems? Maybe this is worth thinking about. Hans -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at http://www.postgresql.at, cluster.postgresql.at http://cluster.postgresql.at, www.cybertec.at http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PREPARE / EXECUTE
Could you use some form of connection proxy where the proxy is actually keeping persistent connections but your application is making transient connections to the proxy? I believe this would result in the desired performance boost and behavior. Now, the next obvious question...anyone know of any proxy apps available for postgresql? Regards, Greg On Wed, 2002-10-23 at 11:04, Hans-Jürgen Schönig wrote: The idea is not to have it accross multiple backends and having it in sync with the tables in the database. This is not the point. My problem is that I have seen many performance critical applications sending just a few complex queries to the server. The problem is: If you have many queries where the relation time planner/time executor is very high (eg. complex joins with just one value as the result). These applications stay the same for a long time (maybe even years) and so there is no need to worry about new tables and so forth - maybe there is not even a need to worry about new data. In these cases we could speed up the database significantly just by avoiding the use of the planner: An example: I have a join across 10 tables + 2 subselects across 4 tables on the machine I use for testing: planner: 12 seconds executor: 1 second The application will stay the same forever. I could be 10 times faster if there was a way to load the execution plan into the backend. There is no way to use a persistent connection (many clients on different machines, dynamic IPs, etc. ...) There is no way to have an invalid execution plan because there are no changes (new tables etc.) in the database. Also: If people execute a prepared query and it fails they will know why - queries will fail if people drop a table even if these queries are not prepared. A new feature like the one we are discussing might be used rarely but if people use it they will benefit A LOT. If we had a simple ASCII interface to load the stuff into the planner people could save MANY cycles. When talking about tuning it is nice to gain 10% or even 20% but in many cases it does not solve a problem - if a problem can be reduced by 90% it is a REAL gain. Gaining 10% can be done by tweaking the database a little - gaining 1000% cannot be done so it might be worth thinking about it even it the feature is only used by 20% of those users out there. 20% of all postgres users is most likely more than 15.000 people. Again; it is not supposed to be a every-day solution. It is a solution for applications staying the same for a very long time. Hans Tom Lane wrote: =?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes: I wonder if there is a way to store a parsed/rewritten/planned query in a table so that it can be loaded again. The original version of the PREPARE patch used a shared-across-backends cache for PREPAREd statements. We rejected that for a number of reasons, one being the increased difficulty of keeping such a cache up to date. I think actually storing the plans on disk would have all the same problems, but worse. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at http://www.postgresql.at, cluster.postgresql.at http://cluster.postgresql.at, www.cybertec.at http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Memory leaks
I've started playing a little with Postgres to determine if there were memory leaks running around. After some very brief checking, I'm starting[1] to think that the answer is yes. Has anyone already gone through a significant effort to locate and eradicate memory leaks? Is this done periodically? If so, what tools are others using? I'm currently using dmalloc for my curiosity. [1] Not sure yet as I'm really wanting to find culumative leaks rather than one shot allocations which are simply never freed prior to process termination. Regards, Greg Copeland ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Memory leaks
On Tue, 2002-10-22 at 17:09, Tom Lane wrote: Greg Copeland [EMAIL PROTECTED] writes: I've started playing a little with Postgres to determine if there were memory leaks running around. After some very brief checking, I'm starting[1] to think that the answer is yes. Has anyone already gone through a significant effort to locate and eradicate memory leaks? Yes, this has been dealt with before. What tools, aside from noggin v1.0, did they use? Do we know? Have you read src/backend/utils/mmgr/README? Yes...but it's been some time since I last opened it. It was because I knew there were some caveats that I wasn't attempting to claim for sure that there were leaks. I then moved on to psql, again, just for fun. Here, I'm thinking that I started to find some other leaks...but again, I've not spent any real time on it. So again, I'm not really sure it they are meaningful at this point. AFAIK the major problems these days are not in the backend as a whole, but in the lesser-used PL language modules (cf recent discussions). Ya, that's what made me wonder about the topic on a larger scale. plpgsql has some issues too, I suspect, but not as bad as pltcl etc. Possibly the best answer is to integrate the memory-context notion into those modules; if they did most of their work in a temp context that could be freed once per PL statement or so, the problems would pretty much go away. Interesting. Having not looked at memory management schemes used in the pl implementations, can you enlighten me by what you mean by integrate the memory-context notion? Does that mean they are not using palloc/pfree stuff? It's fairly difficult to get anywhere with standard leak-tracking tools, since they don't know anything about palloc. What's worse, it is *not* a bug for a routine to palloc space it never pfrees, if it knows that it's palloc'ing in a short-lived memory context. The fact that a context may be released with much still-allocated memory in it is not a bug but a feature; but try teaching that to any standard leak checker... regards, tom lane Well, the thing that really got my attention is that dmalloc is reporting frees on null pointers. While that may be safe on specific platforms, IIRC, it's actually undefined. Again, this is as reported by dmalloc so I've yet to actually track it down to determine if it's possibly something else going on (magic voodoo of some heap manager, etc). Greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Postgresql and multithreading
On Thu, 2002-10-17 at 22:20, Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Let me add one more thing on this thread. This is one email in a long list of Oh, gee, you aren't using that wizz-bang new sync/thread/aio/raid/raw feature discussion where someone shows up and wants to know why. Does anyone know how to address these, efficiently? Simple: respond to 'em all with a one-line answer: convince us why we should use it. The burden of proof always seems to fall on the wrong end in these discussions. regards, tom lane That may be easier said that done. If you don't know what the objections are, it's hard to argue your case. If you do know and understand the objections, chances are you already know the code very well and/or have the mailing lists for a very long time. This basically means, you don't want to hear from anyone unless they are one with the code. That seems and sounds very anti-open source. After it's all said and done, I think you guys are barking up the wrong tree. Open Source is all about sharing ideas. Many times I've seen ideas expressed here that were not exact hits yet help facilitate discussion, understanding on the topics in general and in some cases may even spur other ideas or associated code fixes/improvements. When I first started on this list, I was scolded rather harshly for not asking all of my questions on the list. Originally, I was told to ask reasonable questions so that everyone can learn. Now, it seems, that people don't want to answer questions at all as it's bothering the developers. Commonly asked items, such as threading, seems like they are being addressed rather well without core developer participation. Right now, I'm not seeing any down sides to what's currently in place. If the core developers still feel like they are spending more time then they like, then perhaps those that following the mailing list can step forward a little more to address general questions and defer when needed. The topic, such as threading, was previously addressed yet people still followed up on the topic. Perhaps those that don't want to be bothered should allow more time for others to address the topic and leave it alone once it has been addressed. That alone seems like it would be a huge time saver for the developers and a better use of resources. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Postgresql and multithreading
On Wed, 2002-10-16 at 01:27, Shridhar Daithankar wrote: Well, slow adoption rate is attributed to 'apache 1.3.x is good enough for us' syndrome, as far as I can see from news. Once linux distros start shipping with apache 2.x series *only*, the upgrade cycle will start rolling, I guess. I think that's part of it. I think the other part is that by the time you're getting to huge r/s numbers, typical web site bandwidth is already used up. So, what's the point in adding more breathing room when you don't have the bandwidth to use it anyways. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Vacuum improvement
On Wed, 2002-10-16 at 02:29, Gavin Sherry wrote: On 16 Oct 2002, Hannu Krosing wrote: On Wed, 2002-10-16 at 05:22, Gavin Sherry wrote: Hi all, I'm thinking that there is an improvement to vacuum which could be made for 7.4. VACUUM FULLing large, heavily updated tables is a pain. There's very little an application can do to minimise dead-tuples, particularly if the table is randomly updated. Wouldn't it be beneficial if VACUUM could have a parameter which specified how much of the table is vacuumed. That is, you could specify: VACUUM FULL test 20 precent; What about VACUUM FULL test WORK 5 SLEEP 50; meaning to VACUUM FULL the whole table, but to work in small chunks and relaese all locks and let others access the tables between these ? Great idea. I think this could work as a complement to the idea I had. To answer Tom's question, how would we know what we've vacuumed, we could store the range of tids we've vacuumed in pg_class. Or, we could store the block offset of where we left off vacuuming before and using stats, run for another X% of the heap. Is this possible? Why couldn't you start your % from the first rotten/dead tuple? Just reading through trying to find the first tuple to start counting from wouldn't hold locks would it? That keeps you from having to track stats and ensures that X% of the tuples will be vacuumed. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Vacuum improvement
But doesn't the solution I offer present a possible work around? The table wouldn't need to be locked (I think) until the first dead tuple were located. After that, you would only keep the locks until you've scanned X% of the table and shrunk as needed. The result, I think, results in incremental vacuuming with shorter duration locks being held. It's not ideal (locks) but may shorten the duration behind help by locks. I'm trying to figure out if the two approaches can't be combined somehow. That is, a percent with maybe even a max lock duration? Greg On Wed, 2002-10-16 at 11:33, David Walker wrote: Vacuum full locks the whole table currently. I was thinking if you used a similar to a hard drive defragment that only 2 rows would need to be locked at a time. When you're done vacuum/defragmenting you shorten the file to discard the dead tuples that are located after your useful data. There might be a need to lock the table for a little while at the end but it seems like you could reduce that time greatly. I had one table that is heavily updated and it grew to 760 MB even with regular vacuuming. A vacuum full reduced it to 1.1 MB. I am running 7.2.0 (all my vacuuming is done by superuser). On Wednesday 16 October 2002 09:30 am, (Via wrote: On Wed, 2002-10-16 at 02:29, Gavin Sherry wrote: On 16 Oct 2002, Hannu Krosing wrote: On Wed, 2002-10-16 at 05:22, Gavin Sherry wrote: Hi all, I'm thinking that there is an improvement to vacuum which could be made for 7.4. VACUUM FULLing large, heavily updated tables is a pain. There's very little an application can do to minimise dead-tuples, particularly if the table is randomly updated. Wouldn't it be beneficial if VACUUM could have a parameter which specified how much of the table is vacuumed. That is, you could specify: VACUUM FULL test 20 precent; What about VACUUM FULL test WORK 5 SLEEP 50; meaning to VACUUM FULL the whole table, but to work in small chunks and relaese all locks and let others access the tables between these ? Great idea. I think this could work as a complement to the idea I had. To answer Tom's question, how would we know what we've vacuumed, we could store the range of tids we've vacuumed in pg_class. Or, we could store the block offset of where we left off vacuuming before and using stats, run for another X% of the heap. Is this possible? Why couldn't you start your % from the first rotten/dead tuple? Just reading through trying to find the first tuple to start counting from wouldn't hold locks would it? That keeps you from having to track stats and ensures that X% of the tuples will be vacuumed. Greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Vacuum improvement
That a good idea. That way, if your database slows during specific windows in time, you can vacuum larger sizes, etc. Seemingly would help you better manage your vacuuming against system loading. Greg On Tue, 2002-10-15 at 19:22, Gavin Sherry wrote: Hi all, I'm thinking that there is an improvement to vacuum which could be made for 7.4. VACUUM FULLing large, heavily updated tables is a pain. There's very little an application can do to minimise dead-tuples, particularly if the table is randomly updated. Wouldn't it be beneficial if VACUUM could have a parameter which specified how much of the table is vacuumed. That is, you could specify: VACUUM FULL test 20 precent; Yes, terrible syntax but regardless: this would mean that we could spread the vacuum out and not, possibly, be backing up queues. ANALYZE could be modified, if necessary. Thoughts? Gavin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster signature.asc Description: This is a digitally signed message part
Re: [HACKERS] MySQL vs PostgreSQL.
On Fri, 2002-10-11 at 08:20, Antti Haapala wrote: Quoted from one page Because we couldn't get vacuum() to work reliable with PostgreSQL 7.1.1, I have little respect for the MySQL advocacy guys. They purposely spread misinformation. They always compare their leading edge alpha software against Postgres' year+ old stable versions. In some cases, I've seen them compare their alpha (4.x) software against 7.0. Very sad that these people can't even attempt to be honest. In the case above, since they are comparing 4.x, they should be comparing it to 7.x at least. It's also very sad that their testers don't seem to even understand something as simple as cron. If they can't understand something as simple as cron, I fear any conclusions they may arrive at throughout their testing (destined to be incorrect/invalid). MySQL supports data compression between front and back ends. This could be easily implemented, or is it already supported? Mammoth has such a feature...or at least it's been in development for a while. If I understood them correctly, it will be donated back to core sometime in the 7.5 or 7.7 series. Last I heard, their results were absolutely wonderful. I think all the other statements were misleading in the sense, that they compared their newest product with PostgreSQL 7.1.1. Ya, historically, they go out of their way to ensure unfair comparisons. I have no respect for them. They could be provided one... ;-) In other words, they need a list of features that they can one day hope to add to MySQL. Upgrading MySQL Server is painless. When you are upgrading MySQL Server, you don't need to dump/restore your data, as you have to do with most PostgreSQL upgrades. Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my linux box. Of course PostgreSQL isn't yet as fast as it could be. ;) I consider this par for the course. This is something I've had to do with Sybase, Oracle and MSSQL. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Peer to peer replication of Postgresql databases
Well, not scalable doesn't have to mean not good. That's why I asked. Considering this is one of the problems with mosix clusters (process migration and associated restrictions) and the nature of PostgreSQL's implementation I'm not sure what other result may of been expected. Because of that, I wasn't sure if something else was being implied. Greg On Fri, 2002-10-11 at 08:40, Shridhar Daithankar wrote: On 11 Oct 2002 at 8:30, Greg Copeland wrote: I'd be curious to hear in a little more detail what constitutes not good for postgres on a mosix cluster. On Fri, 2002-10-11 at 06:15, Anuradha Ratnaweera wrote: On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote: Have already tested postgres on a mosix cluster, and as expected results are not good. (although mosix does the correct thing in keeping all the database backend processes on one node). Well, I guess in kind of replication we are talking here, the performance will be enhanced only if separate instances of psotgresql runs on separate machine. Now if mosix kernel applies some AI and puts all of them on same machine, it isn't going to be any good for the purpose replication is deployed. I guess that's what she meant.. Bye Shridhar -- User n.: A programmer who will believe anything you tell him. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Peer to peer replication of Postgresql databases
I'd be curious to hear in a little more detail what constitutes not good for postgres on a mosix cluster. Greg On Fri, 2002-10-11 at 06:15, Anuradha Ratnaweera wrote: On Fri, Oct 11, 2002 at 04:29:53PM +0530, Shridhar Daithankar wrote: Well, I don't think adding support for multiple slaves to usogres would be that problematic. Of course if you want to load balance your application queries, application has to be aware of that. I will not do sending requests to a mosix cluster anyway. Have already tested postgres on a mosix cluster, and as expected results are not good. (although mosix does the correct thing in keeping all the database backend processes on one node). Anuradha -- Debian GNU/Linux (kernel 2.4.18-xfs-1.1) Remember: Silly is a state of Mind, Stupid is a way of Life. -- Dave Butler ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Bison 1.50 was released
Can we please hold off until bison 1.50 becomes a defacto? It will be a matter of weeks before distros offer this as an upgrade package let alone months before distros offer this as a standard. Seems like these changes are ideal for a release after next (7.5/7.6) as enough time will of gone by for it to be much more commonly found. By not jumping on the wagon now, it will also allow more time for bugs in the wild to be caught and fixed before we force it onto the masses. Greg On Thu, 2002-10-10 at 02:05, Michael Meskes wrote: Hi, I just learned that bison 1.50 was released on Oct. 5th and it indeed compiles ecpg just nicely on my machine. Could we please install this on our main machine and merge the ecpg.big branch back into main? Michael -- Michael Meskes [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Bison 1.50 was released
Oh, that's right. I had forgotten that it wasn't for general PostgreSQL use. Since it's a ecpg deal only, I guess I remove my objection. Greg On Thu, 2002-10-10 at 09:18, Tom Lane wrote: Greg Copeland [EMAIL PROTECTED] writes: Can we please hold off until bison 1.50 becomes a defacto? We don't have a whole lot of choice, unless you prefer releasing a broken or crippled ecpg with 7.3. In practice this only affects people who pull sources from CVS, anyway. If you use a tarball then you'll get prebuilt bison output. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Analysis of ganged WAL writes
On Tue, 2002-10-08 at 04:15, Zeugswetter Andreas SB SD wrote: Can the magic be, that kaio directly writes from user space memory to the disk ? Since in your case all transactions A-E want the same buffer written, the memory (not it's content) will also be the same. This would automatically write the latest possible version of our WAL buffer to disk. *Some* implementations allow for zero-copy aio. That is a savings. On heavily used systems, it can be a large savings. The problem I can see offhand is how the kaio system can tell which transaction can be safely notified of the write, or whether the programmer is actually responsible for not changing the buffer until notified of completion ? That's correct. The programmer can not change the buffer contents until notification has completed for that outstanding aio operation. To do otherwise results in undefined behavior. Since some systems do allow for zero-copy aio operations, requiring the buffers not be modified, once queued, make a lot of sense. Of course, even on systems that don't support zero-copy, changing the buffered data prior to write completion just seems like a bad idea to me. Here's a quote from SGI's aio_write man page: If the buffer pointed to by aiocbp-aio_buf or the control block pointed to by aiocbp changes or becomes an illegal address prior to asynchronous I/O completion then the behavior is undefined. Simultaneous synchronous operations using the same aiocbp produce undefined results. And on SunOS we have: The aiocbp argument points to an aiocb structure. If the buffer pointed to by aiocbp-aio_buf or the control block pointed to by aiocbp becomes an illegal address prior to asynchronous I/O completion, then the behavior is undefined. and For any system action that changes the process memory space while an asynchronous I/O is outstanding to the address range being changed, the result of that action is undefined. Greg signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Dirty Buffer Writing [was Proposed LogWriter Scheme]
Bruce, Is there remarks along these lines in the performance turning section of the docs? Based on what's coming out of this it would seem that stressing the importance of leaving a notable (rule of thumb here?) amount for general OS/kernel needs is pretty important. Greg On Tue, 2002-10-08 at 09:50, Tom Lane wrote: (This is, BTW, one of the reasons for discouraging people from pushing Postgres' shared buffer cache up to a large fraction of total RAM; starving the kernel of disk buffers is just plain not a good idea.) signature.asc Description: This is a digitally signed message part