[GENERAL] Still more pg_clog errors
In reference to the pg_clog errors I'm having, I am still looking for tips or help. Here's the info again: ERROR: could not access status of transaction 143934068 DETAIL: could not open file /usr/local/pgsql/data/pg_clog/0089: No such file or directory Now, despite creating an 8k file of zeros (or a 256k file of zeros to match the others in the directory), when I start the database and try a vacuum, I get the same message, and upon inspection, the file is gone - as if Postgres is removing the file. Any tips or help on what I can do? steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[GENERAL] error opening pg_clog file
On Saturday, our database machine locked up hard on us due to some faulty hardware. Since then, we have been getting messages like this: ERROR: could not access status of transaction 143934068 DETAIL: could not open file /usr/local/pgsql/data/pg_clog/0089: No such file or directory Looking in /usr/local/pgsql/data/pg_clog, the 0089 file isn't there. Some investigation revealed these messages: http://archives.postgresql.org/pgsql-hackers/2004-01/msg00534.php http://www.mail-archive.com/[EMAIL PROTECTED]/msg13874.html So, I did dd if=/dev/zero of=/usr/local/pgsql/data/pg_clog/0089 bs=8k count=1. I did an ls to verify that the file existed. I started the postmaster back up, tried a VACUUM, and got: vacuumdb: vacuuming of database hyperseek failed: ERROR: could not access status of transaction 144565028 DETAIL: could not open file /usr/local/pgsql/data/pg_clog/0089: No such file or directory I looked, and the 0089 file was gone again. Is there anything I can do to save the situation? (The PG version is 7.4.2) steve wolfe ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Opteron scaling with PostgreSQL
I would very much like to see the same test with Fsync on. A test that does not reflect real-world use has less value than one that just shows how fast it can go. For a read-only database, fsync could be turned off. For any other system it would be hair-brained and nobody in their right mind would do it. Then I must not be in my right mind. : ) Before I explain why *I* run with fsync turned off, the main reason the tests were done without fsync was to test the scalability of the Opteron platform, not the scalability of my disk subsystem. = ) I've run with fsync off on my production servers for years. Power never goes off, and RAID 5 protects me from disk failures. Sooner or later, it may bite me in the butt. We make backups sufficiently often that the small amount of data we'll lose will be far offset by the tremendous performance boost that we've enjoyed. In fact, we even have a backup server sitting there doing nothing, which can take over the duties of the main DB server within a VERY short amount of time. steve ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] load testing
I wanted to do some quick load testing on a postgres database. Does anyone have any tips on how to go about doing this? Thanks much. Sure. Test after the manner in which the database is normally used, and with real-world data. I've seen far too many people benchmark a database system by opening a single connection, and issuing a number of queries. However, it's more common for a database server to be handling multiple queries at the same time. Also, I've seen far too many people use contrived test data and contrived queries. However, the nature of queries may be very different from the actual queries you run. Test with what you use! For my own benchmarking, I usually log ~10,000 queries from our production server, and start a Perl script that I whipped up quickly. It will split the queries into chunks, and test with 1 through 10 simultaneous connections, and that's been a very good indicator of how the machine in question will behave once it's put into production. steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Question on Opteron performance
The only time I've seen high cpu and memory bandwidth load with near-zero i/o load like you describe was on Oracle and it turned out to be an sql optimization problem. What caused it was a moderate but not very large table on which a very frequent query was doing a full table scan (= sequential scan). The entire table was easily kept in cache, but it was large enough that merely scanning every block of it in the cache consumed a lot of cpu and memory bandwidth. I don't remember how large, but something on the order of a few thousand records. Every so often, I log all queries that are issued, and on a seperate machine, I EXPLAIN them and store the results in a database, so I can do analysis on them. Each time, we look at what's using the greatest amount of resources, and attack that. Believe me, the low-hanging fruit like using indexes instead of sequential scans were eliminated years ago. : ) Over the past four years, our traffic has increased, on average, about 90% per year. We've also incorporated far more sources of data into our model, and come up with far more ways to use the data. When you're talking about exponential traffic growth combined with exponential data complexity, it doesn't take long before you start hitting limits! Before I shell out the $15k on the 4-way Opteron, I'm going to spend some long, hard time looking for ways to make the system more efficient. However, after all that's already been done, I'm not optimistic that it's going to preclude needing the new server. I'm just surprised that nobody seems to have used PostgreSQL on a quad-Opteron before! steve ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [GENERAL] How to make a REALLY FAST db server?
What I said: SCSI disks cost a lot more than comparable IDE disks. What you said: No, because I found some cheap SCSI disks that don't have comparable IDE models. That's not what I said. If you're going to quote me, get it right. My statement only operates in the domain where there ARE comparable models. You say that SCSI disks are more expensive than comparable IDE disks. I say that it's usually impossible to find an IDE disk that truly *is* comparable to a SCSI disk - and when you do, you're generally comparing the most recent IDE disk to an older SCSI disk. If nothing but peak transfer rate is all that matters to you in a hard drive, then yes, you can find comparable IDE and SCSI disks - but if that's all that matters, you're being pretty short-sighted. And to compare nothing but drive features when talking about your disk I/O subsytem is also very short-sighted. steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Reading the database in RAM
Is there any way of reading the whole database into memory? By changing the number of shared buffers the first query still takes long time and the subsequent similar queries are fast. If there is way of pinning the database in the memory all similar queries will take the same time. My database is around 200MB and could easily be pinned into RAM. If you have enough RAM, your OS will cache the physical files as they are used. I'm not sure how much of a speedup would be gained by PG keeping the data in RAM and doing the caching itself, but my guess is that it wouldn't be that great of a speedup over simply having the data in the disk cache. steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] How to make a REALLY FAST db server?
(As an aside, one person was in a heated argument about how much cheaper IDE was than SCSI. I got on pricewatch, found some prices, and would have been able to put together a very fast SCSI system for the same price as his IDE array.) That's nuts: SCSI disks cost a lot more than comparable IDE disks. But it's true. For cutting-edge SCSI disks, the price is quite high. If you look, though, you can find places trying to get rid of last-year's model for very low prices, and you can sometimes find very good performers like that. I picked up some 10K IBM drives with 4 ms access times for something like $125 or $150 each. If you'd like to find comparable IDE drives for that price, you're out of luck, as IDE doesn't have a 160 MB/sec bus, and I don't think you'll find any IDE drives with that low of access times anywhere. (Not to mention the fact that IDE drives only do well when a single process is accessing them. SCSI, having been designed from the gound up for this sort of thing, does much better when you're hitting the disks from several places at once.) steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How to make a REALLY FAST db server?
Tunning is somewhat of a black art to get the right balance. If you have to make a choice, buy fewer processors, faster disks, and as much RAM as the board will handle. Wow. I'd buy more RAM and processors, and maybe skimp a *little* on the disks. The RAID array in my machine is made up of some 9-gig Quantum Atlas 4's that I bought for something like $80 each, far from being state-of the art: But with all of the RAM, the disk lights only blink *occasionally*. The number of drives makes up for the lower transfer rate of each individual unit, and the disk cache (both on the controller and from the kernel) make up for the latency. steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How to make a REALLY FAST db server?
I'd probably go for a 3ware RAID instead... IDE disks are so much cheaper nowadays than SCSI, and the premium isn't really justifiable anymore. Having used IDE and SCSI disks, when I'm serious about performance, IDE doesn't even enter my mind. Also, over on the XFS list, there are a few people that have been using 3ware cards, and it sounds like there are still some serious caveats/bugs to them. Myself, I much prefer a good SCSI RAID card that's going to work, and going to work well. (As an aside, one person was in a heated argument about how much cheaper IDE was than SCSI. I got on pricewatch, found some prices, and would have been able to put together a very fast SCSI system for the same price as his IDE array.) steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] How to make a REALLY FAST db server?
I'm not sure how much a 2+ way system will help. Most of the db work will be done in one long serialized processes and those can't be spread out over multiple processors (with current day postgresql). That's assuming that only one query will ever be executed at once. As a new backend is spawned for each connection, extra CPU's are very helpful if the database will see more than occasional use. Also, even if there's only one query, dual-CPU machines are generally much more responsive, especially under load, as one CPU can be handling interrupts, kernel code, and other system processes while the other sits there doing your task. To really eek out as much speed as possible here, you'll want 10k RPM Ultra-160 Fibre Channel SCSI drives with a dedicated hardware raid controller. If have more reads than writes, you may want to use Raid 5 instead. Why 5? 1+0 is far better and faster. I was planning on doing a hardware RAID controller (just need to find the one that FBSD likes the best). If you have enough RAM, disk speed isn't terribly important, so RAID 5 gives you the redundancy without as many disks. Throw in an extra gig of RAM for your disk cache, turn of fsync(), and you're likely to see a lot bigger speed-up than any disk upgrade will give you. There are cases where that isn't the case (such as updating every row in a multi-gigabyte table), but unless you're involved in those specialized cases, it's not as important. So, why did I say that I don't use IDE for high-performance machines? IDE has limitations. For example, say I wanted 10 drives in my array. Finding a 5-channel IDE RAID controller is probably not as easy (and not as cheap) as a dual-channel SCSI RAID controller. Also, SCSI buses are much better about sharing bandwidth than IDE, as IDE doesn't have some of the nifty features that SCSI does. And to go one further, hot-swappable SCA bays are pretty common in server chassis. I simply plugged the RAID controller into the SCA backplanes, and was done. Had I gone IDE, there would have been additional cost in obtaining the hot-swap IDE bays. As an aside, if you do go with a 3ware card, you might NOT want to use RAID 5. The processors on the card are not up to the computing demands of RAID 5, you might want to take a look at: http://marc.theaimsgroup.com/?l=linux-xfsm=99970690219042w=2 No reason not to go 2GB. Ram is cheap these days, and you can always increase shared buffers and caches to actually fill the server memory up with as much quick-fetch info as possible. But then why not 4G? I would love some real numbers rather than 'a lot'. With oracle, you can plug in some numbers and a real extimate will be spit out. I've worked with DB servers w/ 14G of ram that were killing that, so get a lot isn't really good enough. We run 1.5 gigs, and that's plenty for us. I increased the shared buffers until it didn't help any more, then doubled it, I believe that it came out to around 128 gigs. I did the same with sort memory, that came out to around 64 megs. The machine right now uses about 860 megs of disk cache, but took a few months to ge that high. It hasn't used swap at all. If it ever hits swap, we'll add more. Luckily, with the 4-way interleaved memory, it'll take up to 16 gigs, and with 16 slots, there's a lot of room to add more. : ) steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Deployment of PostgreSQL Applications
2) More importantly, is it possible to prevent a customer from peeking into said database once it is deployed on their machine? ROT13 it, then threaten them with the DMCA. (Yes, that was a joke.) steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] Re: Red Hat to support PostgreSQL
...This is not the same in my book, since I don't care to run RHL in any kind of production environment... snip What is it about RHL that various people wouldn't recommend running it in a production envornment? I don't have a contrary view, so much as I'd like to know what's specifically wrong with the RH distribution. We're trying to decide on a distribution on which to develop telecom software, utilizing PostgreSQL of course :-) What other distributions would you recommend and why? Here's my take on it, it may or may not reflect reality. : ) RH didn't get where they are by being the best, they got there by being the most sellable. Early on, they grabbed a large market share by making a few very sellable decisions, and now, the fact that they are so large now gives them momentum that keeps them afloat - oddly enough, just like Microsoft. : ) Because they're large, they garner support in terms of drivers and programming, and that, in turn, makes them more attractive to potential users. Their products aren't necessarily bad, at least not all of them. Historically, the .0 releases are buggy and flakey, the .1's are better, and the .2's are decent. Now that, of course, depends on your own definitions of such qualitative terms as buggy and decent, but according to my definitions and experience, that's been about right. For my needs, they're also becoming extremely bloated. I don't need three CD's worth of installation crap to get Apache, SSH, and PostgreSQL running. : ) I also don't like depending on precompiled packages, for a couple of reasons - including the fact that it's hard to choose compile-time settings on pre-compiled binaries. : ) So, I've started working on putting together a sort of mini-distro with only what my servers will need. It's quite a bit easier than I thought it would be, and lets me mix and match the features that I want, such as XFS support and what-not. Now, since I've been so negative about them, I'll also be positive - RedHat isn't bad for production use. Stay with .2 releases, and things will likely be just fine for you. There are some policy decisions that (IMHO) aren't as good as they could be, but those can generally be fixed with a few minor modifications to startup scripts and configuration files. As long as you tighten down the security holes in the default installation, most people would likely be just fine using RedHat on their production machines. steve ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] Re: Red Hat to support PostgreSQL
Previous to version 7.1, RHL wasn't very secure by default. This is one of the most common complaints I hear. 7.1 can be made quite secure out of the box without any special config -- just leave the firewall config at the default of 'HIGH' -- of course, I've now heard complaints that it is then 'too secure' :-). Myself, I'd prefer that they'd just leave the insecure services off by default, rather than using a firewall as a band-aid. ; ) steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Re: Red Hat to support PostgreSQL
Even though it may appear that your server is doing a lot, it's not facing the load of a highly scaled enterprise level e-commerce site, where RedHat just doesn't cut it. That claim is bogus. Red Hat Linux is the number one linux by far in enterprise deployments. Well, Microsoft has an even greater installed base in enterprise deployments, so NT must be better than Linux Being #1 doesn't make you the best. steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] Re: Red Hat to support PostgreSQL
1) Distribution of Linux to have the largest number of out of the box security holes. Check back and look at the security reports. Count them if you insist. And check for the number of them being Red Hat specific. I consider things like the portmapper being enabled by default Red Hat specific. 3) So much extra crap running to begin with, eating up extra memory, cpu, etc. You're obviously unfamiliar with it. I don't know, I generaly turn off at least half of the services that are enabled by default, which free up quite a bit of memory. steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Re: Red Hat to support PostgreSQL
None of them. Run FreeBSD. It's better. Or, it will be, once the SMP code is improved. : ) steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] More Red Hat information
In my experience DB pricing structures seem to be a mystery whichever platform you look at :) 20k per CPU. See: Word is that the 2006 Microsoft automobiles will have similar pricing structures. You'll pay $8,000 per year for each seat-license that you want, and another $2,000 per year for each 10 mph above 30 that you want to be able to drive... steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Red Hat Database
Also the comment that RHDB would be better for small businesses than large ones makes me think that it's not exactly high-performance or full-featured. : ) steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Red Hat DB announced - why not PostgreSQL?
It looks like Red Hat has announced an open source database called Red Hat Database: (via slashdot:) http://dailynews.yahoo.com/h/cn/20010619/tc/red_hat_to_play_in_oracle_s_ar ena_1.html Why is this not PostgreSQL? Why can't the Red Hat folks sell support to PostgreSQL, rather than developing another open source database and supporting that? Is it a conflict with GreatBridge? This looks like it could have been a great opportunity for PostgreSQL If it's Red Hat Database, then not only does it promote their brand and supposed worth, then people will more readily come to them for a support contract. If it were Postgres, they'd probably just sign up to the mailing list. : ) steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Hardware Config
Will PostgreSQL take advantage of more than 1 CPU? If so, are there any benchmarks showing how it scales with multiple CPUs? We run PG on a quad xeon, and it works wonderfully. From PG's design of forking off a new backend for each connection, it is inherantly scalable, as each backend can run on another CPU. And, of course, having extra CPU's to handle kernel code, I/O, interrupts, and other things is always good. steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Web site gripes
THESE ARE NOT WHINING GRIPES. They are pefectly valid points about how the current website design is flawed. Your responses were pretty much Make your own bookmark, it doesn't work, and you aren't working hard enough to find things. This attitude is unacceptable. I thought that was the main benefit of open source, you didn't have to write good documentation. : ) On a realistic note, did you really have to CC 4 or 5 people on that? I'm sure that they all read the list, and don't really need duplicate copies. steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Oracle news article
I mean allowing an unlimited amount of connections (as much as the process table will allow) that are served in a round-robin basis by oracle, much like an httpd process. I'm tired of having to up the number of user connections just because we get an unprecedented surge in interest from people viewing all of our sites. How about doing -N 5000? Or a equally-big your-webserver-will-melt-down-before-the-number-is-achieved. Whether that would be advantageous depends on what happens when the max number of backends is reached. If a new incoming connection simply waits for a new backend to become available, then you can do about the same thing as connection-pooling by simply setting a low number of backends - a number somewhere around the number of CPU's that you have. In theory, you keep each CPU full (assuming there aren't other bottlenecks), but avoid the extra memory usage and context-switching cost associated with a larger number of backends. However, it would block incoming requests, meaning that a few large queries could stop a few hundred small, fast queries from running. That means that the guy generating traffic statistics can keep several hundred people from seeing your site until he's done - a bad thing. : ) Overall, it seems (to me, at least) that if new connections (above the max) simply wait, then a good compromise would be limitting the number of backends to a level several times greater than the number of CPU's that you have, so that a single (or a few) large query doesn't completely monopolize the system, yet not so high that you're going to exhaust your memory or run into other problems associated with huge numbers of running processes. steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Compiling to RPM setup/filesystem layout
Now, play some villanous music, and enter RedHat wearing a black cape, with small, beedy eyes. I don't have a cape, but I do have a red hat. And blue eyes, normal size. I was going for the melodrama. : ) They insist that an OS should not touch /usr/local, and they're right about that. However, if you choose to download the Postgres RPM and install it via RPM yourself, they seem to interpret that as the OS touching /usr/local, and it won't happen. For managed distributions, the standard way of doing things is if it's in the package format, put it in /usr. /usr/local is for things not managed by this system (rpm, deb, whatever) (typically this means things installed by loki :), things compiled locally without package systems (configure make install) etc. standard rarely equates to correct or even useful. But I'm going to drop the rest of the message, I had all of my arguments written out, but remembered that this is the Postgres list. The main point of my message was that you should stick to source or RPM, mixing the two isn't that great. I apologize if you took offense at my comments. steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Compiling to RPM setup/filesystem layout
Unfortunately, I can't just compile since I need to be able to replace my current 7.0.3 installation, installed via RPMs. How do I go about this so I don't mess everything up (leftover files and such, in addition to the mandatory pg_dump) ? Install the compiled version somewhere else and run them in parallel. Once you're finished doing the data conversion, remove the RPM version. If you do that, either fully qualify your calls to psql and other utilities, or create aliases for them. Otherwise, the RPM'd versions with binaries in /usr/bin can be called instead of the binaries you really want, and there are circumstances where that will cause problems, even though the developpers have done a wonderful job of making sure that things will work as well as can be expected with mis-matched versions. : ) steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] Compiling to RPM setup/filesystem layout
On Thursday 31 May 2001 16:22, Steve Wolfe wrote: something else fills up /var, PG isn't hosed. And if PG fills up it's partition, other services aren't hosed. Make a partition mounted on /var/lib/pgsql. :-) Touche! Now, play some villanous music, and enter RedHat wearing a black cape, with small, beedy eyes. They insist that an OS should not touch /usr/local, and they're right about that. However, if you choose to Linux Standards Base and the Filesystem Hierarchy Standard sets that policy, not Red Hat. And I happen to think it installs to the right place, IMHO. :-) I'm talking about a user installing software on his own. Does the LFS and FHS still mandate it not go into /usr/local in that case? And Red Hat ain't no villain -- unless you're a BSD partisan who thinks Red Hat is responsible for popularizing Linux beyond its worth (that, incidentally, is a friendly dig at scrappy.) That was purely for dramatic effect, I didn't mean that they were actually villains. I believe that there are some investors that would argue against you, from what I hear of the lawsuit, but I haven't concerned myself with that enough to know the details. If you like Linux, you should absolutely adore Red Hat -- if nothing else, for payrolling Alan Cox and the stable kernels. I wasn't aware that they did - but for that, I do absolutely applaud them. I do have a few things that concern me about their kernels, but I'll leave that for a relevant forum. Running rpm -ql on the RPMset is too much of a hassle, right? Removing all traces of the RPMset is easier than removing all traces of a from-source install. Really? Let's compare removing the RPM's: #rpm --erase postgresql-devel-6.5.3-6 #rpm --erase postgresql-jdbc-6.5.3-6 #rpm --erase postgresql-odbc-6.5.3-6 #rpm --erase postgresql-perl-6.5.3-6 #rpm --erase postgresql-python-6.5.3-6 #rpm --erase postgresql-server-6.5.3-6 #rpm --erase postgresql-tcl-6.5.3-6 #rpm --erase postgresql-test-6.5.3-6 #rpm --erase php-pgsql-3.0.15-2 #rpm --erase postgresql-perl-6.5.3-6 #rpm --erase postgresql-python-6.5.3-6 #rpm --erase postgresql-server-6.5.3-6 #rpm --erase postgresql-tcl-6.5.3-6 #rpm --erase postgresql To removing the installation from source: rm -rf ~postgres/* I think that the second is much easier, in my opinion. Although, as I _am_ mentioned as a 'Developer' on the globe, and the RPM puts the files where I mean for them to go... well, you decide the worth of that. And followup to the PORTS list, as this is a ports, not a general, issue. I'm sorry, I don't know how that ended up there. I'll fix it. steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] Compiling to RPM setup/filesystem layout
Let's compare removing the RPM's: rpm -e `rpm -qa |grep postgresql` Nice. I like it. #rpm --erase php-pgsql-3.0.15-2 That's not what you're doing for your manual install... rm -rf ~postgres/* So postgres doesn't install it's binaries in /usr/local/bin, libraries in /usr/local/lib etc? I'm not sure what you mean. When it's installed with the autoconf defaults, the binaries go in /usr/local/pgsql/bin, and the libraries are in /usr/local/pgsql/lib (assuming that you've chosen /usr/local/pgsql for the base directory), and removing all traces is as easy as removing that directory, assuming that you've followed the commonly-used practice of having /usr/local/pgsql/data as your data directory. (Yes, that should technically go under /var.) Since the binaries aren't in /bin, I generally add /usr/local/pgsql/bin to the path in /etc/profile, and /usr/local/pgsql/lib to /etc/ld.so.conf, as described in the docs, so I suppose that to remove *every* trace, you're right, I'd have to remove those lines as well, even though having them around without an installation of PG isn't going to hurt much. steve ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [GENERAL] OID's....
Also, when the OID's reach the limit of an int4, if I recall correctly, they simply wrap around, and things keep working unless you depend on unique OID's. Is that correct? That's correct. cheers, t. p.s. and rumor has it that the universe will start shrinking as soon as this happens ;-) Actually, in our case, it may happen more soon than I had thought. We were in the tens of millions not long ago, and are now over 100 million. At the rate we're going, we may very well be doing 5 million OID's per day in the very near future, which would give us about 6 months to wrap around. I'm not terribly worried about the wrap-around, but I would like to be as informed as possible regarding this situation. And since we do very few inserts relative to our selects (probably a 1:5,000 ratio), we probably don't need to be churning through them quite so fast. It appears to be pre-allocating 30 or 32 OID's per select, which in our case, is far too many, as it's a very rare case indeed where we insert more than one record at a time. Is there a way to change that behavior? steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] OID's....
What PG version are you using? IIRC, in pre-7.1 code, backends allocate OIDs in blocks of 32 (?? more or less anyway); so if a backend uses one OID and then exits, you wasted 31 OIDs. This does not happen anymore with 7.1, though. Another possibility is that you're creating lots of temp tables --- each one will cost you a dozen or so OIDs, depending on the number of columns. Thanks, Tom. We are using 7.0.x, and they do appear to be used in chunks of 32. One more reason for me to upgrade. : ) steve ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] IpcMemoryCreate: shmget failed (Invalid argument)
Perhaps someone should write a platform-specific FAQ for Linux that answers this question, so that the link provided would actually be useful. http://www.ca.postgresql.org/devel-corner/docs/postgres/kernel-resources.h tml#SYSVIPC Perhaps it should be put in the platform-specific FAQ's, since that's where users are pointed when they get the error. steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] IpcMemoryCreate: shmget failed (Invalid argument)
I get this error when trying to start postgres with a block size the default of 64... [postgres@roam pgsql]$ /usr/local/pgsql/bin/postmaster -B 2000 -D /usr/local/pgsql/data IpcMemoryCreate: shmget failed (Invalid argument) key=5432001, size=33652736, permission=600 This type of error is usually caused by an improper shared memory or System V IPC semaphore configuration. For more information, see the FAQ and platform-specific FAQ's in the source directory pgsql/doc or on our web site at http://www.postgresql.org. FATAL 1: ShmemCreate: cannot create region Perhaps someone should write a platform-specific FAQ for Linux that answers this question, so that the link provided would actually be useful. If there are no other takers, I'll volunteer. steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Ideal hardware configuration for pgsql/Netra
Yes they are better web servers than DBMS servers but if you database is small enough to cache in RAM then who care if the Netra uses slow disks? All that talk on this list about Linux vs. BSD is silly. Why bother when you can have Solaris 8 on SPARC hardware? Easy: Cost. (And, there are other reasons, of course, but I won't turn this into a Solaris-bash. : ) ) steve ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Shared memory failure?
First off, I'd just like to thank everyone for their help with my last problem. It worked, but now it gives me another error: IpcMemoryCreate: shmget failed (Permission denied) key=5432010, size=144, permission=700 This type of error is usually caused by an improper shared memory or System V IPC semaphore configuration. For more information, see the FAQ and platform-specific FAQ's in the source directory pgsql/doc or on our web site at http://www.postgresql.org. IpcMemoryAttach: shmat failed (Permission denied) id=2 FATAL 1: AttachSLockMemory: could not attach segment I looked through the troubleshooting guide, but it wasn't very helpful. Anyone know how to get around this? Thanks again. It's OS-dependant. What OS are you running, and if it's Linux, which kernel? steve ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Re: Re: Fast Inserts and Hardware Questions
Yes, it is RAID-5 on the big box. Unfortunately, I don't have any spare RAID equipped boxes sitting around, so I can't experiment with the different RAID levels. Still, you'd think that even a "slow" RAID-5 configuration would be faster than a $98 IDE drive... Yes, it certainly should be. Right now I have a Mylex 170 in my machine for testing, hooked to 4 IBM 9-gig drives. Three of them are in a RAID 5 array, the last is a hot-spare. Copying data from the IDE drive to the RAID array, the IDE drive reads at full speed, the lights on the RAID array just blink quickly about once per second. The controller has 64 megs of cache on it, but I've copied far larger data sets than that (several gigabytes), and the behavior has been the same. So... yes, RAID 5 is slower than RAID 0 or 1 for writes. But it's still dang fast, especially compared to a single IDE drive. steve ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Re: Re: Fast Inserts and Hardware Questions
It's not surprising to see software raid outperforming hardware raid (both of modern vintage). That's provided, of course, that you have a good amount of free CPU cycles. On machines that are otherwise heavily CPU-loaded, software raid has been (in my experience) horrible. steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] Re: Re: Fast Inserts and Hardware Questions
On Fri, Mar 16, 2001 at 03:53:22PM -0600, Alex Howansky wrote: levels. Still, you'd think that even a "slow" RAID-5 configuration would be faster than a $98 IDE drive... I wouldn't. You'd be wrong. : ) I've also copied large amounts of data from an IDE drive to an old AMI MegaRAID controller with some old Quantum drives in a RAID 5 configuration. The RAID was still faster at writing than the relatively modern IDE drive could read. Not to mention that in an environment like a database server where you may have many processes accessing the disk at once, a SCSI RAID array makes IDE look like a toy. steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] URL for gmake
Trying to rebuild v6.5.3 and the procedure calls for gmake. A google search is a dog-chasing-tail exercise. Can anybody help with a URL or an alternative. www.freshmeat.net is a good place to look for sources, but in this case, searching for "gnu make", which will yield you with: http://www.paulandlesley.org/gmake/ With a link to the FSF's page: http://www.gnu.org/software/make/make.html steve ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [GENERAL] order of clauses
will give a divide by zero error as A=(y0) and B=(x/y1) can be evaluated in any order (A and B = B and A). I obviously would like (y0) to happen first, but I don't see how this can be achieved.. Any ideas? I have one idea that would be nifty to implement. In some compilers, you can turn off complete boolean checking. As soon as any part of an expression will invalidate the expression, it stops evaluating all of it. That can help you avoid division by zero, and keeps you from evaluating parts of the expression that don't matter. It sounds like a good idea, at least to an ignoramus like me. : ) steve
Re: [GENERAL] PostgreSQL over NFS?
It is not performance I would be concerned about, but reliability. NFS has no state for reliability. I have to ask why they are using NFS rather than putting it on a drive local to the machine. If they say they want to share the data between two machines, that is even crazier. They may want to put the data on a file server, so that it gets backed up, and has the advantage of a hardware fault-tolerant RAID array. Tht has it's merits, but I don't think it woul dbe that great for performance. We spent the money to put another RAID array in the database machine, both for reliability and speed. Once per night, the DB is dumped, and the dump is copied to the file server, so that we have a second copy somewhere, and it's included in the nightly tape backup. steve
[GENERAL] Creating indeces on existing tables...
OK, one of my tables was getting duplicate entries, and I thought I'd create a unique index on a particular field. So, I created the index: domains=# create unique index idx_domain_name on domain (domain_name); CREATE Then, tried a vaccum: domains=# vacuum analyze; NOTICE: Index idx_domain_name: NUMBER OF INDEX' TUPLES (305) IS NOT THE SAME AS HEAP' (311). Recreate the index. VACUUM Huh? So, I dropped the index, did a vacuum analyze, recreated the index, tried a vacuum, same thing. So, I did a pg_dump, edited the dump file to create the index with the table, and dropped/restored the database. It worked, but I'm puzzled why the origianal attempt didn't work. It's Postgres 7.0.2. steve
Re: [GENERAL] recovery mode
I don't think recovery mode actually does much in 7.0.* --- I think it's just a stub (Vadim might know better though). In 7.1 it means the thing is replaying the WAL log after a crash. In any case it shouldn't create a lockup condition like that. The only cases I've ever heard of where a user process couldn't be killed with kill -9 are where it's stuck in a kernel call (and the kill response is being held off till the end of the kernel call). Any such situation is arguably a kernel bug, of course, but that's not a lot of comfort. Exactly which process were you sending kill -9 to, anyway? There should have been a postmaster and one backend running the recovery-mode code. If the postmaster was responding to connection requests with an error message, then I would not say that it was locked up. I believe that it was a backend that I tried -9'ing. I knew it wasn't something that good to do, but I had to get it running again. It's amazing how bold you get when you hear an entire department mumbling about "Why isn't the site working?". : ) Anyway, I think the problem wasn't in postgres. I rebooted the machine, and it worked - for about ten minutes. Then, it froze, with the kernel crapping out. I rebooted it, it lasted about three minutes until the same thing happened. Reboot, it didn't even get through the fsck before it did it again. I looked at the CPU temps, one of the four was warmer than it should be, but still within acceptable limits (40 C). So, I shut it down, reseated the RAM chassis, the DIMM's, the CPU's, and the expansion cards. When it came up, I compiled and put on a newer kernel (I guess there was some good in the crashes), and then it worked fine. Because of the symptoms, I imagine that it was a flakey connection. Odd, considering that everything except the DIMM's (including the CPU's) are literally screwed to the motherboard! steve
Re: [GENERAL] Re: VACUUM and 24/7 database operation
Shouldn't it be possible to build vacuum as an ongoing internal PG process, instead of a seperate operation? How does Oracle byepass this? Must be some way that can be implemented. Well, here's what it comes down to: Do you want updates to happen quickly, and vacuum when load is low, or do you want updates to be slow all the time? I suppose that there are some sites that can't find two minutes per day when updates will block (not selects), but I imagine they're very few. steve
Re: [GENERAL] How passwords can be crypted in postgres?
From: [EMAIL PROTECTED] I order to escape from .htaccess, I want to save user passwords in my database. However, passwords appear clear in the database. How passwords can be crypted in the database? I use PHP 3.0.16, Cobalt Raq3i. Hmmm linux users that haven't heard of freshmeat.net? : ) Typing in "php md5" at freshmeat gives: http://freshmeat.net/projects/hcemd5/?highlight=php+md5 Which will allow you to encrypt/decrypt as you desire, but I'm sure there are a hundred other ways to do the encryption, depending on just how much strength you need. steve
Re: [GENERAL] Database cluster?
I am considering splitting the database into tables residing on separate machines, and connect them on one master node. The question I have is: 1) How can I do this using PostgreSQL? You can't. I'll jump in with a bit more info. Splitting tables across multiple machines would do nothing more than make the entire system run at a snail's pace. Yes, it would slow it down immensely, because you just couldn't move data between machines quickly enough. Why? Well, whenever you join two tables that are on different machines, the tables have to go across whatever sort of connection you have between the machines. Even if you use gigabit ethernet, you are still running at a mere fraction of the bandwidth of the computer's internal bus - and at orders of magnitude greater latency. You'd have lots of CPU's sitting around, doing absolutely nothing, waiting for data to come across the wire. There are alternatives, such as IP-over-SCSI. That reduces the latency of ethernet quite a bit, and gives you much more bandwidth (say, up to 160 megabytes/second). However, that's still a pittance compared to the main system bus inside your computer. That's one of the greatest hurdles to distributed computing. That's why the applications that are best adapted to distributed computing are those that don't require much data over the wire - which certainly doesn't apply to databases. : ) steve
Re: [GENERAL] True ACID under linux (no fsync)?
Use the "-F" option. I start PostgreSQL with this line: su -l postgres -c "/usr/bin/pg_ctl -D $PGDATA -p /usr/bin/postmaster -o '-i -B 4096 -o -F' start /dev/null 21" /dev/null Interesting... trying to start postgres 7.0.2 with the "-F" simply returns a list of options to me, as if it can't understand it. It sure does sound appealing to me, though. With a few hundred megs of disk cache (between the kernel and the RAID controller), I've always wished that PostgreSQL would take advantage of it when writing. : ) steve
Re: [GENERAL] how good is PostgreSQL
Or they could buy a single IBM S/390, run Linux/390 and PostgreSQL on that. Probably would cost less, and be more reliable. And they can always load another Linux/390 VM -- an S/390 can run something like 41,000 virtual machines each running Linux/390 and Apache. Yeah I'm very optomistic about IBM's new chips that are coming out next year. Each "processor module" will have 4 processors, but each processor will have 2 cores - so in effect, each "processor module" has 8 processors on it. All processors will have copper interconnects, and depending on the source, will debut at anywhere from 1.3 to 2 gigahertz. I think that will certainly help them get a larger share of the high-end market! steve
Re: [GENERAL] how good is PostgreSQL
Even after that, you have a long way to go before you will hit 1000 transactions per second from any SQL database. Since my last post probably wasn't too useful, here's some information that might be a little more help. It's a little long, I know, but hopefully it will be of use to someone. As programmers, we naturally want to throw things into databases for three reasons. First, it's easy to get data in. Second, it's easy to get relevant data out. And third, it's "cool". We don't want to work with flat files, now do we? ; ) However, in some cases, using the database to get data out ends up costing us a lot of time and money. Sometimes we do the same nasty query so often, that we end up purchasing bigger hardware to make the system work reasonably. Why? Because it was easier for us to write a program that did: GetDataFromDatabase(); PrepareData(); PrintData(); Each time, the database server does the work. But it doesn't necessarily have to be that way. In our company, we've found two trends that have enabled us to save a LOT of processing power on our machines. (read: Increase the capacity of our servers by 30% or more, with fairly minor changes) The first case is that of rarely-changing data. Some of our datasets probably have around 50,000 to 1,000,000 views (selects) for each update (insert/delete). Having the database repeat the query every time is a waste. So, we began writing our programs such that they will grab the data from the database once, and generate the HTML for every page, and the indexes. Then, when an update is made to the database (via the administrative tools), it simply rewrites *the relevant HTML files*, and changes the indeces pointing to them. (There are also some other very large advantages to this sort of thing, but I'm not allowed to say them. ; ) ) The second case is that of often-repeated queries. One of the offerings on our site is an online directory, which gets a pretty fair amount of traffic. Unfortunately, it uses a proprietary program that was purchased by management before they spoke with us. Grr It was the most utterly inefficient program I've ever seen. It would *not* allow the database to do joins, it would grab entire tables, then try to do the joins itself, in Perl. We rewrote the program to let PostgreSQL do the joins, and that sped it up. Then we realized that a very small number of queries (those for the first one or two levels of pages) accounted for a huge portion of the useage. So, we replaced the front page with a static HTML page (the front page doesn't change...), and saw another terrific drop in our system loads. Overall, by only modifying a couple of our more heavily-uesd programs, our server loads dropped by about 30%-40%. If we went to the trouble to modify some others, it would drop even more. But we're going to rewrite them completely for other reasons. : ) In any event, there are ways like this to save a LOT of CPU and disk I/O. Most web servers can server out several hundred static pages with the resources that would otherwise deliver one dynamically-created, database-driven page. It also allows you to cluster the web servers with cheap commodity hardware, instead of using big-iron on the database. And if you have a big-iron machine running the back-end, this can severely lighten the load on it, keeping you from dropping a few hundred grand on the next step up. ; ) (Incidentally, we've toyed around with developping a query-caching system that would sit betwen PostgreSQL and our DB libraries. However, it seems like it could be done *much* more efficiently in PostgreSQL itself, as it would be much easier to keep track of which tables have changed, etc.. Anybody know if this sort of functionality is planned? It would be terrific to simply give the machine another 256 megs of RAM, and tell it to use it as a DB cache...) steve
[GENERAL] Query caching
(Incidentally, we've toyed around with developping a query-caching system that would sit betwen PostgreSQL and our DB libraries. Sounds amazing, but requires some research, I guess. However, in many cases one would be more than happy with cahced connections. Of course, cahced query results can be naturally added to that, but just connections are OK to start with. Security To me, it doesn't sound like it would be that difficult of a project, at least not for the likes of the PostgreSQL developpers. It also doesn't seem like it would really introduce any security problems, not if it were done inside of PostgreSQL. Long ago, I got sidetracked from my endeavors in C, and so I don't feel that I'm qualified to do it. (otherwise, I would have done it already. : ) ) If you wanted it done in Perl or Object Pascal, I could help. : ) Here's a simple design that I was tossing back and forth. Please understand that I'm not saying this is the best way to do it, or even a good way to do it. Just a possible way to do it. I haven't been able to give it as much thought as I would like to. Here goes. Implementation Upon starting, the PostgreSQL engine could allocate a chunk of memory, sized according to the administrator's desire. That chunk would be used solely for query caching. When a query came in that was not cached (say, the first query), the database engine would process it as normal. It would then return it to the user, and add it to the cache. "Adding it to the cache" would mean that it would enter the query itself, the result set, and a list of which tables the query relied upon. The query that is stored could be either the query coming from the user, or the query after it goes through the optimizer. Each has pros and cons, I would probably favor using the query that comes from the user. When another query comes along, the caching engine would quickly look in the hash table, and see if it already had the cached results of the query. If so, it returns them, and wham. You've just avoided all of the work of optimizing, parsing, and executing, not to mention the disk I/O. A hash lookup seems extremely cheap compared to the work of actually processing a query. When an update/delete/insert comes along, the engine would analyze which tables were affected, and clear the cache entries that relied upon those tables. - Cache Clearing Cache clearing would be achieved via an LRU-based algorithm, which would also take into account the amount of RAM used by each query in the cache. - Performance Impact The potential performance differences range from a miniscule decrease to a tremendous increase. And it's a lot cheaper to throw an extra half gig of RAM in a machine that to upgrade processors and disk subsystems! -- Possible Changes One potential drawback is that when a table is modified, the queries that rely upon it would be discarded. Where a table is updated frequently, that could greatly reduce the performance benefit. One possible alternative is to store the query cost with each query in the cache. When a table is updated, those queries are marked as "dirty". If the system load is below a certain amount, or the system has been idle, it could then re-execute those queries and update the cache. Which queries it re-executed would be determined on a factor of query cost and how frequently those cache entries were used. --- The reason I would prefer it done in the PostgreSQL engine (as opposed to in a middleware application) is that the caching engine needs to know (a) which tables a query relies upon, and (b) which tables get changed. It seems that it would significantly reduce overhead to do those inside of PostgreSQL (which is already doing the query parsing and analysis). This could certainly give PostgreSQL a huge advantage over other database systems, too. It could save administrators a very large chunk of cash that they would otherwise have to spend on large systems. And it would just be cool. ; ) steve
Re: [GENERAL] how good is PostgreSQL
Even after that, you have a long way to go before you will hit 1000 transactions per second from any SQL database. I guess they could always buy a few Sun E1's on the backend, and a large room of rack-mountable PC's for web/CGI serving. Nothing like plopping down ten or twenty million dollars on hardware. : ) steve
Re: [GENERAL] Postgres 7.0.2-2 on Red Hat 7.0?
OK, here's a situation. One of the programmers at your company runs the disk out of space. You're going to go bonk him on the head, but first, there are more pressing matters. PostgreSQL 6.5 has horked up the tables, and needs to be fixed. 7.0 is released, which has a fix for the problem. Let's be real here. If your system is out of disk space, you can't do a dump to put it into 7.0. You're definitely gonna need 6.5 to work at this point... I know. And I was being real. That is the situation that happend at my company, and it also came up from at least one other person on the list. Yes, I needed 6.5 to get the data out. However, you *couldn't* dump the data, PSQL had horked the tables up too badly. I ended up writing a Perl script to get things from the tables, and put them into a flat file of SQL statements. It was ugly, but the clock was ticking. ; ) (nothing like a junior programmer doing a recursive grep of a large file system, and redirecting the results to a file *in* the filesystem he's grepping) Your problems aren't with RPM's, your problems the FHS. Distrib packages (RPM *or* DEB) will put stuff in FHS compliant locations, packages by anyone else will put files where they want. If you feel that's incorrect, Irespectfully suggest you hit up the LSB/FHS people if you want that to change, NOT Red Hat, PostGreSQL, or anyone else. I think you're missing the subtlety of my point, but that's fine. You're correct that this isn't the list, I'm going to drop the topic. steve
Re: [GENERAL] Postgres 7.0.2-2 on Red Hat 7.0?
The last thing that a system admin needs when upgrading PostgreSQL is "Oh, crap, I forgot to uninstall the RPM of the old one first." Then upgrade the RPM's. It isn't hard. OK, here's a situation. One of the programmers at your company runs the disk out of space. You're going to go bonk him on the head, but first, there are more pressing matters. PostgreSQL 6.5 has horked up the tables, and needs to be fixed. 7.0 is released, which has a fix for the problem. Are you going to sit around waiting for RPM's, while your tables are all horked up, and the programming department is breathing down your neck because they can't get work done? If you're going to install from source on a RedHat machine, it is simply prudent practice, regardless of the package, to make sure the RPM version is not already installed. I agree. And, the fact of the matter is that there are likely far more PostgreSQL installations from RPM than from source. I fail to see the relevance of that argument. Popularity does not make correctness. If I'm just being extremely dense about that sentence, feel free to let me know. steve
Re: [GENERAL] Any risk in increasing BLCKSZ to get larger tuples?
A trick you can use in 7.0.* to squeeze out a little more space is to declare your large text fields as "lztext" --- this invokes inline compression, which might get you a factor of 2 or so on typical mail messages. lztext will go away again in 7.1, since TOAST supersedes it, Uh, why. Does TOAST do automatic compression? If people need to store huge blocks of text (like a DNA sequence) inline compression isn't just a hack to squeeze bigger text into a tuple. I'd guess that it's a speed issue. Decompressing everything in the table for every select sounds like a great waste of CPU power, to me, especially when hard drives and RAM are cheap. Kind of like the idea of "drivespace" on Windows - nice idea, but it slowed things down quite a bit. steve
Re: [GENERAL] Any risk in increasing BLCKSZ to get larger tuples?
In some cases yes, in some no. Simple text should compress/decompress quickly and the cpu time wasted is made up for by less hardware access time and smaller db files. If you have a huge database the smaller db files could be critical. Hmm... that doesn't seem quite right to me. Whether it is compressed or not, the same amount of final data has to move across the system bus to the CPU for processing. It's the difference of (A) moving a large amount of data to the CPU and processing it, or (B) moving a small amount of data to the CPU, use the CPU cycles to turn it into the large set (as large as in (A)), then processing it. I could be wrong, though. steve
Re: [GENERAL] pg_dump's over 2GB
My current backups made with pg_dump are currently 1.3GB. I am wondering what kind of headaches I will have to deal with once they exceed 2GB. What will happen with pg_dump on a Linux 2.2.14 i386 kernel when the output exceeds 2GB? There are some ways around it if your program supports it, I'm not sure if it works with redirects... Currently the dump file is later fed to a 'tar cvfz'. I am thinking that instead I will need to pipe pg_dumps output into gzip thus avoiding the creation of a file of that size. Why not just pump the data right into gzip? Something like: pg_dumpall | gzip --stdout pgdump.gz (I'm sure that the more efficient shell scripters will know a better way) If your data is anything like ours, you will get at least a 5:1 compression ratio, meaning you can actually dump around 10 gigs of data before you hit the 2 gig file limit. steve
Re: [GENERAL] windows UI tool???
can anyone recommend a windows ui tool for viewing data in postgres? essentially i want to use something other than the command line for looking at results. as an emample, i'm looking for the equivalent of TOAD for Oracle. is the easiest approach to connect in MS Access via odbc? Well, I'd say "find the data access/modelling tool of your choice, and hook it up with ODBC." I'm used to the SQL Explorer that comes with Delphi, so that's what I use, although I've heard that TOAD is much nicer steve
[GENERAL] PostgreSQL on Alpha
Can any of you running PostgreSQL on Alphas comment on the performance vs. Intel-based servers? steve
Re: [BUGS] Re: [GENERAL] cannot vacuum a database !
here is the vacuum verbose : pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. We have lost the connection to the backend, so further processing is impossible. Terminating. Well, here's one possibility. If you're still using 6.5.x or before, then if you run out of disk space (even temporarily), Postgres can hork up the table exactly as you describe. You can select from it, insert into it, etc., but can't vacuum it - nor can you describe (\d) it. A pg_dump won't work on it, either. The only way that I was able to rescue the data was to write a Perl script to select all of the data, and format it into a flat file with a bunch of SQL statements - basically, re-creating pg_dump. There may have been a more elegant way of doing it (I think pg_dump has an option to select instead of copy), but I didn't have a lot of time to research it at the time. : ) Upgrading to 7 is a very good thing, even if for nothing else than it has more protection if you run out of disk. And even if you have plenty of disk, it only takes one junior programmer to mess it all up. : ) steve
Re: [GENERAL] Server Overload
I've seen brief posts regarding server loads, distrubution for heavy loads, etc, but don't recall if there were any solutions... Anyways...We are running a Postgres DB against multiple frontend webservers. For most of the time, everything runs fine. Then, all of a sudden, everything will start to go all funky. Crashing...Errors...Etc... When I log onto the DB server and try to do anything, I get "Too many open files in system." ulimit is set to unlimited and there is PLENTY of FREE memory. What OS are you running? Despite ulimit being "unlimited", your kernel may have a limit on either the number of open files, or the number of file handles. steve
[GENERAL] Tuning PostgreSQL to use more RAM...
Actually, more RAM would permit you to increase both the -B parameters as well as the -S one ... which are both noted for providing performance increases ... -B more on repeative queries and -S on anything involving ORDER BY or GROUP BY ... For a while now, I've been meaning to investigate how to get PostgreSQL to take advantage of the RAM in our machine. It has 512 megs, and most of the time, about 275-400 megs of it simply go to disk cache buffer, as nothing else wants it. Occasionally, we'll only have 250-300 megs of disk cache. : ) While I don't mind disk cache, I feel that we could get better performance by letting postgres use another hundred megs or so, especially since our entire /usr/local/pgsql/base directory has only 134 megs of data. We're currently starting the postmaster with "-B 2048". The machine has 4 Xeon processors, and 5 drives in the RAID array, so we do have a small bit of CPU power and disk throughput. Any suggestions or pointers are welcome. steve
Re: [GENERAL] Great Bridge benchmark results for Postgres, 4 others
1) Using only ODBC drivers. I don't know how much of an impact a driver can make but it would seem that using native drivers would shutdown one source of objections. Using ODBC is guaranteed to slow down the benchmark. I've seen native database drivers beat ODBC by anywhere from a factor of two to an order of magnitude. steve
Re: [GENERAL] PG vs. Oracle for larger databases
How suitable is PG for doing larger databases? The need I am considering would be a financial database that does maybe up to 100k transactions/day. In a day? I think a lot of us do that much in an hour Obviously, it needs to be very reliable, and have minimal scheduled, and no unscheduled downtime. Should this project be on Oracle or Postgres? The only "unscheduled" downtime we've had with postgreSQL was when one of our programmers ran the disk out of space, which will give any database server fits. ; ) steve
[GENERAL] Escaping parenthesis in regular expressions....
How does one escape parenthesis in a regular expression in Postgres? An example is: select * from subcategories where subcategory ~* '401(k)'; Which does not work. I've tried: select * from subcategories where subcategory ~* '401\(k\)'; That still didn't work. Any advice would be much appreciated. BTW, select * from subcategories where subcategory = '401(k)'; does work. ; ) steve
Re: [GENERAL] Performance of Postgres via network connections
Since most RAID servers can't even flood a 100 mbit connection, you're more than safe with that much bandwidth if most of your traffic is going to be database related. You might want to factor in all the other network traffic that will be going over those lines though. For instance, if the app servers are going to be mounting partitions off of a RAID server on the same network, and your Internet connection comes in there too, you might start to run into problems. The database shouldn't even come close though. Thank you, I appreciate the comments. Each app server is actually going to have two 100 mbit cards, one to connect it to the RAID array, and one to connect it to the database machine. Your comments give me even more hope that this will work as well as we hope. ; ) After this is all set up, if anyone would like, I may type up an explanation of how things were done as well as costs, for those going through the same sort of growing pains. It's certainly been a lot of work for us to hammer out all of the details, hopefully that would help someone else avoid the work. steve
Re: [GENERAL] Backend died while dropping index
1) Postgres 6.5.3 is a known quantity, and there are more people familiar with the common, critical problems, so there are more people who can help solve problems. 2) 7.0.2 undoubtedly has some new bugs and problems that are not apparent yet - bugs that could sink a commercial business that depends on Postgres (like mine) I'll throw in my two cents, your experience hopefully won't be the same as ours. This week, 6.5.3 somehow horked it's own data quite thoroughly for us. Vacuuming the table would fail miserably - even trying to *describe* the tables shut down the back end. It would complain about a tuple being too large - yet the largest tuple in the table was around 6K. Pg_dump and pg_dumpall would also promptly shut down the back end as soon as they tried to touch the table. Boy, that makes it fun. The cause (we suspect) is from one day when another programmer here did something really dumb, and filled up the disk. The docs for 7.0.2 say that it is supposed to be more reliable in situations like that, and so we decided that it was simply too dangerous to stay with a version that can mangle it's own data beyond repair. So, there are arguments on both sides of the coin. It's critical for your database to be running at all times, so you don't want to bring it down to upgrade. At the same time, the upgrade may buy you a better chance of staying up - or it might break everything. You'll have to make an educated guess, and possibly just flip a coin. And, of course, one of the prime rules of system administration is to make changes reversible. Back up the old version and the data, libraries, interfaces, etc.. If something goes terribly wrong during the upgrade, you can always switch back to the old version and be up and running in just a few minutes Ah - Here's one more thing, just in case it benefits anyone. A good upgrade procedure is to take a different machine (even if it's pretty low-end) that is similar in OS and hardware, and install the new version on it first. Copy the data over. *Then* test the things that are likely to break, so that you don't have any surprises. steve
[GENERAL] Ah, yet another cause for not binding the right port....
I've run across yet another cause that can make Postgres not be able to bind to the port it desires (or any other port, for that matter). I don't recall it being discussed before, so I'll throw it out. When postgres binds to any port, it likes to write a file in /tmp, along the lines of ".s.PGSQL.5432". If /tmp is not writeable by the postgres process, it will not be able to bind to any port. However, it doesn't give you an intelligent error message like "Can't create file", it simply says that it can't bind the port, and of course, asks if another postgres process is already running. For those interested, I found that out after having a sudden power loss wipe out several filesystems, and I had a late night reformatting and restoring from the backups, which I (thankfully!) had made that morning. When I remounted the /tmp permission, I forgot to set the permissions. Even more interesting was watching the two UPS' that I bought spark, flash, pop, and die. : ) steve
Re: [GENERAL] Ah, yet another cause for not binding the right port....
I tried this and got what I thought was a fairly reasonable error message: FATAL: StreamServerPort: bind() failed: Permission denied Is another postmaster already running on that port? If not, remove socket node (/tmp/.s.PGSQL.5432) and retry. postmaster: cannot create UNIX stream port "Permission denied" is what the kernel told us, and I don't think it's likely to be a good idea to second-guess why the kernel told us that. But the reference to /tmp/.s.PGSQL.5432 together with that error ought to be sufficient clue I would think... Yes, that is what clued me off... now I'm not exactly a kernel hacker, so be patient with me... : ) Postgres was the only program not able to bind a port - everything else (web, mail, IRC, etc.) was able to run just dandy. From poking around in a few config files, it looks like you get to choose whether you want to use the file in /tmp or not. I imagine that the developpers must have had a compelling reason to go with the former - would you care to explain just a little of all of that to me? steve
Re: [GENERAL] RE: One last thing... psql and readline
I've got the same problem with readline and psql using 7.0 RC1. I found that there was something odd when ran configure. Here's the readline stuff I grepped out of config.cache: ac_cv_header_readline_h=${ac_cv_header_readline_h='no'} ac_cv_header_readline_history_h=${ac_cv_header_readline_history_h='yes'} ac_cv_header_readline_readline_h=${ac_cv_header_readline_readline_h='yes'} ac_cv_lib_readline_main=${ac_cv_lib_readline_main='yes'} ac_cv_lib_readline_using_history=${ac_cv_lib_readline_using_history='yes'} Notice that the first line still says "no". Looks to me like it wants to find readline.h in /usr/include, as well as, /usr/include/readline. Could this be the source of the problem? Interesting observation, thanks for following up. In my case, I gave up (since the machine that needed the emergency upgrade was slightly whacked-out), and copied the psql binary from a different machine, a Redhat 6.1 setup, where the readline stuff compiled succesfully. Sorry I can't be of more help well, let me see... On the system where it compiled succesfully, readline.h is only found in /usr/include/readline/readline.h. steve
[GENERAL] And to make things even better...
/var/lib/pgsql reports that we're running 6.3 - which I don't find on the FTP site. Is that not a valid distribution number, or is the source not available? steve
Re: [GENERAL] well, shoot. Error loading Pg.so
First, I really appreciate the suggestions. On to a reply... Restore the /usr/lib/perl5/site_perl directory from a working backup. You'd think so, wouldn't you? : ) Unfortunately, the last time we put a cleaning tape in the DAT drive, the drive died, and now needs to be replaced. Because we're so vastly over-worked, that won't happen for a while. To make it worse, since we're so vastly over-loaded, we never had the time to make the DAT drive work in the first place. Try doing a ./configure --with-perl in the src directory, then go into interfaces/Perl5 and type make;make install. Maybe the currently installed src dir wasn't correct. The bad thing is that the last admin (who is now gone) installed 6.3 and deleted the source - and I can't find the 6.3 source from postgresql.org. Might also want to check that /usr/local/pgsql/lib is in your /etc/ld.so.conf file and re-run ldconfig. (Probably should do this before trying the rebuild of the perl interface) I'll give that a shot, thanks a million. steve
Re: [GENERAL] Re: newline character handling
maybe we need a keyword DOS|UNIX or perhaps TEXT|BINARY to tell postgresql to pick DOS style or UNIX style line endings... Maybe we just need to make sure that the files we are using are in the correct format for the platform they're being processed on. ; ) steve
[GENERAL] Postgres and clustering/scaling....
Our company is getting to where we need to think of turning our Postgres server into a cluster. We've come up with a few very viable ideas, but we'd love to hear of anything that others have done along this line. steve
Re: [GENERAL] LockRelease: locktable lookup failed, no lock
I'm getting "LockRelease: locktable lookup failed, no lock" trying to insert a row into a database. Well, I think I tracked down the problem - an index on a "text" field. Once I removed the index, everything works well. The odd thing is that not *all* inserts would cause the error. I suppose I ought to dig into the docs when I get a few spare minutes... What is the version of PostgreSQL and what kind of platform are you using? Lemmee see... pgsql 6.5.3 on redhat linux 6.1, kernel 2.2.12-20smp. The machine currently isn't running multicpu, that will happen soon. At bootup, linux finds only one chip, and does pseudo-epic. : ) steve
[GENERAL] Re: LockRelease: locktable lookup failed, no lock
Grr Although I'm no longer getting lock releases on this table, a daily vaccum analyze still shows: NOTICE: Rel reef: Uninitialized page 492 - fixing NOTICE: Rel reef: Uninitialized page 498 - fixing NOTICE: Rel reef: Uninitialized page 499 - fixing The database is a fledgling mail archive, when a message comes in, a program parses the message for the relevant information, inserts it into the table, and it's done. Since it's in very early development, there are only very occasional queries on it. Here's the schema that I used to create the table create sequence reef_sequence increment 2; create table reef( reef_index int4 default nextval('reef_sequence') primary key, subject varchar(256), author varchar(256), author_email varchar(256), message text, datedate, timetime, is_replyint ); Now I know that there are problems with that, like "is_reply" should be a bool, not an int. This is a very old project that I'm just starting up again... go easy on me. : ) steve
[GENERAL] Question on confusing behavior
Today, I told someone the syntax for "count" without thinking, and got it wrong - and in so doing, I found some interesting behavior. select count('table.fieldname'); works. however... select count('table.fieldname') where table.fieldname='something'; not only barfs, it kills your connection to the back-end. Then... select count(table.fieldname) where table.fieldname='something'; works fine. So, if the field name to count is enclosed in ticks, it works if there is no where clause, but not if there is a where clause. Now I'm not an expert at SQL, so I realize that it could be because I used totally illegal syntax. But, then I'd think that postgres would barf consistantly. Anyone care to enlighten me as to what's going on? steve