Re: [PERFORM] Postgres over Linux NBD or NFS
There are some less expensive baby-SAN options coming out now - Dell has an rebranded EMC baby SAN (which of course doesn't work with any other EMC system...) that starts at about $6000 or so. Just read the announcement - don't know anything else. While there have been some reports of undewhelming performance for database applications, the Apple XRaid has a sweet price point, particularly if you're in an industry that they want some exposure in (we're in financial services, they almost gave it to us...$7000 for 2TB, batteries, accessory kits, etc), and a decent feature set. It works with non-Apple stuff.. The baby-SANs don't necessarily do many of the things that you can get out of a full-blown EMC/NetApp rig, but then again, you're not paying for it either. There are a lot of lower-cost storage options popping up now, as IDE/SATA disks arrays proliferate. You can get external RAID boxes that talk SCSI or fiber with IDE disks for dirt these days. Small, too. Portable. I'm see little need to buy massive boxes with internal storage arrays anymore. On Jun 22, 2004, at 7:50 AM, Christopher Browne wrote: In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] (Andrew Rawnsley) transmitted: On Jun 21, 2004, at 2:02 PM, Andrew Hammond wrote: We're looking for an alternative to fiber-channel disk arrays for mass storage. One of the ideas that we're exploring would involve having the cluster on an NFS mounted filesystem. Another technology we're looking at is the Linux NBD (Network Block Device). No idea about NBDs, but its generally accepted that running over NFS would significantly decrease reliability and performance, i.e. it would be a Bad Move (tm). Not sure what you think to gain. I sure wouldn't trust NFS with a production database. What exactly are you trying to gain, avoid, or do? The point of the exercise is to try to come up with something that is a near-substitute for a SAN. With a SAN, you have a box with a whole lot of disk in it, and then your database servers connect to that box, typically via something like fibrechannel. One of the goals is for this to allow trying out Opterons at low risk. Should performance turn out to suck or there be some other disqualification, it's simple to hook the disk up to something else instead. The other goal is to be able to stick LOTS of disk into one box, and dole it out to multiple servers. It's more expensive to set up and manage 3 RAID arrays than it is to set up and manage just 1, because you have to manage 3 sets of disk hardware rather than 1. But I'm getting convinced that the attempt to get this clever about it is counterproductive unless you have outrageous amounts of money to throw at it. - NFS may well be acceptable if you buy into something with potent FS semantics, as with NetApp boxes. But they're REALLY expensive. - FibreChannel offers interesting options in conjunction with a fairly smart SAN box and Veritas, where you could have 5TB of storage in one box, and then assign 2TB apiece to two servers, and the other 1TB to a third. But the pricing premium again leaps out at ya. The poor man's approach involves trying to fake this by building a disk box running Linux that exports the storage either as a filesystem (using NFS) or as disk blocks (NBD). NFS clearly doesn't provide the filesystem semantics needed to get decent reliability; with NBD, it's not clear what happens :-(. Barring that, it means building a separate RAID array for each server, and living with the limitation that a full set of disk hardware has to be devoted to each DB server. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','acm.org'). http://www3.sympatico.ca/cbbrowne/ Rules of the Evil Overlord #46. If an advisor says to me My liege, he is but one man. What can one man possibly do?, I will reply This. and kill the advisor. http://www.eviloverlord.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres over Linux NBD or NFS
How about iSCSI? This is exactly what it's for - presenting a bunch of remote SCSI hardware as if it were local. There are several reference implementations on SourceForge from Intel, Cisco others. I've never tried it myself, but I would if I had the need. And let's face it there are some very big players selling very pricey kit that uses it, so you should have pretty high confidence that the fundamentals are strong. M The other goal is to be able to stick LOTS of disk into one box, and dole it out to multiple servers. It's more expensive to set up and manage 3 RAID arrays than it is to set up and manage just 1, because you have to manage 3 sets of disk hardware rather than 1. [snip] The poor man's approach involves trying to fake this by building a disk box running Linux that exports the storage either as a filesystem (using NFS) or as disk blocks (NBD). NFS clearly doesn't provide the filesystem semantics needed to get decent reliability; with NBD, it's not clear what happens :-(. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] [BULK] Problems with vacuum!
As you suggets I've tried to upgrade to the kernel 2.4.28, but it seems that nothing change! I built a new machine with Red Hat 8 (kernel 2.4.28) a 1GB RAM using the same parameters i've been used before. After the boot, i've got 800Mb of free memory, if a launch a pg_dump then the system swap (only 1 or 2 mb.) and the free memory become 20mb. Now, if I tried a vacuumdb the system use partially the cached memory then begin to swap again with 700mb of cached memory it's very strange that the system swap again anyone know why this happend? Distinti Saluti Sgarbossa Domenico X Tecnica S.R.L. www.xtecnica.com Tel: 049/9409154 - 049/5970297 Fax: 049/9400288 - Original Message - From: Scott Marlowe [EMAIL PROTECTED] To: Joshua D. Drake [EMAIL PROTECTED] Cc: Tom Lane [EMAIL PROTECTED]; Domenico Sgarbossa [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Friday, June 18, 2004 8:58 PM Subject: Re: [PERFORM] [BULK] Problems with vacuum! I believe it was more like the kernel was tuned to make it less common, but certain things can still trigger it. I know the problem was still there in the 2.4.24 on the last server I was playing with, but it was a lot less of a problem than it had been under 2.4.9 on an earlier machine with the same basic amount of memory. On Fri, 2004-06-18 at 12:47, Joshua D. Drake wrote: Hello, I would have to double check BUT I believe this is fixed in later 2.4.x kernels as well. If you don't want to go through the hassle of 2.6 (although it really is a nice kernel) then upgrade to 2.4.26. Sincerely, Joshau D. Drake Scott Marlowe wrote: On Fri, 2004-06-18 at 09:11, Tom Lane wrote: Domenico Sgarbossa [EMAIL PROTECTED] writes: so when the users go home, i've got something like 15/2kb free ram, the rest is cached and 0kb of swap... It seems that when pg_dump starts the cached memory isn't released so the system begin to swap, A sane kernel should drop disk buffers rather than swapping. We heard recently about a bug in some versions of the Linux kernel that cause it to prefer swapping to discarding disk cache, though. It sounds like that's what you're hitting. Look into newer kernels ... This was a common problem in the linux 2.4 series kernels, but has supposedly been fixed in the 2.6 kernels. Having lots of memory and turning off swap will fix the problem in 2.4, but if you run out of real mem, you're hosed. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Postgres over Linux NBD or NFS
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 | What exactly are you trying to gain, avoid, or do? Gain: seperate database storage from processing. This lets me move clusters from one server to another easily. Just stop the postgres instance on server A and unmount it's filesystem. Then mount it on server B and start postgres instance on server B. It gives me some fail-over capability as well as scalability and a lot of flexibility in balancing load over multiple servers. Avoid: paying for brutally expensive FC gear. - -- Andrew Hammond416-673-4138[EMAIL PROTECTED] Database Administrator, Afilias Canada Corp. CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFA2Djagfzn5SevSpoRAj+bAKDFFgrhX+G1gkZRrydow3j/j35VaACbBN3Y C/0nWmqcwo/UlqvYpng06Ks= =k2vg -END PGP SIGNATURE- begin:vcard fn:Andrew Hammond n:Hammond;Andrew org:Afilias Canada Corp.;Operations adr:Suite 204;;4141 Yonge Street;North York;Ontario;M2P 2A8;Canada email;internet:[EMAIL PROTECTED] title:Database Administrator tel;work:416-673-4138 tel;fax:416-646-1541 tel;home:416-214-1109 tel;cell:647-285-7106 note;quoted-printable:I sign all emails with my GPG key. Fingerprint is:=0D=0A= CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A x-mozilla-html:TRUE url:http://www.afilias.info/ version:2.1 end:vcard ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] postgresql and openmosix migration
Hi, I am trying to make a cluster out of any database, postgresql or mysql or any other free database. I have looked at openmosix patched with the migshm patch for shared memory support and it seems that neither work fully. Postgresql in particular uses "shared memory but not the system semaphores for locking it". Thus apparently it won't benefit from an openmosix cluster. In addition mysql doesn't seem to migrate because it is multithreaded. Any ideas of how I can cluster my database (around 800 GB in size so even partial replication is not really practical)? If interested this is my source for openmosix and migshm information http://howto.ipng.be/MigSHM-openMosix/x90.html Thanks.
Re: [PERFORM] reindex and copy - deadlock?
Hi, I have changed reindex table my_table to: psql ... -c drop index my_index; create index my_index; We still experience the same hang problem. I was told that this time, the process is create index my_index; before the PG server is bounced. When I login the database, I found the my_index is still there. I do not know what caused this happen, and I am also confused. If create index my_index is killed by -9, then my_index should not present in the database because it has been dropped before creating. On the other hand, if drop index my_index; is killed, then how drop index (which is DDL, right?) can be blocked? There must be other process(es) has/have execlusive lock on my_index, which is not our case from pg_locks. Tom, we are in the process of installing the backend with --enable-debug. Thanks, --- Tom Lane [EMAIL PROTECTED] wrote: Litao Wu [EMAIL PROTECTED] writes: One difference between these two databases is the one having REINDEX problem is using NTFS file system. Oh? That's interesting. Is it possible the root of problem? I would not expect it to show this particular symptom --- if the backtrace is accurate. But there are nearby places that might have FS-dependent behavior. Can you do anything about my request for a stack trace from a debug-enabled build? regards, tom lane __ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hi!
Janio, I am trying install the postgresql-7.4.3 simple installation. I did ./configure command at the postgresql directory source. While the configuring proccess I receiving the follow message: This is the wrong list for this question. Please try PGSQL-ADMIN. You're much more likely to get help there. Sorry! -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] after using pg_resetxlog, db lost
The pg_resetxlog was run as root. It caused ownership problems of pg_control and xlog files. Now we have no access to the data now through psql. The data is still there under /var/lib/pgsql/data/base/17347 (PWFPM_DEV DB name). But there is no reference to 36 of our tables in pg_class. Also the 18 other tables that are reported in this database have no data in them. Is there anyway to have the database resync or make it aware of the data under /var/lib/pgsql/data/base/17347? How can this problem be resolved? There is actually 346 db files adding up to 134 GB in this database. Below are error messages of when the database trying to be started. I am not sure of the when pg_resetxlog was run. I suspect it was run to get rid ot the invalid primary checkpoint record. The postgresql DB had an error trying to be started up. The error was Jun 22 13:17:53 murphy postgres[27430]: [4-1] LOG: invalid primary checkpoint record Jun 22 13:17:53 murphy postgres[27430]: [5-1] LOG: could not open file /var/lib/pgsql/data/pg_xlog/ (log file 0, segment 0): No such file or directory Jun 22 13:18:49 murphy postgres[28778]: [6-1] LOG: invalid secondary checkpoint record Jun 22 13:18:49 murphy postgres[28778]: [7-1] PANIC: could not locate a valid checkpoint record Jun 22 13:26:01 murphy postgres[30770]: [6-1] LOG: database system is ready Jun 22 13:26:02 murphy postgresql: Starting postgresql service: succeeded Jun 22 13:26:20 murphy postgres[30789]: [2-1] PANIC: could not access status of transaction 553 Jun 22 13:26:20 murphy postgres[30789]: [2-2] DETAIL: could not open file /var/lib/pgsql/data/pg_clog/: No such file or directory Jun 22 13:26:20 murphy postgres[30789]: [2-3] STATEMENT: COMMIT and Jun 22 13:26:20 murphy postgres[30791]: [10-1] LOG: redo starts at 0/250 Jun 22 13:26:20 murphy postgres[30791]: [11-1] LOG: file /var/lib/pgsql/data/pg_clog/ doesn't exist, reading as zeroes Jun 22 13:26:20 murphy postgres[30791]: [12-1] LOG: record with zero length at 0/2000E84 Jun 22 13:26:20 murphy postgres[30791]: [13-1] LOG: redo done at 0/2000E60 Jun 22 13:26:20 murphy postgres[30791]: [14-1] WARNING: xlog flush request 213/7363F354 is not satisfied --- flushed only to 0/2000E84 Jun 22 13:26:20 murphy postgres[30791]: [14-2] CONTEXT: writing block 840074 of relation 17347/356768772 Jun 22 13:26:20 murphy postgres[30791]: [15-1] WARNING: xlog flush request 213/58426648 is not satisfied --- flushed only to 0/2000E84 and Jun 22 13:38:23 murphy postgres[1460]: [2-1] ERROR: xlog flush request 210/E757F150 is not satisfied --- flushed only to 0/2074CA0 Jun 22 13:38:23 murphy postgres[1460]: [2-2] CONTEXT: writing block 824605 of relation 17347/356768772 We are using a san for our storage device. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] postgresql and openmosix migration
Hi Bill, I am more often in the needing help category than the giving help when it comes to advise about using postgresql. I have found it to be an extremely powerful tool and by far the best performance/price for my work. I think you will get some excellent answers and help to your performance questions if you send the list details about specific queries that are running too slow. If you are willing to throw more/bigger hardware at the problem, let people know that when you ask and they will tell you if your bottleneck can be alleviated through more ram, disks, cpu or whatever. Having been watching this list for some time now, I suspect most of the performance problems can be improved using non-intuitive query or configuration modifications (for example, replacing min()/max() as suggested by Mr. Wolf). The heavy hitters on the list will usually ask for an explain analyze of your query. If your query is select * from foo, then change it to EXPLAIN ANALYZE select * from foo and post the output. It will look something like this: QUERY PLAN --- Seq Scan on foo (cost=0.00..1.04 rows=4 width=44) (actual time=8.46..8.47 rows=4 loops=1) Total runtime: 19.63 msec (2 rows) I'm sure your data is confidential; mine is too. The good news is that none of your data is included in the query. Only technical details about what the database is doing. If your problem might involve the application that works with the data, give some details about that. For example, if you're using a Java application, let people know what driver version you use, what jvm and other related info. There are lurkers on this list using just about every programming language imaginable on more platforms than you can shake a stick at (I don't care how good you are at shaking sticks, either). The more details you give the better help you're going to get and you'd be amazed at the results I've seen people get with a judicious amount of tweaking. The other day someone had a query that took hours decrease to less than 10 minutes by using some techniques prescribed by members on the list. Bringing 30 - 60 second queries down to 2-3 seconds is commonplace. You seem to be ready to throw money at the problem by investing in new hardware but I would suggest digging into the performance problems first. Too many times we've seen people on the list say, I've just spent $x0,000 on a new xyz and I'm still having problems with this query. Often times the true solution is rewriting queries, tweaking config parameters, adding RAM and upgrading disks (in that order I believe). As I found out even today on the SQL list, it's best to ask questions in this form: I want to do this... I've been trying this... I'm getting this... which is problematic because... The more clearly you state the abstract goal the more creative answers you'll get with people often suggesting things you'd never considered. I hope this helps and I hope that you achieve your goals of a well performing application. Matthew Nuzum | Makers of Elite Content Management System www.followers.net | View samples of Elite CMS in action [EMAIL PROTECTED] | http://www.followers.net/portfolio/ -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-performance- [EMAIL PROTECTED] On Behalf Of Bill Sent: Tuesday, June 22, 2004 1:31 PM To: Josh Berkus Cc: [EMAIL PROTECTED] Subject: Re: [PERFORM] postgresql and openmosix migration Ok, so maybe someone on this group will have a better idea. We have a database of financial information, and this has literally millions of entries. I have installed indicies, but for the rather computationally demanding processes we like to use, like a select query to find the commodity with the highest monthly or annual returns, the computer generally runs unacceptably slow. So, other than clustring, how could I achieve a speed increase in these complex queries? Is this better in mysql or postgresql? Thanks. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] after using pg_resetxlog, db lost
Shea,Dan [CIS] [EMAIL PROTECTED] writes: The pg_resetxlog was run as root. It caused ownership problems of pg_control and xlog files. Now we have no access to the data now through psql. The data is still there under /var/lib/pgsql/data/base/17347 (PWFPM_DEV DB name). But there is no reference to 36 of our tables in pg_class. Also the 18 other tables that are reported in this database have no data in them. Is there anyway to have the database resync or make it aware of the data under /var/lib/pgsql/data/base/17347? How can this problem be resolved? What this sounds like is that you reset the transaction counter along with the xlog, so that those tables appear to have been created by transactions in the future. This could be repaired by doing pg_resetxlog with a more appropriate initial transaction ID, but figuring out what that value should be is not easy :-( What I'd suggest is grabbing pg_filedump from http://sources.redhat.com/rhdb/ and using it to look through pg_class (which will be file $PGDATA/base/yourdbnumber/1259) to see the highest transaction ID mentioned in any row of pg_class. Then pg_resetxlog with a value a bit larger than that. Now you should be able to see all the rows in pg_class ... but this doesn't get you out of the woods yet, unless there are very-recently-created tables shown in pg_class. I'd suggest next looking through whichever tables you know to be recently modified to find the highest transaction ID mentioned in them, and finally doing another pg_resetxlog with a value a few million greater than that. Then you should be okay. The reason you need to do this in two steps is that you'll need to look at pg_class.relfilenode to get the file names of your recently-modified tables. Do NOT modify the database in any way while you are running with the intermediate transaction ID setting. Jun 22 13:38:23 murphy postgres[1460]: [2-1] ERROR: xlog flush request 210/E757F150 is not satisfied --- flushed only to 0/2074CA0 Looks like you also need a larger initial WAL offset in your pg_resetxlog command. Unlike the case with transaction IDs, there's no need to try to be somewhat accurate in the setting --- I'd just use a number WAY beyond what you had, maybe like 1/0. Finally, the fact that all this happened suggests that you lost the contents of pg_control (else pg_resetxlog would have picked up the right values from it). Be very sure that you run pg_resetxlog under the same locale settings (LC_COLLATE,LC_CTYPE) that you initially initdb'd with. Otherwise you're likely to have nasty index-corruption problems later. Good luck. Next time, don't let amateurs fool with pg_resetxlog (and anyone who'd run it as root definitely doesn't know what they're doing). It is a wizard's tool. Get knowledgeable advice from the PG lists before you use it rather than after. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] after using pg_resetxlog, db lost
Tom Lane wrote: Shea,Dan [CIS] [EMAIL PROTECTED] writes: The pg_resetxlog was run as root. It caused ownership problems of pg_control and xlog files. Now we have no access to the data now through psql. The data is still there under /var/lib/pgsql/data/base/17347 (PWFPM_DEV DB name). But there is no reference to 36 of our tables in pg_class. Also the 18 other tables that are reported in this database have no data in them. Is there anyway to have the database resync or make it aware of the data under /var/lib/pgsql/data/base/17347? How can this problem be resolved? What this sounds like is that you reset the transaction counter along with the xlog, so that those tables appear to have been created by transactions in the future. This could be repaired by doing pg_resetxlog with a more appropriate initial transaction ID, but figuring out what that value should be is not easy :-( Tom - would there be any value in adding this to a pg_dump? I'm assuming the numbers attached to tables etc are their OIDs anyway, so it might be a useful reference in cases like this. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] postgresql and openmosix migration
Bill wrote: Ok, so maybe someone on this group will have a better idea. We have a database of financial information, and this has literally millions of entries. I have installed indicies, but for the rather computationally demanding processes we like to use, like a select query to find the commodity with the highest monthly or annual returns, the computer generally runs unacceptably slow. So, other than clustring, how could I achieve a speed increase in these complex queries? Is this better in mysql or postgresql? If the bottleneck is really computational, not I/O, you might try PL/R in conjunction with the rpvm R package. rpvm allows R to make use of pvm to split its load among a cluster. See: R: http://www.r-project.org/ PL/R: http://www.joeconway.com/plr/ rpvm: http://cran.r-project.org/src/contrib/Descriptions/rpvm.html http://cran.r-project.org/doc/packages/rpvm.pdf I haven't had a chance to play with this myself yet, but I hope to relatively soon. HTH, Joe ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster