[PERFORM] arrays and indexes
Hi all - I've got a schema I'm working on modifying, nad I need some help getting the best performance out. The orginal schema has a many to many linkage between a couple tables, using a two column linkage table. This is used to represent groups of people and their relationship to an object (authors, copyrightholders, maintainers) This worked fine, and, with the right indixes, is quite zippy. Approximate schems: table content ( contentid serial, name text, ... authorgroupid int, cpholdergroupid int, maintgroupid int) table groups ( personid text, groupid int) Note that neither grouid nor personid are unique. Now the users want not just groups, but ordered lists. Well, that's just fine: we could do it with another column in the groups linkage table, and some additional logic in the middleware for detecting identical groups, but it occured to me that PG's array types are just the ticket for ordered lists like this. So, by dropping arrays of personids (authors, copyrightholders, maintainers, ...) into the content table, I can do everything I need. Only one problem. Retreiving all the content for a particular person/role is fairly common. Queries of the form: SELECT * from content c join groups g on c.authorgroupid = g.personid where personid = 'ross'; work fine and use the index on groups.personid. In the new schema, the same thing is: SELECT * from content where 42 = ANY (authors); Works fine, but for the life of me I can't find nor figure out how to build an index that will be used to speed this along. Any ideas? I'm using 7.4.3, BTW. Ross -- Ross Reedstrom, Ph.D. [EMAIL PROTECTED] Research Scientist phone: 713-348-6166 The Connexions Project http://cnx.rice.edu fax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] arrays and indexes
On Mon, Jul 26, 2004 at 02:27:20AM -0400, Greg Stark wrote: Ross J. Reedstrom [EMAIL PROTECTED] writes: In the new schema, the same thing is: SELECT * from content where 42 = ANY (authors); Works fine, but for the life of me I can't find nor figure out how to build an index that will be used to speed this along. Any ideas? Well that's basically the problem with denormalized data like this. Have you resolved what you're going to do if two sessions try to add a user to the same group at the same time? Or how you'll go about removing a user from all his groups in one shot? We've got plenty of interlocks in the middleware to handle the first (mainly because this is an authoring system where everyone has to agree to participate, and acknowledge the open license on the materials) Second, they _can't_ be removed: we're effectively a write only archive. Even if we weren't it would be a rare event and could go slowly (loop over groups in the middleware, probably) Basically, if you denormalize in this fashion it becomes hard to use the groups as anything but single monolithic objects. Whereas normalized data can be queried and updated from other points of view like in the case you name above. These groups _really are_ ideal for Joe Conway's work on arrays: we need ordered vectors, so we'd be sorting all the time, otherwise. They're static, and they're read only. The one thing they're not is fixed, known size (Sorry Merlin). They work fine for the query as shown: the only issue is performance. Postgres does have a way to do what you ask, though. It involves GiST indexes and the operators from the contrib/intarray directory from the Postgres source. Well, yes, that's how it used to be done. I figured the new array support should be able to handle it without the addon, however. However I warn you in advance that this is fairly esoteric stuff and will take some time to get used to. And at least in my case I found the indexes didn't actually help much for my data sets, probably because they just weren't big enough to benefit. I know that they should help in this case: we've got lots of content. Any particular author or maintainter will be in a small fraction of those. i.e.: it's ideal for an index. And the current joined case uses an index, when it's available. I'll take a look at the GiST/contrib work, anyway. Thanks - Ross -- Ross Reedstrom, Ph.D. [EMAIL PROTECTED] Research Scientist phone: 713-348-6166 The Connexions Project http://cnx.rice.edu fax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] TCP network cost
Recently I've been working on improving the performance of a system that delivers files stored in postgresql as bytea data. I was surprised at just how much a penalty I find moving from a domain socket connection to a TCP connection, even localhost. For one particular 40MB file (nothing outragous) I see ~ 2.5 sec. to download w/ the domain socket, but ~ 45 sec for a TCP connection (either localhost, name of localhost, or from another machine 5 hops away (on campus - gigabit LAN) Similar numbers for 8.2.3 or 8.3.6 (on Linux/Debian etch + backports) So, why the 20 fold penalty for using TCP? Any clues on how to trace what's up in the network IO stack? Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
On Tue, Feb 17, 2009 at 12:20:02AM -0700, Rusty Conover wrote: Try running tests with ttcp to eliminate any PostgreSQL overhead and find out the real bandwidth between the two machines. If its results are also slow, you know the problem is TCP related and not PostgreSQL related. I did in fact run a simple netcat client/server pair and verified that I can transfer that file on 0.12 sec localhost (or hostname), 0.35 over the net, so TCP stack and network are not to blame. This is purely inside the postgresql code issue, I believe. On Tue, Feb 17, 2009 at 10:13:40AM -0800, Aaron Turner wrote: TCP has additional overhead as well as going through the IP stack which for non-tuned Linux kernels is pretty limiting. Right. Already tuned those so long ago, I failed to mention it. Note the 'bare' transfer times added above. Nothing to write home about (~3Mb/sec) but another order of magnitude faster than the postgresql transfer. long story short, there are things in /proc you can use to increase buffers and window sizes which will help with large TCP streams (like a 40MB file for example). There's a lot of documentation on the net for how to tune the Linux IP stack so I won't repeat it here. Now, having your DB box 5 hops away is going to add a lot of latency and any packet loss is going to kill TCP throughput- especially if you increase window sizes. I'd recommend something like mtr to map the network traffic (make sure you run it both ways in case you have an asymmetric routing situation) for a long period of time to look for hiccups. The 5-hops in on campus, gigabit all the way, w/ reasonable routing - and not the issue: I see the same times from another machine attaached to the same switch (which is the real use-case, actually.) Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
On Tue, Feb 17, 2009 at 01:59:55PM -0700, Rusty Conover wrote: On Feb 17, 2009, at 1:04 PM, Ross J. Reedstrom wrote: What is the client software you're using? libpq? python w/ psycopg (or psycopg2), which wraps libpq. Same results w/ either version. I think I'll try network sniffing to see if I can find where the delays are happening. Ross -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
On Tue, Feb 17, 2009 at 03:14:55PM -0600, Ross J. Reedstrom wrote: On Tue, Feb 17, 2009 at 01:59:55PM -0700, Rusty Conover wrote: What is the client software you're using? libpq? python w/ psycopg (or psycopg2), which wraps libpq. Same results w/ either version. It's not python networking per se's fault: sending the file via a SimpleHTTPServer, adn fetching w/ wget takes on the order of 0.5 sec. as well. I think I'll try network sniffing to see if I can find where the delays are happening. I'm no TCP/IP expert, but some packet capturing, and wireshark analysis makes me suspicious about flow control. the 'netcat' transfer shows lots of packets from server - client, w/ deltaTs of 8 - 200 usec (that's micro-sec), mostly in the 10-20 range. The client - server 'ack's seem bursty, happening only every 50-100 packets, then a few back-to-back, all taking 10-20 usec. I also see occasional lost packets, retransmits, and TCP Window Updates in this stream. FIN packet is after 8553 packets. For the libpq driven transfer, I see lots of packets flowing both ways. Seems about every other packet from server to client is 'ack'ed. Each of these 'ack's takes 10 uS to send, but seem to cause the transfer to 'reset', since the next packet from the server doesn't arrive for 2-2.5 ms (that's milli-sec!) FIN happens at 63155 packets. No lost packets, no renegotiation, etc. Capturing a localhost transfer shows the same pattern, although now almost every single packet from server - client takes ~ 3 ms So, TCP experts out there, what's the scoop? Is libpq/psycopg being very conservative, or am I barking up the wrong tree? Are there network socket properities I need to be tweaking? Does framing up for TCP just take that long when the bits are coming from the DB? I assume the unix-domain socket case still uses the full postgresql messaging protocol, but wouldn't need to worry about network-byte-order, etc. All the postgres tunable knobs I can see seem to talk about disk IO, rather than net IO. Can someone point me at some doco about net IO? Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
[note: sending a message that's been sitting in 'drafts' since last week] Summary: C client and large-object API python both send bits in reasonable time, but I suspect there's still room for improvement in libpq over TCP: I'm suspicious of the 6x difference. Detailed analysis will probably find it's all down to memory allocation and extra copying of bits around (client side) Ross On Wed, Feb 18, 2009 at 01:44:23PM +, Gregory Stark wrote: There's not much Postgres can do to mess up TCP/IP. The only things that come to mind are a) making lots of short-lived connections and b) getting caught by Nagle when doing lots of short operations and blocking waiting on results. The hint re: Nagle sent to off hunting. It looks like libpq _should_ be setting NODELAY on both sides of the socket. However, tcptrace output does show (what I understand to be) the stereotypical every-other-packet-acked stairstep of a delayed-ack/Nagle interaction. (as described here: http://www.stuartcheshire.org/papers/NagleDelayedAck/ ) Walking through the libpq code, though, it sets NODELAY, so Nagle should be out of the picture. This may be a red herring, though. See below. What libpq (or other interface) operations are you doing exactly? I'm using psycopg from python. My cut down test case is: con=psycopg.connect('dbname=mydb user=myuser port=5433 host=myhost') cur=con.cursor() start=DateTime.now() cur.execute(select file from files where fileid=1) data = cur.fetchone()[0] end=DateTime.now() f=open('/dev/null','w') f.write(data) f.close() cur.close() print tcp socket: %s % str(end - start) I've since written a minimal C app, and it's doing much better, down to about 7 sec for a local TCP connection (either localhost or hostname) So, I get to blame the psycopg wrapper for ~ 30 sec of delay. I'm suspicous of memory allocation, myself. The tcp traces (tcpdump + tcptrace + xplot are cool set of tools, btw) indicate that the backend's taking ~ 0.35 sec to process the query and start sending bits, and using a domain socket w/ that code gets the file in 1.3 - 1.4 sec, so I'm still seeing a 6-fold slowdown for going via TCP (6 sec. vs. 1 sec.) Sending the raw file via apache (localhost) takes ~200 ms. Moving to a large-object based implementation would seem to confirm that: psycopg2 (snapshot of svn head) manages to pull a lo version of the file in times equivalent to the C client (7 sec local) I'll probably move the system to use that, since there's really almost no use-case for access to the insides of these files from SQL. [also, your Mail-Followup-To has a bogus email address in it. Please don't do that] Hmm, not on purpose. I'll take a look. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
On Thu, Feb 19, 2009 at 02:09:04PM +0100, PFC wrote: python w/ psycopg (or psycopg2), which wraps libpq. Same results w/ either version. I've seen psycopg2 saturate a 100 Mbps ethernet connection (direct connection with crossover cable) between postgres server and client during a benchmark... I had to change the benchmark to not retrieve a large TEXT column to remove this bottleneck... this was last year so versions are probably different, but I don't think this matters a lot... Here's the core of the problem: I in fact need to transfer exactly that: a large single field (bytea in my case). I suspect psycopg[12] is having issues w/ memory allocation, but that's just an unsupported gut feeling. The final upshot is that I need to restructure my config to use the large-object API (and hence a snapshot of psycopg2) to get decent throughput. You should test with sending a large (100 MB) amount of data through Netcat. This should give you your maximum wire speed. Use /dev/null as the test file, and use pv (pipe viewer) to measure throughput : box 1 : pv /dev/zero | nc -lp 12345 box 2 : nc (ip) 12345 /dev/null On gigabit lan you should get 100 MB/s, on 100BaseT about 10 MB/s. 112 MB/s, and 233 MB/s for localhost. Thanks for the pointer to pv: looks like a nice tool. Investigating this problem has lead me to a number of nice 'old school' tools: the other is tcptrace and xplot.org. I've been hand reading tcpdump output, or clicking around in ethereal/wireshark. I like tcptrace's approach. Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] TCP network cost
Excellent. I'll take a look at this and report back here. Ross On Mon, Feb 23, 2009 at 04:17:00PM -0500, Tom Lane wrote: Ross J. Reedstrom reeds...@rice.edu writes: Summary: C client and large-object API python both send bits in reasonable time, but I suspect there's still room for improvement in libpq over TCP: I'm suspicious of the 6x difference. Detailed analysis will probably find it's all down to memory allocation and extra copying of bits around (client side) I wonder if the backend isn't contributing to the problem too. It chops its sends up into 8K units, which doesn't seem to create huge overhead in my environment but maybe it does in yours. It'd be interesting to see what results you get from the attached quick-and-dirty patch (against HEAD, but it should apply back to at least 8.1). regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On Sat, Mar 20, 2010 at 10:47:30PM -0500, Andy Colson wrote: I guess, for me, once I started using PG and learned enough about it (all db have their own quirks and dark corners) I was in love. It wasnt important which db was fastest at xyz, it was which tool do I know, and trust, that can solve problem xyz. (I added the and trust as an after thought, because I do have one very important 100% uptime required mysql database that is running. Its my MythTV box at home, and I have to ask permission from my GF before I take the box down to upgrade anything. And heaven forbid if it crashes or anything. So I do have experience with care and feeding of mysql. And no, I'm not kidding.) And I choose PG. Andy, you are so me! I have the exact same one-and-only-one mission critical mysql DB, but the gatekeeper is my wife. And experience with that instance has made me love and trust PostgreSQL even more. Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
On Tue, Mar 23, 2010 at 03:22:01PM -0400, Tom Lane wrote: Ross J. Reedstrom reeds...@rice.edu writes: Andy, you are so me! I have the exact same one-and-only-one mission critical mysql DB, but the gatekeeper is my wife. And experience with that instance has made me love and trust PostgreSQL even more. So has anyone looked at porting MythTV to PG? My understanding from perusing mailing list archives is that there have been multiple attempts to provide a database neutral layer and support different backend databases (mostly w/ PG as the driver) but the lead developer has been something between disintrested and actively hostile to the idea. I think this page http://www.mythtv.org/wiki/PostgreSQL_Support say it all: deleted PostgreSQL Support (Outdated, messy and unsupported) And the Wayback machine version: http://web.archive.org/web/20080521003224/http://mythtv.org/wiki/index.php/PostgreSQL_Support Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 The Connexions Project http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9 query performance
On Sun, Jan 30, 2011 at 05:18:15PM -0500, Tom Lane wrote: Andres Freund and...@anarazel.de writes: What happens if you change the left join event.origin on event.id = origin.eventid into join event.origin on event.id = origin.eventid ? The EXISTS() requires that origin is not null anyway. (Not sure why the planner doesn't recognize that though). Sloppy thinking in reduce_outer_joins() is why. Fixed now: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=1df57f63f3f60c684aa8918910ac410e9c780713 regards, tom lane This is one of the reasons I love open source in general, and PostgreSQL in particular: Tom has the bandwidth to notice these kinds of workarounds being discussed on support lists, and turn them immediately into improvements in the planner. Partly because (I assume, based on the commit message) Andres's parenthetical comment red-flagged it for him, since he knew he could trust Andres's opinion that there was probably a planner improvement hiding here. Amazing! Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On Thu, Feb 03, 2011 at 12:44:23PM -0500, Chris Browne wrote: mladen.gog...@vmsinfo.com (Mladen Gogala) writes: Hints are not even that complicated to program. The SQL parser should compile the list of hints into a table and optimizer should check whether any of the applicable access methods exist in the table. If it does - use it. If not, ignore it. This looks to me like a philosophical issue, not a programming issue. It's worth looking back to what has already been elaborated on in the ToDo. http://wiki.postgresql.org/wiki/Todo --- Optimizer hints (not wanted) Optimizer hints are used to work around problems in the optimizer and introduce upgrade and maintenance issues. We would rather have the problems reported and fixed. We have discussed a more sophisticated system of per-class cost adjustment instead, but a specification remains to be developed. And as to the 'wait around for a new version to fix that': there are constantly excellent examples of exactly this happening, all the time with PostgreSQL - most recent example I've seen - http://archives.postgresql.org/pgsql-performance/2011-01/msg00337.php The wait often isn't long, at all. Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Really really slow select count(*)
On Tue, Feb 08, 2011 at 03:52:31PM -0600, Kevin Grittner wrote: Scott Marlowe scott.marl...@gmail.com wrote: Greg Smith g...@2ndquadrant.com wrote: Kevin and I both suggested a fast plus timeout then immediate behavior is what many users seem to want. Are there any settings in postgresql.conf that would make it unsafe to use -m immediate? I don't think so. There could definitely be problems if someone cuts power before your shutdown completes, though. (I hear that those firefighters like to cut power to a building before they grab those big brass nozzles to spray a stream of water into a building. Go figure...) Following you off topic, I know of one admin type who has stated I don't care what sort of fine the power company wants to give me, if my property's on fire, I'm going to pull the meter, in order to hand it to the first responder, rather than have them sit there waiting for the power tech to arrive while my house burns. Back on topic, I like the the idea of a timed escalation. That means there's two things to configure though, timeout(s?) and the set of states to escalate through. I can see different use cases for different sets. Hmmm: pg_ctl -m s:10:f:5:i restart for smart, 5 sec. timeout, escalate to fast, 5 sec., then immediate? Not sure how rhat would interact w/ -t. Perhaps: pg_ctl -t 10 -m s -t 5 -m f -m i restart Some video-processing tools do things like that: the order of options impacts their interaction. Ross -- Ross Reedstrom, Ph.D. reeds...@rice.edu Systems Engineer Admin, Research Scientistphone: 713-348-6166 Connexions http://cnx.orgfax: 713-348-3665 Rice University MS-375, Houston, TX 77005 GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance