Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Hannes Dorbath
Ron Mayer wrote: Greg Smith wrote: On Wed, 1 Apr 2009, Scott Carey wrote: Write caching on SATA is totally fine. There were some old ATA drives that when paried with some file systems or OS's would not be safe. There are some combinations that have unsafe write barriers. But there is a

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Mark Kirkwood
Mark Kirkwood wrote: Rebuilt with 256K chunksize: transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 24 number of transactions per client: 12000 number of transactions actually processed: 288000/288000 tps = 942.852104 (including connections establishing) tps = 943.019223

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc , with more testing showing similar issues at

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Thu, 2 Apr 2009, James Mansion wrote: Might have to give the whole disk to ZFS with Solaris to give it confidence to enable write cache Confidence, sure, but not necessarily performance at the same time. The ZFS Kool-Aid gets bitter sometimes too, and I worry that its reputation causes

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Thu, 2 Apr 2009, Scott Carey wrote: The big one, is this quote from the linux kernel list: Right now, if you want a reliable database on Linux, you _cannot_ properly depend on fsync() or fdatasync(). Considering how much Linux is used for critical databases, using these functions, this

[PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
I'm writing a plpgsql function that effectively does a merge join on the results of two queries. Now, it appears that I cannot read the results of two queries as streams in plpgsql, so I need to copy the contents of one query into an array first, and then iterate over the second query

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Robert Haas
On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling matt...@flymine.org wrote: I'm writing a plpgsql function that effectively does a merge join on the results of two queries. Now, it appears that I cannot read the results of two queries as streams in plpgsql, so I need to copy the contents of one

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Robert Haas wrote: On the other hand, the query planner should be figuring out which index to use without any help from you. If it's not, something is wrong. Unfortunately it cannot tell that SELECT l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.start =

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Robert Haas wrote: On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling matt...@flymine.org wrote: I'm writing a plpgsql function that effectively does a merge join on the results of two queries. Why not just use SQL to do the join? Because the merge condition is: WHERE

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: On Fri, 3 Apr 2009, Robert Haas wrote: Why not just use SQL to do the join? Because the merge condition is: WHERE l1.start = l2.end AND l2.start = l1.end and merge joins in postgres only currently cope with the case where the merge condition

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: I have discovered that creating large arrays in plpgql is rather slow. In fact, it seems to be O(n^2). For variable-width element types, yeah. Don't go that way. ... alternatively is there a way to read two results streams simultaneously? Use

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: On Fri, 3 Apr 2009, Tom Lane wrote: I don't actually believe that a standard merge join algorithm will work with an intransitive join condition ... A standard merge join should work absolutely fine, depending on how it's implemented. If the

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Tom Lane wrote: Not unless you have sorted the inputs in some way that has more knowledge than the equal operator represents. Otherwise you can have elements drop out that might still be needed to match to a later left-hand element. Of course. You certainly have to

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Matthew Wakeling wrote: On Fri, 3 Apr 2009, Tom Lane wrote: Not unless you have sorted the inputs in some way that has more knowledge than the equal operator represents. Otherwise you can have elements drop out that might still be needed to match to a later left-hand

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: On Fri, 3 Apr 2009, Tom Lane wrote: Not unless you have sorted the inputs in some way that has more knowledge than the equal operator represents. Otherwise you can have elements drop out that might still be needed to match to a later left-hand

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Tom Lane wrote: Intuitively, it seems like 1-D overlaps is a tractable enough operator that you should be able to make something merge-like work. But it's more complicated than I think you realize. It's tractable when the two sides are symmetrical, but not so much when

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling matt...@flymine.org wrote:  genes = '{}';  next_new = 1;  FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end LOOP     genes[next_new] = loc;     IF (next_new %

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:02 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Apr 3, 2009 at 9:32 AM, Matthew Wakeling matt...@flymine.org wrote:  genes = '{}';  next_new = 1;  FOR loc IN SELECT location.* FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid,

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Matthew Wakeling
On Fri, 3 Apr 2009, Merlin Moncure wrote: select array(SELECT location FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end)) into genes; Yeah, that works nicely. this will make array of location records. when you access the records to do

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:15 AM, Matthew Wakeling matt...@flymine.org wrote: On Fri, 3 Apr 2009, Merlin Moncure wrote: select array(SELECT location FROM location, gene WHERE location.subjectid = gene.id ORDER BY objectid, intermine_start, intermine_end)) into genes; Yeah, that works nicely.

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 9:17 AM, Matthew Wakeling matt...@flymine.org wrote: So, I have a view. The query that the view uses can be written two different ways, to use two different indexes. Then I use the view in another query, under some circumstances the first way will be quick, and under

Re: [PERFORM] Rewriting using rules for performance

2009-04-03 Thread Merlin Moncure
On Fri, Apr 3, 2009 at 11:52 AM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Apr 3, 2009 at 9:17 AM, Matthew Wakeling matt...@flymine.org wrote: So, I have a view. The query that the view uses can be written two different ways, to use two different indexes. Then I use the view in another

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Simon Riggs
On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: Matthew Wakeling matt...@flymine.org writes: On Fri, 3 Apr 2009, Robert Haas wrote: Why not just use SQL to do the join? Because the merge condition is: WHERE l1.start = l2.end AND l2.start = l1.end and merge joins in postgres

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Alvaro Herrera
Simon Riggs wrote: On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: I don't actually believe that a standard merge join algorithm will work with an intransitive join condition ... I think it's a common enough problem that having a non-standard join algorithm written for that case

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes: On Fri, 2009-04-03 at 10:04 -0400, Tom Lane wrote: I don't actually believe that a standard merge join algorithm will work with an intransitive join condition ... I think it's a common enough problem that having a non-standard join algorithm written

Re: [PERFORM] plpgsql arrays

2009-04-03 Thread Nathan Boley
Uh, no, it wouldn't.  Visually:        L1      -        L2      ---        L3      -        R1                     At L2, you'd conclude that you're done matching R1. No, you should conclude that you're done matching L2. You

[PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
Hello! Sorry for the wall of text here. I'm working on a performance POC and I'm using pgbench and could use some advice. Mostly I want to ensure that my test is valid and that I'm using pgbench properly. The story behind the POC is that my developers want to pull web items from the database

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
David Kerr d...@mr-paradox.net writes: The results actually have surprised me, the database isn't really tuned and i'm not working on great hardware. But still I'm getting: caling factor: 1 number of clients: 400 number of transactions per client: 50 number of transactions actually

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
On Fri, Apr 03, 2009 at 04:43:29PM -0400, Tom Lane wrote: - I'm not really sure how to evaulate the tps, I've read in this forum that - some folks are getting 2k tps so this wouldn't appear to be good to me. - - Well, you're running a custom transaction definition so comparing your - number to

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Scott Marlowe
On Fri, Apr 3, 2009 at 1:53 PM, David Kerr d...@mr-paradox.net wrote: Here is my transaction file: \setrandom iid 1 5 BEGIN; SELECT content FROM test WHERE item_id = :iid; END; and then i executed: pgbench -c 400 -t 50 -f trans.sql -l The results actually have surprised me, the

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, David Kerr wrote: Here is my transaction file: \setrandom iid 1 5 BEGIN; SELECT content FROM test WHERE item_id = :iid; END; Wrapping a SELECT in a BEGIN/END block is unnecessary, and it will significantly slow down things for two reason: the transactions overhead

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
David Kerr d...@mr-paradox.net writes: On Fri, Apr 03, 2009 at 04:43:29PM -0400, Tom Lane wrote: - How much more real is the target hardware than what you have? - You appear to need about a factor of 10 better disk throughput than - you have, and that's not going to be too cheap. The

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Tom Lane
Greg Smith gsm...@gregsmith.com writes: pgbench is extremely bad at simulating large numbers of clients. The pgbench client operates as a single thread that handles both parsing the input files, sending things to clients, and processing their responses. It's very easy to end up in a

[PERFORM] Using IOZone to simulate DB access patterns

2009-04-03 Thread Josh Berkus
All, I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone. Thing is, IOZone offers a huge complex series of parameters, so I'd really like to have some idea of how to configure it so its results are applicable to

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote: - Greg Smith gsm...@gregsmith.com writes: - pgbench is extremely bad at simulating large numbers of clients. The - pgbench client operates as a single thread that handles both parsing the - input files, sending things to clients, and

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread David Kerr
Gah - sorry, setting up pgbouncer for my Plan B. I meant -pgbench- Dave Kerr On Fri, Apr 03, 2009 at 04:34:58PM -0700, David Kerr wrote: - On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote: - - Greg Smith gsm...@gregsmith.com writes: - - pgbench is extremely bad at simulating large

Re: [PERFORM] Using IOZone to simulate DB access patterns

2009-04-03 Thread Josh Berkus
On 4/3/09 4:12 PM, Josh Berkus wrote: All, I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone. Related to this: is IOZone really multi-threaded? I'm doing a test run right now, and only one CPU is actually active.

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread david
On Fri, 3 Apr 2009, Greg Smith wrote: Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at http://zork.net/~nick/mail/why-reiserfs-is-teh-sukc , with more testing showing similar

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, da...@lang.hm wrote: also note that the message from Ted was back in 2004, there has been a _lot_ of work done on XFS in the last 4 years. Sure, I know they've made progress, which is why I didn't also bring up older ugly problems like delayed allocation issues reducing

Re: [PERFORM] Question on pgbench output

2009-04-03 Thread Greg Smith
On Fri, 3 Apr 2009, Tom Lane wrote: However, I don't think anyone else has been pgbench'ing transactions where client-side libpq has to absorb (and then discard) a megabyte of data per xact. I wouldn't be surprised that that eats enough CPU to make it an issue. David, did you pay any

Re: [PERFORM] Raid 10 chunksize

2009-04-03 Thread Scott Carey
On 4/3/09 6:05 PM, da...@lang.hm da...@lang.hm wrote: On Fri, 3 Apr 2009, Greg Smith wrote: Hannes sent this off-list, presumably via newsgroup, and it's certainly worth sharing. I've always been scared off of using XFS because of the problems outlined at