Re: [PERFORM] Very specialised query
On Mon, 30 Mar 2009, Marc Mamin wrote: Are your objects limited to some smaller ranges of your whole interval ? If yes you may possibly reduce the ranges to search for while using an additional table with the min(start) max(end) of each object... No, they aren't. However, even if they were, that would not actually speed up the query at all. We are already looking up in the index by objectid, and that would be an equivalent constraint to limiting by the available range of start/end values. I'm currently arguing with plpgsql over this problem, but it looks like it will run reasonably fast. Matthew -- If you're thinking Oh no, this lecturer thinks Turing Machines are a feasible method of computation, where's the door?, then you are in luck. There are some there, there, and by the side there. Oxygen masks will not drop from the ceiling... -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I have a fusion IO drive available for testing
On Fri, Mar 27, 2009 at 4:33 PM, David Rees dree...@gmail.com wrote: On Fri, Mar 27, 2009 at 10:30 AM, da...@lang.hm wrote: On Thu, 26 Mar 2009, Dave Cramer wrote: So far using dd I am seeing around 264MB/s on ext3, 335MB/s on ext2 write speed. So the question becomes what is the best filesystem for this drive? until the current mess with ext3 and fsync gets resolved, i would say it would probably be a bad choice. I consider ext4 too new, so I would say XFS or ext2 (depending on if you need the journal or not) If you're worried about the performance implications of ext3 in data=ordered mode, the best thing to do is to mount the filesystem in data=writeback mode instead. If you're only using the filesystem for PostgreSQL data or logs, your data will be just as safe except now that data and metadata won't be forced to disk in the order it was written. And you still get the benefit of a journal so fsck's after a crash will be fast. XFS probably is a decent choice, but I don't have much experience with it except on a desktop system where I can tell you that having write barriers on absolutely kills performance of anything that does a lot of filesystem metadata updates. Again, not a big concern if the filesystem is only being used for PostgreSQL data or logs. -Dave So I tried writing directly to the device, gets around 250MB/s, reads at around 500MB/s The client is using redhat so xfs is not an option. Dave
Re: [PERFORM] I have a fusion IO drive available for testing
Dave Cramer wrote: So I tried writing directly to the device, gets around 250MB/s, reads at around 500MB/s The client is using redhat so xfs is not an option. I'm using Red Hat and XFS, and have been for years. Why is XFS not an option with Red Hat? Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I have a fusion IO drive available for testing
craig_ja...@emolecules.com (Craig James) writes: Dave Cramer wrote: So I tried writing directly to the device, gets around 250MB/s, reads at around 500MB/s The client is using redhat so xfs is not an option. I'm using Red Hat and XFS, and have been for years. Why is XFS not an option with Red Hat? If you report practically any kind of problem, and you're using XFS, or JFS, or such, their support offering is to tell you to use a supported filesystem. -- let name=cbbrowne and tld=cbbrowne.com in name ^ @ ^ tld;; http://linuxdatabases.info/info/linuxxian.html The only thing better than TV with the sound off is Radio with the sound off. -- Dave Moon -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very specialised query
On Mon, 30 Mar 2009, Віталій Тимчишин wrote: select case when n == 1 then id1 else id2 end, case when n == 2 then id1 else id2 end from ( SELECT l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.objectid = 22893 AND l2.objectid = 22893 AND l1.id l2.id AND l1.start l2.end AND l1.end l2.start AND l1.start l2.start) a, (values (1),(2)) b(n) It is a nice idea. However, the planner gets the join the wrong way round: select distinct case when n = 1 then id1 else id2 end, case when n = 1 then id2 else id1 end FROM ( select l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.id l2.id AND l1.objectid = l2.objectid AND l1.start = l2.end AND l2.start = l1.end AND l1.start = l2.start ) AS a, (values (1), (2)) b(n); QUERY PLAN -- Unique (cost=7366497963.75..7637346831.94 rows=36113182426 width=12) (actual time=1642178.623..2206678.691 rows=139606782 loops=1) - Sort (cost=7366497963.75..7456780919.81 rows=36113182426 width=12) (actual time=1642178.619..1899057.147 rows=166377424 loops=1) Sort Key: (CASE WHEN (*VALUES*.column1 = 1) THEN l1.subjectid ELSE l2.subjectid END), (CASE WHEN (*VALUES*.column1 = 1) THEN l2.subjectid ELSE l1.subjectid END) Sort Method: external merge Disk: 3903272kB - Nested Loop (cost=0.00..592890483.66 rows=36113182426 width=12) (actual time=85.333..984211.011 rows=166377424 loops=1) - Values Scan on *VALUES* (cost=0.00..0.03 rows=2 width=4) (actual time=0.002..0.008 rows=2 loops=1) - Nested Loop (cost=0.00..25596373.62 rows=18056591213 width=8) (actual time=42.684..322743.335 rows=83188712 loops=2) Join Filter: ((l1.subjectid l2.subjectid) AND (l1.intermine_start = l2.intermine_end)) - Seq Scan on location l1 (cost=0.00..78076.79 rows=3490079 width=16) (actual time=0.008..3629.672 rows=3490079 loops=2) - Index Scan using location_test_obj_start on location l2 (cost=0.00..3.89 rows=152 width=16) (actual time=0.005..0.038 rows=25 loops=6980158) Index Cond: ((l2.objectid = l1.objectid) AND (l2.intermine_start = l1.intermine_end) AND (l1.intermine_start = l2.intermine_start)) Total runtime: 2339619.383 ms The outer nested join has the VALUES as the main loop, and the complicated join as the leaf. So, the complicated overlap-finding join gets run twice. Oh, there's also the great big sort and unique, but I think I can get rid of that. Matthew -- Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are. -- Kyle Hearn -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very specialised query
The outer nested join has the VALUES as the main loop, and the complicated join as the leaf. So, the complicated overlap-finding join gets run twice. That's weird. What do you have as statistics target? Planner is incorrect few orders of magnitude, so increasing it may help. BTW: One of constraints is redundant l1.start = l2.start implies l1.start = l2.end, so latter can be removed as for me. Oh, there's also the great big sort and unique, but I think I can get rid of that. As far as I can see, duplicates will occur if and only if l1.start == l2.start l1.end == l2.end. That can be easily filtered by adding where n=1 or l1.start != l2.start or l1.end != l2.end to outer select.
Re: [PERFORM] Strange behavior: pgbench and new Linux kernels
On Thu, Apr 17, 2008 at 7:26 PM, Greg Smith wrote: On this benchmark 2.6.25 is the worst kernel yet: It seems I have a lot of work ahead of me here to nail down and report what's going on here. I don't remember seeing a follow-up on this issue from last year. Are there still any particular kernels to avoid based on this? Thanks, -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How to get parallel restore in PG 8.4 to work?
Hi, For performance reasons (obviously ;)) I'm experimenting with parallel restore in PG 8.4. I grabbed the latest source snapshot (of today, March 30) and compiled this with zlib support. I dumped a DB from PG 8.3.5 (using maximum compression). I got this message however: postg...@mymachine:/home/henk/postgresql-8.4/bin$ time ./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc /home/henk/test-databases/dumps/db_test.custom pg_restore: [archiver] WARNING: archive is compressed, but this installation does not support compression -- no data will be available pg_restore: [archiver] cannot restore from compressed archive (compression not supported in this installation) So initially it seemed its only possible to do a pg_restore using the uncompressed pg_dump custom format. So I tried the uncompressed dump, but this too failed. This last part was a little problematic anyway, since pg_dump absolutely wants to read its input from a file and does not accept any input from stdin. I assume reading from a file is necessary for the multiple parallel processes to each read their own part of the file, something which might be difficult to do when reading from stdin. Apart from the fact that it simply doesn't work for me at the moment, I see a major problem with this approach though. Dumping in the custom format (option -Fc) is far slower than dumping in the plain format. Even if the parallel restore would speed up things, then the combined time of a dump and restore would still be negatively affected when compared to doing a plain dump and restore. I'm aware of the fact that I might be hitting some bugs, as a development snapshot is by definition of course not stable. Also, perhaps I'm missing something. My question is thus; could someone advise me how to get parallel restore to work and how to speed up a dump in the custom file format? Many thanks in advance _ See all the ways you can stay connected to friends and family http://www.microsoft.com/windows/windowslive/default.aspx
Re: [PERFORM] How to get parallel restore in PG 8.4 to work?
henk de wit henk53...@hotmail.com writes: For performance reasons (obviously ;)) I'm experimenting with parallel restore in PG 8.4. I grabbed the latest source snapshot (of today, March 30) and compiled this with zlib support. I dumped a DB from PG 8.3.5 (using maximum compression). I got this message however: postg...@mymachine:/home/henk/postgresql-8.4/bin$ time ./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc /home/henk/test-databases/dumps/db_test.custom pg_restore: [archiver] WARNING: archive is compressed, but this installation does not support compression -- no data will be available pg_restore: [archiver] cannot restore from compressed archive (compression not supported in this installation) As far as one can tell from here, you built *without* zlib support. This is unrelated to parallel restore as such. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange behavior: pgbench and new Linux kernels
On Tue, 31 Mar 2009, Kevin Grittner wrote: On Thu, Apr 17, 2008 at 7:26 PM, Greg Smith wrote: On this benchmark 2.6.25 is the worst kernel yet: It seems I have a lot of work ahead of me here to nail down and report what's going on here. I don't remember seeing a follow-up on this issue from last year. Are there still any particular kernels to avoid based on this? I never got any confirmation that the patches that came out of my discussions with the kernel developers were ever merged. I'm in the middle of a bunch of pgbench tests this week, and one of the things I planned to try was seeing if the behavior has changed in 2.6.28 or 2.6.29. I'm speaking about pgbench at the PostgreSQL East conference this weekend and will have an update by then (along with a new toolchain for automating large quantities of pgbench tests). -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance