Re: [PERFORM] Very specialised query

2009-03-31 Thread Matthew Wakeling

On Mon, 30 Mar 2009, Marc Mamin wrote:

Are your objects limited to some smaller ranges of your whole interval ?
If yes you may possibly reduce the ranges to search for while using an 
additional table with the min(start) max(end) of each
object...


No, they aren't. However, even if they were, that would not actually speed 
up the query at all. We are already looking up in the index by objectid, 
and that would be an equivalent constraint to limiting by the available 
range of start/end values.


I'm currently arguing with plpgsql over this problem, but it looks like 
it will run reasonably fast.


Matthew

--
If you're thinking Oh no, this lecturer thinks Turing Machines are a feasible
method of computation, where's the door?, then you are in luck. There are
some there, there, and by the side there. Oxygen masks will not drop from the
ceiling...  -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] I have a fusion IO drive available for testing

2009-03-31 Thread Dave Cramer
On Fri, Mar 27, 2009 at 4:33 PM, David Rees dree...@gmail.com wrote:

 On Fri, Mar 27, 2009 at 10:30 AM,  da...@lang.hm wrote:
  On Thu, 26 Mar 2009, Dave Cramer wrote:
  So far using dd I am seeing around 264MB/s on ext3, 335MB/s on ext2
 write
  speed. So the question becomes what is the best filesystem for this
 drive?
 
  until the current mess with ext3 and fsync gets resolved, i would say it
  would probably be a bad choice. I consider ext4 too new, so I would say
 XFS
  or ext2 (depending on if you need the journal or not)

 If you're worried about the performance implications of ext3 in
 data=ordered mode, the best thing to do is to mount the filesystem in
 data=writeback mode instead.

 If you're only using the filesystem for PostgreSQL data or logs, your
 data will be just as safe except now that data and metadata won't be
 forced to disk in the order it was written.

 And you still get the benefit of a journal so fsck's after a crash will be
 fast.

 XFS probably is a decent choice, but I don't have much experience with
 it except on a desktop system where I can tell you that having write
 barriers on absolutely kills performance of anything that does a lot
 of filesystem metadata updates.  Again, not a big concern if the
 filesystem is only being used for PostgreSQL data or logs.

 -Dave


So I tried writing directly to the device, gets around 250MB/s, reads at
around 500MB/s

The client is using redhat so xfs is not an option.

Dave


Re: [PERFORM] I have a fusion IO drive available for testing

2009-03-31 Thread Craig James

Dave Cramer wrote:
So I tried writing directly to the device, gets around 250MB/s, reads at 
around 500MB/s


The client is using redhat so xfs is not an option.


I'm using Red Hat and XFS, and have been for years. Why is XFS not an option 
with Red Hat?

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] I have a fusion IO drive available for testing

2009-03-31 Thread Chris Browne
craig_ja...@emolecules.com (Craig James) writes:
 Dave Cramer wrote:
 So I tried writing directly to the device, gets around 250MB/s,
 reads at around 500MB/s

 The client is using redhat so xfs is not an option.

 I'm using Red Hat and XFS, and have been for years. Why is XFS not an option 
 with Red Hat?

If you report practically any kind of problem, and you're using XFS,
or JFS, or such, their support offering is to tell you to use a
supported filesystem.
-- 
let name=cbbrowne and tld=cbbrowne.com in name ^ @ ^ tld;;
http://linuxdatabases.info/info/linuxxian.html
The only thing  better than TV with the  sound off is  Radio with the
sound off. -- Dave Moon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very specialised query

2009-03-31 Thread Matthew Wakeling

On Mon, 30 Mar 2009, Віталій Тимчишин wrote:

select
case when n == 1 then id1 else id2 end,
case when n == 2 then id1 else id2 end

from (
SELECT
   l1.id AS id1,
   l2.id AS id2
FROM
   location l1,
   location l2
WHERE
       l1.objectid = 22893
   AND l2.objectid = 22893
   AND l1.id  l2.id
   AND l1.start  l2.end
   AND l1.end  l2.start
   AND l1.start  l2.start) a, (values (1),(2)) b(n)


It is a nice idea. However, the planner gets the join the wrong way round:

select distinct
case when n = 1 then id1 else id2 end,
case when n = 1 then id2 else id1 end
FROM (
select
l1.id AS id1,
l2.id AS id2
FROM
location l1,
location l2
WHERE
l1.id  l2.id
AND l1.objectid = l2.objectid
AND l1.start = l2.end
AND l2.start = l1.end
AND l1.start = l2.start
) AS a,
(values (1), (2)) b(n);

QUERY PLAN
--
 Unique  (cost=7366497963.75..7637346831.94 rows=36113182426 width=12)
 (actual time=1642178.623..2206678.691 rows=139606782 loops=1)
   -  Sort  (cost=7366497963.75..7456780919.81 rows=36113182426 width=12)
 (actual time=1642178.619..1899057.147 rows=166377424 loops=1)
 Sort Key: (CASE WHEN (*VALUES*.column1 = 1) THEN l1.subjectid ELSE 
l2.subjectid END), (CASE WHEN (*VALUES*.column1 = 1) THEN l2.subjectid ELSE 
l1.subjectid END)
 Sort Method:  external merge  Disk: 3903272kB
 -  Nested Loop  (cost=0.00..592890483.66 rows=36113182426 width=12)
  (actual time=85.333..984211.011 rows=166377424 
loops=1)
   -  Values Scan on *VALUES*  (cost=0.00..0.03 rows=2 width=4)
  (actual time=0.002..0.008 rows=2 
loops=1)
   -  Nested Loop  (cost=0.00..25596373.62 rows=18056591213 
width=8)
(actual time=42.684..322743.335 rows=83188712 
loops=2)
 Join Filter: ((l1.subjectid  l2.subjectid) AND 
(l1.intermine_start = l2.intermine_end))
 -  Seq Scan on location l1
(cost=0.00..78076.79 rows=3490079 width=16)
(actual time=0.008..3629.672 rows=3490079 
loops=2)
 -  Index Scan using location_test_obj_start on location l2
(cost=0.00..3.89 rows=152 width=16)
(actual time=0.005..0.038 rows=25 loops=6980158)
   Index Cond: ((l2.objectid = l1.objectid) AND 
(l2.intermine_start = l1.intermine_end) AND (l1.intermine_start = 
l2.intermine_start))
 Total runtime: 2339619.383 ms

The outer nested join has the VALUES as the main loop, and the complicated 
join as the leaf. So, the complicated overlap-finding join gets run twice.


Oh, there's also the great big sort and unique, but I think I can get rid 
of that.


Matthew

--
Contrary to popular belief, Unix is user friendly. It just happens to be
very selective about who its friends are. -- Kyle Hearn
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very specialised query

2009-03-31 Thread Віталій Тимчишин


 The outer nested join has the VALUES as the main loop, and the complicated
 join as the leaf. So, the complicated overlap-finding join gets run twice.


That's weird. What do you have as statistics target? Planner is incorrect
few orders of magnitude, so increasing it may help.
BTW: One of constraints is redundant l1.start = l2.start implies l1.start
= l2.end, so latter can be removed as for me.




 Oh, there's also the great big sort and unique, but I think I can get rid
 of that.


As far as I can see, duplicates will occur if and only if l1.start ==
l2.start  l1.end == l2.end.
That can be easily filtered by adding where n=1 or l1.start != l2.start or
l1.end != l2.end to outer select.


Re: [PERFORM] Strange behavior: pgbench and new Linux kernels

2009-03-31 Thread Kevin Grittner
 On Thu, Apr 17, 2008 at  7:26 PM, Greg Smith wrote: 
 
 On this benchmark 2.6.25 is the worst kernel yet:
 
 It seems I have a lot of work ahead of me here
 to nail down and report what's going on here.
 
I don't remember seeing a follow-up on this issue from last year.
Are there still any particular kernels to avoid based on this?
 
Thanks,
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] How to get parallel restore in PG 8.4 to work?

2009-03-31 Thread henk de wit

Hi,
For performance reasons (obviously ;)) I'm experimenting with parallel restore 
in PG 8.4. I grabbed the latest source snapshot (of today, March 30) and 
compiled this with zlib support. I dumped a DB from PG 8.3.5 (using maximum 
compression). I got this message however:
postg...@mymachine:/home/henk/postgresql-8.4/bin$ time
./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc
/home/henk/test-databases/dumps/db_test.custom
pg_restore: [archiver] WARNING: archive is compressed, but this
installation does not support compression -- no data will be available
pg_restore: [archiver] cannot restore from compressed archive (compression
not supported in this installation)

So initially it seemed its only possible to do a pg_restore using the 
uncompressed pg_dump custom format. So I tried the uncompressed dump, but this 
too failed. This last part was a little problematic anyway, since pg_dump 
absolutely wants to read its input from a file and does not accept any input 
from stdin. I assume reading from a file is necessary for the multiple parallel 
processes to each read their own part of the file, something which might be 
difficult to do when reading from stdin.
Apart from the fact that it simply doesn't work for me at the moment, I see a 
major problem with this approach though. Dumping in the custom format (option 
-Fc) is far slower than dumping in the plain format. Even if the parallel 
restore would speed up things, then the combined time of a dump and restore 
would still be negatively affected when compared to doing a plain dump and 
restore. I'm aware of the fact that I might be hitting some bugs, as a 
development snapshot is by definition of course not stable. Also, perhaps I'm 
missing something.
My question is thus; could someone advise me how to get parallel restore to 
work and how to speed up a dump in the custom file format?
Many thanks in advance


_
See all the ways you can stay connected to friends and family
http://www.microsoft.com/windows/windowslive/default.aspx

Re: [PERFORM] How to get parallel restore in PG 8.4 to work?

2009-03-31 Thread Tom Lane
henk de wit henk53...@hotmail.com writes:
 For performance reasons (obviously ;)) I'm experimenting with parallel 
 restore in PG 8.4. I grabbed the latest source snapshot (of today, March 30) 
 and compiled this with zlib support. I dumped a DB from PG 8.3.5 (using 
 maximum compression). I got this message however:
 postg...@mymachine:/home/henk/postgresql-8.4/bin$ time
 ./pg_restore -p 5434 -h localhost -U henk -d db_test -j 8 -Fc
 /home/henk/test-databases/dumps/db_test.custom
 pg_restore: [archiver] WARNING: archive is compressed, but this
 installation does not support compression -- no data will be available
 pg_restore: [archiver] cannot restore from compressed archive (compression
 not supported in this installation)

As far as one can tell from here, you built *without* zlib support.
This is unrelated to parallel restore as such.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange behavior: pgbench and new Linux kernels

2009-03-31 Thread Greg Smith

On Tue, 31 Mar 2009, Kevin Grittner wrote:


On Thu, Apr 17, 2008 at  7:26 PM, Greg Smith wrote:



On this benchmark 2.6.25 is the worst kernel yet:



It seems I have a lot of work ahead of me here
to nail down and report what's going on here.


I don't remember seeing a follow-up on this issue from last year.
Are there still any particular kernels to avoid based on this?


I never got any confirmation that the patches that came out of my 
discussions with the kernel developers were ever merged.  I'm in the 
middle of a bunch of pgbench tests this week, and one of the things I 
planned to try was seeing if the behavior has changed in 2.6.28 or 2.6.29. 
I'm speaking about pgbench at the PostgreSQL East conference this weekend 
and will have an update by then (along with a new toolchain for automating 
large quantities of pgbench tests).


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance