Re: [PERFORM] Cost estimate vs. actual - do I care?

2012-01-01 Thread Mark Mielke
3), 2), 1). The planner needs the right information to make the right decision. However, the planner rarely has perfect information, so the algorithms need to be able to cope with some amount of imperfection while still generally making the right decision. There are also a limited

Re: [PERFORM] Does auto-analyze work on dirty writes?

2011-02-04 Thread Mark Mielke
On 02/04/2011 10:41 AM, Tom Lane wrote: 1. Autovacuum fires when the stats collector's insert/update/delete counts have reached appropriate thresholds. Those counts are accumulated from messages sent by backends at transaction commit or rollback, so they take no account of what's been done by

Does auto-analyze work on dirty writes? (was: Re: [HACKERS] [PERFORM] Slow count(*) again...)

2011-02-03 Thread Mark Mielke
On 02/03/2011 09:45 PM, Conor Walsh wrote: My understanding is that auto-analyze will fire only after my transaction is completed, because it is a seperate daemon. If I do like so: BEGIN; COPY ...; -- Dangerously un-analyzed SELECT complicated-stuff ...; END; Auto-analyze does not benefit me,

Re: [PERFORM] wal_synch_method = open_sync safe on RHEL 5.5?

2010-06-17 Thread Mark Mielke
The conclusion I read was that Linux O_SYNC behaves like O_DSYNC on other systems. For WAL, this seems satisfactory? Personally, I use fdatasync(). I wasn't able to measure a reliable difference for my far more smaller databases, and fdatasync() seems reliable and fast enough, that fighting

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Mark Mielke
On 02/23/2010 04:22 PM, da...@lang.hm wrote: On Tue, 23 Feb 2010, Aidan Van Dyk wrote: * da...@lang.hm da...@lang.hm [100223 15:05]: However, one thing that you do not get protection against with software raid is the potential for the writes to hit some drives but not others. If this happens

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Mark Mielke
On 02/22/2010 08:04 PM, Greg Smith wrote: Arjen van der Meijden wrote: That's weird. Intel's SSD's didn't have a write cache afaik: I asked Intel about this and it turns out that the DRAM on the Intel drive isn't used for user data because of the risk of data loss, instead it is used as

Re: [PERFORM] SATA drives performance

2009-12-24 Thread Mark Mielke
On 12/24/2009 10:51 AM, Greg Smith wrote: 7. If you have 3 equal disks, try doing some experiments. My inclination would be to set them all up with ext4... I have yet to yet a single positive thing about using ext4 for PostgreSQL. Stick with ext3, where the problems you might run into are

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Mark Mielke
On 11/17/2009 01:51 PM, Greg Smith wrote: Merlin Moncure wrote: I am right now talking to someone on postgresql irc who is measuring 15k iops from x25-e and no data loss following power plug test. The funny thing about Murphy is that he doesn't visit when things are quiet. It's quite possible

Re: [PERFORM] UUID as primary key

2009-10-10 Thread Mark Mielke
On 10/10/2009 01:14 AM, tsuraan wrote: The most significant impact is that it takes up twice as much space, including the primary key index. This means fewer entries per block, which means slower scans and/or more blocks to navigate through. Still, compared to the rest of the overhead of an

Re: [PERFORM] UUID as primary key

2009-10-09 Thread Mark Mielke
On 10/09/2009 12:56 PM, tsuraan wrote: I have a system where it would be very useful for the primary keys for a few tables to be UUIDs (actually MD5s of files, but UUID seems to be the best 128-bit type available). What is the expected performance of using a UUID as a primary key which will

Re: [PERFORM] Best suiting OS

2009-10-04 Thread Mark Mielke
On 10/01/2009 03:44 PM, Denis Lussier wrote: I'm a BSD license fan, but, I don't know much about *BSD otherwise (except that many advocates say it runs PG very nicely). On the Linux side, unless your a dweeb, go with a newer, popular well supported release for Production. IMHO, that's RHEL

Maybe OT, not sure Re: [PERFORM] Best suiting OS

2009-10-04 Thread Mark Mielke
This is kind of OT, unless somebody really is concerned with understanding the + and - of distributions, and is willing to believe the content of this thread as being accurate and objective... :-) On 10/04/2009 08:42 PM, Scott Marlowe wrote: On Sun, Oct 4, 2009 at 8:05 AM, Mark

Re: [PERFORM] Best suiting OS

2009-10-02 Thread Mark Mielke
On 10/02/2009 10:23 AM, Matthew Wakeling wrote: On Fri, 2 Oct 2009, Tom Lane wrote: You switched OSes instead of complaining to the repository maintainer that he'd forgotten a subpackage? You must have a lot of time on your hands. Camel's back, straw. Besides, both I and our sysadmin are

Re: [PERFORM] Best suiting OS

2009-10-02 Thread Mark Mielke
On 10/02/2009 01:20 PM, Merlin Moncure wrote: I know I'm in the minority here, but I _always_ compile postgresql myself directly from official sources. It's easy enough and you never know when you have to do an emergency patch or cassert build, etc. +1 I decided to do this as soon as I

Re: [PERFORM] What exactly is postgres doing during INSERT/UPDATE ?

2009-08-30 Thread Mark Mielke
On 08/30/2009 11:40 AM, Merlin Moncure wrote: For random writes, raid 5 has to write a minimum of two drives, the data being written and parity. Raid 10 also has to write two drives minimum. A lot of people think parity is a big deal in terms of raid 5 performance penalty, but I don't --

Re: [PERFORM] Memory reporting on CentOS Linux

2009-08-15 Thread Mark Mielke
On 08/15/2009 11:39 AM, Jeremy Carroll wrote: Linux strives to always use 100% of memory at any given time. Therefore the system will always throw free memory into swap cache. The kernel will (and can) take any memory away from the swap cache at any time for resident (physical) memory for

Re: [PERFORM] Will Postgres ever lock with read only queries?

2009-07-27 Thread Mark Mielke
On 07/27/2009 08:54 PM, Robert James wrote: Hi. I'm seeing some weird behavior in Postgres. I'm running read only queries (SELECT that is - no UPDATE or DELETE or INSERT is happening at all). I can run one rather complicated query and the results come back... eventually. Likewise with

Re: [PERFORM] hyperthreaded cpu still an issue in 8.4?

2009-07-21 Thread Mark Mielke
On 07/21/2009 10:36 AM, Grzegorz Jaƛkiewicz wrote: On Tue, Jul 21, 2009 at 3:16 PM, Scott Marlowescott.marl...@gmail.com wrote: On Tue, Jul 21, 2009 at 6:42 AM, Doug Hunleyd...@hunley.homeip.net wrote: Just wondering is the issue referenced in

Re: [PERFORM] Bundling postgreSQL with my Java application

2009-07-06 Thread Mark Mielke
On 07/06/2009 03:17 AM, Saurabh Dave wrote: No offense intended - but have you looked at the documentation for postgresql.conf? If you are going to include PostgreSQL in your application, I'd highly recommend you understand what you are including. :-) I had a look into the documentation of

Re: [PERFORM] Six PostgreSQL questions from a pokerplayer

2009-07-06 Thread Mark Mielke
On 07/06/2009 06:23 AM, Stephen Frost wrote: * Craig Ringer (cr...@postnewspapers.com.au) wrote: What that does mean, though, is that if you don't have significantly more RAM than a 32-bit machine can address (say, 6 to 8 GB), you should stick with 32-bit binaries. I'm not sure this

Re: [PERFORM] Bundling postgreSQL with my Java application

2009-07-05 Thread Mark Mielke
On 07/06/2009 01:48 AM, Saurabh Dave wrote: We are bundling PostgreSQL 8.3.7 with our Java based application. We observe that in some systems the Database access becomes very slow after running it for couple of days. We understand that postgresql.conf needs to be adjusted as per the system

Re: [PERFORM] Nested Loop Killer on 8.1

2009-06-25 Thread Mark Mielke
On 06/25/2009 04:36 PM, Greg Stark wrote: AND web_user_property_directory_outbox.prop_key like 'location_node_directory_outbox' Why use like for a constant string with no % or _ characters? If you used = the planner might be able to come up with a better estimate Any reason why

Re: [PERFORM] Scalability in postgres

2009-06-04 Thread Mark Mielke
of some sort. Cheers, mark -- Mark Mielke m...@mielke.cc

Re: [PERFORM] Scalability in postgres

2009-06-04 Thread Mark Mielke
da...@lang.hm wrote: On Thu, 4 Jun 2009, Mark Mielke wrote: You should really only have as 1X or 2X many threads as there are CPUs waiting on one monitor. Beyond that is waste. The idle threads can be pooled away, and only activated (with individual monitors which can be far more easily

Re: [PERFORM] Scalability in postgres

2009-06-04 Thread Mark Mielke
OR are slow? It takes theory to answer why and so, what do we do about it? Cheers, mark -- Mark Mielke m...@mielke.cc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] RAID arrays and performance

2008-09-19 Thread Mark Mielke
into a temporary table, then SELECT to join the results, and pull all of the results, doing additional processing (UPDATE) as you pull? Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] Postgres not using array

2008-08-21 Thread Mark Mielke
to be significant and possibly out-perform the 3.0 Ghz x 1. If you usually only have one query running at the same time, I expect the 3.0 Ghz x 1 to always win. PostgreSQL isn't good at splitting the load from a single client across multiple CPU cores. Cheers, mark -- Mark Mielke [EMAIL

Re: [PERFORM] file system and raid performance

2008-08-16 Thread Mark Mielke
be considered on their own. Personally, I use data=writeback for most purposes, but use data=journal for /mail and /home. In these cases, I find even the default ext3 mode to be fewer guarantees than I am comfortable with. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql

Re: [PERFORM] Using PK value as a String

2008-08-12 Thread Mark Mielke
cost, just in case. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] Using PK value as a String

2008-08-12 Thread Mark Mielke
Gregory Stark wrote: Mark Mielke [EMAIL PROTECTED] writes: - Increased keyspace. Even if keyspace allocation is performed, an int4 only has 32-bit of keyspace to allocate. The IPv4 address space is already over 85% allocated as an example of how this can happen. 128-bits has a LOT more

Re: [PERFORM] Using PK value as a String

2008-08-12 Thread Mark Mielke
integers with a good random number source. :-) ) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] file system and raid performance

2008-08-07 Thread Mark Mielke
is that modern Linux distributions do not benefit from noatime as much as they have in the past. In this case, noatime vs default would probably be measuring % noise. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] ??: Postgresql update op is very very slow

2008-06-26 Thread Mark Mielke
, but is only kept as pointers that allow any part of the table to be re-built on access. The UPDATE statement could be recorded cheaply, but queries against the UPDATE statement might be very expensive. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing

Re: [PERFORM] Posible planner improvement?

2008-05-21 Thread Mark Mielke
entirely? If t1.id = t2.id, I would expect the planner to substitute them freely in terms of identities? Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org

Re: [PERFORM] Group by more efficient than distinct?

2008-04-22 Thread Mark Mielke
Matthew Wakeling wrote: On Mon, 21 Apr 2008, Mark Mielke wrote: This surprises me - hash values are lossy, so it must still need to confirm against the real list of values, which at a minimum should require references to the rows to check against? Is PostgreSQL doing something beyond my

Re: [PERFORM] Group by more efficient than distinct?

2008-04-22 Thread Mark Mielke
Matthew Wakeling wrote: On Tue, 22 Apr 2008, Mark Mielke wrote: The poster I responded to said that the memory required for a hash join was relative to the number of distinct values, not the number of rows. They gave an example of millions of rows, but only a few distinct values. Above, you

Re: [PERFORM] Group by more efficient than distinct?

2008-04-21 Thread Mark Mielke
at a minimum should require references to the rows to check against? Is PostgreSQL doing something beyond my imagination? :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http

Re: [PERFORM] Group by more efficient than distinct?

2008-04-21 Thread Mark Mielke
Mark Mielke wrote: PFC wrote: Actually, the memory used by the hash depends on the number of distinct values, not the number of rows which are processed... Consider : SELECT a GROUP BY a SELECT a,count(*) GROUP BY a In both cases the hash only holds discinct values. So if you

Re: [PERFORM] SQL Function Slowness, 8.3.0

2008-04-16 Thread Mark Mielke
, Gavin -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] POSIX file updates

2008-03-31 Thread Mark Mielke
on the premise that POSIX enforces such a thing, or that systems are POSIX compliant. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql

Re: [PERFORM] count * performance issue

2008-03-10 Thread Mark Mielke
this truth. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] count * performance issue

2008-03-07 Thread Mark Mielke
can take time if the index is large (therefore not instantaneous). Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] count * performance issue

2008-03-05 Thread Mark Mielke
a table scan for databases that can accurately determine counts using only the index, but it's still a relatively slow operation, and people don't normally need an accurate count for records in the range of 100,000+? :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Mark Mielke
shine? Curious. Thanks, mark -- Mark Mielke [EMAIL PROTECTED] -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your Subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance

Re: [PERFORM] How to allocate 8 disks

2008-03-03 Thread Mark Mielke
Matthew wrote: On Mon, 3 Mar 2008, Mark Mielke wrote: Has anybody been able to prove to themselves that RAID 0 vs RAID 1+0 is faster for these sorts of loads? My understanding is that RAID 1+0 *can* reduce latency for reads, but that it relies on random access, whereas RAID 0 performs best

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-27 Thread Mark Mielke
Bill Moran wrote: In response to Mark Mielke [EMAIL PROTECTED]: Bill Moran wrote: I'm fairly sure that FreeBSD's GEOM does. Of course, it couldn't be doing consistency checking at that point. According to this: http://www.freebsd.org/cgi/man.cgi?query=gmirrorapropos=0sektion

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
had too much egg nog... :-) Yep - checking consistency on read would eliminate the performance benefits of RAID under any redundant configuration. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
[EMAIL PROTECTED] wrote: On Wed, 26 Dec 2007, Mark Mielke wrote: Florian Weimer wrote: seek/read/calculate/seek/write since the drive moves on after the read), when you read you must read _all_ drives in the set to check the data integrity. I don't know of any RAID implementation

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
at all. I note that you also disagree with Dave, in that you are not claiming it performs consistency checks on read. No system does this as performance would go to the crapper. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
. This is the default balance algorithm. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
, wal, and other parts, and RAID 0 for a build partition. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
would work) Yep. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] With 4 disks should I go for RAID 5 or RAID 10

2007-12-26 Thread Mark Mielke
of the system size), I would suggest RAID 1+0 on all four as sensible compromise. If you can put more in - start to consider breaking it up. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9

[PERFORM] Combining two bitmap scans out performs a single regular index scan?

2007-12-08 Thread Mark Mielke
? The good thing is that bitmap scan seems to be well optimized. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Combining two bitmap scans out performs a single regular index scan?

2007-12-08 Thread Mark Mielke
Tom Lane wrote: Mark Mielke [EMAIL PROTECTED] writes: To find records after a certain time, I must do one of: select * from icpric where audtdate ? or (audtdate = ? and audttime ?) In recent releases (at least 8.2, don't remember about 8.1), a row comparison is what you want

Re: [PERFORM] RAID arrays and performance

2007-12-04 Thread Mark Mielke
. Just a suggestion... I recall talk of more intelligent table scanning algorithms, and the use of asynchronous I/O to benefit from RAID arrays, but the numbers prepared to convince people that the change would have effect have been less than impressive. Cheers, mark -- Mark Mielke [EMAIL

Re: [PERFORM] RAID arrays and performance

2007-12-04 Thread Mark Mielke
, it is wasting it's time. I am not trying to discourage you - only trying to ensure that you have reasonable expectations. 12X is far too optimistic. Please show one of your query plans and how you as a person would design which pages to request reads for. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] RAID arrays and performance

2007-12-04 Thread Mark Mielke
James Mansion wrote: Mark Mielke wrote: This assumes that you can know which pages to fetch ahead of time - which you do not except for sequential read of a single table. Why doesn't it help to issue IO ahead-of-time requests when you are scanning an index? You can read-ahead in index pages

Re: [PERFORM] RAID arrays and performance

2007-12-04 Thread Mark Mielke
, neither may be the solution to your problem. Or they may be. We wouldn't know without numbers. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] RAID arrays and performance

2007-12-04 Thread Mark Mielke
James Mansion wrote: Mark Mielke wrote: PostgreSQL or the kernel should already have the hottest pages in memory, so the value of doing async I/O is very likely the cooler pages that are unique to the query. We don't know what the cooler pages are until we follow three tree down. I'm

Re: [PERFORM] Union within View vs.Union of Views

2007-11-03 Thread Mark Mielke
. :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend

Re: [PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1

2007-09-11 Thread Mark Mielke
Decibel! wrote: On Mon, Sep 10, 2007 at 06:22:06PM -0400, Mark Mielke wrote: In my case, I set effective_cache_size to 25% of the RAM available to the system (256 Mbytes), for a database that was about 100 Mbytes or less. I found performance to increase when reducing random_page_cost from

Re: [PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1

2007-09-10 Thread Mark Mielke
for me. I think this means that the planner doesn't understand my database size : effective memory size ratio. :-) Anyways - my point is that if you change the default to 10 you may hurt people like me. Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast

Re: [PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1

2007-09-10 Thread Mark Mielke
first. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] random_page_costs - are defaults of 4.0 realistic for SCSI RAID 1

2007-09-10 Thread Mark Mielke
may have been 8.1. I am also curious to see what the current algorithm is with regard to effective_cache_size. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] Transaction Log

2007-08-29 Thread Mark Mielke
, but are filled with commodity RAM instead of a magnetic platter, and a battery that lasts a few weeks without external power. Cheers, mark -- Mark Mielke [EMAIL PROTECTED]

Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-22 Thread Mark Mielke
Are you able to show that the dirty pages are all coming from postgres? Cheers, mark -- Mark Mielke [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings