[PERFORM] Bad query plan with high-cardinality column

2013-02-28 Thread Alexander Staubo
I have a planner problem that looks like a bug, but I'm not familiar enough with how planner the works to say for sure. This is my schema: create table comments ( id serial primary key, conversation_id integer, created_at timestamp ); create index

[PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
I have a problem with a query that is planned wrong. This is my schema: create table comments ( id serial primary key, conversation_id integer, created_at timestamp ); create index comments_conversation_id_index on comments (conversation_id); create index

Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
On Friday, February 22, 2013 at 21:33 , Tom Lane wrote: The reason is that the LIMIT may stop the query before it's scanned all of the index. The planner estimates on the assumption that the desired rows are roughly uniformly distributed within the created_at index, and on that assumption, it

Re: [PERFORM] Bad query plan with high-cardinality column

2013-02-22 Thread Alexander Staubo
On Friday, February 22, 2013 at 21:47 , Kevin Grittner wrote: I suspect you would be better off without those two indexes, and instead having an index on (conversation_id, created_at). Not just for the query you show, but in general. Indeed, that solved it, thanks! In my experience these

Re: [PERFORM] Sorting by an arbitrary criterion

2009-07-09 Thread Alexander Staubo
On Thu, Jul 9, 2009 at 6:26 PM, Craig Jamescraig_ja...@emolecules.com wrote: Suppose I have a large table with a small-cardinality CATEGORY column (say, categories 1..5).  I need to sort by an arbitrary (i.e. user-specified) mapping of CATEGORY, something like this:  1 = 'z'  2 = 'a'  3 =

Re: [PERFORM] How would you store read/unread topic status?

2009-06-23 Thread Alexander Staubo
On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebramate...@siteduzero.com wrote: This flags table has more or less the following fields: UserID - TopicID - LastReadAnswerID We are doing pretty much same thing. My problem is that everytime a user READS a topic, it UPDATES this flags table to

Re: [PERFORM] How would you store read/unread topic status?

2009-06-23 Thread Alexander Staubo
On Tue, Jun 23, 2009 at 1:12 PM, Mathieu Nebramate...@siteduzero.com wrote: The flags table keeps track of every topic a member has visited and remembers the last answer which was posted at this moment. It allows the user to come back a few days after and immediately jump to the last answer he

[PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
Here's the query: select photos.* from photos inner join event_participations on event_participations.user_id = photos.creator_id and event_participations.attend = true inner join event_instances on event_instances.id = event_participations.event_instance_id where (

Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
On Tue, Jun 16, 2009 at 3:56 PM, Dave Dutcherd...@tridecap.com wrote: -Original Message- From: Alexander Staubo    -  Nested Loop  (cost=0.00..5729774.95 rows=10420 width=116) (actual time=262614.470..262614.470 rows=0 loops=1)          Join Filter: ((photos.taken_at

Re: [PERFORM] Yet another slow nested loop

2009-06-16 Thread Alexander Staubo
On Tue, Jun 16, 2009 at 4:36 PM, Tom Lanet...@sss.pgh.pa.us wrote: Actually the easiest way to fix that is to get rid of the LIMIT. (Maybe use a cursor instead, and fetch only twenty rows.)  LIMIT magnifies the risks from any estimation error, and you've got a lot of that here ... There's no

Re: [PERFORM] Storing sensor data

2009-05-28 Thread Alexander Staubo
On Thu, May 28, 2009 at 2:54 PM, Ivan Voras ivo...@freebsd.org wrote: The volume of sensor data is potentially huge, on the order of 500,000 updates per hour. Sensor data is few numeric(15,5) numbers. The size of that dataset, combined with the apparent simplicity of your schema and the

Re: [PERFORM] Storing sensor data

2009-05-28 Thread Alexander Staubo
On Thu, May 28, 2009 at 5:06 PM, Ivan Voras ivo...@freebsd.org wrote: If you require precise data with the ability to filter, aggregate and correlate over multiple dimensions, something like Hadoop -- or one of the Hadoop-based column database implementations, such as HBase or Hypertable --

Re: [PERFORM] Bad plan for nested loop + limit

2009-03-30 Thread Alexander Staubo
On Sun, Mar 1, 2009 at 4:32 AM, Robert Haas robertmh...@gmail.com wrote: What do you have default_statistics_target set to?  If it's less than 100, you should probably raise it to 100 and re-analyze (the default value for 8.4 will be 100, but for 8.3 and prior it is 10). Changing it to 100

Re: [PERFORM] Bad plan for nested loop + limit

2009-02-28 Thread Alexander Staubo
On Fri, Feb 27, 2009 at 11:54 PM, Robert Haas robertmh...@gmail.com wrote: The problem here is that the planner estimates the cost of a Limit plan node by adding up (1) the startup cost of the underlying plan node, in this case 0 for the nestjoin, and (2) a percentage of the run cost, based on

Re: [PERFORM] Bad plan for nested loop + limit

2009-02-27 Thread Alexander Staubo
On Sun, Feb 15, 2009 at 5:45 PM, Alexander Staubo a...@bengler.no wrote: On Sun, Feb 15, 2009 at 5:29 AM, David Wilson david.t.wil...@gmail.com wrote: On Sat, Feb 14, 2009 at 5:25 PM, Alexander Staubo a...@bengler.no wrote: Output from explain analyze:  Limit  (cost=0.00..973.63 rows=4

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-16 Thread Alexander Staubo
On Sun, Feb 15, 2009 at 6:35 PM, Greg Smith gsm...@gregsmith.com wrote: http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm goes over this topic, with Appendix B: pg_stat_bgwriter sample analysis covering a look at what to do based on a pg_stat_bgwriter snapshot. Wonderful, thank

Re: [PERFORM] Bad plan for nested loop + limit

2009-02-15 Thread Alexander Staubo
On Sun, Feb 15, 2009 at 5:29 AM, David Wilson david.t.wil...@gmail.com wrote: On Sat, Feb 14, 2009 at 5:25 PM, Alexander Staubo a...@bengler.no wrote: Output from explain analyze: Limit (cost=0.00..973.63 rows=4 width=48) (actual time=61.554..4039.704 rows=1 loops=1) - Nested Loop

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-14 Thread Alexander Staubo
On Fri, Feb 13, 2009 at 6:35 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: You should definitely set effective_cache_size. Wow -- I set this to 10GB (5GB for shared buffers + another 5GB for cache), and today's average write frequency went from 20MB/sec to just 1MB/sec. The documentation

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-14 Thread Alexander Staubo
On Sat, Feb 14, 2009 at 9:49 AM, Craig Ringer cr...@postnewspapers.com.au wrote: Is there any chance you had pg_xlog stored separately on your old database, and I/O for it wasn't being recorded? No, the database files have always been on a single volume. Alexander. -- Sent via

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-14 Thread Alexander Staubo
On Sat, Feb 14, 2009 at 8:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alexander Staubo a...@bengler.no writes: Wow -- I set this to 10GB (5GB for shared buffers + another 5GB for cache), and today's average write frequency went from 20MB/sec to just 1MB/sec. The documentation suggests

[PERFORM] Bad plan for nested loop + limit

2009-02-14 Thread Alexander Staubo
This dinky little query takes about 4 seconds to run: select event_occurrences.* from event_occurrences join section_items on section_items.subject_id = event_occurrences.event_id and section_items.subject_type = 'Event' and section_items.sandbox_id = 16399 where

[PERFORM] I/O increase after upgrading to 8.3.5

2009-02-13 Thread Alexander Staubo
After upgrading from 8.2 to 8.3.5, the write load on our database server has increased dramatically and inexplicably -- as has the CPU usage. Here's a Munin graph of iostat showing the sudden increase in blocks written/sec: http://purefiction.net/paste/Totals-iostatwrite1-week.png We expected

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-13 Thread Alexander Staubo
On Fri, Feb 13, 2009 at 3:46 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Alexander Staubo a...@bengler.no wrote: After upgrading from 8.2 to 8.3.5, the write load on our database server has increased dramatically and inexplicably -- as has the CPU usage. Did you do a VACUUM ANALYZE

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-13 Thread Alexander Staubo
On Fri, Feb 13, 2009 at 12:53 PM, Alexander Staubo a...@bengler.no wrote: The upgrade was done with dump/restore using pg_dump -Fc. The old database lived on a SAN volume, whereas the new database lives on a local disk volume. I need to correct myself: The Munin graphs were never set to track

Re: [PERFORM] I/O increase after upgrading to 8.3.5

2009-02-13 Thread Alexander Staubo
On Fri, Feb 13, 2009 at 5:17 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Could you show the non-commented lines from old and new postgresql.conf files, please? Attached. The differences are not performance-related, as far as I can see, aside from the additional of synchronous_commit =

Re: [PERFORM] Identifying the nature of blocking I/O

2008-08-25 Thread Alexander Staubo
On Mon, Aug 25, 2008 at 3:34 AM, Scott Carey [EMAIL PROTECTED] wrote: DTrace is available now on MacOSX, Solaris 10, OpenSolaris, and FreeBSD. Linux however is still in the dark ages when it comes to system monitoring, especially with I/O. While that's true, newer 2.6 kernel versions at least

Re: [PERFORM] Query Optimization with Kruskal’s Algorithm

2008-05-07 Thread Alexander Staubo
On 5/7/08, Tarcizio Bini [EMAIL PROTECTED] wrote: I'm working on optimizing queries using the Kruskal algorithm (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4318118). That paper looks very interesting. I would love to hear what the PostgreSQL committers think of this algorithm.

Re: [PERFORM] Putting files into fields in a table

2007-12-13 Thread Alexander Staubo
On 12/13/07, Campbell, Lance [EMAIL PROTECTED] wrote: I am looking at the possibility of storing files in some of my database tables. My concern is obviously performance. I have configured PostgreSQL to take advantage of Linux file caching. So my PostgreSQL does not have a large setting for

Re: [PERFORM] PostgreSQL performance on various distribution stock kernels

2007-11-26 Thread Alexander Staubo
On 11/26/07, Damon Hart [EMAIL PROTECTED] wrote: So, what's different between these tests? I'm seeing performance differences of between +65% to +90% transactions per second of the OpenVZ kernel running on the HN over the stock Fedora 8 kernel. Is this reflective of different emphasis between

Re: [PERFORM] Estimation problem with a LIKE clause containing a /

2007-11-07 Thread Alexander Staubo
On 11/7/07, Guillaume Smet [EMAIL PROTECTED] wrote: While studying a query taking forever after an ANALYZE on a never analyzed database (a bad estimate causes a nested loop on a lot of tuples), I found the following problem: [snip] Total runtime: 31.097 ms [snip] Total runtime: 31.341 ms

Re: [PERFORM] TEXT or LONGTEXT?

2007-09-24 Thread Alexander Staubo
On 9/24/07, Fabiola Fernández [EMAIL PROTECTED] wrote: I have a database with an amount of tables and in several of them I have an attribute for a semantic definition, for which I use a field of type text. I am trying to decide if it would be worth using LONGTEXT instead of TEXT, as maybe it

Re: [PERFORM] Replication

2007-06-15 Thread Alexander Staubo
On 6/15/07, Craig James [EMAIL PROTECTED] wrote: I don't think I can use PGPool as the replicator, because then it becomes a new single point of failure that could bring the whole system down. If you're using it for INSERT/UPDATE, then there can only be one PGPool server. Are you sure? I

Re: [PERFORM] Replication

2007-06-14 Thread Alexander Staubo
On 6/15/07, Craig James [EMAIL PROTECTED] wrote: [snip] Is this a good summary of the status of replication? Have I missed any important solutions or mischaracterized anything? * Mammoth Replicator, commercial. * Continuent uni/cluster, commercial

Re: [PERFORM] Some info to share: db_STRESS Benchmark results

2007-05-31 Thread Alexander Staubo
On 5/31/07, Dimitri [EMAIL PROTECTED] wrote: just wanted to share some benchmark results from one long performance study comparing MySQL, PostgreSQL and Oracle transactions throughput and engine scalability on T2000 and V890 (under Solaris). Interesting, if awfully cryptic. The lack of axis

Re: [PERFORM] Feature suggestion : FAST CLUSTER

2007-05-27 Thread Alexander Staubo
On 5/27/07, PFC [EMAIL PROTECTED] wrote: PostgreSQL humiliates InnoDB on CPU-bound workloads (about 2x faster since I run it on dual core ; InnoDB uses only one core). However, InnoDB can automatically cluster tables without maintenance. How does it know what to cluster by? Does it

Re: [PERFORM] LIKE search and performance

2007-05-23 Thread Alexander Staubo
On 5/23/07, Andy [EMAIL PROTECTED] wrote: An example would be: SELECT * FROM table WHERE name like '%john%' or street like '%srt%' Anyway, the query planner always does seq scan on the whole table and that takes some time. How can this be optimized or made in

Re: [PERFORM] Performace comparison of indexes over timestamp fields

2007-05-22 Thread Alexander Staubo
On 5/22/07, Arnau [EMAIL PROTECTED] wrote: On older versions of PostgreSQL, at least in my experience, queries on timestamps fields even having indexes where performing quite bad mainly sequential scans where performed. PostgreSQL uses B-tree indexes for scalar values. For an expression

Re: [PERFORM] Performace comparison of indexes over timestamp fields

2007-05-22 Thread Alexander Staubo
On 5/22/07, Steinar H. Gunderson [EMAIL PROTECTED] wrote: On Tue, May 22, 2007 at 02:39:33PM +0200, Alexander Staubo wrote: PostgreSQL uses B-tree indexes for scalar values. For an expression such as t between a and b, I believe it's going to match both sides of the table independently (ie

Re: [PERFORM] Tips Tricks for validating hardware/os

2007-05-22 Thread Alexander Staubo
On 5/22/07, Stephane Bailliez [EMAIL PROTECTED] wrote: Out of curiosity, can anyone share his tips tricks to validate a machine before labelling it as 'ready to use postgres - you probably won't trash my data today' ? I'm looking for a way to stress test components especially kernel/disk to

Re: [PERFORM] bitmap index and IS NULL predicate

2007-05-15 Thread Alexander Staubo
On 5/15/07, Jason Pinnix [EMAIL PROTECTED] wrote: Does the bitmap index not store a bit vector for the NULL value (i.e. a bit vector that contains a 1 for each row with a NULL value and 0 for other rows) ? You should be able to do this with a conditional index: create index ... (col) where

Re: [PERFORM] Best OS for Postgres 8.2

2007-05-08 Thread Alexander Staubo
On 5/8/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: [snip] I personally don't trust reiserfs, jfs seems to be a tools for transitioning from AIX more then anything else [...] What makes you say this? I have run JFS for years with complete satisfaction, and I have never logged into an AIX

Re: [PERFORM] specific query (not all) on Pg8 MUCH slower than Pg7

2007-05-08 Thread Alexander Staubo
On 5/8/07, Tom Lane [EMAIL PROTECTED] wrote: You're not getting the indexscan optimization of the LIKE clause, which is most likely due to having initdb'd the 8.1 installation in something other than C locale. You can either redo the initdb in C locale (which might be a good move to fix other

Re: [PERFORM] pg_stat_* collection

2007-05-03 Thread Alexander Staubo
On 5/3/07, Greg Smith [EMAIL PROTECTED] wrote: Today's survey is: just what are *you* doing to collect up the information about your system made available by the various pg_stat views? I have this hacked together script that dumps them into a file, imports them into another database, and then

Re: [PERFORM] index structure for 114-dimension vector

2007-05-01 Thread Alexander Staubo
On 5/1/07, Andrew Lazarus [EMAIL PROTECTED] wrote: Let me just thank the list, especially for the references. (I found similar papers myself with Google: and to think I have a university library alumni card and barely need it any more!) I'll write again on the sorts of results I get. Looking

Re: [PERFORM] index structure for 114-dimension vector

2007-04-26 Thread Alexander Staubo
On 4/20/07, Andrew Lazarus [EMAIL PROTECTED] wrote: I have a table with 2.5 million real[] arrays. (They are points in a time series.) Given a new array X, I'd like to find, say, the 25 closest to X in some sense--for simplification, let's just say in the usual vector norm. Speed is critical

Re: [PERFORM] Simple query, 10 million records...MySQL ten times faster

2007-04-26 Thread Alexander Staubo
On 24 Apr 2007 14:26:46 -0700, zardozrocks [EMAIL PROTECTED] wrote: I have this table: CREATE TABLE test_zip_assoc ( id serial NOT NULL, f_id integer DEFAULT 0 NOT NULL, lat_radians numeric(6,5) DEFAULT 0.0 NOT NULL, long_radians numeric(6,5) DEFAULT 0.0 NOT NULL );

Re: [PERFORM] Simple query, 10 million records...MySQL ten times faster

2007-04-26 Thread Alexander Staubo
On 4/27/07, Alexander Staubo [EMAIL PROTECTED] wrote: [snip] PostGIS implements the whole GIS stack, and it's so good at this that it's practically the de facto tool among GIS analysts. Installing PostGIS into a database is simple, and once you have done this, you can augment your table

Re: [PERFORM] Warm - standby system.

2007-04-24 Thread Alexander Staubo
On 4/24/07, Nimesh Satam [EMAIL PROTECTED] wrote: Can anybody let me know the steps which are supposed to be followed to make the standby machine for read access? and how it should be one. Not possible at the moment. The warm standby is not hot -- it cannot be used for queries while it's

Re: [PERFORM] Two hard drives --- what to do with them?

2007-02-24 Thread Alexander Staubo
On Feb 25, 2007, at 04:39 , Carlos Moreno wrote: I do have the option to configure it in RAID-0, but I'm sort of reluctant; I think there's the possibility that having two filesystems that can be accessed truly simultaneously can be more beneficial. The question is: does PostgreSQL have

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Alexander Staubo
On Dec 15, 2006, at 04:09 , Ron wrote: At 07:27 PM 12/14/2006, Alexander Staubo wrote: Sorry, I neglected to include the pertinent graph: http://purefiction.net/paste/pgbench2.pdf In fact, your graph suggests that using arch specific options in addition to -O3 actually =hurts

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-15 Thread Alexander Staubo
On Dec 15, 2006, at 17:53 , Ron wrote: At 09:50 AM 12/15/2006, Greg Smith wrote: On Fri, 15 Dec 2006, Merlin Moncure wrote: The slower is probably due to the unroll loops switch which can actually hurt code due to the larger footprint (less cache coherency). The cache issues are so

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-14 Thread Alexander Staubo
On Dec 14, 2006, at 16:00 , Greg Smith wrote: On Wed, 13 Dec 2006, Ron wrote: The slowest results, Michael's, are on the system with what appears to be the slowest CPU of the bunch; and the ranking of the rest of the results seem to similarly depend on relative CPU performance. This is

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-14 Thread Alexander Staubo
On Dec 14, 2006, at 20:28 , Ron wrote: Can you do runs with just CFLAGS=-O3 and just CFLAGS=-msse2 - mfpmath=sse -funroll-loops -m64 - march=opteron -pipe as well ? All right. From my perspective, the effect of -O3 is significant, whereas architecture-related optimizations have no

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-14 Thread Alexander Staubo
On Dec 15, 2006, at 01:16 , Ron wrote: At 05:39 PM 12/14/2006, Alexander Staubo wrote: On Dec 14, 2006, at 20:28 , Ron wrote: Can you do runs with just CFLAGS=-O3 and just CFLAGS=-msse2 - mfpmath=sse -funroll-loops -m64 - march=opteron -pipe as well ? All right. From my perspective

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-12 Thread Alexander Staubo
On Dec 11, 2006, at 23:22 , Daniel van Ham Colchete wrote: I ran this test at a Gentoo test machine I have here. It's a Pentium 4 3.0GHz (I don't know witch P4) Try cat /proc/cpuinfo. TESTS RESULTS == On a dual-core Opteron 280 with 4G RAM with an LSI PCI-X Fusion-MPT SAS

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-12 Thread Alexander Staubo
On Dec 12, 2006, at 13:32 , Michael Stone wrote: On Tue, Dec 12, 2006 at 12:29:29PM +0100, Alexander Staubo wrote: I suspect the hardware's real maximum performance of the system is ~150 tps, but that the LSI's write cache is buffering the writes. I would love to validate this hypothesis

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Alexander Staubo
On Dec 11, 2006, at 04:35 , Tatsuo Ishii wrote: That's not the whole story. UTF-8 and other variable-width encodings don't provide a 1:1 mapping of logical characters to single bytes; in particular, combination characters opens the possibility of multiple different byte sequences mapping to the

Re: [PERFORM] File Systems Compared

2006-12-06 Thread Alexander Staubo
On Dec 6, 2006, at 16:40 , Brian Wipf wrote: All tests are with bonnie++ 1.03a [snip] Care to post these numbers *without* word wrapping? Thanks. Alexander. ---(end of broadcast)--- TIP 4: Have you searched our list archives?

Re: [PERFORM] query produces 1 GB temp file

2006-10-27 Thread Alexander Staubo
While I can't explain why PostgreSQL would use that memory, I recommend looking into tweaking the work_mem parameter. This setting specifies how much memory PostgreSQL on certain temporary data structures (hash tables, sort vectors) until it starts using temporary files. Quoting the docs:

Re: [PERFORM] Optimization of this SQL sentence

2006-10-17 Thread Alexander Staubo
On Oct 17, 2006, at 11:33 , Ruben Rubio wrote: CREATE TABLE comment ( idcomment int4 NOT NULL DEFAULT nextval('comment_idcomment_seq'::regclass), [snip 28 columns] CONSTRAINT comment_pkey PRIMARY KEY (idcomment) ) Ficha structure: No indexes in ficha Ficha rows: 17.850 CREATE TABLE

Re: [PERFORM] Optimization of this SQL sentence

2006-10-17 Thread Alexander Staubo
On Oct 17, 2006, at 17:10 , Craig A. James wrote: These tables are particularly egregious examples of ignorant database design. You need to understand the relational model This email is a *particularly* egregious example of rudeness. You owe Mr. Staubo, and the Postgress community, an

Re: [PERFORM] Optimization of this SQL sentence

2006-10-17 Thread Alexander Staubo
On Oct 17, 2006, at 17:29 , Mario Weilguni wrote: Am Dienstag, 17. Oktober 2006 11:52 schrieb Alexander Staubo: Lastly, note that in PostgreSQL these length declarations are not necessary: contacto varchar(255), fuente varchar(512), prefijopais varchar(10) Enforcing length

Re: [PERFORM] Poor performance on very simple query ?

2006-10-03 Thread Alexander Staubo
On Oct 3, 2006, at 13:25 , Arnaud Lesauvage wrote: The problem is that simple select queries with the primary key in the WHERE statement take very long to run. For example, this query returns only 7 rows and takes about 1 second to run ! SELECT * FROM table1 WHERE gid in