date:20120823

On Wed, Aug 22, 2012 at 4:06 PM, Nick nboutel...@gmail.com wrote:
 I have a table with 40 million rows and haven't had any performance issues 
 yet.

 Are there any rules of thumb as to when a table starts getting too big?

 For example, maybe if the index size is 6x the amount of ram, if the table is 
 10% of total disk space, etc?


Performance on a big table is going to depend very heavily on the
sorts of queries executed against it.I don't think you can come up
with a rule of thumb of that sort.

Best Wishes,
Chris Travers


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Rules, Windows and ORDER BY

2012-08-23 Thread Jason Dusek

Hello List,

I have a simple table of keys and values which periodically
receives updated values. It's desirable to keep older values
but, most of the time, we query only for the latest value of a
particular key.

  CREATE TABLE kv
  ( k bytea NOT NULL,
at timestamptz NOT NULL,
realm bytea NOT NULL,
v bytea NOT NULL );
  CREATE INDEX ON kv USING hash(k);
  CREATE INDEX ON kv (t);
  CREATE INDEX ON kv USING hash(realm);

  SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

It would be nice to encapsulate this common query with a VIEW;
for example:

  CREATE VIEW kv_new AS
SELECT * FROM kv WHERE at =
  ( SELECT at FROM kv AS _
 WHERE _.k = kv.k AND _.realm = kv.realm
 ORDER BY at DESC LIMIT 1 );

I tried partition functions, at first, but they were really very
slow. This view is pretty sprightly but has a more complicated
plan than the original query, which only has a sort followed by
an index scan, and is consequently not as fast. Please find the
plans below my signature.

Ideally, I'd be able to create a rule where the ORDER BY and
LIMIT were simply appended to whatever SELECT was given; but I
am at a loss as to how to do that. Creating a VIEW with the
order and limit just gives me a table with one row in it (of
course).

Is there something better than a sub-select here? I tried using
one with max(at) but it's not noticeably faster. I would be
interested to see how others have approached this kind of log-
-structured storage in Postgres. The window functions make,
alas, no use of indexes.

--
Jason Dusek
pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B




  EXPLAIN (COSTS FALSE, FORMAT YAML)
SELECT * FROM kv WHERE k = ... AND realm = ... ORDER BY at LIMIT 1;
  -[ RECORD 1 ]-
  QUERY PLAN | - Plan:
 | Node Type: Limit
 | Plans:
 |   - Node Type: Sort
 | Parent Relationship: Outer
 | Sort Key:
 |   - at
 | Plans:
 |   - Node Type: Index Scan
 | Parent Relationship: Outer
 | Scan Direction: NoMovement
 | Index Name: kv_k_idx
 | Relation Name: kv
 | Alias: kv
 | Index Cond: (k = ...)
 | Filter: (realm = ...)


  EXPLAIN (COSTS FALSE, FORMAT YAML)
SELECT * FROM kv_new WHERE k = ... AND realm = ...;
  -[ RECORD 1 ]-
  QUERY PLAN | - Plan:
 | Node Type: Index Scan
 | Scan Direction: NoMovement
 | Index Name: kv_k_idx
 | Relation Name: kv
 | Alias: kv
 | Index Cond: (k = ...)
 | Filter: ((realm = ...) AND (at = (SubPlan 1)))
 | Plans:
 |   - Node Type: Limit
 | Parent Relationship: SubPlan
 | Subplan Name: SubPlan 1
 | Plans:
 |   - Node Type: Sort
 | Parent Relationship: Outer
 | Sort Key:
 |   - _.at
 | Plans:
 |   - Node Type: Index Scan
 | Parent Relationship: Outer
 | Scan Direction: NoMovement
 | Index Name: kv_k_idx
 | Relation Name: kv
 | Alias: _
 | Index Cond: (k = kv.k)
 | Filter: (realm = kv.realm)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances

2012-08-23 Thread Vincent Veyron

Le mercredi 22 août 2012 à 13:30 -0400, Sébastien Lorion a écrit :
 Vincent, I would appreciate that you stop assuming things based on
 zero information about what I am doing. I understand that you are
 trying to be helpful, but I can assure you that going bare-metal only
 does not make any sense in my context.


Dude,

I would appreciate you realize that approaching a newsgroup while
providing zero information about what you are doing (in your own words)
is not the best way to get relevant responses to your question.

Ignoring repeated requests for information does not help, castigating
people trying to help for not having said information at least shows a
certain consistency on your part.

Lest we ridicule ourselves publicly, I suggest we leave the discussion
at that and wish you luck in your endeavor.

Vincent Veyron






-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances

2012-08-23 Thread Anthony

On Thu, Aug 23, 2012 at 7:39 AM, Vincent Veyron vv.li...@wanadoo.fr wrote:
 Lest we ridicule ourselves publicly, I suggest we leave the discussion
 at that and wish you luck in your endeavor.

If anyone has an answer to his question, I'd appreciate hearing it,
despite any faux pas that the OP has committed.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Jasen Betts

On 2012-08-22, Nick nboutel...@gmail.com wrote:
 I have a table with 40 million rows and haven't had any performance issues 
 yet.

 Are there any rules of thumb as to when a table starts getting too big?

when you need to run a query that needs to fetch too many rows.

 For example, maybe if the index size is 6x the amount of ram,
 if the table is 10% of total disk space, etc?

If you only need one row at a time and you have the indices for it
no size is too big, the larger they are the more impressive 
indices are. O(log(n)) beats O(n) more and more as n grows.

-- 
⚂⚃ 100% natural



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Bill Moran

In response to Martin French martin.fre...@romaxtech.com:
 
  I have a table with 40 million rows and haven't had any performance
  issues yet.
 
  Are there any rules of thumb as to when a table starts getting too big?
 
  For example, maybe if the index size is 6x the amount of ram, if the
  table is 10% of total disk space, etc?
 
 My rule here is that a table is too big when performance starts degrading 
 beyond an acceptable level.

The challenge there is that if you wait until performance degrades
beyond an acceptable level, you've allowed yourself to get into a
situation where clients are upset and frustrated, and fixing the
problem is difficult because there's so much data to manipulate to
rearrange things.

And the advice I have along those lines is to establish now what
constitutes unacceptable performance, and put some sort of monitoring
and tracking in place to know what your performance degradation looks
like and predict when you'll have to react.  For example, a MRTG
graph that runs an experimental query once a day during off hours and
graphs the time it takes vs. the # of rows in the table will prove
a valuable tool that can sometimes predict exactly when you'll have
to change things before it becomes a problem.  Other tricks work as
well, such as having the application send an email any time a process
takes more than 50% of the allowable maximum time.

The key is to have visibility into what's going on so your guesses
are at least informed.  People will often point out that no monitoring
or trend tracking is 100% accurate, but if it allows you to predict
and plan for 90% of the future issues, you'll have that much more time
available to deal with the 10% that you don't expect.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances

2012-08-23 Thread Sébastien Lorion

Vincent,

The original question can be summed up by how is general performance of
PostgreSQL on Amazon IOPS. I fail to understand why that would require me
to explain the specifics of my application and/or my market. The only one
asking for that information is you, while others have provided useful
answers, for which I am very grateful.

p.s. My name is not dude or seb, we have not raised the pigs together
...

Sébastien

On Thu, Aug 23, 2012 at 7:39 AM, Vincent Veyron vv.li...@wanadoo.fr wrote:

 Le mercredi 22 août 2012 à 13:30 -0400, Sébastien Lorion a écrit :
  Vincent, I would appreciate that you stop assuming things based on
  zero information about what I am doing. I understand that you are
  trying to be helpful, but I can assure you that going bare-metal only
  does not make any sense in my context.


 Dude,

 I would appreciate you realize that approaching a newsgroup while
 providing zero information about what you are doing (in your own words)
 is not the best way to get relevant responses to your question.

 Ignoring repeated requests for information does not help, castigating
 people trying to help for not having said information at least shows a
 certain consistency on your part.

 Lest we ridicule ourselves publicly, I suggest we leave the discussion
 at that and wish you luck in your endeavor.

 Vincent Veyron

Re: [GENERAL] Can column name aliases be supported?

Thanks for all the replies !

The real problem has nothing to do with names.  I just used that as a vehicle 
for articulating the problem.  

The view approach for queries is workable, at least for queries.  Thanks for 
the input on that and the idea to replicate the various aliases in the view!

I take it that duplication of the aliases as formal table columns, synched up 
with a before trigger is a bad idea?  Is there any fruit on the end of that 
branch? (after looking at RULES, I expect only a single dried raisin :-))

Regarding the RULES approach, I've never used RULES before, so please bear with 
me a bit :-)
My stab at this using the base example...

create rule lastnamealias1 
on update PEOPLE.lname
do instead update PEOPLE.last_name;

create rule lastnamealias2
on update PEOPLE.surname
do instead update PEOPLE.last_name;

does that look correct (given that last_name is the actual c olumn name)?


I too would like to shoot the clients.  But as they say, the customer is always 
right !

Thanks Again everyone !


-Original Message-
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Chris Angelico
Sent: Thursday, August 23, 2012 1:23 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Can column name aliases be supported?

On Thu, Aug 23, 2012 at 2:19 PM, Craig Ringer ring...@ringerc.id.au wrote:
 On 08/23/2012 11:56 AM, Chris Angelico wrote:

 Here's an out-of-the-box suggestion.

 Drop the column altogether and have a single column name. Trying to 
 divide names up never works properly. Does surname mean family name?
 Not all cultures put the family name last. Is last_name simply the 
 part of the name after the last space?

 +1 to that, and it gets way worse:

 http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-abo
 ut-names/

Yes, that link was posted on python-list a little while back, and that's what I 
had in mind as I was writing that up. Couldn't remember the actual link though. 
Thanks!

 and while you're at it, read this:

 http://www.joelonsoftware.com/articles/Unicode.html

Definitely. I disagree with Joel Spolsky on many things, but I agree with that 
post. These days, Unicode is an absolute necessity. Our PHP-based web site has 
a number of issues with Unicode input, but at least everything that goes 
through the database (we use Postgres for
everything) is safe.

ChrisA


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make 
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

On Wed, Aug 22, 2012 at 6:06 PM, Nick nboutel...@gmail.com wrote:
 I have a table with 40 million rows and haven't had any performance issues 
 yet.

 Are there any rules of thumb as to when a table starts getting too big?

 For example, maybe if the index size is 6x the amount of ram, if the table is 
 10% of total disk space, etc?

Well, that begs the question: ...and do what?  I guess you probably
mean partitioning.

Partitioning doesn't reduce index size -- it makes total index size
*bigger* since you have to duplicate higher nodes in the index --
unless you can exploit the table structure around the partition so
that less fields have to be indexed.

Where partitioning helps is by speeding certain classes of bulk
operations like deleting a bunch of rows -- maybe you can set it up so
that a partition can be dropped instead for a huge efficiency win.
Partitioning also helps by breaking up administrative operations such
as vacuum, analyze, cluster, create index, reindex, etc. So I'd argue
that it's time to start thinking about plan 'b' when you find yourself
getting concerned about performance of those operations.

Partitioning aside, the way to reduce the number of rows you're
dealing with is to explore reorganizing your data: classic
normalization or use of arrays are a couple of examples of things you
can try.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances


On 08/23/2012 07:39 PM, Vincent Veyron wrote:

Le mercredi 22 août 2012 à 13:30 -0400, Sébastien Lorion a écrit :

Vincent, I would appreciate that you stop assuming things based on
zero information about what I am doing. I understand that you are
trying to be helpful, but I can assure you that going bare-metal only
does not make any sense in my context.



Dude,

I would appreciate you realize that approaching a newsgroup while
providing zero information about what you are doing


In this case, what he's doing is seeking generalized performance 
measurements. I don't think details were particularly necessary until it 
got pulled off-track.


I'll be interested to hear if you have any results Sébastien, or if 
anyone else does. It's good to have data on the increasingly popular 
cloud platforms out there.


--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

On Thu, Aug 23, 2012 at 6:46 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Aug 22, 2012 at 6:06 PM, Nick nboutel...@gmail.com wrote:
 I have a table with 40 million rows and haven't had any performance issues 
 yet.

 Are there any rules of thumb as to when a table starts getting too big?

 For example, maybe if the index size is 6x the amount of ram, if the table 
 is 10% of total disk space, etc?

 Well, that begs the question: ...and do what?  I guess you probably
 mean partitioning.

 Partitioning doesn't reduce index size -- it makes total index size
 *bigger* since you have to duplicate higher nodes in the index --
 unless you can exploit the table structure around the partition so
 that less fields have to be indexed.

Depending on the operation it may make the effective index size bigger
or smaller.  For example if querying only one child table your
effective index size is much smaller.

However, if you are worried about that, partial indexes rock :-D

Best Wishes,
Chris Travers


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Tom Lane

Chris Travers chris.trav...@gmail.com writes:
 On Thu, Aug 23, 2012 at 6:46 AM, Merlin Moncure mmonc...@gmail.com wrote:
 Partitioning doesn't reduce index size -- it makes total index size
 *bigger* since you have to duplicate higher nodes in the index --
 unless you can exploit the table structure around the partition so
 that less fields have to be indexed.

 Depending on the operation it may make the effective index size bigger
 or smaller.  For example if querying only one child table your
 effective index size is much smaller.

I tend to think of it like this: partitioning means *manually* replacing
the first level of index search.

As such, it is almost never a win for either complexity or performance
of simple searches and updates.  As Merlin said, pretty much the only
compelling reason to do it is if you can match up the partition
boundaries with bulk tasks that are common in your application, such as
dropping a month's worth of data at a time.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can column name aliases be supported?

Updateable views.  This is great.  I didn't know about these.  Absolutely 
delicious !   
I found a great example here... 
http://vibhorkumar.wordpress.com/2011/10/28/instead-of-trigger/

The problem of user updating 1 alias remains, but I have no problem bouncing 
them if they attempt that.  Maybe return a -E- You cannot update multiple 
aliases of the same base value (something like that).  I'd just have to detect 
this circumstance, raise the message and bounce (return OLD).

Does this sound doable?





-Original Message-
From: Craig Ringer [mailto:ring...@ringerc.id.au] 
Sent: Thursday, August 23, 2012 9:52 AM
To: Gauthier, Dave
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Can column name aliases be supported?

On 08/23/2012 09:32 PM, Gauthier, Dave wrote:

 The view approach for queries is workable, at least for queries.  Thanks for 
 the input on that and the idea to replicate the various aliases in the view!

The key issue with all your approaches is whether the client can ever `UPDATE` 
the view. If they can, and you duplicate the column under multiple aliases, 
what happens when they `UPDATE` two of the aliases to different values?

 I too would like to shoot the clients.  But as they say, the customer is 
 always right !

Heh, I wouldn't shoot them. Fire the client means:

 http://sixrevisions.com/project-management/how-to-fire-bad-clients/

--
Craig Ringer



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can column name aliases be supported?


On 08/23/2012 09:32 PM, Gauthier, Dave wrote:


The view approach for queries is workable, at least for queries.  Thanks for 
the input on that and the idea to replicate the various aliases in the view!


The key issue with all your approaches is whether the client can ever 
`UPDATE` the view. If they can, and you duplicate the column under 
multiple aliases, what happens when they `UPDATE` two of the aliases to 
different values?



I too would like to shoot the clients.  But as they say, the customer is always 
right !


Heh, I wouldn't shoot them. Fire the client means:

http://sixrevisions.com/project-management/how-to-fire-bad-clients/

--
Craig Ringer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Cannot Run EnterpriseDB Postgresql Installation

2012-08-23 Thread javad M

Hi, i just formatted my machine and installed fresh win7 x64. Also
installed VS2012 since i do .net developement. In backend i use postgresql
so downloaded latest postgresql 9.1.5 installation. But, i am not able to
install. Upon executing file postgresql-9.1.5-1-windows.exe it asks for
UAC and i say yes then i get a black window and another window named
console with (Downloads) 1 %  written in it. I have installed old version
before and also on many clients but this is first time i am encountering
this issue. There are also no logs anywhere. Please help, as my development
is fully stranded because of this issue.

Re: [GENERAL] Rules, Windows and ORDER BY

2012-08-23 Thread Tom Lane

Jason Dusek jason.du...@gmail.com writes:
 I have a simple table of keys and values which periodically
 receives updated values. It's desirable to keep older values
 but, most of the time, we query only for the latest value of a
 particular key.

   CREATE TABLE kv
   ( k bytea NOT NULL,
 at timestamptz NOT NULL,
 realm bytea NOT NULL,
 v bytea NOT NULL );
   CREATE INDEX ON kv USING hash(k);
   CREATE INDEX ON kv (t);
   CREATE INDEX ON kv USING hash(realm);

   SELECT * FROM kv WHERE k = $1 AND realm = $2 ORDER BY at DESC LIMIT 1;

If you want to make that fast, an index on (k,realm,at) would help.
Those indexes that you did create are next to useless for this,
and furthermore hash indexes are quite unsafe for production.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] NULL value comparison

2012-08-23 Thread Michael Sacket


On Aug 22, 2012, at 9:38 PM, Craig Ringer wrote:

 On 08/23/2012 10:32 AM, Michael Sacket wrote:
 
 The good news is I now have the proper constraints in place and the app and 
 it's 130 tables are working with PostgreSQL in less than a day.
 
 Wow, that's cool, especially without SQL changes.
 
 What was the previous database? I'm curious now.
 

The database was OpenBase.  I did an ascii backup and used some scripts I 
created last summer to make some small changes (mostly changing types on the 
create statements)... then ran it through psql.  On the WebObjects side I just 
had to update the external types in the EOModel to match the PostgreSQL ones.  
Turned out to be a fairly straightforward process.

In any case, we've been using PostgreSQL for all new development for about 1.5 
years and have found it to be quite reliable, quick, and elegant.  OpenBase was 
quick and reliable for us as well; however, due to its limited adoption outside 
of MacOSX and WebObjects we were always having to write our own tools and 
frameworks to work with it.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Martin French

I have a table with 40 million rows and haven't had any performance issues yet. Are there any rules of thumb as to when a table starts getting too big? For example, maybe if the index size is 6x the amount of ram, if the table is 10% of total disk space, etc?My rule here is that a table is too big when performance starts degrading beyond an acceptable level. The challenge there is that if you wait until performance degrades beyond an acceptable level, you've allowed yourself to get into a situation where clients are upset and frustrated, and fixing the problem is difficult because there's so much data to manipulate to rearrange things. Apologies, I could/should have phrased that better..My rule here is that a table is too big when performance starts degrading beyond a MEASURABLE level. :)CheersMartin =

Romax Technology Limited
Rutherford House
Nottingham Science & Technology Park
Nottingham,
NG7 2PZ
England

Telephone numbers:
+44 (0)115 951 88 00 (main)

For other office locations see:
http://www.romaxtech.com/Contact
=
===
E-mail: i...@romaxtech.com
Website: www.romaxtech.com
=

Confidentiality Statement
This transmission is for the addressee only and contains information that is confidential and privileged.
Unless you are the named addressee, or authorised to receive it on behalf of the addressee
you may not copy or use it, or disclose it to anyone else.
If you have received this transmission in error please delete from your system and contact the sender. Thank you for your cooperation.
=

Re: [GENERAL] Performance implications of numeric?

On Tue, Aug 21, 2012 at 11:27 PM, Wells Oliver wellsoli...@gmail.com wrote:
 We have a lot of tables which store numeric data. These tables all use the
 numeric type, where the values are 95% integer values. We used numeric
 because it eliminated the need for casting during division to yield a
 floating point value.

 I'm curious as to whether this would have performance and/or disk size
 implications. Would converting these columns to integer (or double precision
 on the handful that require the precision) and forcing developers to use
 explicit casting be worth the time?

 Thanks for any clarification.

Calculations against numeric are several orders of magnitude slower
than native binary operations.  Fortunately the time the database
spends doing these types of calculations is often a tiny fraction of
overall execution time and I advise giving numeric a whirl unless you
measure a big performance hit.  Let's put it this way: native binary
types are a performance hack that come with all kinds of weird baggage
that percolate up and uglify your code: your example given is a
classic case in point.  Database integer types are not in fact
integers but a physically constrained approximation of them.  Floating
point types are even worse.

Another example: I just found out for the first time (after many years
of programming professionally) that -2147483648 / -1 raises a hardware
exception: this is exactly the kind of thing that makes me think that
rote use of hardware integer types is a terribly bad practice.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances

On Wed, Aug 22, 2012 at 4:12 PM, Alan Hodgson ahodg...@simkin.ca wrote:
 On Wednesday, August 22, 2012 04:10:01 PM Andrew Hannon wrote:
 Just looking into High IO instances for a DB deployment. In order to get
 past 1TB, we are looking at RAID-0. I have heard
 (http://hackerne.ws/item?id=4266119) there might be a problem if TRIM isn't
 supported. Does anyone know if it is and has anyone used RAID-0 on these
 instances? (Linux of course…)

 Just use LVM striping. If it turns out to be an issue; that seems to be mostly
 conjecture.

 I note that the SSDs are only instance storage. The data will be gone when the
 instance goes away. I have used instance storage in replicated setups but it
 always feels rather fragile unless your data really is transient or you can
 maintain 2 replicas.

 Their other new service, provisioned IOPS for EBS, might be more useful for a
 persistent database. Although not nearly SSD speeds, of course.

Yeah -- I should have mentioned that: you absolutely must run hs/sr or
some other strategy that maintains your data.  I guess you might as
well turn off fsync, right?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cannot Run EnterpriseDB Postgresql Installation

2012-08-23 Thread Sachin Srivastava

Hello,

Can we get a screenshot? Also you can check for any partial installation
logs in your %TEMP% as install-postgresql.log or
bitrock_installer_.log. Check the %TEMP% of the Administrator as well
(If you dont see any logs in the %TEMP% of the logged in user)



On Thu, Aug 23, 2012 at 8:01 PM, javad M mjavad...@gmail.com wrote:

 Hi, i just formatted my machine and installed fresh win7 x64. Also
 installed VS2012 since i do .net developement. In backend i use postgresql
 so downloaded latest postgresql 9.1.5 installation. But, i am not able to
 install. Upon executing file postgresql-9.1.5-1-windows.exe it asks for
 UAC and i say yes then i get a black window and another window named
 console with (Downloads) 1 %  written in it. I have installed old version
 before and also on many clients but this is first time i am encountering
 this issue. There are also no logs anywhere. Please help, as my development
 is fully stranded because of this issue.




-- 
Regards,
Sachin Srivastava
EnterpriseDB, India

Re: [GENERAL] Amazon High I/O instances


On 08/23/12 6:49 AM, Craig Ringer wrote:
In this case, what he's doing is seeking generalized performance 
measurements. I don't think details were particularly necessary until 
it got pulled off-track. 



42

performance measurements without a very narrow definition of 
'performance' are useless.depending on the nature of the application 
workload, postgres can stress completely different aspects of the system 
(cpu vs read IO performance vs write IO performance being the big three).




--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Should I drop / create extensions after upgraging?

2012-08-23 Thread François Beausoleil

Hello all!

I'm on Ubuntu 11.10, using 9.1.4. There was a new package for 9.1.5, and I just 
finished my upgrade. I make use of the uuid-ossp and tablefunc extensions. What 
is the expected upgrade procedure for these extensions? A drop followed by a 
create? Or is there something else less invasive?

I can't find any references to upgrades on 
http://www.postgresql.org/docs/9.1/static/sql-createextension.html, nor on 
http://wiki.postgresql.org/wiki/ExtensionPackaging. Where should I have looked 
to find a reference to upgrading?

Thanks!
François Beausoleil

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Should I drop / create extensions after upgraging?

2012-08-23 Thread Tomas Vondra

On 23 Srpen 2012, 18:19, François Beausoleil wrote:
 Hello all!

 I'm on Ubuntu 11.10, using 9.1.4. There was a new package for 9.1.5, and I
 just finished my upgrade. I make use of the uuid-ossp and tablefunc
 extensions. What is the expected upgrade procedure for these extensions? A
 drop followed by a create? Or is there something else less invasive?

 I can't find any references to upgrades on
 http://www.postgresql.org/docs/9.1/static/sql-createextension.html, nor on
 http://wiki.postgresql.org/wiki/ExtensionPackaging. Where should I have
 looked to find a reference to upgrading?

The minor updates (e.g. from 9.1.4 to 9.1.5) are a drop-in binary
replacements. You don't need to do anything special about the extensions,
it just works (unless there's something really broken).

Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Should I drop / create extensions after upgraging?

2012-08-23 Thread Tom Lane

Tomas Vondra t...@fuzzy.cz writes:
 On 23 Srpen 2012, 18:19, FranÃ§ois Beausoleil wrote:
 I'm on Ubuntu 11.10, using 9.1.4. There was a new package for 9.1.5, and I
 just finished my upgrade. I make use of the uuid-ossp and tablefunc
 extensions. What is the expected upgrade procedure for these extensions? A
 drop followed by a create? Or is there something else less invasive?

 The minor updates (e.g. from 9.1.4 to 9.1.5) are a drop-in binary
 replacements. You don't need to do anything special about the extensions,
 it just works (unless there's something really broken).

Or perhaps more to the point: if there were something special you had to
do, the release notes for 9.1.5 would have told you about it.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Cannot Run EnterpriseDB Postgresql Installation

2012-08-23 Thread javad M

One more thing i forgot to add
The console - help - about window reads
Tcl for Windows
Tcl 8.5.9
Tk 8.5.9

So i searched and tried downloading and running the latest Tcl available
from http://www.activestate.com/activetcl
And what do you know, the same thing as in the screenshot running that tcl
setup
Maybe some issue with the tcl in the postgresql setup
I tried older postgresql setup version like 9.0.9, 9.1.4, x64 setups but
same issue.
For some reason the tcl component is creating some issue,

Javad

On Thu, Aug 23, 2012 at 11:32 PM, javad M mjavad...@gmail.com wrote:

 i have attached the screenshot. I searched everywhere for any log file.
 but nothing.


 On Thu, Aug 23, 2012 at 9:10 PM, Sachin Srivastava 
 sachin.srivast...@enterprisedb.com wrote:

 Hello,

 Can we get a screenshot? Also you can check for any partial installation
 logs in your %TEMP% as install-postgresql.log or
 bitrock_installer_.log. Check the %TEMP% of the Administrator as well
 (If you dont see any logs in the %TEMP% of the logged in user)



 On Thu, Aug 23, 2012 at 8:01 PM, javad M mjavad...@gmail.com wrote:

 Hi, i just formatted my machine and installed fresh win7 x64. Also
 installed VS2012 since i do .net developement. In backend i use postgresql
 so downloaded latest postgresql 9.1.5 installation. But, i am not able to
 install. Upon executing file postgresql-9.1.5-1-windows.exe it asks for
 UAC and i say yes then i get a black window and another window named
 console with (Downloads) 1 %  written in it. I have installed old version
 before and also on many clients but this is first time i am encountering
 this issue. There are also no logs anywhere. Please help, as my development
 is fully stranded because of this issue.




 --
 Regards,
 Sachin Srivastava
 EnterpriseDB, India

[GENERAL] Confirming \timing output

With \timing set on, I run an update statement and it reports

Time: 0.524 ms

Is that really 0.524 ms?  As in 524 nanoseconds?

Also, is this wallclock time or some sort of indication of how much cpu it took?

Thanks for any answers !

Re: [GENERAL] Amazon High I/O instances

2012-08-23 Thread Sébastien Lorion

I think both kind of tests (general and app specific) are complementary and
useful in their own way. At a minimum, if the general ones fail, why go to
the expenses of doing the specific ones ? Setting up a meaningful
application test can take a lot of time and it can be hard to pinpoint
exactly where in the stack the performance drops occur. The way I see it,
synthetic benchmarks allow to isolate somewhat the layers and serve as a
base to validate application tests done later on. It surprises me that
asking for the general perf behavior of a platform is controversial.

Sébastien

On Thu, Aug 23, 2012 at 11:51 AM, John R Pierce pie...@hogranch.com wrote:

 On 08/23/12 6:49 AM, Craig Ringer wrote:

 In this case, what he's doing is seeking generalized performance
 measurements. I don't think details were particularly necessary until it
 got pulled off-track.



 42

 performance measurements without a very narrow definition of 'performance'
 are useless.depending on the nature of the application workload,
 postgres can stress completely different aspects of the system (cpu vs read
 IO performance vs write IO performance being the big three).



 --
 john r pierceN 37, W 122
 santa cruz ca mid-left coast




 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Confirming \timing output

2012-08-23 Thread Steven Schlansker


On Aug 23, 2012, at 11:13 AM, Gauthier, Dave dave.gauth...@intel.com wrote:

 With \timing set on, I run an update statement and it reports
  
 Time: 0.524 ms
  
 Is that really 0.524 ms?  As in 524 nanoseconds?

0.524ms = 524000ns

Perhaps you meant microseconds?

0.524ms = 524us

If all your data happens to be in RAM cache, simple queries can execute very 
fast!  Unless you have a reason to believe it's wrong, I would trust it to be 
accurate :-)

  
 Also, is this wallclock time or some sort of indication of how much cpu it 
 took?
  
 Thanks for any answers !
  


\timing measures wall time.  There's a more detailed discussion of the 
difference between this and e.g. EXPLAIN ANALYZE here:

http://postgresql.1045698.n5.nabble.com/What-does-timing-measure-td4289329.html





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Amazon High I/O instances


On 08/23/12 11:24 AM, Sébastien Lorion wrote:
I think both kind of tests (general and app specific) are 
complementary and useful in their own way. At a minimum, if the 
general ones fail, why go to the expenses of doing the specific ones ? 
Setting up a meaningful application test can take a lot of time and it 
can be hard to pinpoint exactly where in the stack the performance 
drops occur. The way I see it, synthetic benchmarks allow to isolate 
somewhat the layers and serve as a base to validate application tests 
done later on. It surprises me that asking for the general perf 
behavior of a platform is controversial.


I don't use AWS at all.   But, it shouldnt take more than a couple hours 
to spin up an instance, populate a pgbench database and run a series of 
pgbench runs against it, and do the same against any other sort of 
system you wish to use as your reference.


I like to test with a database about twice the size of the available 
memory if I'm testing IO, and I've found that pgbench -i -s , for 
=1 it generates a 1 billion row table and uses about 150GB (and 
a hour or so to initialize on fast IO hardware).  I then run pgbench 
with -c of about 2-4X the cpu/thread count, and -j of about -c/16, and a 
-t of at least 1 (so each client connection runs 1 transactions).


on a modest but decent 2U class 2-socket dedicated server with a decent 
raid card and raid10 across enough spindles, I can see numbers as high 
as 5000 transactions/second with 15krpm rust, and 7000-8000 with a 
couple MLC SSD's striped.   trying to raid10 a bunch of SATA 7200 disks 
gives numbers more like 1000.   using host based raid, without a 
write-back cache in the raid card, gives numbers about 1/2 the above.   
the IOPS during these tests hit around 12000 or 15000 small writes/second.


doing this level of IO on a midsized SAN can often cause the SAN CPU to 
run at 80%+ so if there's other activity on the SAN from other hosts, 
good luck.


in a heavily virtualized shared-everything environment, I'm guessing 
your numbers will be all over the place and difficult to achieve 
consistency.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Confirming \timing output


On 08/23/12 11:13 AM, Gauthier, Dave wrote:


Time: 0.524 ms

Is that really 0.524 ms?  As in 524 nanoseconds?



0.524 MILLIseconds.  as in 524 microseconds.   microseconds is commonly 
abbreviated us.


afaik, its elapsed time, not CPU time.

--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Windows SIngle Sign On - LINUX Server

2012-08-23 Thread Jeremy Palmer

Hi All,

We are currently running PostgreSQL 8.4 on Windows server 2003 and are planning 
to move the instance to Ubuntu 10.4 - yay!. At the same time we will also 
upgrade to 9.1. One nice features that we leverage from the windows 
configuration is the ability for windows clients to use AD SSO i.e SSPI. This 
was really easy to set-up and configure.

If we move to Linux I was wondering if anyone could provide a howto reference 
or some tips on how to set-up auth configuration to provide SSO to windows 
clients? I've read the GSSAPI/Kerberos authentication section of the docs, but 
it's still unclear to me the exact process to follow. In particular what things 
would I need to configure on the Ubuntu server and which tasks will need to be 
done on the Windows domain controller (which is managed by a third party 
service provider who knows little about PostgreSQL or Ubuntu)

We are using a WIndows server 2008 for the domain control. However I know 
little about it's setup or configuration, I only know it's our windows domain 
realm. 

Regards,
Jeremy
This message contains information, which is confidential and may be subject to 
legal privilege. If you are not the intended recipient, you must not peruse, 
use, disseminate, distribute or copy this message. If you have received this 
message in error, please notify us immediately (Phone 0800 665 463 or 
i...@linz.govt.nz) and destroy the original message. LINZ accepts no 
responsibility for changes to this email, or for any attachments, after its 
transmission from LINZ. Thank You.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How hard would a path operator be to implement in PostgreSQL

On Tue, Aug 21, 2012 at 2:56 AM, Craig Ringer ring...@ringerc.id.au wrote:
 On 08/21/2012 03:01 PM, Martijn van Oosterhout wrote:

 Well, Postgres in principle supports arrays of records, so I've
 wondered if a relationship join could stuff all the objects in a single
 field of the response using an aggregate.  I think what's always
 prevented this from working is that client would have to parse the
 resulting output text output, which is practically impossible in the
 face of custom types.


 That's where the new JSON support is interesting; it provides a much more
 commonly understood and easier to parse structured form for results, so
 trees (but not more general graphs) can be returned.

I'd go beyond 'interesting' and call it a complete game changer if you
are consuming data in a language that has good support for JSON
(especially javascript naturally).  Another good option if you're
consuming structured data in C is libpqtypes.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

On Wed, Aug 22, 2012 at 10:22 PM, Chris Travers chris.trav...@gmail.com wrote:
 I have now been working with table inheritance for a while and after
 starting to grapple with many of the use cases it has have become
 increasingly impressed with this feature.  I also think that some of
 the apparent limitations fundamentally follow from the support for
 multiple inheritance, and multiple inheritance itself is so useful I
 would not want to see this go away.  Inheritance really starts to come
 to its own once you start using table methods, and some features that
 are useful in some sorts of inheritance modelling are useless in
 others.

The problem with postgres table inheritance is that it doesn't really
solve the problem that people wanted solved: to be able to define an
set of specific extra attributes for each row depending on some
characteristic of that row.  The feature only tantalizingly
masquerades as such.

Until it found use in table partitioning, I found the inheritance
feature to be basically useless.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Problems with timestamp with time zone and old dates?

2012-08-23 Thread Michael Clark

HI Scott.

Thanks a lot for the feedback.
I ended up setting the client time zone to GMT on my connections, and that
has fixed the problem for us.
It's actually an awesome solution, we can now expect all timestamps to be
returned in a consistent fashion.

Thanks for prodding me on that and sending me down that road!
Michael.


On Wed, Aug 22, 2012 at 12:19 AM, Scott Marlowe scott.marl...@gmail.comwrote:

 On Tue, Aug 21, 2012 at 10:12 PM, Scott Marlowe scott.marl...@gmail.com
 wrote:
  On Tue, Aug 21, 2012 at 10:08 PM, Scott Marlowe scott.marl...@gmail.com
 wrote:
  On Tue, Aug 21, 2012 at 3:29 PM, Michael Clark codingni...@gmail.com
 wrote:
  For example, if I insert like so:
  INSERT INTO sometable (startdate) values ('1750-08-21 21:17:00+00:00');
 
  What's the reason for you inserting with an offest instead of letting
  the client timezone set it for you?  Just wondering.
 
  Note that if you just want to get out what you're putting in (GMT) you
  can do this:
 
  select startdate at time zone 'GMT' from sometable ;
timezone
  -
   1750-08-21 21:17:00

 Or you could just use plain timezones, not timezones with timestamp.

Re: [GENERAL] Problems with timestamp with time zone and old dates?

2012-08-23 Thread Michael Clark

On Wed, Aug 22, 2012 at 5:00 AM, hubert depesz lubaczewski 
dep...@depesz.com wrote:

 On Tue, Aug 21, 2012 at 05:29:14PM -0400, Michael Clark wrote:
  For example, if I insert like so:
  INSERT INTO sometable (startdate) values ('1750-08-21 21:17:00+00:00');
 
  I get the following when I select:
  SELECT startdate FROM sometable;
startdate
  --
   1750-08-21 15:59:28-05:17:32
  (1 row)

 It's impossible to tell without knowing what is your time zone, but
 I don't see anything particularly strange about it. Non-integer offsets
 do happen, and in the past there were more commonly used.

 Based on what I can see in timezone data, it looks that time zone
 America/Toronto
 used this offset until 1895.


Ah, I see.  That makes sense.  Thanks for the info!  It's appreciated!

Michael.

[GENERAL] need substring based on delimiter

Hi:

I want to create a string from the first 3 elements of a csv (for example).  
The csv is longer than 3 elements.  Example...

aaa,bbb,ccc,ddd,eee,fff,ggg

I want the string aaa,bbb,ccc.



Tried splitting this to an array (precursor to appending elements 1,2,3), but 
failed to be able to reference the elements returned...

stdb_a0=# select string_to_array('aaa,bbb,ccc,ddd,eee,fff,ggg',',');
string_to_array
---
{aaa,bbb,ccc,ddd,eee,fff,ggg}
(1 row)

stdb_a0=# select string_to_array('aaa,bbb,ccc,ddd,eee,fff,ggg',',')[1];
ERROR:  syntax error at or near [


If there was a way position would return the pos of the 3rd comma, I could 
use sustring.  But I don't think it can :-(

Re: [GENERAL] Statistical aggregates with intervals

2012-08-23 Thread Christopher Swingley

Thomas,

On Wed, Aug 22, 2012 at 12:25 PM, Thomas Munro mu...@ip9.org wrote:
 I noticed that 'avg' works on 'interval', but 'stddev' and 'variance' don't:

I don't know why, but you could convert 'interval' into something else
where all the functions work:

CREATE OR REPLACE FUNCTION interval_to_seconds(interval)
RETURNS double precision AS $$
SELECT (extract(days from $1) * 86400)
 + (extract(hours from $1) * 3600)
 + (extract(minutes from $1) * 60)
 + extract(seconds from $1);
$$ LANGUAGE SQL;

Cheers,

Chris
-- 
Christopher Swingley
Fairbanks, Alaska
http://swingleydev.com/
cswin...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] need substring based on delimiter

2012-08-23 Thread Joe Conway

On 08/23/2012 02:30 PM, Gauthier, Dave wrote:
 I want to create a string from the first 3 elements of a csv (for
 example).  The csv is longer than 3 elements.  Example...
 
 aaa,bbb,ccc,ddd,eee,fff,ggg
 
 I want the string aaa,bbb,ccc.

select array_to_string
   (
 (
   string_to_array
   (
 'aaa,bbb,ccc,ddd,eee,fff,ggg',','
   )
 )[1:3],','
   );
 array_to_string
-
 aaa,bbb,ccc
(1 row)

HTH,

Joe


-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Statistical aggregates with intervals

2012-08-23 Thread Ondrej Ivanič

Hi,

On 24 August 2012 07:39, Christopher Swingley cswin...@gmail.com wrote:
 I don't know why, but you could convert 'interval' into something else
 where all the functions work:

 CREATE OR REPLACE FUNCTION interval_to_seconds(interval)
 RETURNS double precision AS $$
 SELECT (extract(days from $1) * 86400)
  + (extract(hours from $1) * 3600)
  + (extract(minutes from $1) * 60)
  + extract(seconds from $1);
 $$ LANGUAGE SQL;

Looks complicated. You can extract 'epoch':
db=# select now() - (now() - interval '1 day');
 ?column?
--
 1 day
(1 row)

db=# select extract(epoch from (now() - (now() - interval '1 day')));
 date_part
---
 86400
(1 row)


-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Jeff Janes

On Wed, Aug 22, 2012 at 4:06 PM, Nick nboutel...@gmail.com wrote:
 I have a table with 40 million rows and haven't had any performance issues 
 yet.

 Are there any rules of thumb as to when a table starts getting too big?

No.  Assuming you decided it were too big, what could you do about it?

If there are chunks of data that you don't need anymore, why wait for
the table to be become too big before removing it?

And partitioning very often isn't the answer, either.  There are very
few problems that ill-conceived partitioning won't make worse.  And
there are very many problems which even the best-conceived
partitioning will fail to improve.  If you have one of the cases where
partitioning is a good solution, don't wait for the table to become
'too big'.  Just go do it.

 For example, maybe if the index size is 6x the amount of ram, if the table is 
 10% of total disk space, etc?

If you need to be able to rapidly insert new rows in bulk, and you
can't sort them before inserting because there are multiple indexes
with completely different sort order and they cover the entire key
range of at least some of the indexes, than your performance will
collapse long before you get to 6x the amount of RAM.  But, what can
you do about it?  Maybe partitioning will fix this, maybe it won't.
If it will, why wait for a rule of thumb to be met?  If it won't, what
do you actually do once the rule of thumb is met?

I guess one rule of them I would have is, if for some reason I had to
cluster or reindex the table, how long would it take to do so?  If
that is much longer than I can reasonably schedule as a maintenance
window, I would be worried.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Jeff Janes

On Thu, Aug 23, 2012 at 6:37 AM, Bill Moran wmo...@potentialtech.com wrote:
 In response to Martin French martin.fre...@romaxtech.com:
 
  I have a table with 40 million rows and haven't had any performance
  issues yet.
 
  Are there any rules of thumb as to when a table starts getting too big?
 
  For example, maybe if the index size is 6x the amount of ram, if the
  table is 10% of total disk space, etc?

 My rule here is that a table is too big when performance starts degrading 
 beyond an acceptable level.

 The challenge there is that if you wait until performance degrades
 beyond an acceptable level, you've allowed yourself to get into a
 situation where clients are upset and frustrated, and fixing the
 problem is difficult because there's so much data to manipulate to
 rearrange things.

Yes, I agree with that.

 And the advice I have along those lines is to establish now what
 constitutes unacceptable performance, and put some sort of monitoring
 and tracking in place to know what your performance degradation looks
 like and predict when you'll have to react.  For example, a MRTG
 graph that runs an experimental query once a day during off hours and
 graphs the time it takes vs. the # of rows in the table will prove
 a valuable tool that can sometimes predict exactly when you'll have
 to change things before it becomes a problem.

This seems inconsistent with your previous advice.  By the time your
experimental query shows a problem, you no longer have any maintenance
windows left large enough to fix it.  Unless your experimental query
was a reindex or something non-production like that, in which case
running it on a production server, even off-hours, doesn't seem like a
good idea.


Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] FETCH in subqueries or CTEs


Hi all

I've noticed that FETCH doesn't seem to be supported in subqueries or in 
CTEs.


Is there a specific reason for that, beyond nobody's needed it and 
implemented it? I'm not complaining at all, merely curious.


A toy example:

DECLARE somecursor CURSOR FOR SELECT generate_series(1,1000)
SELECT * FROM ( FETCH ALL FROM somecursor ) x;

produces:

ERROR:  syntax error at or near FETCH
LINE 1: SELECT * FROM ( FETCH ALL FROM somecursor ) x;
^
Same with a CTE:

 WITH x AS ( FETCH ALL FROM somecursor ) SELECT * FROM x;

ERROR:  syntax error at or near FETCH
LINE 1: WITH x AS ( FETCH ALL FROM somecursor ) SELECT * FROM x;

--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

On Thu, Aug 23, 2012 at 12:36 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Aug 22, 2012 at 10:22 PM, Chris Travers chris.trav...@gmail.com 
 wrote:
 I have now been working with table inheritance for a while and after
 starting to grapple with many of the use cases it has have become
 increasingly impressed with this feature.  I also think that some of
 the apparent limitations fundamentally follow from the support for
 multiple inheritance, and multiple inheritance itself is so useful I
 would not want to see this go away.  Inheritance really starts to come
 to its own once you start using table methods, and some features that
 are useful in some sorts of inheritance modelling are useless in
 others.

 The problem with postgres table inheritance is that it doesn't really
 solve the problem that people wanted solved: to be able to define an
 set of specific extra attributes for each row depending on some
 characteristic of that row.  The feature only tantalizingly
 masquerades as such.

This is true to some extent.  I may have found a solution there, which
is to build your inheritance solutions on top of the solutions used
for table partitioning.  It's one reason why I say you should start
working with it on table partitioning before you try to do set/subset
modelling elsewhere.  Interestingly one appealing solution doesn't
really work (which is to put a check constraint which checks the
tableoid column, presumably because this isn't set on insert until
after the check constraint fires).

One thing I have found looking through Oracle and DB2 docs is that
their table inheritance seems to have all the same problems as ours
and their solutions to these problems seem rather broken from a
pure relational perspective.

For example, Oracle and DB2 make extensive use of OID's here (which
must be recorded in some sort of system catalog somewhere given what
they do with them), and they have functions to take a reference to a
row and operators to dereference the row.  This sounds all good and
well until you come across the IS DANGLING operator, which returns
true when the reference no longer is there...

In other words, as far as I can see nobody else has come up with a
sane foreign key solution for inherited tables either.

 Until it found use in table partitioning, I found the inheritance
 feature to be basically useless.

I think one can actually borrow techniques from table partitioning to
solve the problems associated with inheritance.

However here's what turned me around on table inheritance:

1)  First, in LedgerSMB, we started using it to create consistent
interfaces to sets of storage tables.  The storage tables would behave
differently, but would inherit essentially interfaces from their
parents.  In this regard, you can think of an inheritance tree as a
partitioned table set, but where the partitioning is necessary because
foreign key fields reference different tables in different children.
We use this for example, to avoid having to have a global notes table
or global file attachments table and it gives us clear control over
where these can be attached along with central maintenance of data
structures.  In cases, like with file attachments, where foreign keys
to inheritance trees ended up being needed, we started out with a more
complex but workable solution but I think are going to a simpler one.
This is a good thing.

In essence what we did was use inheritance to give us variable target
tables for a foreign key column.  I would still like to see
inheritable foreign key constraints because that would make some
things a lot easier, but the idea that foreign keys are not, by
default, copied in, means that you can override the destination in the
child table.  It isn't the use documented but it actually works very
well.

2)  Secondly I re-read Stonebraker's Object-Relational Database:  The
Next Wave and I had a rather sudden epiphany.  Relational databases
are about modelling your data so you can ensure consistency and gain
as many answers as you can.  Object-relational modelling adds
interfaces (possibly written in arbitrary programming languages) to
derive additional information from stored information.  The example he
gives could be summarized in English to be Give me all pictures of
sunsets taken within 20 miles of Sacramento where whether a picture
is of a sunset is determined by analyzing the graphic itself.  Thus
you have to add features to allow you to plug into the query to answer
that question, and you have to have a planner capable of optimizing
such a query.

I also read some other papers which discussed table inheritance and
what sort of modelling problems it was designed to solve (the main one
is actually part/whole modelling where a row may be a whole in itself
and also a part of another whole--- for example we might sell timing
belts, but they might also come included in an engine assembly).

3)  I was talking with Matt Trout regarding object-oriented
programming in Perl, and

Re: [GENERAL] FETCH in subqueries or CTEs

2012-08-23 Thread Jeff Davis

On Fri, 2012-08-24 at 09:35 +0800, Craig Ringer wrote:
 Hi all
 
 I've noticed that FETCH doesn't seem to be supported in subqueries or in 
 CTEs.
 
 Is there a specific reason for that, beyond nobody's needed it and 
 implemented it? I'm not complaining at all, merely curious.

1. Cursors have their own snapshot, so it would be kind of like looking
at two snapshots of data at the same time. That would be a little
strange.

2. For regular subqueries, it would also be potentially
non-deterministic, because the FETCH operation has the side effect of
advancing the cursor. So, if you had something like SELECT * FROM
(FETCH 1 FROM mycursor) x WHERE FALSE, it's not clear whether the FETCH
would execute or not. After the query, it may have advanced the cursor
or may not have, depending on whether the optimizer decided it didn't
need to compute the subquery.

3. Cursors are really meant for a more effective interaction with the
client, it's not really meant as an operator (and it doesn't change the
results, anyway). You can already do LIMIT/OFFSET in a subquery if you
need that kind of thing.

All that being said, there may be some use case for something like what
you are describing, if you get creative.

Regards,
Jeff Davis



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

One other thing that seems worth mentioning is that as soon as you
jump from relational to object-relational modelling is that the latter
is more rich and hence more complex than the former.  Because
object-relational modelling is a much expanded semantic superset of
relational modelling, the antipatterns are a much expanded superset
there too.  Additionally because the patterns are not yet well
understood, the whole area needs to be seen as somewhat immature.

That doesn't mean that the features are primarily useful as foot-guns,
but it does mean that productive use of features like this involves
careful documentation of patterns and antipatterns.

Best Wishes,
Chris Travers


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Can column name aliases be supported?


On 08/23/2012 10:19 PM, Gauthier, Dave wrote:

Updateable views.  This is great.  I didn't know about these.  Absolutely 
delicious !
I found a great example here... 
http://vibhorkumar.wordpress.com/2011/10/28/instead-of-trigger/

The problem of user updating 1 alias remains, but I have no problem bouncing them if 
they attempt that.  Maybe return a -E- You cannot update multiple aliases of the same 
base value (something like that).  I'd just have to detect this circumstance, raise 
the message and bounce (return OLD).

Does this sound doable?


With a RULE, probably not.

With a `DO INSTEAD` view trigger - available in Pg 9.1 and above - yes. 
I'd recommend using a view trigger instead of a rule if at all possible. 
Rules are tricksy things and sooner or later they'll bite you.


--
Craig Ringer



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Some thoughts on table inheritance (which is uniquely awesome on PostgreSQL)

2012-08-23 Thread Ondrej Ivanič

Hi,

On 24 August 2012 11:44, Chris Travers chris.trav...@gmail.com wrote:
 One thing I have found looking through Oracle and DB2 docs is that
 their table inheritance seems to have all the same problems as ours
 and their solutions to these problems seem rather broken from a
 pure relational perspective.

I can second that. Additionally, some vendors tried to fix
partitioning (which uses table inheritance) issues by creating all
sort of extension like CREATE TABLE ... PARTITION BY, and ALTER TABLE
... ALTER PARTITION ... which create all sorts of issues which are not
documented at all but you get response like yes, we know about this
bug; fix not yet available.

Many people asked for SQL wrappers for table partitioning but that's
not easy to do. I would be happy to have out of the box auto-routing
for insert/update/copy statements

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Postgresql 9.1 on VMWare ESXi 5.0

2012-08-23 Thread Edson Richter


Dear friends,

Anyone has experienced extreme slowness running PostgreSQL 9.1.4 on 
virtualized CentOS 5.8 on VMware ESXi 5.0 (with all Paravirtualized 
drivers)?


By extreme slowness, consider a query that brings one record from a 
table with 5000 records (using the PK as criteria) takes 200ms... for 
comparision, it takes 0ms in my local server (not virtualized).


I can see that disks have almost same performance as physical disks: 
110Mb/s.


Your help will be greatly appreciated!

Regards,

Edson


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Bill Moran

On Thu, 23 Aug 2012 17:56:37 -0700 Jeff Janes jeff.ja...@gmail.com wrote:

 On Thu, Aug 23, 2012 at 6:37 AM, Bill Moran wmo...@potentialtech.com wrote:
  In response to Martin French martin.fre...@romaxtech.com:
  
   I have a table with 40 million rows and haven't had any performance
   issues yet.
  
   Are there any rules of thumb as to when a table starts getting too big?
  
   For example, maybe if the index size is 6x the amount of ram, if the
   table is 10% of total disk space, etc?
 
  My rule here is that a table is too big when performance starts degrading 
  beyond an acceptable level.
 
  The challenge there is that if you wait until performance degrades
  beyond an acceptable level, you've allowed yourself to get into a
  situation where clients are upset and frustrated, and fixing the
  problem is difficult because there's so much data to manipulate to
  rearrange things.
 
 Yes, I agree with that.
 
  And the advice I have along those lines is to establish now what
  constitutes unacceptable performance, and put some sort of monitoring
  and tracking in place to know what your performance degradation looks
  like and predict when you'll have to react.  For example, a MRTG
  graph that runs an experimental query once a day during off hours and
  graphs the time it takes vs. the # of rows in the table will prove
  a valuable tool that can sometimes predict exactly when you'll have
  to change things before it becomes a problem.
 
 This seems inconsistent with your previous advice.  By the time your
 experimental query shows a problem, you no longer have any maintenance
 windows left large enough to fix it.  Unless your experimental query
 was a reindex or something non-production like that, in which case
 running it on a production server, even off-hours, doesn't seem like a
 good idea.

Perhaps I didn't explain the approach sufficiently.

If you can establish something like, This specific SELECT has to run
in under 5 minutes to meet the client's expectations you can then
time how long that query takes each time it's run (by capturing that
information in the application, for example ... or by running it in
some automated fashion ... possibly other methods as well).

If you capture that runtime on a regular basis and put the results
on a graph in concert with other relevant data, such as the number
of rows in the related tables, size of the data, etc, you quickly
get a good picture of how fast things are growing, and frequently
you can project the line out into the future and say things like
if we don't come up with a better way to do this by Sept of next
year, we're going to exceed our allowed run time.  You can then
take that very detailed information to business planners and point
out that they need to schedule developer time _before_ then if they
don't want the application to slow down below the allowable level.

Unless you work for somewhere that has unlimited resources, your
time is always split between feature requests, day to day operations,
firefighting, etc.  In my experience, keeping things like this
under control is often a matter of having enough information to
justify why your optimization project is more important than
whizbang feature x that marketing wants so bad.

Of course, if you work somewhere with unlimited resources, you
should let me know so I can send in my resume.

And none of what I'm suggesting is intended to belittle the other
suggestions either -- if you know of a way to optimize the data
better, why not do it now?  If you can be purging old data, why
wait until performance is a problem to start purging, etc.

It's just another trick to have in your bag.

-- 
Bill Moran wmo...@potentialtech.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Confirming \timing output


On 08/24/2012 02:30 AM, John R Pierce wrote:

On 08/23/12 11:13 AM, Gauthier, Dave wrote:


Time: 0.524 ms

Is that really 0.524 ms?  As in 524 nanoseconds?



0.524 MILLIseconds.  as in 524 microseconds.   microseconds is commonly
abbreviated us.


They should be µs ; (micro µ seconds s). Sadly, many setups still can't 
type anything outside 7-bit ASCII even in 2012 :-(


--
Craig Ringer


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] At what point does a big table start becoming too big?

2012-08-23 Thread Ondrej Ivanič

Hi,

On 23 August 2012 23:37, Bill Moran wmo...@potentialtech.com wrote:

 And the advice I have along those lines is to establish now what
 constitutes unacceptable performance, and put some sort of monitoring
 and tracking in place to know what your performance degradation looks
 like and predict when you'll have to react.  For example, a MRTG
 graph that runs an experimental query once a day during off hours and
 graphs the time it takes vs. the # of rows in the table will prove
 a valuable tool that can sometimes predict exactly when you'll have
 to change things before it becomes a problem.  Other tricks work as
 well, such as having the application send an email any time a process
 takes more than 50% of the allowable maximum time.


I like to use APDEX (http://apdex.org/specs.html). You can change your
database to all time for all statements and then calculate APDEX score
based on last N log entries ( 10). APDEX score is weighted score
based on number of datapoints within three zones:
0...T Satisfied Zone
T..F (=4*T) Tolerating Zone
4T...  Frustrated Zone

you can choose T (or F; then T = F/4) i.e. under normal circumstances
all queries should finish under 20ms (T = 20ms, F = 4T = 80ms). Apdex
score is:

score = (Satisfied count + Tolerating count / 2) / Total samples

You can get this number, for example, every minute and plot it using
Ganglia / MRTG / ...

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Confirming \timing output


On 08/23/12 7:31 PM, Craig Ringer wrote:




0.524 MILLIseconds.  as in 524 microseconds.   microseconds is commonly
abbreviated us.


They should be µs ; (micro µ seconds s). Sadly, many setups still 
can't type anything outside 7-bit ASCII even in 2012 


yeah, I know I could enter the alt+xyz except this laptop keyboard 
doesn't have a number pad, and I was way way too lazy to find and 
copy/paste one, or to use charmap.


--
john r pierceN 37, W 122
santa cruz ca mid-left coast




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Postgresql 9.1 on VMWare ESXi 5.0