[GENERAL] SR: pg_xlog

2013-08-27 Thread salah jubeh
Hello, 


I have a question regarding streaming replication:

When  pg_xlog needs to be shipped in streaming replication and when not?  I 
have faced a different issues in two existing setups, in the first setup, when 
shipping the pg_xlogs  a time line issue has arisen, and in another setup, when 
not shipping the log files i get -the database system is starting up- Fatal 
error.


Regards

Re: [GENERAL] SR: pg_xlog

2013-08-27 Thread Albe Laurenz
salah jubeh wrote:
 I have a question regarding streaming replication:
 
 When  pg_xlog needs to be shipped in streaming replication and when not?  I 
 have faced a different
 issues in two existing setups, in the first setup, when shipping the pg_xlogs 
  a time line issue has
 arisen, and in another setup, when not shipping the log files i get -the 
 database system is starting
 up- Fatal error.

You don't need log shipping for streaming replication as such.

The time line problem should be solved in PostgreSQL 9.3.

But it is still a good idea to use log shipping:
As you describe, you will usually need archived WALs to start replication,
unless the base backup is really recent.
Another case is when there is an interruption or the standby cannot
replicate fast enough.  Then replication will fall behind and might
need WAL archives to catch up.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] OLAP

2013-08-27 Thread Alban Hertroys
Hi all,

At work we have a system that's built on top of a proprietary OLAP database
system (Rocket Gentia). We're looking into replacing that system with
something that's still supported and in such a way that we can also access
the data from our reporting software (WebFOCUS by information builders).

Because I'm always advocating PG, I was asked whether PG would be suitable
for this, but I'm not really familiar enough with OLAP databases to be able
to comment on that.

I got three prerequisites for a solution, namely:
1. It must contain correct information,
2. It must be fast and
3. It must be easy to maintain the data and the models; that's a task for a
3rd party back-end application, but it would be helpful to be able to name
something to the higher-ups.

Next to that, because we're also going to access the system using our
reporting software (which is read-only access), it would be best if the
entire data model and all the business rules are stored inside the database
so that we're looking at the data in the same way that the back-end sees
it.

For size, we're looking at about 20 years of sales and shipment data all
over the world (although mostly in Europe) for about 5mln sold products per
year.

I suspect there might be some middleware that handles the models and
dimensions and stuff and manages triggers on relational tables in PG or a
setup like that.
I've seen an old reference to Cybertec OLAP, but they don't seem to carry
a product like that if I watch their site.

I'm looking for suggestions for something that would be able to do this.

Cheers,
Alban.
-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


Re: [GENERAL] OLAP

2013-08-27 Thread Paul Jungwirth
Hi Alban,

I think Postgres works great for OLAP work, and Amazon's Red Shift is
even based on Postgres. 100 million sales should be not problem at
all. My understanding is Greenplum also builds on top of Postgres, so
if you ever do outgrow your Postgres installation, that would be an
easy migration path. One Postgres OLAP tool to consider is Pentaho.
That will save you lots of time around ETL, ad-hoc reporting, and
other standard OLAP functionality.

Good luck!
Paul



On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote:
 Hi all,

 At work we have a system that's built on top of a proprietary OLAP database
 system (Rocket Gentia). We're looking into replacing that system with
 something that's still supported and in such a way that we can also access
 the data from our reporting software (WebFOCUS by information builders).

 Because I'm always advocating PG, I was asked whether PG would be suitable
 for this, but I'm not really familiar enough with OLAP databases to be able
 to comment on that.

 I got three prerequisites for a solution, namely:
 1. It must contain correct information,
 2. It must be fast and
 3. It must be easy to maintain the data and the models; that's a task for a
 3rd party back-end application, but it would be helpful to be able to name
 something to the higher-ups.

 Next to that, because we're also going to access the system using our
 reporting software (which is read-only access), it would be best if the
 entire data model and all the business rules are stored inside the database
 so that we're looking at the data in the same way that the back-end sees
 it.

 For size, we're looking at about 20 years of sales and shipment data all
 over the world (although mostly in Europe) for about 5mln sold products per
 year.

 I suspect there might be some middleware that handles the models and
 dimensions and stuff and manages triggers on relational tables in PG or a
 setup like that.
 I've seen an old reference to Cybertec OLAP, but they don't seem to carry
 a product like that if I watch their site.

 I'm looking for suggestions for something that would be able to do this.

 Cheers,
 Alban.
 --
 If you can't see the forest for the trees,
 Cut the trees and you'll see there is no forest.



-- 
_
Pulchritudo splendor veritatis.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Pgbouncer help

2013-08-27 Thread Yelai, Ramkumar IN BLR STS
HI



In our current project, we are opening several postgresql connection. Few 
connections are frequently used and few are occasionally used. Hence we plan to 
adapt connection pool method to avoid more connection to open.  We plan to use 
Pgbouncer.  Most of the pgbouncer example shows how to configure, but they 
are not explaining how to use in C++.



Please provide me a example, how to use it in C++.



Thanks  Regards,

Ramkumar



[GENERAL] A QUESTION ABOUT evaluation of two-dimensional Associative Arrays

2013-08-27 Thread superzhangfei
Hello
This is Robin.
I'm learning about the package of Enterprisedb.
I wrote a package like:CREATE OR REPLACE PACKAGE BODY pkg_test_d2IS
TYPE t_d2_tab IS TABLE OF integer INDEX BY integer;TYPE t_d2_arr IS TABLE OF 
t_d2_tab INDEX BY integer;
v_D2_Value t_d2_arr;v_D2_Max integer DEFAULT 0;
PROCEDURE xxx(INOUT ) ISBEGIN  ...  cnt := 1;  LOOP   cnt := cnt + 1;  IF 
XXX
   v_D2_Value(v_D2_Max)(cnt) := 0;   END IF;        
But it can not be excused.I modified v_D2_Value(v_D2_Max)(cnt)  to 
v_D2_Value[v_D2_Max][cnt].So it can be load in to database.When the package be 
excused,an error occured.
I need a way to give a value to the variable v_D2_Value  directly,which like 
v_D2_Value(v_D2_Max)(cnt) := 0  .
Could you give me some advice please?
 
Best Regards

Re: [GENERAL] Pgbouncer help

2013-08-27 Thread Jeff Janes
On Tue, Aug 27, 2013 at 1:34 AM, Yelai, Ramkumar IN BLR STS
ramkumar.ye...@siemens.com wrote:
 HI



 In our current project, we are opening several postgresql connection. Few
 connections are frequently used and few are occasionally used. Hence we plan
 to adapt connection pool method to avoid more connection to open.  We plan
 to use “Pgbouncer”.  Most of the pgbouncer example shows how to configure,
 but they are not explaining how to use in C++.



 Please provide me a example, how to use it in C++.

pgbouncer is designed to look (to the client) just like a normal
postgresql server.

If you want all connections to the database to go through pgbouncer,
you can move the real server to a different port, and then start up
pgbouncer on that vacated port.  In this case, the clients do not need
to make any changes at all to their configuration.

If you want to keep the real server on the same port as it currently
is and to use a special port to go through pgbouncer, then you need to
change the clients to use that new port number.  You do this the same
way you would change the client to use a different port if that
different port were a regular postgresql server.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] how to use aggregate functions in this case

2013-08-27 Thread David Johnston
David Johnston wrote
 
 Janek Sendrowski wrote
 Hi, 
 
 nbsp; 
 
 thanks for all your answers. 
 
 I#39;ll have a try with the contains operator and the intrange, but
 before I#39;d like to know if this would work: 
 
 nbsp; 
 
 CASE WHEN a gt;= 0 AND a lt;nbsp;25 
 
 CASE WHEN a gt;=nbsp;25 AND a lt; 50 
 
 nbsp; 
 
 There wouldn#39;t be a double endpoint. I just have to decide which
 range the endpoint includes. 
 
 nbsp; 
 
 Janek Sendrowski
 Yes, using explicit comparison operators with AND will work just fine;
 its just a little more verbose so the other options, if available, are
 preferred from a readability standpoint.  I do not know whether the
 different options may have different performance characteristics.
 
 David J.

And just for completeness:

WITH range_def (low, high, label) AS (
VALUES (0, 25, '0 to 25'::varchar), (25, 50, '25 to 50')
)
SELECT ...
FROM source_data
JOIN range_def ON (val = low AND val  high)
[the rest of the query]

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768721.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] regexp idea

2013-08-27 Thread AI Rumman
Hi,

I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A
final, June 5, 2013

I need to extract date part from the string.

I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )

But it gives me result August as it stops at Augustine.

In my case, date can be in different formats, some record may use , or
some may not.

Any idea to achieve this?

Thanks.


Re: [GENERAL] regexp idea

2013-08-27 Thread David Johnston
rummandba wrote
 Hi,
 
 I have a string like:
 Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A
 final, June 5, 2013
 
 I need to extract date part from the string.
 
 I used the follows:
 regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
 
 But it gives me result August as it stops at Augustine.
 
 In my case, date can be in different formats, some record may use , or
 some may not.
 
 Any idea to achieve this?
 
 Thanks.

Not sure how you expect to match June with that particular expression but
to solve the mis-matching of Augustine you can use the word-boundary
escapes \m (word-start) and \M (word-end).

Unless you need fuzzy matching on the month name you should simply list all
twelve months and possible recognized abbreviations as well.

^.*\m(June|July|August|September)\M[, a-zA-Z0-9]+

I'd consider helping more with forming an actual expression but a single
input sample with zero context on how such a string is created gives little
to work with.

Though after the month there likely cannot be a letter so a better
definition would be:

\m(August)[, ]+(\d)+[, ]+(\d+)

HTH

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/regexp-idea-tp5768725p5768731.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OLAP

2013-08-27 Thread Alban Hertroys
On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote:

 Hi Alban,
 
 I think Postgres works great for OLAP work

What do you base that on?
I don't really doubt that the database layer is up to the task, I'm much more 
worried about maintaining the model and the cube data and all that typical OLAP 
stuff that I've mostly just heard about.

 , and Amazon's Red Shift is
 even based on Postgres. 100 million sales should be not problem at
 all. My understanding is Greenplum also builds on top of Postgres, so
 if you ever do outgrow your Postgres installation, that would be an
 easy migration path.

What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a 
pretty old fork of Postgres at that?
GreenPlum has a pretty steep price-tag too.

I didn't really look into Red Shift, perhaps I should…

Anyway, I'm not at all sure I want to use some product that's heavily modified 
from Postgres. If it is, it has to be really really good.

 One Postgres OLAP tool to consider is Pentaho.
 That will save you lots of time around ETL, ad-hoc reporting, and
 other standard OLAP functionality.

How is Pentaho an OLAP tool? Aren't you mixing up a few things?
We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you 
consider it suitable for managing an OLAP database?

How would Pentaho manage cube rollup triggers, business models, dimensions and 
stuff like that?
We don't want to hand code those, that's far too error-prone and far too much 
work to keep track of. That stuff needs to be automated, preferably similar to 
what we're used to from Gentia (well, not me - I can't make heads or tails of 
Gentia, but the person who asked me about PG's suitability has been developing 
with it for years). That's what we're comparing to.

Unfortunately, I can't find any decent information about Gentia for reference. 
Apparently these days they're all about NoSQL databases and such. That's not 
what we have - I guess the clunky GUI is a hint that it's something of the 
past...


BTW, please don't top-post.


 On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote:
 Hi all,
 
 At work we have a system that's built on top of a proprietary OLAP database
 system (Rocket Gentia). We're looking into replacing that system with
 something that's still supported and in such a way that we can also access
 the data from our reporting software (WebFOCUS by information builders).
 
 Because I'm always advocating PG, I was asked whether PG would be suitable
 for this, but I'm not really familiar enough with OLAP databases to be able
 to comment on that.
 
 I got three prerequisites for a solution, namely:
 1. It must contain correct information,
 2. It must be fast and
 3. It must be easy to maintain the data and the models; that's a task for a
 3rd party back-end application, but it would be helpful to be able to name
 something to the higher-ups.
 
 Next to that, because we're also going to access the system using our
 reporting software (which is read-only access), it would be best if the
 entire data model and all the business rules are stored inside the database
 so that we're looking at the data in the same way that the back-end sees
 it.
 
 For size, we're looking at about 20 years of sales and shipment data all
 over the world (although mostly in Europe) for about 5mln sold products per
 year.
 
 I suspect there might be some middleware that handles the models and
 dimensions and stuff and manages triggers on relational tables in PG or a
 setup like that.
 I've seen an old reference to Cybertec OLAP, but they don't seem to carry
 a product like that if I watch their site.
 
 I'm looking for suggestions for something that would be able to do this.
 
 Cheers,
 Alban.
 --
 If you can't see the forest for the trees,
 Cut the trees and you'll see there is no forest.
 
 
 
 -- 
 _
 Pulchritudo splendor veritatis.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pgbouncer help

2013-08-27 Thread Steve Crawford

On 08/27/2013 10:40 AM, Jeff Janes wrote:

On Tue, Aug 27, 2013 at 1:34 AM, Yelai, Ramkumar IN BLR STS
ramkumar.ye...@siemens.com wrote:

HI



In our current project, we are opening several postgresql connection. Few
connections are frequently used and few are occasionally used. Hence we plan
to adapt connection pool method to avoid more connection to open.  We plan
to use “Pgbouncer”.  Most of the pgbouncer example shows how to configure,
but they are not explaining how to use in C++.



Please provide me a example, how to use it in C++.

pgbouncer is designed to look (to the client) just like a normal
postgresql server

However...

Since clients are reusing previously accessed server sessions, be sure 
to consider the implication of the different pool types and reset options.


For example, if you have multi-statement transactions you cannot, of 
course, use statement-level pooling since the server connection is 
released after the statement.


And if you set any runtime parameters (set time zone to..., set 
statement timeout..., etc.) then you will probably need to use 
session-level pooling and you will need to set server_reset_query 
appropriately otherwise you risk ending up either having parameters set 
to values you did not expect by a previously connected client or having 
parameters you set disappear when your next statement is assigned to a 
different server connection.


A similar issue exists if you use temporary tables as you need to be 
sure to stick with the same server connection while your processing 
needs the temporary table and you need to clean it up when you release 
the connection so it doesn't use extra resources and doesn't interfere 
with statements issued a subsequent client.


For more, see the following if you haven't read them already:
http://pgbouncer.projects.pgfoundry.org/doc/config.html
http://wiki.postgresql.org/wiki/PgBouncer

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] regexp idea

2013-08-27 Thread Rob Sargent

On 08/27/2013 12:44 PM, AI Rumman wrote:

Hi,

I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey 
Non-Public A final, June 5, 2013


I need to extract date part from the string.

I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )

But it gives me result August as it stops at Augustine.

In my case, date can be in different formats, some record may use , 
or some may not.


Any idea to achieve this?

Thanks.
select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, 
South Jersey Non-Public A final, June 5, 2013', 
E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' );

++
| regexp_replace |
++
| June  5, 2013  |
++
(1 row)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] regexp idea

2013-08-27 Thread AI Rumman
Thanks. That's awesome.
Do you have any good guide where I may get more knowledge on REGEXP?


On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent robjsarg...@gmail.com wrote:

 On 08/27/2013 12:44 PM, AI Rumman wrote:

 Hi,

 I have a string like:
 Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A
 final, June 5, 2013

 I need to extract date part from the string.

 I used the follows:
 regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )

 But it gives me result August as it stops at Augustine.

 In my case, date can be in different formats, some record may use , or
 some may not.

 Any idea to achieve this?

 Thanks.

 select regexp_replace('Gloucester Catholic vs. St. Augustine baseball,
 South Jersey Non-Public A final, June 5, 2013',
 E'(^.*)(\\m(June|July|August|**Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' );
 ++
 | regexp_replace |
 ++
 | June  5, 2013  |
 ++
 (1 row)




 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] regexp idea

2013-08-27 Thread Rob Sargent

On 08/27/2013 02:04 PM, AI Rumman wrote:

Thanks. That's awesome.
Do you have any good guide where I may get more knowledge on REGEXP?


On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent robjsarg...@gmail.com 
mailto:robjsarg...@gmail.com wrote:


On 08/27/2013 12:44 PM, AI Rumman wrote:

Hi,

I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey
Non-Public A final, June 5, 2013

I need to extract date part from the string.

I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )

But it gives me result August as it stops at Augustine.

In my case, date can be in different formats, some record may
use , or some may not.

Any idea to achieve this?

Thanks.

select regexp_replace('Gloucester Catholic vs. St. Augustine
baseball, South Jersey Non-Public A final, June 5, 2013',
E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' );
++
| regexp_replace |
++
| June  5, 2013  |
++
(1 row)




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

mailto:pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


http://www.postgresql.org/docs/9.1/static/functions-matching.html is the 
best starting point


Re: [GENERAL] OLAP

2013-08-27 Thread Jerry Sievers
Alban Hertroys haram...@gmail.com writes:

 On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com 
 wrote:

 Hi Alban,
 
 I think Postgres works great for OLAP work

 What do you base that on?
 I don't really doubt that the database layer is up to the task, I'm much more 
 worried about maintaining the model and the cube data and all that typical 
 OLAP stuff that I've mostly just heard about.

 , and Amazon's Red Shift is
 even based on Postgres. 100 million sales should be not problem at
 all. My understanding is Greenplum also builds on top of Postgres, so
 if you ever do outgrow your Postgres installation, that would be an
 easy migration path.

 What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a 
 pretty old fork of Postgres at that?
 GreenPlum has a pretty steep price-tag too.

Vertica is another case of an analytics focused platform that dirived
from Postgres, version 8.0 IIRC.

It was, by the time I first looked at it back about 4 years ago,  only
superficially  resembling Postgres.  Performance was absolutely
shocking in terms of how quickly it processed queries over certain
kinds of data... and for which the set of expected queries to be run
over same was identifiable in advance. 

Sample queries are given to a moddeler which in turn creates a set of
projections which are physical manifestations of the backend storage
intended to optimize for this specialized workload.

Vertica and I presume Green Plumb are *not* well suited for an OLTP
role so it takes a fair amount of learning to make good use of them.

Just FWIW.

 I didn't really look into Red Shift, perhaps I should…

 Anyway, I'm not at all sure I want to use some product that's heavily 
 modified from Postgres. If it is, it has to be really really good.

 One Postgres OLAP tool to consider is Pentaho.
 That will save you lots of time around ETL, ad-hoc reporting, and
 other standard OLAP functionality.

 How is Pentaho an OLAP tool? Aren't you mixing up a few things?
 We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you 
 consider it suitable for managing an OLAP database?

 How would Pentaho manage cube rollup triggers, business models, dimensions 
 and stuff like that?
 We don't want to hand code those, that's far too error-prone and far too much 
 work to keep track of. That stuff needs to be automated, preferably similar 
 to what we're used to from Gentia (well, not me - I can't make heads or tails 
 of Gentia, but the person who asked me about PG's suitability has been 
 developing with it for years). That's what we're comparing to.

 Unfortunately, I can't find any decent information about Gentia for 
 reference. Apparently these days they're all about NoSQL databases and such. 
 That's not what we have - I guess the clunky GUI is a hint that it's 
 something of the past...


 BTW, please don't top-post.


 On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote:
 Hi all,
 
 At work we have a system that's built on top of a proprietary OLAP database
 system (Rocket Gentia). We're looking into replacing that system with
 something that's still supported and in such a way that we can also access
 the data from our reporting software (WebFOCUS by information builders).
 
 Because I'm always advocating PG, I was asked whether PG would be suitable
 for this, but I'm not really familiar enough with OLAP databases to be able
 to comment on that.
 
 I got three prerequisites for a solution, namely:
 1. It must contain correct information,
 2. It must be fast and
 3. It must be easy to maintain the data and the models; that's a task for a
 3rd party back-end application, but it would be helpful to be able to name
 something to the higher-ups.
 
 Next to that, because we're also going to access the system using our
 reporting software (which is read-only access), it would be best if the
 entire data model and all the business rules are stored inside the database
 so that we're looking at the data in the same way that the back-end sees
 it.
 
 For size, we're looking at about 20 years of sales and shipment data all
 over the world (although mostly in Europe) for about 5mln sold products per
 year.
 
 I suspect there might be some middleware that handles the models and
 dimensions and stuff and manages triggers on relational tables in PG or a
 setup like that.
 I've seen an old reference to Cybertec OLAP, but they don't seem to carry
 a product like that if I watch their site.
 
 I'm looking for suggestions for something that would be able to do this.
 
 Cheers,
 Alban.
 --
 If you can't see the forest for the trees,
 Cut the trees and you'll see there is no forest.
 
 
 
 -- 
 _
 Pulchritudo splendor veritatis.

 Alban Hertroys
 --
 If you can't see the forest for the trees,
 cut the trees and you'll find there is no forest.



 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your 

Re: [GENERAL] pg_extension_config_dump() with a sequence

2013-08-27 Thread Moshe Jacobson
On Tue, Aug 20, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:


 Well, I think you did it wrong, or else you're using a PG version that
 predates some necessary fix, because it works for me.



 To debug, you might try looking in pg_extension to see if the extconfig
 entry for your extension includes the OID of the sequence.  If not, you
 messed up somehow in updating the extension.  If so, you must need a
 newer version of pg_dump (you did not answer the question what version
 you're using).


Sorry for the delayed response. I am using postgres 9.1.4 with pg_dump of
the same version.

I did basically the same thing as you, and it didn't work for me:

I created a simple extension myext as follows:

CREATE SEQUENCE sq_pk_myitem;
CREATE TABLE tb_myitem
(
myitem integer primary key default nextval('sq_pk_myitem'),
data text
);

SELECT pg_catalog.pg_extension_config_dump('tb_myitem', '');
SELECT pg_catalog.pg_extension_config_dump('sq_pk_myitem', '');


Then I created a database for it and installed it:

postgres@moshe=devmain:postgres=# create database mydb;
CREATE DATABASE
postgres@moshe=devmain:postgres=# \c mydb
You are now connected to database mydb as user postgres.
postgres@moshe=devmain:mydb=# create extension myext;
CREATE EXTENSION
postgres@moshe=devmain:mydb=# \d tb_myitem
   Table public.tb_myitem
 Column |  Type   | Modifiers
+-+
 myitem | integer | not null default nextval('sq_pk_myitem'::regclass)
 data   | text|
Indexes:
tb_myitem_pkey PRIMARY KEY, btree (myitem)
postgres@moshe=devmain:mydb=# \dx+ myext
Objects in extension myext
  Object Description
---
 sequence sq_pk_myitem
 table tb_myitem
(2 rows)

postgres@moshe=devmain:mydb=# \q


Then I tried to pg_dump it:

(0)(0j)[jehsom@moshe ~]$ pg_dump -U postgres mydb
--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


--
-- Name: hstore; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS hstore WITH SCHEMA public;


--
-- Name: EXTENSION hstore; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION hstore IS 'data type for storing sets of (key, value)
pairs';


--
-- Name: myext; Type: EXTENSION; Schema: -; Owner:
--

CREATE EXTENSION IF NOT EXISTS myext WITH SCHEMA public;


--
-- Name: EXTENSION myext; Type: COMMENT; Schema: -; Owner:
--

COMMENT ON EXTENSION myext IS 'my extension';


SET search_path = public, pg_catalog;

--
-- Data for Name: sq_pk_myitem; Type: TABLE DATA; Schema: public; Owner:
postgres
--

COPY sq_pk_myitem  FROM stdin;
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  cannot copy from sequence
sq_pk_myitem
pg_dump: The command was: COPY public.sq_pk_myitem  TO stdout;
(1)(0j)[jehsom@moshe ~]$


And I got the error here. I'm not sure why this happens because it doesn't
happen on another server here. Any help would be appreciated.

Thanks!

-- 
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
mo...@neadwerx.com | www.neadwerx.com

Quality is not an act, it is a habit. -- Aristotle


Re: [GENERAL] OLAP

2013-08-27 Thread Pavel Stehule
Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a):

 Alban Hertroys haram...@gmail.com writes:

  On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com
wrote:
 
  Hi Alban,
 
  I think Postgres works great for OLAP work
 
  What do you base that on?
  I don't really doubt that the database layer is up to the task, I'm
much more worried about maintaining the model and the cube data and all
that typical OLAP stuff that I've mostly just heard about.
 
  , and Amazon's Red Shift is
  even based on Postgres. 100 million sales should be not problem at
  all. My understanding is Greenplum also builds on top of Postgres, so
  if you ever do outgrow your Postgres installation, that would be an
  easy migration path.
 
  What's the benefit of GreenPlum for OLAP? Isn't it a columnar database?
And a pretty old fork of Postgres at that?
  GreenPlum has a pretty steep price-tag too.

 Vertica is another case of an analytics focused platform that dirived
 from Postgres, version 8.0 IIR

vertica use a similar interface, but internally use nothing from pg. it was
written from zero.

 It was, by the time I first looked at it back about 4 years ago,  only
 superficially  resembling Postgres.  Performance was absolutely
 shocking in terms of how quickly it processed queries over certain
 kinds of data... and for which the set of expected queries to be run
 over same was identifiable in advance.

 Sample queries are given to a moddeler which in turn creates a set of
 projections which are physical manifestations of the backend storage
 intended to optimize for this specialized workload.

 Vertica and I presume Green Plumb are *not* well suited for an OLTP
 role so it takes a fair amount of learning to make good use of them.

 Just FWIW.

  I didn't really look into Red Shift, perhaps I should…
 
  Anyway, I'm not at all sure I want to use some product that's heavily
modified from Postgres. If it is, it has to be really really good.
 
  One Postgres OLAP tool to consider is Pentaho.
  That will save you lots of time around ETL, ad-hoc reporting, and
  other standard OLAP functionality.
 
  How is Pentaho an OLAP tool? Aren't you mixing up a few things?
  We already use Pentaho for ETL, so I'm a bit familiar with it. Why do
you consider it suitable for managing an OLAP database?
 
  How would Pentaho manage cube rollup triggers, business models,
dimensions and stuff like that?
  We don't want to hand code those, that's far too error-prone and far
too much work to keep track of. That stuff needs to be automated,
preferably similar to what we're used to from Gentia (well, not me - I
can't make heads or tails of Gentia, but the person who asked me about PG's
suitability has been developing with it for years). That's what we're
comparing to.
 
  Unfortunately, I can't find any decent information about Gentia for
reference. Apparently these days they're all about NoSQL databases and
such. That's not what we have - I guess the clunky GUI is a hint that it's
something of the past...
 
 
  BTW, please don't top-post.
 
 
  On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com
wrote:
  Hi all,
 
  At work we have a system that's built on top of a proprietary OLAP
database
  system (Rocket Gentia). We're looking into replacing that system with
  something that's still supported and in such a way that we can also
access
  the data from our reporting software (WebFOCUS by information
builders).
 
  Because I'm always advocating PG, I was asked whether PG would be
suitable
  for this, but I'm not really familiar enough with OLAP databases to
be able
  to comment on that.
 
  I got three prerequisites for a solution, namely:
  1. It must contain correct information,
  2. It must be fast and
  3. It must be easy to maintain the data and the models; that's a task
for a
  3rd party back-end application, but it would be helpful to be able to
name
  something to the higher-ups.
 
  Next to that, because we're also going to access the system using our
  reporting software (which is read-only access), it would be best if
the
  entire data model and all the business rules are stored inside the
database
  so that we're looking at the data in the same way that the back-end
sees
  it.
 
  For size, we're looking at about 20 years of sales and shipment data
all
  over the world (although mostly in Europe) for about 5mln sold
products per
  year.
 
  I suspect there might be some middleware that handles the models and
  dimensions and stuff and manages triggers on relational tables in PG
or a
  setup like that.
  I've seen an old reference to Cybertec OLAP, but they don't seem to
carry
  a product like that if I watch their site.
 
  I'm looking for suggestions for something that would be able to do
this.
 
  Cheers,
  Alban.
  --
  If you can't see the forest for the trees,
  Cut the trees and you'll see there is no forest.
 
 
 
  --
  _
  Pulchritudo splendor veritatis.
 

Re: [GENERAL] OLAP

2013-08-27 Thread Carlos Saritama
according to what you write pentaho best fits your needs


On Tue, Aug 27, 2013 at 5:52 PM, Pavel Stehule pavel.steh...@gmail.comwrote:


 Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a):

 
  Alban Hertroys haram...@gmail.com writes:
 
   On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com
 wrote:
  
   Hi Alban,
  
   I think Postgres works great for OLAP work
  
   What do you base that on?
   I don't really doubt that the database layer is up to the task, I'm
 much more worried about maintaining the model and the cube data and all
 that typical OLAP stuff that I've mostly just heard about.
  
   , and Amazon's Red Shift is
   even based on Postgres. 100 million sales should be not problem at
   all. My understanding is Greenplum also builds on top of Postgres, so
   if you ever do outgrow your Postgres installation, that would be an
   easy migration path.
  
   What's the benefit of GreenPlum for OLAP? Isn't it a columnar
 database? And a pretty old fork of Postgres at that?
   GreenPlum has a pretty steep price-tag too.
 
  Vertica is another case of an analytics focused platform that dirived
  from Postgres, version 8.0 IIR

 vertica use a similar interface, but internally use nothing from pg. it
 was written from zero.

  It was, by the time I first looked at it back about 4 years ago,  only
  superficially  resembling Postgres.  Performance was absolutely
  shocking in terms of how quickly it processed queries over certain
  kinds of data... and for which the set of expected queries to be run
  over same was identifiable in advance.
 
  Sample queries are given to a moddeler which in turn creates a set of
  projections which are physical manifestations of the backend storage
  intended to optimize for this specialized workload.
 
  Vertica and I presume Green Plumb are *not* well suited for an OLTP
  role so it takes a fair amount of learning to make good use of them.
 
  Just FWIW.
 
   I didn't really look into Red Shift, perhaps I should…
  
   Anyway, I'm not at all sure I want to use some product that's heavily
 modified from Postgres. If it is, it has to be really really good.
  
   One Postgres OLAP tool to consider is Pentaho.
   That will save you lots of time around ETL, ad-hoc reporting, and
   other standard OLAP functionality.
  
   How is Pentaho an OLAP tool? Aren't you mixing up a few things?
   We already use Pentaho for ETL, so I'm a bit familiar with it. Why do
 you consider it suitable for managing an OLAP database?
  
   How would Pentaho manage cube rollup triggers, business models,
 dimensions and stuff like that?
   We don't want to hand code those, that's far too error-prone and far
 too much work to keep track of. That stuff needs to be automated,
 preferably similar to what we're used to from Gentia (well, not me - I
 can't make heads or tails of Gentia, but the person who asked me about PG's
 suitability has been developing with it for years). That's what we're
 comparing to.
  
   Unfortunately, I can't find any decent information about Gentia for
 reference. Apparently these days they're all about NoSQL databases and
 such. That's not what we have - I guess the clunky GUI is a hint that it's
 something of the past...
  
  
   BTW, please don't top-post.
  
  
   On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com
 wrote:
   Hi all,
  
   At work we have a system that's built on top of a proprietary OLAP
 database
   system (Rocket Gentia). We're looking into replacing that system with
   something that's still supported and in such a way that we can also
 access
   the data from our reporting software (WebFOCUS by information
 builders).
  
   Because I'm always advocating PG, I was asked whether PG would be
 suitable
   for this, but I'm not really familiar enough with OLAP databases to
 be able
   to comment on that.
  
   I got three prerequisites for a solution, namely:
   1. It must contain correct information,
   2. It must be fast and
   3. It must be easy to maintain the data and the models; that's a
 task for a
   3rd party back-end application, but it would be helpful to be able
 to name
   something to the higher-ups.
  
   Next to that, because we're also going to access the system using our
   reporting software (which is read-only access), it would be best if
 the
   entire data model and all the business rules are stored inside the
 database
   so that we're looking at the data in the same way that the
 back-end sees
   it.
  
   For size, we're looking at about 20 years of sales and shipment data
 all
   over the world (although mostly in Europe) for about 5mln sold
 products per
   year.
  
   I suspect there might be some middleware that handles the models
 and
   dimensions and stuff and manages triggers on relational tables in PG
 or a
   setup like that.
   I've seen an old reference to Cybertec OLAP, but they don't seem
 to carry
   a product like that if I watch their site.
  
   I'm 

Re: [GENERAL] OLAP

2013-08-27 Thread Carlos Saritama
Checkout the Saiku, the future of Open Source Interactive OLAP(
http://analytical-labs.com )


On Tue, Aug 27, 2013 at 8:34 PM, Carlos Saritama cssarit...@gmail.comwrote:

 according to what you write pentaho best fits your needs


 On Tue, Aug 27, 2013 at 5:52 PM, Pavel Stehule pavel.steh...@gmail.comwrote:


 Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a):

 
  Alban Hertroys haram...@gmail.com writes:
 
   On Aug 27, 2013, at 19:07, Paul Jungwirth 
 p...@illuminatedcomputing.com wrote:
  
   Hi Alban,
  
   I think Postgres works great for OLAP work
  
   What do you base that on?
   I don't really doubt that the database layer is up to the task, I'm
 much more worried about maintaining the model and the cube data and all
 that typical OLAP stuff that I've mostly just heard about.
  
   , and Amazon's Red Shift is
   even based on Postgres. 100 million sales should be not problem at
   all. My understanding is Greenplum also builds on top of Postgres, so
   if you ever do outgrow your Postgres installation, that would be an
   easy migration path.
  
   What's the benefit of GreenPlum for OLAP? Isn't it a columnar
 database? And a pretty old fork of Postgres at that?
   GreenPlum has a pretty steep price-tag too.
 
  Vertica is another case of an analytics focused platform that dirived
  from Postgres, version 8.0 IIR

 vertica use a similar interface, but internally use nothing from pg. it
 was written from zero.

  It was, by the time I first looked at it back about 4 years ago,  only
  superficially  resembling Postgres.  Performance was absolutely
  shocking in terms of how quickly it processed queries over certain
  kinds of data... and for which the set of expected queries to be run
  over same was identifiable in advance.
 
  Sample queries are given to a moddeler which in turn creates a set of
  projections which are physical manifestations of the backend storage
  intended to optimize for this specialized workload.
 
  Vertica and I presume Green Plumb are *not* well suited for an OLTP
  role so it takes a fair amount of learning to make good use of them.
 
  Just FWIW.
 
   I didn't really look into Red Shift, perhaps I should…
  
   Anyway, I'm not at all sure I want to use some product that's heavily
 modified from Postgres. If it is, it has to be really really good.
  
   One Postgres OLAP tool to consider is Pentaho.
   That will save you lots of time around ETL, ad-hoc reporting, and
   other standard OLAP functionality.
  
   How is Pentaho an OLAP tool? Aren't you mixing up a few things?
   We already use Pentaho for ETL, so I'm a bit familiar with it. Why do
 you consider it suitable for managing an OLAP database?
  
   How would Pentaho manage cube rollup triggers, business models,
 dimensions and stuff like that?
   We don't want to hand code those, that's far too error-prone and far
 too much work to keep track of. That stuff needs to be automated,
 preferably similar to what we're used to from Gentia (well, not me - I
 can't make heads or tails of Gentia, but the person who asked me about PG's
 suitability has been developing with it for years). That's what we're
 comparing to.
  
   Unfortunately, I can't find any decent information about Gentia for
 reference. Apparently these days they're all about NoSQL databases and
 such. That's not what we have - I guess the clunky GUI is a hint that it's
 something of the past...
  
  
   BTW, please don't top-post.
  
  
   On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com
 wrote:
   Hi all,
  
   At work we have a system that's built on top of a proprietary OLAP
 database
   system (Rocket Gentia). We're looking into replacing that system
 with
   something that's still supported and in such a way that we can also
 access
   the data from our reporting software (WebFOCUS by information
 builders).
  
   Because I'm always advocating PG, I was asked whether PG would be
 suitable
   for this, but I'm not really familiar enough with OLAP databases to
 be able
   to comment on that.
  
   I got three prerequisites for a solution, namely:
   1. It must contain correct information,
   2. It must be fast and
   3. It must be easy to maintain the data and the models; that's a
 task for a
   3rd party back-end application, but it would be helpful to be able
 to name
   something to the higher-ups.
  
   Next to that, because we're also going to access the system using
 our
   reporting software (which is read-only access), it would be best if
 the
   entire data model and all the business rules are stored inside the
 database
   so that we're looking at the data in the same way that the
 back-end sees
   it.
  
   For size, we're looking at about 20 years of sales and shipment
 data all
   over the world (although mostly in Europe) for about 5mln sold
 products per
   year.
  
   I suspect there might be some middleware that handles the models
 and
   dimensions and stuff and manages triggers on relational 

Re: [GENERAL] Is there any method to limit resource usage in PG?

2013-08-27 Thread 高健
Hi:

Now the situation goes there:
In the testing environment,
even when my customer changed  shared_buffers from 1024MB to 712MB or 512MB,
The total  memory consumption is still  almost the same.

I think that PG is always using  as much resource as it can,
For a query and insert  action,
Firstly , the data is pull into private memory  of  the backend process
which is service client.
Then,  the backend process push the data into  shared memory, here into
shared_buffers.
If  the shared_buffers is not big enough to hold all the result data, then
part of data will be in shared_buffer,
the other data will still remain in backend process's memory.

Is my understanding right?

Best Regard


2013/8/27 Jeff Janes jeff.ja...@gmail.com

 On Sun, Aug 25, 2013 at 11:08 PM, 高健 luckyjack...@gmail.com wrote:
  Hello:
 
  Sorry for disturbing.
 
  I am now encountering a serious problem: memory is not enough.
 
  My customer reported that when they run a program they found the totall
  memory and disk i/o usage all reached to threshold value(80%).
 
  That program is written by Java.
  It is to use JDBC to pull out data from DB, while the query joined some
  table together,  It will return about  3000,000 records.
  Then the program will use JDBC  again to write the records  row by row ,
 to
  inert into another table in the DB.

 What is using the memory, the postgres backend or the client program?

 Cheers,

 Jeff



Re: [GENERAL] What is the relationship between checkpoint and wal

2013-08-27 Thread 高健
Hello

Thanks for replying.

It is really a complicated concept.
So I think that in a mission critical environment , it is not a good choice
to  turn full_page_writes on.

Best Regards


2013/8/27 Jeff Janes jeff.ja...@gmail.com

 On Sun, Aug 25, 2013 at 7:57 PM, 高健 luckyjack...@gmail.com wrote:
  Hi :
 
  Thanks to Alvaro! Sorry for replying lately.
 
  I have understood a little about it.
 
  But the description of full_page_write made me even confused. Sorry that
  maybe I go to another problem:
 
  It is said:
 
 http://www.postgresql.org/docs/9.2/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES
  --
  When this parameter is on, the PostgreSQL server writes the entire
 content
  of each disk page to WAL during the first modification of that page
 after a
  checkpoint. This is needed because a page write that is in process
 during an
  operating system crash might be only partially completed, leading to an
  on-disk page that contains a mix of old and new data.
  ---
 
  Let me imagine that:
  On a disk page, there are following data:
 
  id=1 val=1 with transaction id  of 1001
  id=2 val=2 with transaction id  of 1002
  id=3 val=3 with transaction id  of 1003
 
  If  I start DB,
  And begin with transaction id of 2002 deal with data of id=2 ,making val
 to
  20
  Then with trsansaction id of 2003 deal with data of id=3,making val to 30
 
  If With full_page_write =off,
  When my checkpoint occur, it succeed with transaction 2002 but failed
 with
  2003 because of crash.

 A checkpoint either succeeds or fails.  It cannot succeed with some
 transactions and fail with others.

  Then disk page will be of:
 
  id=1 val=1 with transaction id  of 1001--maybe this is the very old
 data
  id=2 val=20 with transaction id  of 2002--This is now new data
  id=3 val=3 with transaction id  of 1003--This is old data.


 Postgres does not do in-place updates, it marks the old row as
 obsolete and creates a new one.

 id=1 val= 1 with transaction id  of 1001
 id=2 val= 2 with transaction id  of 1002 xmax of 2002
 id=3 val= 3 with transaction id  of 1003 xmax of 2003
 id=2 val=20 with transaction id  of 2002
 id=3 val=30 with transaction id of 2003

 Of course the whole point of a torn page write is that you don't how
 much got written, so you don't know what is actually on the disk.

  When DB restart from  crash,
  I think  that  there are wal data of transaction id 2002 and 2003 beause
  that wal written to wal_buffer is before data written to shared_buffer.

 The wal is written out of wal_buffer and flushed, before the
 corresponding block is written out of shared_buffer.

  So if  Online wal log file is ok, there will be no data lost, and
  roll-forward and roll-back can happen.
  If  some online wal log file is dmaged during crash:
  There might be some data lost,but if we have archive log, we can restore
  back due to archive wal log's latest transaction id.

 Most WAL records would have no problem being applied to a block that
 is an otherwise uncorrupted mix of old and new.

 But some WAL records have instructions that amount to grab 134 bytes
 from offset 7134 in the block and move them to offset 1623.  If the
 block is an unknown mix of old and new data, that can't be carried out
 safely.

 Cheers,

 Jeff



Re: [GENERAL] pg_extension_config_dump() with a sequence

2013-08-27 Thread Tom Lane
Moshe Jacobson mo...@neadwerx.com writes:
 On Tue, Aug 20, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, I think you did it wrong, or else you're using a PG version that
 predates some necessary fix, because it works for me.

 Sorry for the delayed response. I am using postgres 9.1.4 with pg_dump of
 the same version.

Ah.  I think that you are missing this 9.1.7 fix:

commit 5110a96992e508b220a7a6ab303b0501c4237b4a
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Fri Oct 26 12:12:53 2012 -0400

In pg_dump, dump SEQUENCE SET items in the data not pre-data section.

Represent a sequence's current value as a separate TableDataInfo dumpable
object, so that it can be dumped within the data section of the archive
rather than in pre-data.  This fixes an undesirable inconsistency between
the meanings of --data-only and --section=data, and also fixes dumping
of sequences that are marked as extension configuration tables, as per a
report from Marko Kreen back in July.  The main cost is that we do one more
SQL query per sequence, but that's probably not very meaningful in most
databases.

Back-patch to 9.1, since it has the extension configuration issue even
though not the --section switch.


regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OLAP

2013-08-27 Thread Jayadevan
Alban Hertroys-4 wrote
 How is Pentaho an OLAP tool? Aren't you mixing up a few things?
 We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you
 consider it suitable for managing an OLAP database?
 
 How would Pentaho manage cube rollup triggers, business models, dimensions
 and stuff like that?
 We don't want to hand code those, that's far too error-prone and far too
 much work to keep track of. That stuff needs to be automated, preferably
 similar to what we're used to from Gentia (well, not me - I can't make
 heads or tails of Gentia, but the person who asked me about PG's
 suitability has been developing with it for years). That's what we're
 comparing to.

Pentaho, Jasper and Actuate are the common OLAP tools. Pentahos' ETL tool
(Kettle) is more popular than their Reporting solution. Similarly, Jasper's
reporting tool is more popular than their ETL (they integrate talend). The
OLAP functionality in both come from Mondrian, as far as I know.
The cubes, dimensions etc can be defined using a UI tool and uploaded to the
reporting server - either Jasper or Pentaho. These typically support MDX,
in addition to standard SQL.
http://en.wikipedia.org/wiki/MultiDimensional_eXpressions



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/OLAP-tp5768698p5768782.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] OLAP

2013-08-27 Thread bricklen
On Tue, Aug 27, 2013 at 3:04 PM, Jerry Sievers gsiever...@comcast.netwrote:

 Vertica is another case of an analytics focused platform that dirived
 from Postgres, version 8.0 IIRC.


FWIW,
You might be thinking of Greenplum, and I think it forked from pg 8.1.
Vertica was built from scratch, it is not using Postgres as the basis for
the db. This is according to the engineers that I have spoken to in the
past at Vertica. I believe there was some code ported over for their
command line client vsql, but I could be wrong about that. At the very
least, vsql looks and acts like psql from a user's point of view.

I agree absolutely with you about the dazzling speed of certain types of
queries -- one of my tests was a simple aggregate query (a bunch of SUM()
and GROUP BY's) over about 500M rows and it returned results in a handful
of seconds. The same query in our Postgresql server with 2x the number of
disks took handfuls of minutes IIRC. On the flip side, I found some of the
complex Window functions performed on par or better in Postgresql than in
Vertica.


Re: [GENERAL] Is there any method to limit resource usage in PG?

2013-08-27 Thread John R Pierce

On 8/27/2013 6:49 PM, 高健 wrote:

For a query and insert  action,
Firstly , the data is pull into private memory  of  the backend 
process which is service client.



if you're returning a single result of 3 million records, yes, you're 
going to need memory to store that entire result set before you can do 
anything with it.


again, if you're just writing this data into another table, why not do 
it all in SQL ?


INSERT INTO newtable (field1,field2,field3,...) SELECT your 
complex 3 million row query here;


that will do the whole thing without having to move any data into client 
space.   this will be faster and more memory efficient.


now, if your Java client HAS to process the data its selecting and do 
complex stuff with it that you just don't think SQL can/should do, then 
you'll need to use a CURSOR.


DECLARE cursorname CURSOR FOR SELECT your messy query here;

then loop on ...
FETCH 100 FROM cursorname;

to return the next 100 rows of this cursor.  once you're done with it,

CLOSE cursorname;

will delete the cursor.

I suggest you read the specifics of these cursor commands to fully 
understand them.




--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general