[GENERAL] SR: pg_xlog
Hello, I have a question regarding streaming replication: When pg_xlog needs to be shipped in streaming replication and when not? I have faced a different issues in two existing setups, in the first setup, when shipping the pg_xlogs a time line issue has arisen, and in another setup, when not shipping the log files i get -the database system is starting up- Fatal error. Regards
Re: [GENERAL] SR: pg_xlog
salah jubeh wrote: I have a question regarding streaming replication: When pg_xlog needs to be shipped in streaming replication and when not? I have faced a different issues in two existing setups, in the first setup, when shipping the pg_xlogs a time line issue has arisen, and in another setup, when not shipping the log files i get -the database system is starting up- Fatal error. You don't need log shipping for streaming replication as such. The time line problem should be solved in PostgreSQL 9.3. But it is still a good idea to use log shipping: As you describe, you will usually need archived WALs to start replication, unless the base backup is really recent. Another case is when there is an interruption or the standby cannot replicate fast enough. Then replication will fall behind and might need WAL archives to catch up. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] OLAP
Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm looking for suggestions for something that would be able to do this. Cheers, Alban. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
Re: [GENERAL] OLAP
Hi Alban, I think Postgres works great for OLAP work, and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. Good luck! Paul On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm looking for suggestions for something that would be able to do this. Cheers, Alban. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Pgbouncer help
HI In our current project, we are opening several postgresql connection. Few connections are frequently used and few are occasionally used. Hence we plan to adapt connection pool method to avoid more connection to open. We plan to use Pgbouncer. Most of the pgbouncer example shows how to configure, but they are not explaining how to use in C++. Please provide me a example, how to use it in C++. Thanks Regards, Ramkumar
[GENERAL] A QUESTION ABOUT evaluation of two-dimensional Associative Arrays
Hello This is Robin. I'm learning about the package of Enterprisedb. I wrote a package like:CREATE OR REPLACE PACKAGE BODY pkg_test_d2IS TYPE t_d2_tab IS TABLE OF integer INDEX BY integer;TYPE t_d2_arr IS TABLE OF t_d2_tab INDEX BY integer; v_D2_Value t_d2_arr;v_D2_Max integer DEFAULT 0; PROCEDURE xxx(INOUT ) ISBEGIN ... cnt := 1; LOOP cnt := cnt + 1; IF XXX v_D2_Value(v_D2_Max)(cnt) := 0; END IF; But it can not be excused.I modified v_D2_Value(v_D2_Max)(cnt) to v_D2_Value[v_D2_Max][cnt].So it can be load in to database.When the package be excused,an error occured. I need a way to give a value to the variable v_D2_Value directly,which like v_D2_Value(v_D2_Max)(cnt) := 0 . Could you give me some advice please? Best Regards
Re: [GENERAL] Pgbouncer help
On Tue, Aug 27, 2013 at 1:34 AM, Yelai, Ramkumar IN BLR STS ramkumar.ye...@siemens.com wrote: HI In our current project, we are opening several postgresql connection. Few connections are frequently used and few are occasionally used. Hence we plan to adapt connection pool method to avoid more connection to open. We plan to use “Pgbouncer”. Most of the pgbouncer example shows how to configure, but they are not explaining how to use in C++. Please provide me a example, how to use it in C++. pgbouncer is designed to look (to the client) just like a normal postgresql server. If you want all connections to the database to go through pgbouncer, you can move the real server to a different port, and then start up pgbouncer on that vacated port. In this case, the clients do not need to make any changes at all to their configuration. If you want to keep the real server on the same port as it currently is and to use a special port to go through pgbouncer, then you need to change the clients to use that new port number. You do this the same way you would change the client to use a different port if that different port were a regular postgresql server. Cheers, Jeff -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how to use aggregate functions in this case
David Johnston wrote Janek Sendrowski wrote Hi, nbsp; thanks for all your answers. I#39;ll have a try with the contains operator and the intrange, but before I#39;d like to know if this would work: nbsp; CASE WHEN a gt;= 0 AND a lt;nbsp;25 CASE WHEN a gt;=nbsp;25 AND a lt; 50 nbsp; There wouldn#39;t be a double endpoint. I just have to decide which range the endpoint includes. nbsp; Janek Sendrowski Yes, using explicit comparison operators with AND will work just fine; its just a little more verbose so the other options, if available, are preferred from a readability standpoint. I do not know whether the different options may have different performance characteristics. David J. And just for completeness: WITH range_def (low, high, label) AS ( VALUES (0, 25, '0 to 25'::varchar), (25, 50, '25 to 50') ) SELECT ... FROM source_data JOIN range_def ON (val = low AND val high) [the rest of the query] David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768721.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] regexp idea
Hi, I have a string like: Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013 I need to extract date part from the string. I used the follows: regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) But it gives me result August as it stops at Augustine. In my case, date can be in different formats, some record may use , or some may not. Any idea to achieve this? Thanks.
Re: [GENERAL] regexp idea
rummandba wrote Hi, I have a string like: Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013 I need to extract date part from the string. I used the follows: regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) But it gives me result August as it stops at Augustine. In my case, date can be in different formats, some record may use , or some may not. Any idea to achieve this? Thanks. Not sure how you expect to match June with that particular expression but to solve the mis-matching of Augustine you can use the word-boundary escapes \m (word-start) and \M (word-end). Unless you need fuzzy matching on the month name you should simply list all twelve months and possible recognized abbreviations as well. ^.*\m(June|July|August|September)\M[, a-zA-Z0-9]+ I'd consider helping more with forming an actual expression but a single input sample with zero context on how such a string is created gives little to work with. Though after the month there likely cannot be a letter so a better definition would be: \m(August)[, ]+(\d)+[, ]+(\d+) HTH David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-idea-tp5768725p5768731.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] OLAP
On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote: Hi Alban, I think Postgres works great for OLAP work What do you base that on? I don't really doubt that the database layer is up to the task, I'm much more worried about maintaining the model and the cube data and all that typical OLAP stuff that I've mostly just heard about. , and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a pretty old fork of Postgres at that? GreenPlum has a pretty steep price-tag too. I didn't really look into Red Shift, perhaps I should… Anyway, I'm not at all sure I want to use some product that's heavily modified from Postgres. If it is, it has to be really really good. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Unfortunately, I can't find any decent information about Gentia for reference. Apparently these days they're all about NoSQL databases and such. That's not what we have - I guess the clunky GUI is a hint that it's something of the past... BTW, please don't top-post. On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm looking for suggestions for something that would be able to do this. Cheers, Alban. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- _ Pulchritudo splendor veritatis. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pgbouncer help
On 08/27/2013 10:40 AM, Jeff Janes wrote: On Tue, Aug 27, 2013 at 1:34 AM, Yelai, Ramkumar IN BLR STS ramkumar.ye...@siemens.com wrote: HI In our current project, we are opening several postgresql connection. Few connections are frequently used and few are occasionally used. Hence we plan to adapt connection pool method to avoid more connection to open. We plan to use “Pgbouncer”. Most of the pgbouncer example shows how to configure, but they are not explaining how to use in C++. Please provide me a example, how to use it in C++. pgbouncer is designed to look (to the client) just like a normal postgresql server However... Since clients are reusing previously accessed server sessions, be sure to consider the implication of the different pool types and reset options. For example, if you have multi-statement transactions you cannot, of course, use statement-level pooling since the server connection is released after the statement. And if you set any runtime parameters (set time zone to..., set statement timeout..., etc.) then you will probably need to use session-level pooling and you will need to set server_reset_query appropriately otherwise you risk ending up either having parameters set to values you did not expect by a previously connected client or having parameters you set disappear when your next statement is assigned to a different server connection. A similar issue exists if you use temporary tables as you need to be sure to stick with the same server connection while your processing needs the temporary table and you need to clean it up when you release the connection so it doesn't use extra resources and doesn't interfere with statements issued a subsequent client. For more, see the following if you haven't read them already: http://pgbouncer.projects.pgfoundry.org/doc/config.html http://wiki.postgresql.org/wiki/PgBouncer Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regexp idea
On 08/27/2013 12:44 PM, AI Rumman wrote: Hi, I have a string like: Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013 I need to extract date part from the string. I used the follows: regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) But it gives me result August as it stops at Augustine. In my case, date can be in different formats, some record may use , or some may not. Any idea to achieve this? Thanks. select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' ); ++ | regexp_replace | ++ | June 5, 2013 | ++ (1 row) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regexp idea
Thanks. That's awesome. Do you have any good guide where I may get more knowledge on REGEXP? On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent robjsarg...@gmail.com wrote: On 08/27/2013 12:44 PM, AI Rumman wrote: Hi, I have a string like: Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013 I need to extract date part from the string. I used the follows: regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) But it gives me result August as it stops at Augustine. In my case, date can be in different formats, some record may use , or some may not. Any idea to achieve this? Thanks. select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|**Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' ); ++ | regexp_replace | ++ | June 5, 2013 | ++ (1 row) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/**mailpref/pgsql-generalhttp://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] regexp idea
On 08/27/2013 02:04 PM, AI Rumman wrote: Thanks. That's awesome. Do you have any good guide where I may get more knowledge on REGEXP? On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent robjsarg...@gmail.com mailto:robjsarg...@gmail.com wrote: On 08/27/2013 12:44 PM, AI Rumman wrote: Hi, I have a string like: Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013 I need to extract date part from the string. I used the follows: regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) But it gives me result August as it stops at Augustine. In my case, date can be in different formats, some record may use , or some may not. Any idea to achieve this? Thanks. select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' ); ++ | regexp_replace | ++ | June 5, 2013 | ++ (1 row) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org mailto:pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general http://www.postgresql.org/docs/9.1/static/functions-matching.html is the best starting point
Re: [GENERAL] OLAP
Alban Hertroys haram...@gmail.com writes: On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote: Hi Alban, I think Postgres works great for OLAP work What do you base that on? I don't really doubt that the database layer is up to the task, I'm much more worried about maintaining the model and the cube data and all that typical OLAP stuff that I've mostly just heard about. , and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a pretty old fork of Postgres at that? GreenPlum has a pretty steep price-tag too. Vertica is another case of an analytics focused platform that dirived from Postgres, version 8.0 IIRC. It was, by the time I first looked at it back about 4 years ago, only superficially resembling Postgres. Performance was absolutely shocking in terms of how quickly it processed queries over certain kinds of data... and for which the set of expected queries to be run over same was identifiable in advance. Sample queries are given to a moddeler which in turn creates a set of projections which are physical manifestations of the backend storage intended to optimize for this specialized workload. Vertica and I presume Green Plumb are *not* well suited for an OLTP role so it takes a fair amount of learning to make good use of them. Just FWIW. I didn't really look into Red Shift, perhaps I should Anyway, I'm not at all sure I want to use some product that's heavily modified from Postgres. If it is, it has to be really really good. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Unfortunately, I can't find any decent information about Gentia for reference. Apparently these days they're all about NoSQL databases and such. That's not what we have - I guess the clunky GUI is a hint that it's something of the past... BTW, please don't top-post. On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm looking for suggestions for something that would be able to do this. Cheers, Alban. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- _ Pulchritudo splendor veritatis. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your
Re: [GENERAL] pg_extension_config_dump() with a sequence
On Tue, Aug 20, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Well, I think you did it wrong, or else you're using a PG version that predates some necessary fix, because it works for me. To debug, you might try looking in pg_extension to see if the extconfig entry for your extension includes the OID of the sequence. If not, you messed up somehow in updating the extension. If so, you must need a newer version of pg_dump (you did not answer the question what version you're using). Sorry for the delayed response. I am using postgres 9.1.4 with pg_dump of the same version. I did basically the same thing as you, and it didn't work for me: I created a simple extension myext as follows: CREATE SEQUENCE sq_pk_myitem; CREATE TABLE tb_myitem ( myitem integer primary key default nextval('sq_pk_myitem'), data text ); SELECT pg_catalog.pg_extension_config_dump('tb_myitem', ''); SELECT pg_catalog.pg_extension_config_dump('sq_pk_myitem', ''); Then I created a database for it and installed it: postgres@moshe=devmain:postgres=# create database mydb; CREATE DATABASE postgres@moshe=devmain:postgres=# \c mydb You are now connected to database mydb as user postgres. postgres@moshe=devmain:mydb=# create extension myext; CREATE EXTENSION postgres@moshe=devmain:mydb=# \d tb_myitem Table public.tb_myitem Column | Type | Modifiers +-+ myitem | integer | not null default nextval('sq_pk_myitem'::regclass) data | text| Indexes: tb_myitem_pkey PRIMARY KEY, btree (myitem) postgres@moshe=devmain:mydb=# \dx+ myext Objects in extension myext Object Description --- sequence sq_pk_myitem table tb_myitem (2 rows) postgres@moshe=devmain:mydb=# \q Then I tried to pg_dump it: (0)(0j)[jehsom@moshe ~]$ pg_dump -U postgres mydb -- -- PostgreSQL database dump -- SET statement_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; SET check_function_bodies = false; SET client_min_messages = warning; -- -- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: -- CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog; -- -- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: -- COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language'; -- -- Name: hstore; Type: EXTENSION; Schema: -; Owner: -- CREATE EXTENSION IF NOT EXISTS hstore WITH SCHEMA public; -- -- Name: EXTENSION hstore; Type: COMMENT; Schema: -; Owner: -- COMMENT ON EXTENSION hstore IS 'data type for storing sets of (key, value) pairs'; -- -- Name: myext; Type: EXTENSION; Schema: -; Owner: -- CREATE EXTENSION IF NOT EXISTS myext WITH SCHEMA public; -- -- Name: EXTENSION myext; Type: COMMENT; Schema: -; Owner: -- COMMENT ON EXTENSION myext IS 'my extension'; SET search_path = public, pg_catalog; -- -- Data for Name: sq_pk_myitem; Type: TABLE DATA; Schema: public; Owner: postgres -- COPY sq_pk_myitem FROM stdin; pg_dump: SQL command failed pg_dump: Error message from server: ERROR: cannot copy from sequence sq_pk_myitem pg_dump: The command was: COPY public.sq_pk_myitem TO stdout; (1)(0j)[jehsom@moshe ~]$ And I got the error here. I'm not sure why this happens because it doesn't happen on another server here. Any help would be appreciated. Thanks! -- Moshe Jacobson Nead Werx, Inc. | Manager of Systems Engineering 2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339 mo...@neadwerx.com | www.neadwerx.com Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] OLAP
Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a): Alban Hertroys haram...@gmail.com writes: On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote: Hi Alban, I think Postgres works great for OLAP work What do you base that on? I don't really doubt that the database layer is up to the task, I'm much more worried about maintaining the model and the cube data and all that typical OLAP stuff that I've mostly just heard about. , and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a pretty old fork of Postgres at that? GreenPlum has a pretty steep price-tag too. Vertica is another case of an analytics focused platform that dirived from Postgres, version 8.0 IIR vertica use a similar interface, but internally use nothing from pg. it was written from zero. It was, by the time I first looked at it back about 4 years ago, only superficially resembling Postgres. Performance was absolutely shocking in terms of how quickly it processed queries over certain kinds of data... and for which the set of expected queries to be run over same was identifiable in advance. Sample queries are given to a moddeler which in turn creates a set of projections which are physical manifestations of the backend storage intended to optimize for this specialized workload. Vertica and I presume Green Plumb are *not* well suited for an OLTP role so it takes a fair amount of learning to make good use of them. Just FWIW. I didn't really look into Red Shift, perhaps I should… Anyway, I'm not at all sure I want to use some product that's heavily modified from Postgres. If it is, it has to be really really good. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Unfortunately, I can't find any decent information about Gentia for reference. Apparently these days they're all about NoSQL databases and such. That's not what we have - I guess the clunky GUI is a hint that it's something of the past... BTW, please don't top-post. On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm looking for suggestions for something that would be able to do this. Cheers, Alban. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- _ Pulchritudo splendor veritatis.
Re: [GENERAL] OLAP
according to what you write pentaho best fits your needs On Tue, Aug 27, 2013 at 5:52 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a): Alban Hertroys haram...@gmail.com writes: On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote: Hi Alban, I think Postgres works great for OLAP work What do you base that on? I don't really doubt that the database layer is up to the task, I'm much more worried about maintaining the model and the cube data and all that typical OLAP stuff that I've mostly just heard about. , and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a pretty old fork of Postgres at that? GreenPlum has a pretty steep price-tag too. Vertica is another case of an analytics focused platform that dirived from Postgres, version 8.0 IIR vertica use a similar interface, but internally use nothing from pg. it was written from zero. It was, by the time I first looked at it back about 4 years ago, only superficially resembling Postgres. Performance was absolutely shocking in terms of how quickly it processed queries over certain kinds of data... and for which the set of expected queries to be run over same was identifiable in advance. Sample queries are given to a moddeler which in turn creates a set of projections which are physical manifestations of the backend storage intended to optimize for this specialized workload. Vertica and I presume Green Plumb are *not* well suited for an OLTP role so it takes a fair amount of learning to make good use of them. Just FWIW. I didn't really look into Red Shift, perhaps I should… Anyway, I'm not at all sure I want to use some product that's heavily modified from Postgres. If it is, it has to be really really good. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Unfortunately, I can't find any decent information about Gentia for reference. Apparently these days they're all about NoSQL databases and such. That's not what we have - I guess the clunky GUI is a hint that it's something of the past... BTW, please don't top-post. On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational tables in PG or a setup like that. I've seen an old reference to Cybertec OLAP, but they don't seem to carry a product like that if I watch their site. I'm
Re: [GENERAL] OLAP
Checkout the Saiku, the future of Open Source Interactive OLAP( http://analytical-labs.com ) On Tue, Aug 27, 2013 at 8:34 PM, Carlos Saritama cssarit...@gmail.comwrote: according to what you write pentaho best fits your needs On Tue, Aug 27, 2013 at 5:52 PM, Pavel Stehule pavel.steh...@gmail.comwrote: Dne 28. 8. 2013 0:05 Jerry Sievers gsiever...@comcast.net napsal(a): Alban Hertroys haram...@gmail.com writes: On Aug 27, 2013, at 19:07, Paul Jungwirth p...@illuminatedcomputing.com wrote: Hi Alban, I think Postgres works great for OLAP work What do you base that on? I don't really doubt that the database layer is up to the task, I'm much more worried about maintaining the model and the cube data and all that typical OLAP stuff that I've mostly just heard about. , and Amazon's Red Shift is even based on Postgres. 100 million sales should be not problem at all. My understanding is Greenplum also builds on top of Postgres, so if you ever do outgrow your Postgres installation, that would be an easy migration path. What's the benefit of GreenPlum for OLAP? Isn't it a columnar database? And a pretty old fork of Postgres at that? GreenPlum has a pretty steep price-tag too. Vertica is another case of an analytics focused platform that dirived from Postgres, version 8.0 IIR vertica use a similar interface, but internally use nothing from pg. it was written from zero. It was, by the time I first looked at it back about 4 years ago, only superficially resembling Postgres. Performance was absolutely shocking in terms of how quickly it processed queries over certain kinds of data... and for which the set of expected queries to be run over same was identifiable in advance. Sample queries are given to a moddeler which in turn creates a set of projections which are physical manifestations of the backend storage intended to optimize for this specialized workload. Vertica and I presume Green Plumb are *not* well suited for an OLTP role so it takes a fair amount of learning to make good use of them. Just FWIW. I didn't really look into Red Shift, perhaps I should… Anyway, I'm not at all sure I want to use some product that's heavily modified from Postgres. If it is, it has to be really really good. One Postgres OLAP tool to consider is Pentaho. That will save you lots of time around ETL, ad-hoc reporting, and other standard OLAP functionality. How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Unfortunately, I can't find any decent information about Gentia for reference. Apparently these days they're all about NoSQL databases and such. That's not what we have - I guess the clunky GUI is a hint that it's something of the past... BTW, please don't top-post. On Tue, Aug 27, 2013 at 8:12 AM, Alban Hertroys haram...@gmail.com wrote: Hi all, At work we have a system that's built on top of a proprietary OLAP database system (Rocket Gentia). We're looking into replacing that system with something that's still supported and in such a way that we can also access the data from our reporting software (WebFOCUS by information builders). Because I'm always advocating PG, I was asked whether PG would be suitable for this, but I'm not really familiar enough with OLAP databases to be able to comment on that. I got three prerequisites for a solution, namely: 1. It must contain correct information, 2. It must be fast and 3. It must be easy to maintain the data and the models; that's a task for a 3rd party back-end application, but it would be helpful to be able to name something to the higher-ups. Next to that, because we're also going to access the system using our reporting software (which is read-only access), it would be best if the entire data model and all the business rules are stored inside the database so that we're looking at the data in the same way that the back-end sees it. For size, we're looking at about 20 years of sales and shipment data all over the world (although mostly in Europe) for about 5mln sold products per year. I suspect there might be some middleware that handles the models and dimensions and stuff and manages triggers on relational
Re: [GENERAL] Is there any method to limit resource usage in PG?
Hi: Now the situation goes there: In the testing environment, even when my customer changed shared_buffers from 1024MB to 712MB or 512MB, The total memory consumption is still almost the same. I think that PG is always using as much resource as it can, For a query and insert action, Firstly , the data is pull into private memory of the backend process which is service client. Then, the backend process push the data into shared memory, here into shared_buffers. If the shared_buffers is not big enough to hold all the result data, then part of data will be in shared_buffer, the other data will still remain in backend process's memory. Is my understanding right? Best Regard 2013/8/27 Jeff Janes jeff.ja...@gmail.com On Sun, Aug 25, 2013 at 11:08 PM, 高健 luckyjack...@gmail.com wrote: Hello: Sorry for disturbing. I am now encountering a serious problem: memory is not enough. My customer reported that when they run a program they found the totall memory and disk i/o usage all reached to threshold value(80%). That program is written by Java. It is to use JDBC to pull out data from DB, while the query joined some table together, It will return about 3000,000 records. Then the program will use JDBC again to write the records row by row , to inert into another table in the DB. What is using the memory, the postgres backend or the client program? Cheers, Jeff
Re: [GENERAL] What is the relationship between checkpoint and wal
Hello Thanks for replying. It is really a complicated concept. So I think that in a mission critical environment , it is not a good choice to turn full_page_writes on. Best Regards 2013/8/27 Jeff Janes jeff.ja...@gmail.com On Sun, Aug 25, 2013 at 7:57 PM, 高健 luckyjack...@gmail.com wrote: Hi : Thanks to Alvaro! Sorry for replying lately. I have understood a little about it. But the description of full_page_write made me even confused. Sorry that maybe I go to another problem: It is said: http://www.postgresql.org/docs/9.2/static/runtime-config-wal.html#GUC-FULL-PAGE-WRITES -- When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint. This is needed because a page write that is in process during an operating system crash might be only partially completed, leading to an on-disk page that contains a mix of old and new data. --- Let me imagine that: On a disk page, there are following data: id=1 val=1 with transaction id of 1001 id=2 val=2 with transaction id of 1002 id=3 val=3 with transaction id of 1003 If I start DB, And begin with transaction id of 2002 deal with data of id=2 ,making val to 20 Then with trsansaction id of 2003 deal with data of id=3,making val to 30 If With full_page_write =off, When my checkpoint occur, it succeed with transaction 2002 but failed with 2003 because of crash. A checkpoint either succeeds or fails. It cannot succeed with some transactions and fail with others. Then disk page will be of: id=1 val=1 with transaction id of 1001--maybe this is the very old data id=2 val=20 with transaction id of 2002--This is now new data id=3 val=3 with transaction id of 1003--This is old data. Postgres does not do in-place updates, it marks the old row as obsolete and creates a new one. id=1 val= 1 with transaction id of 1001 id=2 val= 2 with transaction id of 1002 xmax of 2002 id=3 val= 3 with transaction id of 1003 xmax of 2003 id=2 val=20 with transaction id of 2002 id=3 val=30 with transaction id of 2003 Of course the whole point of a torn page write is that you don't how much got written, so you don't know what is actually on the disk. When DB restart from crash, I think that there are wal data of transaction id 2002 and 2003 beause that wal written to wal_buffer is before data written to shared_buffer. The wal is written out of wal_buffer and flushed, before the corresponding block is written out of shared_buffer. So if Online wal log file is ok, there will be no data lost, and roll-forward and roll-back can happen. If some online wal log file is dmaged during crash: There might be some data lost,but if we have archive log, we can restore back due to archive wal log's latest transaction id. Most WAL records would have no problem being applied to a block that is an otherwise uncorrupted mix of old and new. But some WAL records have instructions that amount to grab 134 bytes from offset 7134 in the block and move them to offset 1623. If the block is an unknown mix of old and new data, that can't be carried out safely. Cheers, Jeff
Re: [GENERAL] pg_extension_config_dump() with a sequence
Moshe Jacobson mo...@neadwerx.com writes: On Tue, Aug 20, 2013 at 7:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Well, I think you did it wrong, or else you're using a PG version that predates some necessary fix, because it works for me. Sorry for the delayed response. I am using postgres 9.1.4 with pg_dump of the same version. Ah. I think that you are missing this 9.1.7 fix: commit 5110a96992e508b220a7a6ab303b0501c4237b4a Author: Tom Lane t...@sss.pgh.pa.us Date: Fri Oct 26 12:12:53 2012 -0400 In pg_dump, dump SEQUENCE SET items in the data not pre-data section. Represent a sequence's current value as a separate TableDataInfo dumpable object, so that it can be dumped within the data section of the archive rather than in pre-data. This fixes an undesirable inconsistency between the meanings of --data-only and --section=data, and also fixes dumping of sequences that are marked as extension configuration tables, as per a report from Marko Kreen back in July. The main cost is that we do one more SQL query per sequence, but that's probably not very meaningful in most databases. Back-patch to 9.1, since it has the extension configuration issue even though not the --section switch. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] OLAP
Alban Hertroys-4 wrote How is Pentaho an OLAP tool? Aren't you mixing up a few things? We already use Pentaho for ETL, so I'm a bit familiar with it. Why do you consider it suitable for managing an OLAP database? How would Pentaho manage cube rollup triggers, business models, dimensions and stuff like that? We don't want to hand code those, that's far too error-prone and far too much work to keep track of. That stuff needs to be automated, preferably similar to what we're used to from Gentia (well, not me - I can't make heads or tails of Gentia, but the person who asked me about PG's suitability has been developing with it for years). That's what we're comparing to. Pentaho, Jasper and Actuate are the common OLAP tools. Pentahos' ETL tool (Kettle) is more popular than their Reporting solution. Similarly, Jasper's reporting tool is more popular than their ETL (they integrate talend). The OLAP functionality in both come from Mondrian, as far as I know. The cubes, dimensions etc can be defined using a UI tool and uploaded to the reporting server - either Jasper or Pentaho. These typically support MDX, in addition to standard SQL. http://en.wikipedia.org/wiki/MultiDimensional_eXpressions -- View this message in context: http://postgresql.1045698.n5.nabble.com/OLAP-tp5768698p5768782.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] OLAP
On Tue, Aug 27, 2013 at 3:04 PM, Jerry Sievers gsiever...@comcast.netwrote: Vertica is another case of an analytics focused platform that dirived from Postgres, version 8.0 IIRC. FWIW, You might be thinking of Greenplum, and I think it forked from pg 8.1. Vertica was built from scratch, it is not using Postgres as the basis for the db. This is according to the engineers that I have spoken to in the past at Vertica. I believe there was some code ported over for their command line client vsql, but I could be wrong about that. At the very least, vsql looks and acts like psql from a user's point of view. I agree absolutely with you about the dazzling speed of certain types of queries -- one of my tests was a simple aggregate query (a bunch of SUM() and GROUP BY's) over about 500M rows and it returned results in a handful of seconds. The same query in our Postgresql server with 2x the number of disks took handfuls of minutes IIRC. On the flip side, I found some of the complex Window functions performed on par or better in Postgresql than in Vertica.
Re: [GENERAL] Is there any method to limit resource usage in PG?
On 8/27/2013 6:49 PM, 高健 wrote: For a query and insert action, Firstly , the data is pull into private memory of the backend process which is service client. if you're returning a single result of 3 million records, yes, you're going to need memory to store that entire result set before you can do anything with it. again, if you're just writing this data into another table, why not do it all in SQL ? INSERT INTO newtable (field1,field2,field3,...) SELECT your complex 3 million row query here; that will do the whole thing without having to move any data into client space. this will be faster and more memory efficient. now, if your Java client HAS to process the data its selecting and do complex stuff with it that you just don't think SQL can/should do, then you'll need to use a CURSOR. DECLARE cursorname CURSOR FOR SELECT your messy query here; then loop on ... FETCH 100 FROM cursorname; to return the next 100 rows of this cursor. once you're done with it, CLOSE cursorname; will delete the cursor. I suggest you read the specifics of these cursor commands to fully understand them. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general