Re: [GENERAL] cache lookup failed for function 19119
Hi All Thanks for your suggestions . But the problem here seems to be with the way slony applies triggers . The result of below queries in the database gives the following results : SELECT tgrelid,tgname,tgfoid from pg_trigger; tgrelid|tgname | tgfoid - ++ 16470 | _clustername_denyaccess_1 | 19119 16470 | user_defined_trig_1| 17733 16470 | user_defined_trig_2| 17629 16392 | _cluastername_denyaccess_1 | 20801 msw=# SELECT relname from pg_class where oid=16470; relname abc_pkey (1 row) msw=# SELECT relname from pg_class where oid=16392; relname --- abc (1 row) The result is like this for all the replicable tables i.e the denyaccess trigger is applied on the table as well as table_pkey with different function ids (19119 and 20801 in this case). While function with oid 20801 is available in pg_proc table and the corresponding function name is denyaccess( ) but function with oid 19119 is not available In database and that's why the error cache lookup failed for function 19119 Now my question is how denyaccess trigger got applied on table_pkey while denyaccess trigger should have been applied only on table name and user defined triggers should have been applied on table_pkey in slave database. Is this a known issue ?? What can lead to this kind of situation. ?? Please help Thanks.. Tamanna -Original Message- From: David Fetter [mailto:da...@fetter.org] Sent: Saturday, July 17, 2010 8:45 PM To: Merlin Moncure Cc: tamanna madaan; pgsql-general@postgresql.org Subject: Re: [GENERAL] cache lookup failed for function 19119 On Thu, Jul 15, 2010 at 10:21:52AM -0400, Merlin Moncure wrote: On Thu, Jul 15, 2010 at 2:34 AM, tamanna madaan tamanna.ma...@globallogic.com wrote: Hi All I am using postgres-8.1.2 . And getting this error cache lookup failed for function 19119. Can anyone please let me know what could have gone wrong. How can a function go missing . And which function Its talkig about ?? its some postgres's internal function or a user defined function ?? How can I get function name corresponding 19119 The function is either gone (it was deleted manally from pg_proc for example), dropped, added, etc. or there is some other problem. You might be able to fix the problem by recreating the function (create/replace) that is calling the function in question (your database log should be giving you some context). You are on 8.1.2 which is crazy. you need to immediately get the latest bugfix release for the 8.1 series. You might want to consider a dump/reload...read the release notes for the 8.1 series here: http://www.postgresql.org/docs/8.1/static/release.html. You might also want to note that 8.1's end of life is in November, so start planning the upgrade to 9.0 right now. You will likely need to clean up some client code in order for that to work, as modern versions of PostgreSQL don't allow some of the sloppy and dangerous things (casting automatically to and from text, e.g.) that former versions did. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! If the fifth bit means one thing, and the 7th bit means something else, quick which of the following have the fifth bit set and the 7th bit off: 01001101 (base2) or 77 (base 10) And, fundamentally, they mean precisely the same thing. Of course. But that wasn't my point, and by my example above, one is much easier to figure out than the other if you're interested in bit twiddling. And the '' function works as nicely with boring old decimals as it does with long-winded binaries, so I really don't see the point of making it more complicated than it needs to be -but I'm open to be enlightened on the matter! I fail to see how storing a binary as a binary and showing it as a binary makes things more complicated. But I'm open to someone showing me how that's true. At least hex or octal have direct and simple conversions where each hex or octal digit represents 4 or 3 bits respectively. Decimal does not. Thanks for the second link though. I hadn't realised that PostgreSQL was so richly-endowed with bitwise functions. Specifically, it's got the bitwise XOR I was thinking it would be nice to have in these sorts of situations: ims=# select * from coloursample where colour # 10 = 0; recid | colour | descript ---++--- 1 | 10 | Yellow and Orange (1 row) Not quite sure how to apply that to my more realistic example just yet, but I think this will be very helpful, so thank you! Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. quick which of the following have the fifth bit set and the 7th bit off: 01001101 (base2) or 77 (base 10) I'll give you that one (except that the fifth bit isn't set and the 7th bit is on!!). Now repeat for 10 million records and see how you get on. Really, I don't have to visually inspect a record to work this stuff out! So what is easier on your eye for one record is completely irrelevant as far as my code is concerned! one is much easier to figure out than the other if you're interested in bit twiddling. As I say, take an incredibly simple example and everything looks, er, simple. Point of fact, I happen to know that the '1' bit is set in 21205 simply by looking at the last digit and spotting that it's not even. Fat lot of good that does me when fetching 350,000 records that happen to match 'insurance claim'. I fail to see how storing a binary as a binary and showing it as a binary makes things more complicated Because it's NOT binary. It's an encoding. Whether that encoding is displayed in binary, fluent hebrew or klingon or imaginary numbers is really irrelevant to me. It happens that I can look at a decimal number and work out most things (if the number is 29438, it's a fair chance the 16384 bit is set, for example; if it's 4098, I know it's 4096 + 2. And so on.). If you're happier working with fancifully long strings of 1s and 0s, good on you: but it doesn't alter the fact that I'm working with encoded meanings, not binary digits and I prefer a nice, compact display of that encoding which doesn't involve hieroglyphics. But it's your project. Indeed. Doesn't mean I can do it all on my own, of course. But if it's simply a question of personal preference, I'll take mine over yours, for projects I work on, if that's OK. No hard feelings! :-) Regards HJR -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Passing a PGconn * between two processes on Unix like systems
I two Unix/ process related questions: Is there a documented way to pass around an (opened) PGconn * structure between two processes on Unix? When a process forks() and both the parent and child process continue to use a previously opened PGconn * structure, is that behaviour defined? Thanks, Marc Balmer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Finding last checkpoint time
On Tue, 2010-07-20 at 20:48 +0100, Thom Brown wrote: Or you can use pg_controldata /path/to/pgdata and look at Time of latest checkpoint. Right. Thanks :) -- Devrim GÜNDÜZ PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer PostgreSQL RPM Repository: http://yum.pgrpms.org Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Passing a PGconn * between two processes on Unix like systems
On Thu, Jul 22, 2010 at 08:35, Marc Balmer m...@msys.ch wrote: I two Unix/ process related questions: Is there a documented way to pass around an (opened) PGconn * structure between two processes on Unix? No. You can probably hack up something yourself but you'd have to look inside the struct which is not part of the public API - so it'd be a very version-dependent (even minor-version dependent!) hack. When a process forks() and both the parent and child process continue to use a previously opened PGconn * structure, is that behaviour defined? Yes - broken :-) Well, the child can continue to use it *as long as the parent doesn't use it anymore*. And note that while it may be a good idea in general to close the socket in the parent, you can *not* call PQclose() on it - that'll tell the server you're disconnecting, and the child will stop working. In theory you could do something like close(PQsocket(conn))... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com wrote: If you want something simple, and not requiring PostGIS, but plpgsql instead, see: http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html Regards Oliver Kohll oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk - software
Re: [GENERAL] Passing a PGconn * between two processes on Unix like systems
On 22 Jul 2010, at 8:35, Marc Balmer wrote: When a process forks() and both the parent and child process continue to use a previously opened PGconn * structure, is that behaviour defined? I recall having done this successfully, but you have to take care to synchronise access to the connection. You can't have multiple transactions running in parallel on one connection. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4c482748286211410335719! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ECPG - Some errno definitions don't match to the manual
Hi all, I'm looking into some ecpg part of the official manual, and I have found some strange things. I'm now investigating SQLCODE and SQLSTATE, and I have found that some of the errno definitions don't match to the manual. For example, the manual says that ECPG_CONVERT_BOOL could be `-207'. However, ECPG_CONVERT_BOOL is defined as `-211' in ecpgerrno.h. -207 (ECPG_CONVERT_BOOL) This means the host variable is of type bool and the datum in the database is neither 't' nor 'f'. (SQLSTATE 42804) http://www.postgresql.org/docs/9.0/static/ecpg-errors.html #define ECPG_NUMERIC_FORMAT -207 #define ECPG_CONVERT_BOOL -211 http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/ecpg/include/ecpgerrno.h?rev=1.27;content-type=text%2Fx-cvsweb-markup;only_with_tag=REL9_0_STABLE What does it mean? The manual is not up to date? Any suggestions? Regards, -- NAGAYASU Satoshi satoshi.nagay...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Maximum document-size of text-search?
Hi. I'm trying to index the contents of word-documents, extracted text, which leads to quite large documents sometimes. This resutls in the following Exception: Caused by: org.postgresql.util.PSQLException: ERROR: index row requires 10376 bytes, maximum size is 8191 I have the following schema: andreak=# \d origo_search_index Table public.origo_search_index Column | Type |Modifiers --+---+- id | integer | not null default nextval('origo_search_index_id_seq'::regclass) entity_id| integer | not null entity_type | character varying | not null field| character varying | not null search_value | character varying | not null textsearchable_index_col | tsvector | origo_search_index_fts_idx gin (textsearchable_index_col) Triggers: update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON origo_search_index FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('textsearchable_index_col', 'pg_catalog.english', 'search_value') I store all the text extracted from the documents in search_value and have the built-in trigger tsvector_update_trigger update the tsvector-column. Any hints on how to get around this issue to allow indexing large documents? I don't see how only index the first N bytes of the document would be of interest to anyone... BTW: I'm using PG-9.0beta3 -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Finding last checkpoint time
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Or you can use pg_controldata /path/to/pgdata and look at Time of latest checkpoint. Assuming your system is using English. Otherwise, you'll have to build a collection of .po strings as we did for check_postgres.pl. Needless to say, I'd greatly prefer some other way to grab the information! - -- Greg Sabino Mullane g...@turnstep.com PGP Key: 0x14964AC8 201007220933 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkxISMwACgkQvJuQZxSWSsirbACfa3ujzyTLyzlPbG0QrDUC/0AB BCYAnRfP0E2CJQM+V0qNzgdsi47OjWKB =+XW4 -END PGP SIGNATURE- -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
Oliver Kohll - Mailing Lists wrote: On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: If you want something simple, and not requiring PostGIS, but plpgsql instead, see: http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html I did look at earthdistance before. Revisiting it now, thanks. So, I'm trying to figure out this syntax. The docs say: point @ point - float8 - gives the distance in statue miles between two points on the Earth's surface. How does longitude and latitude fit into this picture? I can't find any other documentation or examples? I've got the contrib mods installed as 'select earth()' works fine. Regards Oliver Kohll oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk http://www.agilebase.co.uk - software -- Until later, Geoffrey I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them. - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
Oliver Kohll - Mailing Lists wrote: On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: If you want something simple, and not requiring PostGIS, but plpgsql instead, see: http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html Trying to figure out the proper usage. My assumptions: use ll_to_earth() to get point values to pass to 'point @ point' First issue, ll_to_earth() returns three values, not one. Second issue, I tried something like: select (ll_to_earth(46,67)@ll_to_earth(57,87)); I get: ERROR: operator does not exist: earth @ earth LINE 1: select (ll_to_earth(46,67)@ll_to_earth(57,87)); So I tried: select (4618419.15006707@4394453.66154081); And I get: ERROR: operator does not exist: numeric @ numeric LINE 1: select (4618419.15006707@4394453.66154081); ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. What am I missing??? -- Until later, Geoffrey I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them. - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
Oliver Kohll - Mailing Lists wrote: On 21 Jul 2010, at 23:14, Joe Conway m...@joeconway.com mailto:m...@joeconway.com wrote: If you want something simple, and not requiring PostGIS, but plpgsql instead, see: http://archives.postgresql.org/pgsql-sql/2003-12/msg00193.php For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html Disgregard my last post, Surely as soon as I hit send, the light went on... I'm looking at deriving my points for point @ point from ll_to_earth(). Regards Oliver Kohll oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk http://www.agilebase.co.uk - software -- Until later, Geoffrey I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them. - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Getting statistics for each sql statement?
Hello, is it possible to get statistics on the usage of different sql statements, e.g. how many INSERT or UPDATE statements per day? log_statement_stats doesn't seem to be the right parameter or I haven't found the output statistic for this command. Any ideas or suggestions? Thanks, Stefan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
RESOLVED: Re: [GENERAL] Maximum document-size of text-search?
On 07/22/2010 03:31 PM, Andreas Joseph Krogh wrote: Hi. I'm trying to index the contents of word-documents, extracted text, which leads to quite large documents sometimes. This resutls in the following Exception: Caused by: org.postgresql.util.PSQLException: ERROR: index row requires 10376 bytes, maximum size is 8191 I have the following schema: andreak=# \d origo_search_index Table public.origo_search_index Column | Type |Modifiers --+---+- id | integer | not null default nextval('origo_search_index_id_seq'::regclass) entity_id| integer | not null entity_type | character varying | not null field| character varying | not null search_value | character varying | not null textsearchable_index_col | tsvector | origo_search_index_fts_idx gin (textsearchable_index_col) Triggers: update_search_index_tsvector_t BEFORE INSERT OR UPDATE ON origo_search_index FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger('textsearchable_index_col', 'pg_catalog.english', 'search_value') I store all the text extracted from the documents in search_value and have the built-in trigger tsvector_update_trigger update the tsvector-column. Any hints on how to get around this issue to allow indexing large documents? I don't see how only index the first N bytes of the document would be of interest to anyone... BTW: I'm using PG-9.0beta3 Never mind... I was having a btree index on search_value too, which of course caused the problem. -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Clarification of the simple dictionary
Hi. It's not clear to me if the simple dictionary uses stopwords or not, does it? Can someone please post a complete description of what the simple dict. does? -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! snip lots of stuff/snip Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. In that case your database design is fundamentally broken. A database should have content fields that map to the needs of the application. As you describe your application requirements, that is a bit string and not an integer. Use bit strings and your application logic is transparent, obvious and easy to maintain. Use integers and you have to resort to magic. As you say, it's your choice, but you came here looking for advice and the advice you were given is very good -- Peter Hunsberger -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] optimizing daily data storage in Pg
I have been struggling with this for a while now, have even gone down a few paths but struck out, so I turn now to the community for ideas. First, the problem: Store six daily variables for ~ 25 years for cells in a grid. * Number of vars = 6 * Number of cells ~ 13 million * Number of days ~ 9125 (25 * 365) Optimize the store for two different kinds of queries: Query one: Retrieve the value of a single var for all or a portion of the cells for a single day. This is analogous to an image where every pixel is the value of a single var. Query two: Retrieve values for all the days or a duration of days for a single var for a single cell. This is like grabbing a column out of a table in which each row holds all the vars for a single day. So, I set about designing the db. The grid is in a table with 13 million rows CREATE TABLE cells ( cell_id INTEGER, other_data .. ) WITH ( OIDS=FALSE ) A single table *where every row is one day's values for one cell* looks like so CREATE TABLE d ( yr SMALLINT, ydaySMALLINT, a SMALLINT, b SMALLINT, d SMALLINT, e SMALLINT, f SMALLINT, g SMALLINT, cell_id INTEGER ) WITH ( OIDS=FALSE ) The data would look like so yr ydaya b c d e f g cell_id 19801 x x x x x x x 1 .. 1980365 x x x x x x x 1 ... 19811 x x x x x x x 1 .. 1981365 x x x x x x x 1 ... ... 20051 x x x x x x x 1 .. 2005365 x x x x x x x 1 .. 19801 x x x x x x x 2 .. 1980365 x x x x x x x 2 ... I could now (theoretically) conduct my queries like so: Query 1a: Retrieve the value of a single var for all the cells for a single day. This is analogous to an image where every pixel is the value of a single var. SELECT var FROM d WHERE yr = ? AND yday = ?; I assuming I would need an index on yr and yday, or perhaps even a compound index on (yr, yday). Query 1b: Retrieve the value of a single var for a portion of the cells for a single day. This is analogous to an image where every pixel is the value of a single var. SELECT var FROM d WHERE yr = ? AND yday = ? AND cell_id IN (?,?,?...); I assuming I would need an index on yr and yday, or perhaps even a compound index on (yr, yday) AND an index on cell_id. Query 2: Retrieve values for all the days or a duration of days for a single var for a single cell. This is like grabbing a column out of a table in which each row holds all the vars for a single day. SELECT var FROM d WHERE cell_id = ?; SELECT var FROM d WHERE cell_id IN (?,?,?...); Once again, an index on cell_id would assist in the above. The problem: The above table would have 13 M * 9125 rows ~ 118 billion rows. Huge indexes, slow queries, etc. In fact, major issues loading the data in the first place. Since I am loading data in batches, I drop the indexes (takes time), COPY data into the table (takes time), build the indexes (takes time), experiment with improving the performance (takes time), fail, rinse, lather, repeat. I actually tried the above with a subset of data (around 100 M rows) and experienced all of the above. I don't remember the query times, but they were awful. So, I partitioned the table into years like so CREATE TABLE d_ ( CHECK ( yr = ) ) INHERITS (d) Hmmm... still no satisfaction. I ended up with 1 master table + 25 inherited tables. Each of the year tables now had ~ 4.75 billion rows (13 M * 365), and the queries were still very slow. So, I partitioned it all by years and days like so CREATE TABLE d__yday ( CHECK ( yr = AND yday = yday ) ) INHERITS (d) Each table now has 13 million rows, and is reasonably fast (although still not satisfactorily fast), but now I have 9K tables. That has its own problems. I can't query the master table anymore as Pg tries to lock all the tables and runs out of memory. Additionally, I can't anymore conduct query two above. I could do something like SELECT a FROM d_1980_1 WHERE cell_id = 1 UNION SELECT a FROM d_1980_2 WHERE cell_id = 1 UNION SELECT a FROM d_1980_3 WHERE cell_id = 1 UNION SELECT a FROM d_1980_4 WHERE cell_id = 1 UNION ... But the above is hardly optimal. Any suggestions, ideas, brainstorms would be appreciated. Perhaps Pg, or even a RDBMS, is not the right tool for this problem, in which case, suggestion for alternatives would be welcome as well. Right now I am testing this on a dual Xeon dual core 3 GHz Xserve with 12 GB RAM. The PGDATA directory is located on an attached RAID that is configured as
[GENERAL] How to improve performance in reporting database?
I have a daily job that pushes data from the production database into the reporting database, which right now, is an exact copy. I have a webapp that builds lots of reports for users. Most of these reports involve elaborate joins of lookup tables and lots of summations, and they take too long to run, even after using everything I know to tune the queries. Since I know this is a read-only data, it seems like I should be able to speed everything up dramatically if I run the queries offline and then save the results into new tables. Then the web app could just grab the cached results out of these new tables and then spit them out quickly. I've heard people talking about using materialized views for this, but that was with Oracle. What's the postgresql way here? More generally, any advice on running reporting databases well is welcome. Matt -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Getting statistics for each sql statement?
On Jul 22, 2010, at 4:50 AM, Stefan-Michael Guenther wrote: Hello, is it possible to get statistics on the usage of different sql statements, e.g. how many INSERT or UPDATE statements per day? log_statement_stats doesn't seem to be the right parameter or I haven't found the output statistic for this command. Any ideas or suggestions? The usual approach is to make a log file that holds all your queries and then let something like pgFouine grovel through it. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to distribute quantity if same product is in multiple rows
Tim, Thank you. It can be done in SQL: SUM(kogus) OVER (PARTITION BY toode ORDER BY ID) - kogus (*1) will give you the running sum of the product up to that row. You can then subtract that value from the delivered quantity to calculate the delivered quan- tity for the current row. But doing so automatically is probably bad. For example, if a user has a purchase order with one position of two pieces and one position of four, it is very likely that when a shipment of four pieces arrives, the latter position shall be marked as delivered. So I would leave the decision to the user. If four pieces arrived, first position of 2 pieces should marked as delivered. Second position of 4 pieces shoudl be marked as partialli delivered by setting undelivered quantity of this row to 2 How to use your suggestion for this ? How to implement this is PostgreSql 8.1,8.2, 8.3 ? Andrus. (*1) In PostgreSQL 9.0, you might be able to use ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING instead (untested). PS. If possible please use cc: with my email address in reply. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clarification of the simple dictionary
The easiest way to look at this is to give the simple dictionary a document with to_tsvector() and see if stopwords pop out. In my experience they do. In my experience, the simple dictionary just breaks the document down into the space etc. separated words in the document. It doesn't analyze further. John On Jul 22, 2010, at 4:15 PM, Andreas Joseph Krogh wrote: Hi. It's not clear to me if the simple dictionary uses stopwords or not, does it? Can someone please post a complete description of what the simple dict. does? -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Are identical subqueries in unioned statements nonrepeatable?
Hi all. I had no luck finding a previous message or documentation related to the effective transaction isolation of subqueries, specifically identical subqueries in union statements. Consider the following statement executed without a transaction. select true as from_one, table_one.* from table_one where table_one.id not in (select id from reference_table) union all select false as from_one, table_two.* from table_two where table_two.id not in (select id from reference_table) Is it possible for the contents of reference_table to differ from the first select to the select on the right hand side of the union? (e.g. because some other transaction committed additional rows). Or even from row comparison to row comparison in the same select (I highly doubt that). If it is not possible, why? Is it because a single query always executes with serializable (effective) isolation? Is it because postgresql recognizes that the query is repeated and uses a single result set in both sides of the union? Is this behavior that is part of postgresql intentionally, or a side effect that I should not rely on? Assumption: I'm assuming that it's faster to union all after filtering by the where clause than to union all then filter by a single where clause. The subquery for exclude_ids is very fast and the results of each of the selects is a small fraction of the entire tables. Doing a union first would be expensive in comparison to doing a union of the Thanks, Derrick
Re: [GENERAL] Clarification of the simple dictionary
On 07/22/2010 06:27 PM, John Gage wrote: The easiest way to look at this is to give the simple dictionary a document with to_tsvector() and see if stopwords pop out. In my experience they do. In my experience, the simple dictionary just breaks the document down into the space etc. separated words in the document. It doesn't analyze further. That's my experience too, I just want to make sure it doesn't actually have any stopwords which I've missed. Trying many phrases and checking for stopwords isn't really proving anything. Can anybody confirm the simple dict. only lowercases the words and uniques them? -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clarification of the simple dictionary
Don't guess, but read docs http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 12.6.2. Simple Dictionary The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list. d=# \dFd+ simple List of text search dictionaries Schema | Name | Template | Init options |Description ++---+--+--- pg_catalog | simple | pg_catalog.simple | | simple dictionary: just lower case and check for stopword By default it has no Init options, so it doesn't check for stopwords. On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote: On 07/22/2010 06:27 PM, John Gage wrote: The easiest way to look at this is to give the simple dictionary a document with to_tsvector() and see if stopwords pop out. In my experience they do. In my experience, the simple dictionary just breaks the document down into the space etc. separated words in the document. It doesn't analyze further. That's my experience too, I just want to make sure it doesn't actually have any stopwords which I've missed. Trying many phrases and checking for stopwords isn't really proving anything. Can anybody confirm the simple dict. only lowercases the words and uniques them? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clarification of the simple dictionary
On 07/22/2010 07:44 PM, Oleg Bartunov wrote: Don't guess, but read docs http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 12.6.2. Simple Dictionary The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list. d=# \dFd+ simple List of text search dictionaries Schema | Name | Template | Init options |Description ++---+--+--- pg_catalog | simple | pg_catalog.simple | | simple dictionary: just lower case and check for stopword By default it has no Init options, so it doesn't check for stopwords. Guess what - I *have* read the docs which sais ...and checking it against a file of stop words. What was unclear to me was whether or not it was configured with a stopwords-file or not as default, which is not the case I understand from your reply. Very good, fits my needs like a glove:-) It might be worth considering updating the docs to make this clearer? So - can we rely on simple to remain this way forever (no Init options) or is it better to make a copy of it with the same properties as today? It seems simple + the unaccent dict. available in 9.0 saves my day, thanks Mr. Bartunov. -- Andreas Joseph Kroghandr...@officenet.no Senior Software Developer / CTO +-+ OfficeNet AS| The most difficult thing in the world is to | Rosenholmveien 25 | know how to do a thing and to watch | 1414 Trollåsen | somebody else doing it wrong, without | NORWAY | comment.| | | Tlf:+47 24 15 38 90 | | Fax:+47 24 15 38 91 | | Mobile: +47 909 56 963 | | +-+ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] varchar[] or text[]
Hi, I know this issue was controversed discussed. Some one see no really benefits of using varchar against text var. But i'm asking me what if i use it as array values. Is there any difference between varchar[126], varchar[1] and text[] else than the number of chars i can store in? Performance, memory usage, ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] varchar[] or text[]
This was discussed yesterday and previously. Please read the archives. There is no positive performance reason to use varchar instead of text. On 2010-07-22 05:38:14PM +0200, Armand Turpel wrote: Hi, I know this issue was controversed discussed. Some one see no really benefits of using varchar against text var. But i'm asking me what if i use it as array values. Is there any difference between varchar[126], varchar[1] and text[] else than the number of chars i can store in? Performance, memory usage, ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- === Peter C. Lai | Bard College at Simon's Rock Systems Administrator| 84 Alford Rd. Information Technology Svcs. | Gt. Barrington, MA 01230 USA peter AT simons-rock.edu | (413) 528-7428 === -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
On 22 Jul 2010, at 12:57, Geoffrey wrote: For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html Disgregard my last post, Surely as soon as I hit send, the light went on... I'm looking at deriving my points for point @ point from ll_to_earth(). I constructed mine using point(longitude, latitude), where long and lat are double precision, which returns a datatype of type point. ll_to_earth() looks like it returns a datatype of type earth, so not sure if it will work. Maybe things have changed in a recent release, please let me know if so. So an example would be select point(-2.2171,56.8952)@point(-1.2833,51.6667) as miles; miles -- 363.202864676916 (1 row) Regards Oliver Kohll oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk - software
Re: [GENERAL] How to improve performance in reporting database?
Matthew Wilson wrote: I've heard people talking about using materialized views for this, but that was with Oracle. You can build those manually with PostgreSQL if you really want them: http://wiki.postgresql.org/wiki/Materialized_Views The fundamental architecture is sound for a lot of problems in this area, you just have to figure out how to build them efficiently. In your case, you might just consider if there's a way way to update the MV in batches, rather than rely on triggers to keep the data up to date, after each data import. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to improve performance in reporting database?
On Thu, Jul 22, 2010 at 8:45 AM, Matthew Wilson m...@tplus1.com wrote: I have a daily job that pushes data from the production database into the reporting database, which right now, is an exact copy. I have a webapp that builds lots of reports for users. Most of these reports involve elaborate joins of lookup tables and lots of summations, and they take too long to run, even after using everything I know to tune the queries. Since I know this is a read-only data, it seems like I should be able to speed everything up dramatically if I run the queries offline and then save the results into new tables. Then the web app could just grab the cached results out of these new tables and then spit them out quickly. I've heard people talking about using materialized views for this, but that was with Oracle. What's the postgresql way here? http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views They're kinda roll your own, but they're not that hard to do. More generally, any advice on running reporting databases well is welcome. Throw more drives and RAM at the problem, and use materialized views. Also you're often better off with fewer faster cpus than more slower ones for reporting servers (the opposite of OLTP where number of cores is far more important.) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are identical subqueries in unioned statements nonrepeatable?
Excerpts from Derrick Rice's message of jue jul 22 12:27:31 -0400 2010: Is it possible for the contents of reference_table to differ from the first select to the select on the right hand side of the union? (e.g. because some other transaction committed additional rows). No. If it is not possible, why? Is it because a single query always executes with serializable (effective) isolation? Yes. (Actually: it's because a query is always executed with a single snapshot). Is it because postgresql recognizes that the query is repeated and uses a single result set in both sides of the union? No. Is this behavior that is part of postgresql intentionally, or a side effect that I should not rely on? It is intentional and will not be changed. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Insert and Retrieve unsigned char sequences using C
2010/7/22 Vinícius Soares vinicius...@yahoo.com.br: Hey, thanks for your response. I did it: S8 sql[1500] = insert into t values ( E'; U8 *msg; msg = PQescapeByteaConn(conn, pending_cmd-cmd.value, sizeof(msg_cmd_t), to_length); for (i=0; i sizeof(msg_cmd_t); i++){ S8 str[20] = ; sprintf(str, %c, *(msg+i) ); strcat(sql, str); } strcat(sql, ' );); PQexec(conn, sql); But it is very strange because sometimes it works but others times it does not work. is it right? That code doesn't look right: you need to make sure your 'to' is big enough: at has to be at least (2*N)+1 where N is the input size. it returns a size_t, not a char*, and you should be able to just sprintf the 'to' into your query, not copy the chars in a loop. see the following fragment: #define ARGSZ 64 char my_bytea[ARGSZ]; char escaped_bytea[(2*ARGSZ)+1]; int error; size_t nbytes; nbytes = PQescapeStringConn (conn, escaped_bytea, my_bytea, sizeof(my_bytea), error); if(error != 0) // handle error sprintf(querybuf, insert into foo(bytea_col) values (E'%s'), escaped_bytea); like I said earlier, this is just about the absolute worst way to transfer a bytea to the server. I had to look up the docs for PQescapeStringConn -- I've never once used it my entire life (or it's even more evil cousin, PQescapeString). merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to distribute quantity if same product is in multiple rows
(anonymous) wrote: It can be done in SQL: SUM(kogus) OVER (PARTITION BY toode ORDER BY ID) - kogus (*1) will give you the running sum of the product up to that row. You can then subtract that value from the delivered quantity to calculate the delivered quan- tity for the current row. But doing so automatically is probably bad. For example, if a user has a purchase order with one position of two pieces and one position of four, it is very likely that when a shipment of four pieces arrives, the latter position shall be marked as delivered. So I would leave the decision to the user. If four pieces arrived, first position of 2 pieces should marked as delivered. Second position of 4 pieces shoudl be marked as partialli delivered by setting undelivered quantity of this row to 2 How to use your suggestion for this ? Que? You take the query above, join it in the UPDATE and set the delivered quantity to the minimum of the ordered quantity and taitmkogus - sumkogus. How to implement this is PostgreSql 8.1,8.2, 8.3 ? [...] An example for calculating running sums without window functions can be found at URI:http://archives.postgresql.org/pgsql-sql/2001-07/msg00152.php. I would rather use a PL/pgSQL function in this case, though. Tim -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] locating cities within a radius of another
Oliver Kohll - Mailing Lists wrote: On 22 Jul 2010, at 12:57, Geoffrey wrote: For completeness, the earthdistance module also provides the distance between two lat/longs, the point@point syntax is simple to use: http://www.postgresql.org/docs/8.3/static/earthdistance.html Disgregard my last post, Surely as soon as I hit send, the light went on... I'm looking at deriving my points for point @ point from ll_to_earth(). I constructed mine using point(longitude, latitude), where long and lat are double precision, which returns a datatype of type point. ll_to_earth() looks like it returns a datatype of type earth, so not sure if it will work. Maybe things have changed in a recent release, please let me know if so. So an example would be select point(-2.2171,56.8952)@point(-1.2833,51.6667) as miles; miles -- 363.202864676916 (1 row) Perfect, that appears to work for me as well, thanks. Regards Oliver Kohll oli...@agilebase.co.uk mailto:oli...@agilebase.co.uk / +44(0)7814 828608 / skype:okohll www.agilebase.co.uk http://www.agilebase.co.uk - software -- Until later, Geoffrey I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them. - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Clarification of the simple dictionary
Andreas, I'd create myself copy of dictionary to be independent on system changes. Oleg On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote: On 07/22/2010 07:44 PM, Oleg Bartunov wrote: Don't guess, but read docs http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY 12.6.2. Simple Dictionary The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to report non-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list. d=# \dFd+ simple List of text search dictionaries Schema | Name | Template | Init options | Description ++---+--+--- pg_catalog | simple | pg_catalog.simple | | simple dictionary: just lower case and check for stopword By default it has no Init options, so it doesn't check for stopwords. Guess what - I *have* read the docs which sais ...and checking it against a file of stop words. What was unclear to me was whether or not it was configured with a stopwords-file or not as default, which is not the case I understand from your reply. Very good, fits my needs like a glove:-) It might be worth considering updating the docs to make this clearer? So - can we rely on simple to remain this way forever (no Init options) or is it better to make a copy of it with the same properties as today? It seems simple + the unaccent dict. available in 9.0 saves my day, thanks Mr. Bartunov. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
Hello, I have strange problem. I test/optimize my queries with EXPLAIN ANALYZE. I get for example: Total runtime: 40.794 ms But when I run query without EXPLAIN ANALYZE i get, for example: Time: 539.252 ms Query returns 33 rows. Why? I do checks with psql connected using socket to postgresql server. No SSL. Using PostgreSQL 8.4.4. -- Piotr Gasidło -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
W dniu 22 lipca 2010 21:24 użytkownik Piotr Gasidło qua...@barbara.eu.orgnapisał: Hello, I have strange problem. I test/optimize my queries with EXPLAIN ANALYZE. I get for example: Total runtime: 40.794 ms But when I run query without EXPLAIN ANALYZE i get, for example: Time: 539.252 ms Query returns 33 rows. Why? I do checks with psql connected using socket to postgresql server. No SSL. Using PostgreSQL 8.4.4. Hi, maybe the query waits on a lock or maybe the returned rows are very big. regards Szymon Guz
Re: [GENERAL] Clarification of the simple dictionary
By default it has no Init options, so it doesn't check for stopwords. In the first place, this functionality is a rip-snorting home run on Postgres. I congratulate Oleg who I believe is one of the authors. In the second, I too had not read (carefully) the documentation and am very happy to find that I can eliminate stop words with 'simple'. That will be a tremendous convenience going forward. It turns out that using 'english' and getting stemmed lexemes is extremely convenient too, but this functionality in 'simple' is excellent. Thanks, John -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
W dniu 22 lipca 2010 21:34 użytkownik Szymon Guz mabew...@gmail.com napisał: maybe the query waits on a lock or maybe the returned rows are very big. So shouldn't EXPLAIN ANALYZE be also affected by waiting for lock? The row has width = 313, so it's not big. I've reduced it to witdh = 12 (only one column, int4 type), and get EXPLAIN ANALYZE: Total runtime: 14.788 ms And only SELECT returns: Time: 456,528 ms Or maybe Total runtime it's not what I thought it is, and I should look at psql \timing result, which form EXPLAIN ANALYZE is nearly the same like for SELECT: Time: 402,675 ms -- Piotr Gasidło -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Need some help on PG database transfer
Hi all, I am doing some tests on the file level copying for one of database in windows platform. Here is what I want to achieve. copy files within PG tablespace folder where PG database resides to target machine. I want to attache the copied data to target PG database. What I have done is 1. Database: ABCD, OID: 12345, tablespace: e:\pg_data\, obviously, e:\pg_data has a subfolder named 12345. 2. create database in target, named , and it will create its own OID, for example, 16333, the folder 16333 will reside under BASE folder. 3. then I stop PG service, delete all files under ..\16333\*.*, 4. copy all files from e:\pg_data\12345\*.* to the folder in target server, ..\16333\. 5. start PG service in target machine. it looks like it pickup most of tables that are from its source database, ABCD. But, It is missing tables and functions, compared to sources. Is it doable for replicating data like that? PG tablespace, does it contain all the data/tables in its table space folder? Please comment. Thanks.
Re: [GENERAL] Need some help on PG database transfer
On Thu, Jul 22, 2010 at 2:08 PM, Steeles stee...@gmail.com wrote: Hi all, I am doing some tests on the file level copying for one of database in windows platform. Here is what I want to achieve. copy files within PG tablespace folder where PG database resides to target machine. I want to attache the copied data to target PG database. That won't work. You have to pg_dump out the tables / tablespace and then psql or pg_restore it to the other db. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Thu, Jul 22, 2010 at 4:09 PM, Howard Rogers h...@diznix.com wrote: On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger peter.hunsber...@gmail.com wrote: On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: there's a room-full of users who can look at code '4097' and know precisely what it means and would be mortified if I suddenly started displaying exactly the same meanings in what, to them, would look like utter gibberish. In that case, you shouldn't be describing the column as some encoded bit of magic here. It clearly has some some semantic meaning which gives you a reason to want to keep it that way. Though why your users are dealing with the raw values as stored in the database may be another issue to deal with: Personally, I'd say store it in the way that is easiest for your application logic to deal with, display it in the form that is easiest for your users to deal with. The are often two completely different things... -- Peter Hunsberger -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
W dniu 22 lipca 2010 21:46 użytkownik Piotr Gasidło qua...@barbara.eu.org napisał: W dniu 22 lipca 2010 21:34 użytkownik Szymon Guz mabew...@gmail.com napisał: (...) Something new. The query is run against table which has been partitioned into a lot of small tables. When I combine data and put it into one table - the execution of EXPLAIN ANALYZE compares with real SELECT timeing. On paritioned: EXPLAIN ANALYZE SELECT ... Total runtime: 14.790 ms Time: 291,637 ms On one table with data from all partitions and same indexes: EXPLAIN ANALYZE SELECT ... Total runtime: 16.418 ms Time: 17,371 ms Can someone give me clue why EXPLAIN ANALYZE don't work correctly with partitions? -- Piotr Gasidło -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Question about SCO openserver and postgres...
Hi! I have a problem with the max postgres connections on SCO Openserver 5.0.7, so ...my boss decided to buy the SCO Openserver 6.0 but this version comes in 2 editions: Starter and Enterprise. If SCO 5.0.7 only allows 95 ( -3 used by superuser) connections to postgres... Do you know how many connections to postgres can i have with OpenServer in Starter Edition or Enterprise edition? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] psql problem
Hi, System information: - psql 8.4.4 on a client with CentOS 5.5 (64 bits) - postgres 8.4.4 on the server with CentOS 5.5 (64 bits) - the client is connected with vpn I have a script to create a table with some comments in front. When I use the command 'psql -f script.sql' to load it, it hangs. However, if I remove the comments, OR remove some of the columns from the table, it works okay. It looks like to me, the psql will hang with large size of the script file. I tried 'psql script.sql' and 'cat script.sql | psql' with the same result. However, I tried it on another client host (CentOS 5.5 32 bits), I don't see this problem. Any idea and suggestion ? Thanks, Gary -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger peter.hunsber...@gmail.com wrote: On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! snip lots of stuff/snip Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. In that case your database design is fundamentally broken. A database should have content fields that map to the needs of the application. As you describe your application requirements, that is a bit string and not an integer. Use bit strings and your application logic is transparent, obvious and easy to maintain. Use integers and you have to resort to magic. As you say, it's your choice, but you came here looking for advice and the advice you were given is very good -- Peter Hunsberger Hi Peter: It wasn't, as the original poster pointed out, 'advice' that was given so much as personal preference. Had someone said, 'ah, but you see storing your 15 meanings in decimal uses up 5 bytes, whereas a bitstring only requires 15 bits, and over 10,000,000 records, the saving of 3 bytes per record adds up...', then that would be technical advice I could listen to, assess and make a call on. But simply saying your design is broken... wo! might well scare the children, but doesn't really do anything for me, because I know for a certainty that it's not broken at all. It comes down to this: I can do Boyce-Codd normal form in my sleep (...and falling asleep happens quite frequent when doing it, strangely enough), and have been doing so since 1987. I'm certainly not perfect, but I reckon I can tell from a mile away when one of my designs is broken, as you put it -and this one isn't. I haven't even begun to describe a scintilla of a percentage point of the design decisions this thing has to deal with, nor the fact that it's been running quite happily in this manner for a good couple of years... so you'll just have to take it from me that there's a room-full of users who can look at code '4097' and know precisely what it means and would be mortified if I suddenly started displaying exactly the same meanings in what, to them, would look like utter gibberish. Unless you, or someone else, can come up with some hard, *technical* facts as to why working with bitstring encodings of meaning is so much better than working in decimal, we're sticking with the decimal representation. I'll buy you're forever doing implicit casts which are poor performers or implicit casts might break in a future release or it's costing you three bytes per record ...or anything else in that vein. But matters of transparency and ease of maintenance are entirely subjective things (about which I sought no advice at all, incidentally), and what works for you on those scores doesn't work for me. Regards HJR -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] optimizing daily data storage in Pg
On 7/22/2010 9:41 AM, P Kishor wrote: I have been struggling with this for a while now, have even gone down a few paths but struck out, so I turn now to the community for ideas. First, the problem: Store six daily variables for ~ 25 years for cells in a grid. * Number of vars = 6 * Number of cells ~ 13 million * Number of days ~ 9125 (25 * 365) Optimize the store for two different kinds of queries: Query one: Retrieve the value of a single var for all or a portion of the cells for a single day. This is analogous to an image where every pixel is the value of a single var. SELECTvar FROM d WHERE yr = ? AND yday = ?; SELECTvar FROM d WHERE yr = ? AND yday = ? AND cell_id IN (?,?,?...); Query two: Retrieve values for all the days or a duration of days for a single var for a single cell. This is like grabbing a column out of a table in which each row holds all the vars for a single day. SELECTvar FROM d WHERE cell_id = ?; SELECTvar FROM d WHERE cell_id IN (?,?,?...); First, I must admit to not reading your entire email. Second, Query 1 should be fast, regardless of how you layout the tables. Third, Query 2 will return 13M rows? I dont think it matters how you layout the tables, returning 13M rows is always going to be slow. -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql problem
On Jul 22, 5:03 pm, Gary Fu gary...@sigmaspace.com wrote: Hi, System information: - psql 8.4.4 on a client with CentOS 5.5 (64 bits) - postgres 8.4.4 on the server with CentOS 5.5 (64 bits) - the client is connected with vpn I have a script to create a table with some comments in front. When I use the command 'psql -f script.sql' to load it, it hangs. However, if I remove the comments, OR remove some of the columns from the table, it works okay. It looks like to me, the psql will hang with large size of the script file. I tried 'psql script.sql' and 'cat script.sql | psql' with the same result. However, I tried it on another client host (CentOS 5.5 32 bits), I don't see this problem. Any idea and suggestion ? Thanks, Gary Are you sure it is hanging? Maybe piping into psql via pv (http:// www.ivarch.com/programs/pv.shtml) could give some insight as to where it might be hanging, or if it is just going slowly, or whatever. E.g., if your script is 240500 bytes long, the following will give a simple progress meter, elapsed time, and ETA display: cat yourscript.file | pv -t -r -e -b -s 240500 | psql -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote: On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger peter.hunsber...@gmail.com wrote: On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! snip lots of stuff/snip Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. In that case your database design is fundamentally broken. A database should have content fields that map to the needs of the application. As you describe your application requirements, that is a bit string and not an integer. Use bit strings and your application logic is transparent, obvious and easy to maintain. Use integers and you have to resort to magic. As you say, it's your choice, but you came here looking for advice and the advice you were given is very good -- Peter Hunsberger Hi Peter: It wasn't, as the original poster pointed out, 'advice' that was given so much as personal preference. Had someone said, 'ah, but you see storing your 15 meanings in decimal uses up 5 bytes, whereas a bitstring only requires 15 bits, and over 10,000,000 records, the saving of 3 bytes per record adds up...', then that would be technical advice I could listen to, assess and make a call on. You do realize the first page I linked to told you that, right? It's not a particularly big page. I had made the erroneous assumption you'd read the link I posted. But simply saying your design is broken... wo! might well scare the children, but doesn't really do anything for me, because I know for a certainty that it's not broken at all. I asked if there was a reason you were avoiding bit strings. Hardly a your design is broken point. You've now said why you are not using the type that was designed to handle bit strings for bit strings. I personally would store them as bit strings and change representation for users. There are some issues that come up if your bit strings are long enough to get close to the last bit in an integer (also mentioned on the links I posted that didn't get read). But other than that it should work fine. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question about SCO openserver and postgres...
On Thu, Jul 22, 2010 at 3:24 PM, Edmundo Robles L. erob...@sensacd.com.mx wrote: Hi! I have a problem with the max postgres connections on SCO Openserver 5.0.7, so ...my boss decided to buy the SCO Openserver 6.0 but this version comes in 2 editions: Starter and Enterprise. If SCO 5.0.7 only allows 95 ( -3 used by superuser) connections to postgres... Do you know how many connections to postgres can i have with OpenServer in Starter Edition or Enterprise edition? Are you sure this isn't just a limit in max_connections in postgresql.conf? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Changing pg_attribute.attislocal
Over time, I mistakenly did something like this: CREATE TABLE a (); CREATE TABLE b () INHERITS(a); ALTER TABLE b ADD col int; ALTER TABLE a ADD col int; where I should have left out the third statement. Not a great loss, the only consequence I've observed being pg_attribute.attislocal = true, so dropping a.col will not drop b.col. Is there a DDL way to change that, short of dropping the column from both tables and re-adding it to the parent alone? If not, what's the danger of updating attislocal directly? If it makes any difference, I don't actually plan to drop the column anytime soon. Thanks, nm -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql problem
On 07/22/2010 04:03 PM, Gary Fu wrote: Hi, System information: - psql 8.4.4 on a client with CentOS 5.5 (64 bits) - postgres 8.4.4 on the server with CentOS 5.5 (64 bits) - the client is connected with vpn I have a script to create a table with some comments in front. When I use the command 'psql -f script.sql' to load it, it hangs. However, if I remove the comments, OR remove some of the columns from the table, it works okay. It looks like to me, the psql will hang with large size of the script file. I tried 'psql script.sql' and 'cat script.sql | psql' with the same result. However, I tried it on another client host (CentOS 5.5 32 bits), I don't see this problem. Any idea and suggestion ? Thanks, Gary Line endings? How about a sample? What comment style: -- /* (* # ; ' // -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Thu, Jul 22, 2010 at 10:27 PM, Howard Rogers h...@diznix.com wrote: On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe scott.marl...@gmail.com wrote: On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote: On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger peter.hunsber...@gmail.com wrote: On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! snip lots of stuff/snip Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. In that case your database design is fundamentally broken. A database should have content fields that map to the needs of the application. As you describe your application requirements, that is a bit string and not an integer. Use bit strings and your application logic is transparent, obvious and easy to maintain. Use integers and you have to resort to magic. As you say, it's your choice, but you came here looking for advice and the advice you were given is very good -- Peter Hunsberger Hi Peter: It wasn't, as the original poster pointed out, 'advice' that was given so much as personal preference. Had someone said, 'ah, but you see storing your 15 meanings in decimal uses up 5 bytes, whereas a bitstring only requires 15 bits, and over 10,000,000 records, the saving of 3 bytes per record adds up...', then that would be technical advice I could listen to, assess and make a call on. You do realize the first page I linked to told you that, right? It's not a particularly big page. I had made the erroneous assumption you'd read the link I posted. If you mean, did I read the bit in the doco where it said nothing at all in the 'these are great advantages' style I've just described, but instead makes the fairly obvious point that a bit string takes 8 bits to store a group of 8 bits (well, stone me!!) Wow, I'm surprised you get any help with your attitude. I posted a link and asked a question and right up front got my head handed to me. To quote: Why on Earth would I want to store this sort of stuff in a bit string?! I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! Like I'd somehow bitten your hand when I asked my question. PLUS has extra overhead, then yes, I did read that part of your first link... and nevertheless concluded that, overall, there is... er, some extra overhead in storing bitstrings. Well, your initial answer certainly didn't give ANY idea that you'd read that page. So what precisely about that first article, which I did indeed read, would you have expected to lead me to the conclusion that I'd SAVE significant amounts of space or find some other technically-compelling reason for switching? I didn't expect such. I asked why you weren't using them, and gave you some links to read on it. It clearly states that bit strings use a bit per bit, plus some overhead. Now, I had no idea if you were dealing with bigints and 60 bit strings or 5 bit strings. In fact, you did little to really describe your project and preferences in your post. Which is why my response was short and concise, I had little to go on. My point is that there's nothing much in it, storage-wise, either way. Well, there is the fact that bit strings can restrict the size of the entry so you don't accidentally get an int stored that's got more bits than your model can handle. There's also the issue that if / when you ever get close to the last bit in an int bitstring may behave oddly because of sign issues. So there's no compelling technical reason to switch. I never said there was. I simply asked a question, and got my hand bitten. And without a technically-compelling reason, the rest of the post I was referring to simply boiled down, as far as I could tell, to a matter of personal preference. No less valid for that, of course. But ultimately, not something that would hold much sway with me. Sure, fine, whatever you want. I wasn't trying to convince you either way. I do think using the right type for the job makes more sense, but again, it's personal preference. But simply saying your design is broken... wo! might well scare the children, but
Re: [GENERAL] Difference between EXPLAIN ANALYZE SELECT ... total runtime and SELECT ... runtime
In response to Piotr Gasid??o : Hello, I have strange problem. I test/optimize my queries with EXPLAIN ANALYZE. I get for example: Total runtime: 40.794 ms But when I run query without EXPLAIN ANALYZE i get, for example: Time: 539.252 ms Query returns 33 rows. Why? Maybe cheaply or virtuell hardware? There are some issues with functions like gettimoofday(), see here: http://archives.postgresql.org/pgsql-general/2007-01/msg01653.php (and the whole thread) Regards, Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431 2EB0 389D 1DC2 3172 0C99 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bitmask trickiness
On Fri, Jul 23, 2010 at 8:37 AM, Scott Marlowe scott.marl...@gmail.com wrote: On Thu, Jul 22, 2010 at 3:09 PM, Howard Rogers h...@diznix.com wrote: On Fri, Jul 23, 2010 at 12:35 AM, Peter Hunsberger peter.hunsber...@gmail.com wrote: On Wed, Jul 21, 2010 at 11:13 PM, Howard Rogers h...@diznix.com wrote: On Thu, Jul 22, 2010 at 1:02 PM, Scott Marlowe scott.marl...@gmail.com wrote: Why on Earth would I want to store this sort of stuff in a bit string?! Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. I don't know about you, but I find looking at 21205 a darn'd site easier than staring blankly at 101001011010101!! snip lots of stuff/snip Note you can cast integer to bitstring, but there may be some odd behaviour for sign bits and such. Which is again why I'd use the right type for the job, bit string. But it's your project. Quoting... Because you are manipulating bits and not integers? I guess there are 10 kinds of people, those who like think in binary and those who don't. Er, no. 21205 is not an integer. It's an encoded bit of magic. In that case your database design is fundamentally broken. A database should have content fields that map to the needs of the application. As you describe your application requirements, that is a bit string and not an integer. Use bit strings and your application logic is transparent, obvious and easy to maintain. Use integers and you have to resort to magic. As you say, it's your choice, but you came here looking for advice and the advice you were given is very good -- Peter Hunsberger Hi Peter: It wasn't, as the original poster pointed out, 'advice' that was given so much as personal preference. Had someone said, 'ah, but you see storing your 15 meanings in decimal uses up 5 bytes, whereas a bitstring only requires 15 bits, and over 10,000,000 records, the saving of 3 bytes per record adds up...', then that would be technical advice I could listen to, assess and make a call on. You do realize the first page I linked to told you that, right? It's not a particularly big page. I had made the erroneous assumption you'd read the link I posted. If you mean, did I read the bit in the doco where it said nothing at all in the 'these are great advantages' style I've just described, but instead makes the fairly obvious point that a bit string takes 8 bits to store a group of 8 bits (well, stone me!!) PLUS has extra overhead, then yes, I did read that part of your first link... and nevertheless concluded that, overall, there is... er, some extra overhead in storing bitstrings. So what precisely about that first article, which I did indeed read, would you have expected to lead me to the conclusion that I'd SAVE significant amounts of space or find some other technically-compelling reason for switching? My point is that there's nothing much in it, storage-wise, either way. So there's no compelling technical reason to switch. And without a technically-compelling reason, the rest of the post I was referring to simply boiled down, as far as I could tell, to a matter of personal preference. No less valid for that, of course. But ultimately, not something that would hold much sway with me. But simply saying your design is broken... wo! might well scare the children, but doesn't really do anything for me, because I know for a certainty that it's not broken at all. I asked if there was a reason you were avoiding bit strings. Hardly a your design is broken point. I'm getting a bit fed up of this thread now. It wasn't YOU that ever said 'the design is broken', and I never suggested it was. That was Peter Hunsberger, about three posts up in the thread, who wrote In that case your database design is fundamentally broken. If you're going to take umbrage at something, please take umbrage at things that were actually directed at you in the first place! You've now said why you are not using the type that was designed to handle bit strings for bit strings. I personally would store them as bit strings and change representation for users. I'm a user, too. I get to see this stuff every time I do a select statement. At the command line. Which I use a lot. There are some issues that come up if your bit strings are long enough to get close to the last bit in an integer (also mentioned on the links I posted that didn't get read). Don't make false assumptions about other people, please. You don't know what I read or didn't read. Just because you didn't make a compelling technical argument in favour of bitstrings doesn't mean I didn't read the article you linked to ...that also didn't make a compelling technical argument in favour of bitstrings. But other than that it should work fine. Yes, I know. I've only been using this technique for five years on Oracle! I would be very surprised indeed if it
Re: [GENERAL] optimizing daily data storage in Pg
On Thu, Jul 22, 2010 at 4:56 PM, Andy Colson a...@squeakycode.net wrote: On 7/22/2010 9:41 AM, P Kishor wrote: I have been struggling with this for a while now, have even gone down a few paths but struck out, so I turn now to the community for ideas. First, the problem: Store six daily variables for ~ 25 years for cells in a grid. * Number of vars = 6 * Number of cells ~ 13 million * Number of days ~ 9125 (25 * 365) Optimize the store for two different kinds of queries: Query one: Retrieve the value of a single var for all or a portion of the cells for a single day. This is analogous to an image where every pixel is the value of a single var. SELECTvar FROM d WHERE yr = ? AND yday = ?; SELECTvar FROM d WHERE yr = ? AND yday = ? AND cell_id IN (?,?,?...); Query two: Retrieve values for all the days or a duration of days for a single var for a single cell. This is like grabbing a column out of a table in which each row holds all the vars for a single day. SELECTvar FROM d WHERE cell_id = ?; SELECTvar FROM d WHERE cell_id IN (?,?,?...); First, I must admit to not reading your entire email. I am not sure how to respond to your feedback give that you haven't read the entire email. Nevertheless, thanks for writing... Second, Query 1 should be fast, regardless of how you layout the tables. It is not fast. Right now I have data for about 250,000 cells loaded. That comes to circa 92 million rows per year. Performance is pretty sucky. Third, Query 2 will return 13M rows? I dont think it matters how you layout the tables, returning 13M rows is always going to be slow. Yes, I understand that. In reality I will never get 13 M rows. For display purposes, I will probably get around 10,000 rows to 50,000 rows. When more rows are needed, it will be to feed a model, so that can be offline (without an impatient human being waiting on the other end). Right now, my main problem is that I have either too many rows (~4 B rows) in a manageable number of tables (25 tables) or manageable number of rows (~13 M rows) in too many tables (~9000 tables). -Andy -- Puneet Kishor http://www.punkish.org Carbon Model http://carbonmodel.org Charter Member, Open Source Geospatial Foundation http://www.osgeo.org Science Commons Fellow, http://sciencecommons.org/about/whoweare/kishor Nelson Institute, UW-Madison http://www.nelson.wisc.edu --- Assertions are politics; backing up assertions with evidence is science === -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general