Re: [GENERAL] work_mem greater than 2GB issue
wickro robwick...@gmail.com writes: Hi everyone, I have a largish table ( 8GB). I'm doing a very simple single group by on. I am the only user of this database. If I set work mem to anything under 2GB (e.g. 1900MB) the postmaster process stops at that value while it's peforming it's group by. There is only one hash operation so that is what I would expect. But anything larger and it eats up all memory until it can't get anymore (around 7.5GB on a 8GB machine). Has anyone experienced anything of this sort before. What does EXPLAIN say for both cases? I suspect what's happening is that the planner is estimating it will need 2G to has all the values and in fact it would need 8G. So for values under 2G it uses a sort and not a hash at all, for values over 2G it's trying to use a hash and failing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] DISTINCT ON without ORDER BY
Martijn van Oosterhout klep...@svana.org writes: SELECT * FROM foo WHERE id in (SELECT max(id) FROM foo GROUP BY bar); Is there a way to acheive the above result without a sort and without a self-join? Something like SELECT bar, (magic_agg_func(foo)).* FROM foo GROUP BY bar where you define an aggregate function magic_agg_func to remember the whole record for the largest value of id. Something like: postgres=# create function magic_transition(a,a) returns a as 'select case when $1.aid $2.aid then $1 else $2 end' language sql; postgres=# create aggregate magic (a) (sfunc = magic_transition, stype = a); Not sure it'll be faster though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Nooby Q: Should this take five hours? And counting?
Kenneth Tilton kentil...@gmail.com writes: A bit. I killed the indexing and jacked work_mem up to 500mb, indexing then finished in 7 min 25s. Yer a genius! FWIW creating indexes using maintenance_work_mem. I would not expect changing work_mem to really help much. That's, uh, curious. 2: You've got a slow disk subsystem, if you're already seeing 25% IOWait with only ~2 to 3 megs a second being written. This has been passed along to management for consideration. Depends. If it's all random i/o then 8-12MB/s is about right for 4-6 drives. If there's any sequential i/o mixed in then yeah, it's pretty poor. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Breaking national language support in Solaris 8
Peter Eisentraut pete...@gmx.net writes: This will, however, only work with GNU Gettext (as used in Linux and BSD distributions) and Solaris 9 or later, and it is not easy to provide a backward compatible mode. Eh? I thought it was trivial to provide a backward compatible mode which is just as good as the existing code. Just use regular gettext on the two strings separately and pick the right one based on the English rule. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What are the benefits of using a clustered index?
Simon Riggs si...@2ndquadrant.com writes: Just think one index tuple points to more than one heap row. Could you expand on that? Like, uh, I have no idea what you're saying. Less index pointers, smaller index. Are you talking about Heikkie's grouped-items-index? The trick is: How? But that's a secondary issue to getting it on the TODO list, which is all I'm suggesting at present. Well I think we need to be clear enough at least on the what if not the how. But there's a bit a of a fuzzy line between them I admit. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL versus MySQL for GPS Data
Merlin Moncure mmonc...@gmail.com writes: A good rule of thumb for large is table size working ram. Huge (really large) is 10x ram. Or better yet, large is data working ram. Very large is data directly attached drives... That means that without fairly expensive hardware you start talking about very large at about 4-10 TB. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query 4-5 times slower after ANALYZE
Bill Moran wmo...@potentialtech.com writes: I opened one of those links figuring I'd take a few minutes to see if I could muster up some advice ... and just started laughing ... definitely not the type of query that one can even understand in just a few minutes! You might consider setting default_statistics_target to 100 and re-analyzing. The estimates don't look too far off but like Bill I haven't analyzed it very carefully. One other things that stands out, using comparisons like ('now'::date - creation_date) = expression is going to make it hard to optimize. Better to use something like creation_date = now() - expression Both because of the now() instead of 'now'::date and because the latter is a comparison that can be indexed instead of an expression which could use an index on creation_date. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Query 4-5 times slower after ANALYZE
Oh, the other thing you could try experimenting with are these two parameters. Your query has *way* more tables than the default values for these so you would have to raise them substantially. Given that the query was running in 30s you may find that this increases the planning time by more time than it saves in the query -- the reason these limits exist at all.. geqo_threshold join_collapse_limit from_collapse_limit -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning - Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] upgrade from 7.4 to 8.3
Marc Cuypers m.cuyp...@mgvd.be writes: Error: ERROR: encoding LATIN9 does not match server's locale nl_BE.utf8 SQL state: XX000 Detail: The server's LC_CTYPE setting requires encoding UTF8. Can i only use nl_BE and UTF-8 now? Why can't i use LATIN9 anymore? Is bacula 8.3 stricter in this respect to 7.4? 8.3 is stricter about checking that the configuration makes sense. But even under 7.4 you would have had problems, you just wouldn't have been forewarned so soon. You would still only be able to use nl_BE.utf8 collation but you would have been allowed to tell the server your data was encoded with latin9. So the collation results would have been nonsensical. Ie, comparisons like and would have given incorrect results. If this database is still under development and your schedule allows one option might be use 8.4dev from CVS. It should be released sometime in the next 3-6 months and will allow you to have a different encoding and locale for each database. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] intermittant performance problem
Tom Lane t...@sss.pgh.pa.us writes: Mike Charnoky n...@nextbus.com writes: The sampling query which runs really slow on some days looks something like this: INSERT INTO sampled_data (item_name, timestmp, ... ) SELECT item_name, timestmp, ... ) FROM raw_data WHERE timestmp = ? and timestmp ? AND item_name=? AND some_data_field NOTNULL ORDER BY random() LIMIT ?; Hmph, I'd expect that that would run pretty slowly *all* the time :-(. There's no good way to optimize ORDER BY random(). This seems kind of unlikely but does the parameter to the LIMIT vary a lot? If it's small enough to fit all the chosen records in work_mem then you'll avoid a disk-sort and do a top-k scan. If it overflows work_mem then it'll fail over to do a full disk sort of all the records picked from raw_data. It does seem much more likely that whatever index you have it using on timestmp or item_name or some_data_field is sometimes being used and sometimes not. Perhaps it's switching from an index on one of those columns to an index on some other column and that's what's throwing it off. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgresql and LDAP
Tom Lane t...@sss.pgh.pa.us writes: JP Fletcher jpfle...@ca.afilias.info writes: I've just compiled 8.3.6 with ldap support yet I get 'FATAL: missing or erroneous pg_hba.conf file' when I try to connect. Can't help you much when you don't show us the pg_hba.conf file ;-) However, a tip that might help is that there should be more information about the problem in the postmaster log. We intentionally don't send details about the conf file's contents to the client... Perhaps we should send a HINT to the client saying to consult the postmaster logs? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 8.3.6 build error on Debian Lenny
Josh Trutwin j...@trutwins.homeip.net writes: On Tue, 10 Mar 2009 00:49:32 -0400 Tom Lane t...@sss.pgh.pa.us wrote: Josh Trutwin j...@trutwins.homeip.net writes: snip geo_ops.c /usr/lib/gcc/i486-linux-gnu/4.3.2/cc1: symbol lookup error: /usr/lib/libmpfr.so.1: undefined symbol: __gmp_get_memory_functions [ blink... ] There's no reason for a compile to be looking into shared libraries. What I think must have happened here is that this reflects a bogus function call internally in gcc itself. Which probably means that you have a gcc version that's out-of-sync with your libmpfr.so.1 version (whatever the heck that is). I'd suggest asking around among some Debian experts, or just reinstalling instead of upgrading. Searching on google it seems this is a typical error message when you have a hand-compiled gmp installed locally in /usr/local/lib as well as a system gmp installed in /usr/lib. So you may be getting one version of libmpfr and a mismatched version of libgmpMPFR. I would imagine this could cause problems if one is compiled statically and the other dynamically. (Or if they're both static but you list them in the wrong order on the command-line.) Reinstalling would be rough - this box is remote, I felt lucky to get through the upgrade over ssh. It's usually unnecessary to reinstall Debian. I've been running unstable since 1996 and only had to do it once and only because I had a filesystem corruption. I'll try Debian lists / irc - hopefully don't get snarks. :) Yes well... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] postgresql and LDAP
Emanuel Calvo Franco postgres@gmail.com writes: 2009/3/10 Gregory Stark st...@enterprisedb.com: Tom Lane t...@sss.pgh.pa.us writes: However, a tip that might help is that there should be more information about the problem in the postmaster log. We intentionally don't send details about the conf file's contents to the client... Perhaps we should send a HINT to the client saying to consult the postmaster logs? egg on my face -- we already *do* have such a hint: ereport(FATAL, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg(missing or erroneous pg_hba.conf file), errhint(See server log for details.))); IMHO I think is quite dangerous because someone with experience could guess some security details. How? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Newbie questions relating to transactions
Carl Sopchak carl.sopc...@cegis123.com writes: Well, the upgrade to 8.3 seemed to rid me of the command limit, but now I'm running out of memory. I have 2Gb physical and 8Gb swap (after adding 4Gb). What do you mean you're running out of memory? For most part of Postgres that's only a problem if you've configured it to use more memory than your system can handle -- such as setting work_mem or shared_buffers too large. One area that can cause problems is having too many trigger executions queued up. I don't know if that's what you're running into though. Is there a way for me to run this outside of one huge transaction? This really shouldn't be using more than a few hundred megs of RAM (assuming cursor records are all stored in memory)... Personally I find it much more flexible to implement these types of jobs as external scripts connecting as a client. That lets you stop/start transactions freely. It also allows you to open multiple connections or run the client-side code on a separate machine which can have different resources available. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Problem with Mauritius summer time (MUST)
Xavier Bugaud xavier.bug...@gloptv.com writes: Hi, Mauritius use daylight saving since 2008-11. The Olson database has been updated to reflect the Mauritius timezone change at the end of 2008 (maybe since tzdata-2008f or tzdata-2008g, not really sure). Postgresql is supposed to have the correct Mauritius timezone since 8.3.5 (http://www.postgresql.org/docs/8.3/interactive/release-8-3-5.html). From a debian box configured in the Mauritius timezone : $ date Wed Mar 4 11:10:01 MUST 2009 $ psql -c SELECT '2009-01-01 00:56:00 MUT'::timestamp timestamp - 2009-01-01 00:56:00 (1 row) $ psql -c SELECT '2009-01-01 00:56:00 MUST'::timestamp ERROR: invalid input syntax for type timestamp: 2009-01-01 00:56:00 MUST As you can see Posgresql does not recognize the MUST (Mauritius Summer Time). For reference, here is the content of the tzdata file related to Mauritius : # RuleNAMEFROMTO TYPEIN ON AT SAVE LETTER/S Rule Mauritius1982only- Oct 10 0:001:00 S Rule Mauritius1983only- Mar 21 0:000 - Rule Mauritius2008max - Oct lastSun 2:00s 1:00 S Rule Mauritius2009max - Mar lastSun 2:00s 0 - # ZoneNAMEGMTOFF RULES FORMAT [UNTIL] Zone Indian/Mauritius 3:50:00 - LMT 1907# Port Louis 4:00 Mauritius MU%sT # Mauritius Time Is this a bug in Posgresql or am I missing something ? This problem was encountered with Postgresql 8.3.5 and 8.3.6. Hm, you can use Indian/Mauritius to reference the appropriate time zone directly but that doesn't let you specify Mauritius Summer Time explicitly. I think the problem is that MUST is missing from the Default timezone_abbreviations file. SELECT '2009-01-01 00:56:00 Indian/Mauritius'::timestamp with time zone; timestamptz 2008-12-31 19:56:00+00 -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Strange behavior: row won't delete
Phoenix Kiula phoenix.ki...@gmail.com writes: I guess my question is, how should I remove all pending locks on a table so that I can get on with the rest of the stuff? I mean, even if I can now find an offending RULE on the table, I cannot replace or remove it. ' You're off on the wrong track. Locks are held by transactions until the transaction commits. You need to find the transactions which are holding these locks and either commit or roll them back. You look in pg_locks to see what locks transactions are holding. In particular look for rows with granted set to t, especially locks on relations and especially ExclusiveLocks. Then you take the pid of those transactions and look in pg_stat_activity to see what they're up to. If they say idle in transaction then they're waiting for the client to do something. If they stay that way for any length of time while holding locks which block other transactions that's bad. Alternately if you see a query in pg_stat_transaction which is taking a long time to run you might check whether you have a bad plan or a bad query running while holding locks effectively doing the same thing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] funny view/temp table problem with query
Alban Hertroys dal...@solfertje.student.utwente.nl writes: On Feb 26, 2009, at 11:02 AM, Grzegorz Jaśkiewicz wrote: looks like you completely misunderstood my question. I'm not surprised. What do you expect with random capitalisation, random table alias names and random indentation combined with queries getting wrapped by the mailing-list software? Uh, we get a lot of really mangled SQL and explain plans -- I don't see anything wrong with these. If the question was unclear it sounds like it's just because it's a fairly subtle problem and was hard to describe. Needing two cracks at describing the problem is pretty much par for the course here. I haven't tested the query to see what's going on but if the problem is due to random() then in 8.4 you could use WITH to guarantee that the subquery is executed precisely once and the results reused as-is subsequently. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using xmin to identify last modified rows
Stéphane A. Schildknecht stephane.schildkne...@postgresqlfr.org writes: But, trying the same query on a non slonified DB, I got an error, as there is no ordering operator for xid. I think that in the slon case, the query uses the implicit cast xid-xxid, and then the operator to sort xxid. You could order by age(xmin) instead What would be the best way to get last modified rows? I'm not sure using xmin is such a great idea really. It's handy for ad-hoc queries but there are all kinds of cases where it might not give you the results you expect. You probably want to put a timestamp column on your tables and manage the date you put in their according to a policy you control. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using xmin to identify last modified rows
Richard Broersma richard.broer...@gmail.com writes: On Wed, Feb 25, 2009 at 5:21 AM, Gregory Stark st...@enterprisedb.com wrote: I'm not sure using xmin is such a great idea really. It's handy for ad-hoc queries but there are all kinds of cases where it might not give you the results you expect. Its been a while since the following emails were written. Has the treatment of xmin changed since then, or is using a timestamp a better practice? http://archives.postgresql.org/pgsql-novice/2007-02/msg00079.php http://archives.postgresql.org/pgsql-hackers/2004-02/msg00654.php Well those emails aren't discussing evaluating when records were updated or deciding which were updated more recently than others. All they say is that in Postgres if ctid and xmin both match then you're looking at the same version of the same record. For a gui table editor or ODBC driver that's an important thing to know. If what you want to do is find records which have been updated for something like a Recently updated pages it's unlikely that the desired behaviour will exactly match how Postgres works. You're better off deciding the policy you want and writing code to implement that. Some examples of how xmin might not do what you expect: The order in which transactions *start* will determine the ordering, not the order in which they commit. If you look at records you've updated in the same transaction it's even possible to see records which come from the future. If any records have frozen you lose any record of what order they were created. Another example is that it's impossible to ignore trivial updates -- any update will update xmin no matter how trivial, even if no columns are updated. Worse, in the future there may be changes to database internals which change when xmin is updated which won't match your desired policy. For example if we decide to replace VACUUM FULL with something which does no-op updates instead of moving tuples then you'll find records spontaneously appearing to have been recently updated. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indexing a Bit String column
George Oakman oakm...@hotmail.com writes: Is it all I need to do? Will PgSQL know how to index properly a Bit String column? Should I build the index using a special method, e.g. CREATE INDEX myBitStringCol_idx ON myTable USING gist(myBitStringCol); No, the default will be to build a btree index which won't help these types of queries at all. You would want a GIST index if there was a built-in GIST opclass for these kinds of queries, but sadly there isn't. You could add one fairly easily but it would require C code. I think it would be a valuable addition to Postgres if you do write one. Note that something like WHERE myBitStringCol B'101' might be selecting too much of your table to make an index useful anyways. If each bit is set in half the table then you're talking about selecting 3/4 of the table in which case a full table scan would be more efficient than any index. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] \l of psql 8.4 looks ... unreadable on 80char default
Harald Armin Massa haraldarminma...@gmail.com writes: Thanks! What is your $COLUMNS set to? This should have wrapped to fit into the screen with. Also look at \pset columns: $COLUMNS was not set at all ... guess that is a usual environment variable on bash. Setting it to 80 works. COLUMNS wasn't set and \pset columns wasn't set? What environment were you running this psql command in? Was the output redirected anywhere with \o or with on the commandline? We had a long argument about how this should work a while back so I suspect people aren't psyched about reopening it, but while I don't think the current logic is right I don't think wrapping to 80 columns when your terminal is wider is one of the current broken cases. It tends to fail in the opposite direction of randomly not wrapping at all so it's kind of surprising to see your experience. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bringing other columns along with a GROUP BY clause
Tom Lane t...@sss.pgh.pa.us writes: I have a feeling that it could also be done via the window functions feature due to be introduced in 8.4, but I'm not entirely sure how. Anybody feeling handy with those? There may be a better way but something like select * from (select *, rank() over (partition by charge order by coldspot_time desc) as r) where r = 1 ? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Vacuums taking forever :(
Phoenix Kiula phoenix.ki...@gmail.com writes: Thanks, Gregory and Simon, for the very useful posts. I have increased the vacuum_cost_limit to 2000 for now, just to see if that has an impact. Hopefully positive. Note that that was offhand speculation. Conventional wisdom is that it should make things *worse* -- you're saying to process more pages between sleeping so it'll use more i/o. I was speculating that you increased both vacuum_cost_limit and vacuum_cost_delay proportionally it might use the i/o more efficiently even though it's using the same amount of total bandwidth. The more normal suggestion is to increase *vacuum_cost_delay* which tells it to sleep longer between bits of work. Don't increase it too much or vacuum will take forever. But if you increase it from 20 to 40 it should use half as much i/o as bandwidth as now. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] C function question
Tom Lane t...@sss.pgh.pa.us writes: Alvaro Herrera alvhe...@alvh.no-ip.org writes: Grzegorz Jaśkiewicz wrote: looks like it really has to be defined with char in double quotes. I thought just char is enough... They're different types. You know, maybe we should stop holding our noses and do something about this old gotcha. That type's not going away anytime soon, but could we rename it to char1 or something like that? int1? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Vacuums taking forever :(
Phoenix Kiula phoenix.ki...@gmail.com writes: autovacuum_vacuum_cost_delay = 20 vacuum_cost_delay= 20 These say to sleep 20ms every few pages. These cron jobs are taking over 35 minutes for a vacuum! What's the use of a vacuum if it takes that long, and the DB performance is tragic in the meantime? The fact that vacuum takes 35 minutes really shouldn't be a concern. As long as it isn't making it hard to manage vacuuming frequently enough what do you care when vacuum finishes? You're not waiting on any results from it. In fact the point of the above parameters is to ensure vacuum goes *slowly* enough to avoid causing i/o slowdowns in the rest of the system. The real question is why your performance is tragic while vacuum is running. Sleeping 20ms periodically should really be enough to avoid causing any performance impact. Unless your system is running extremely close to the maximum throughput already and the small additional i/o is enough to tip it over the edge? Another option is to set the delay to 0 which *will* cause performance to be tragic, but for as short a time as possible. I don't recommend this approach. You could try raising the delay parameters or decreasing the vacuum_cost_limit parameters which would make the few pages fewer. That would lessen the i/o impact at the expense of lengthen vacuum's run time. But if you're already at 35% of the time between vacuums being necessary then that seems like it might not be an option. Upgrading to 8.3.x would reduce the need for vacuum at all if your updates qualify for HOT updates. And 8.4 will lessen the impact of vacuums further. But if you're already running that close to the red-line then you're going to have problems soon even with less i/o from vacuum. It sounds like you need to quantify just how much i/o your system is capable of handling and how close to that level you're already at. Keep in mind that random i/o is a *lot* more expensive than sequential i/o. Typical consumer drives can handle 60MB/s+ of sequential i/o but only about 1-2MB/s of random i/o! It's easy to misjudge your capacity by basing it on purely sequential i/o. (H. That makes me think that raising the vacuum_cost_limit parameter dramatically and the vacuum_cost_delay parameter proportionally might make it actually run faster with less i/o impact. The defaults process only a few kilobytes before sleeping which probably cause a lot of random seeks. If you multiple both by 10 then you'll process close to a megabyte of data and then sleep for a long while. Just a thought -- I haven't tried this on a test box.) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Christopher Browne cbbro...@gmail.com writes: - Managing jobs (e.g. - pgcron) A number of people have mentioned a job scheduler. I think a job scheduler entirely inside Postgres would be a terrible idea. However a cron daemon which used Postgres as a storage backend would be very cool. It could then provide SQL functions to manipulate the schedule and allow you to set jobs that call database functions using the existing connection instead of forcing you to write an external script. This is something someone could do with no special database background, the hard part is finding a cron source base which is flexible enough to extend to use a database backend. I'm under the impression most cron daemons are based on pretty old and ossified source bases and are burdened by a lot of legacy compatibility requirements. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Full text index not being used, even though it is in the plan
Alex Neth a...@liivid.com writes: I am trying to use a full text index, but it seems to be reindexing on every query. The query plan looks fine, but the queries take extremely long (hours even). I think it is reindexing because it is notifying me that certain long words won't be indexed as you can see below, which is what it does when I create the index. I don't think it's reindexing, it's just calling to_tsvector() which it has to do when it rechecks rows that the index says might match. Is it possible that nearly all the full_listing values contain view? How does it perform with much more selective searches? If your full_listing values are quite large then recalculating the tsvector might be a lot more expensive than doing a full table scan and LIKE match for cases when nearly the whole table is going to be scanned anyways. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indices types, what to use. Btree, Hash, Gin or Gist
Mohamed mohamed5432154...@gmail.com writes: My Gmail(bloody gmail!) has been auto-replying to the last messager (Scott) so I think we have been having a private discussion on this topic. There is an option in the Google Labs tab to make Reply All the default button -- of course then there's always a chance you'll make the opposite mistake which can be a lot worse. Earlier I suggested with a boolean column you could consider making it the condition on a partial index with some other key. For example you could have CREATE INDEX partial_age_male on tab(age) WHERE gender = 'M'; CREATE INDEX partial_age_female on tab(age) WHERE gender = 'F'; Then if you always search on age with gender the optimizer can use the index which only includes the records for the appropriate gender. It's basically a free index key column since it doesn't actually have to store the extra column. Note that in this example if you were to search on just age it wouldn't be able to use either of these indexes however. In theory it could use the indexes if you search on just gender but it would be unlikely to for all the same reasons as previously mentioned for regular indexes. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
rhubbell rhubb...@ihubbell.com writes: Installing a package for DBD::Pg or building it? The former would indeed be a package bug. When I installed the package I did via CPAN so maybe this was my mistake. Not every CPAN package is packaged for debian so I often times don't bother checking if a perl module exists in debian I just do perl -MCPAN -e 'install (DBD::Pg)' or whatever pkg Ah, well that's not a mistake, but you need to check what -dev packages the CPAN module you're building requires. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Indices types, what to use. Btree, Hash, Gin or Gist
Mohamed mohamed5432154...@gmail.com writes: Hi, I have several fields that use to match with my queries. I am curious to what index types is best for what. Here is some examples that will help you understand. Say I have a 1000 000 rows. Speed is of the essence here, insertions and updates happens relatively less frequent than search. Can you give examples of actual WHERE clauses? It's the combination of restrictions that actually matters. Are there specific fields which will never be used on their own, only in combination with others? I want to match against a boolean field, that is, only true or false is possible. I am thinking Btree but not sure.. correct? No index is going to be particularly effective for boolean columns unless they're very heavily skewed. You might find it useful to build separate partial indexes on other keys for each value though. I understand the Hash is not recommended. When should I use the Gin index ? GIN and GIST are used for fairly specialized purposes. Full text searching, geometric data types, etc. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Octavio Alvarez alvar...@alvarezp.ods.org writes: In any case, the results are the same as GROUPing BY from the data source. +-+-+ | Assignment | Average | +-+-+ | Assignment1 | 94.67 | | Assignment2 | 90.33 | | Assignment3 | 86.67 | +-+-+ A crosstab is not but a presentational transform of the data set. Any information you would eventually need can be taken from the original data source, one way or another. That's why dynamic-column crosstab are not a problem, and the DBMS should not worry about providing the information about the columns, maybe by simply not allowing the dynamic-column ones in subqueries. What about a WHERE clause like WHERE P1 P2 -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Octavio Alvarez alvar...@alvarezp.ods.org writes: Now, last time I checked, Postgresql lacked an SQL extension to create crosstab queries. Even though contrib have some helper functions, they are not official and work only if the number of columns is actually predefined. For instance if you want to create a query to produce different paths to go through a graph, being represented by 1 arc per record, you will never be able to predict the final number of columns (path steps) needed for the crosstab without doing the whole query one extra time to get the max(step). Is it the hierarchical query ability you're looking for or pivot? The former we are actually getting in 8.4. AFAIK even in systems with pivot you still have to declare a fixed list of columns in advance anyways. Do you see a system where it works differently? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] complex custom aggregate function
Scara Maccai m_li...@yahoo.it writes: It would be very easy if the input to the custom aggregate function was ordered (because I would keep 4 internal counters), but I guess there's no way of forcing the ordering of the input to the function, right? You can with a subquery. Something like SELECT agg(foo) from (SELECT foo ORDER BY bar) However that will produce one record per grouping. From what I read of your description you want to produce one record per input record. There isn't any efficient way to do that in current Postgres releases -- you would have to have a subquery which executed for every record and retrieved the set of data to aggregate. 8.4 Will have OLAP Window functions which can implement things like moving averages. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Daniel Verite dan...@manitou-mail.org writes: Gregory Stark wrote: Is it the hierarchical query ability you're looking for or pivot? The former we are actually getting in 8.4. AFAIK even in systems with pivot you still have to declare a fixed list of columns in advance anyways. Do you see a system where it works differently? MS-Access SQL has a TRANSFORM clause that allows for crosstab queries without the need to know in advance the number of columns: http://msdn.microsoft.com/en-us/library/bb208956.aspx That's puzzling. I wonder what they do about clients requesting info about the results. Or for that matter such queries being used in subqueries or anywhere else where the surrounding code needs to know the type of results to expect. As for Oracle, it wasn't possible until recently but now 11g has the PIVOT clause: http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-f eatures/11g-pivot.html From this the result columns do need to be explicitly listed in advance unless you're asking for the pivot to be into an xml blob which seems like a whole different feature really. In contrast of these clauses, PG's contrib/tablefunc looks more limited and quite harder to use. Incidentally, the work-around I've used in the past was to aggregate the rows into an array instead of separate columns. Definitely not the same of course, just a work-around. I think PIVOT is enticing too. It'll be interesting to see what happens in the standard with the divergence between MSSQL and Oracle. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
rhubbell rhubb...@ihubbell.com writes: Nope, had to find it in another package called libpq-dev. That's on UbuntuHardy. Maybe it's a maintainer problem? What logic would lead someone to separate pg_config from everything else? Do people often just install the server and nothing else? Then what? This is actually *required* by Debian/Ubuntu packaging rules. The development environment must be packaged separately from shared libraries like libpq or else major snafus arise when a new soversion of libpq comes out. You need to be able to have both versions installed simultaneously (in case you have programs which require both) but that won't work if they both contain things like header files or executables. BTW I ran into the need for pg_config upon installing DBD::Pg. Maybe DBD::Pg maintainer problem? Installing a package for DBD::Pg or building it? The former would indeed be a package bug. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
rhubbell rhubb...@ihubbell.com writes: What's your list look like? Or maybe you want everyone else to do your work for you and don't have anything to share. Heh, fair enough. Perhaps I should just say you should come to FOSDEM though :) The usual bugaboos are things like having to do a dump/restore to upgrade, weak support for partitioning, no index-only-scans, etc. Things which have been mentioned several times. One thing which has *not* been mentioned which i find positively shocking is VACUUM. This was once our single biggest source of user complaints. Between Autovacuum improvements and HOT previously and the free space map in 8.4 the situation will be much improved. However there are still some common usage patterns where people run into problems. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow first query despite LIMIT and OFFSET clause
Phoenix Kiula phoenix.ki...@gmail.com writes: Index Cond: ((user_id)::text = 'superman'::text) Filter: (status = 'Y'::bpchar) Of course for unregistered users we use user_known = 0, so they are excluded from this index. Is this not a useful partial index? I think in this SQL, the user_id is always superman and the user_known always 1 which is why the guesstimate from the planner may be off? Well the histograms are for each column separately, so the planner will take the selectivity estimates for user_id='superman' and status = 'Y' and multiply them. If the status of 'superman' records are very different from the status records as a whole then this will give poor results. If that's not the case then raising the statistics target for those two columns might help. And of course if the table hasn't been analyzed recently then analyzing it more often is always good. There isn't really a good solution for cross-column stats. You could perhaps create a functional index (could still be partial too) on an expression like CASE WHEN status = 'Y' THEN superman Which will make Postgres build stats for the result of that expression specifically. Then if you use that expression exactly as-is in the query the planner should those statistics. I think. I haven't tried this... Tell us how it goes :) I wonder if we should look at building partial histograms for the columns in partial indexes effectively equivalent to this... hm... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Pet Peeves?
I'm putting together a talk on PostgreSQL Pet Peeves for discussion at FOSDEM 2009 this year. I have a pretty good idea what some them are of course, but I would be interested to hear if people have any complaints from personal experience. What would be most interesting is if you can explain an example of when the problem caused real inconvenience to you, since sometimes it's hard to see from a theoretical description where the real harm lies. So, what do people say? Is Postgres perfect in your world or does it do some things which rub you the wrong way? Feel free to respond on-list or if you prefer in personal emails. I do intend to use the ideas you give in my presentation so mark anything you wouldn't be happy to see in a slide at a conference some day. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Text search segmentation fault
Teodor Sigaev teo...@sigaev.ru writes: I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch is attached and I'm going to commit it ... ! Conf-flagval[(unsigned int) *s] = (unsigned char) val; ... ! Conf-flagval[*(unsigned char*) s] = (unsigned char) val; Maybe I'm missing something but I don't understand how this fixes the problem. s is a char* so type punning it to an unsigned char * before dereferencing it is really the same as casting it to unsigned char directly and casting it to unsigned int really ought to have done the same thing anyways. All of the changes are of this type so I can't see how your patch could have fixed the problem. And in general casting the pointer before dereferencing it is a whole lot scarier code which should raise eyebrows a lot faster than just a simple cast to unsigned char like you had it originally. What really boggles me is why you don't just use unsigned chars everywhere and remove all of these casts. or would that just move the casts to strcmp and company? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Text search segmentation fault
Gregory Stark st...@enterprisedb.com writes: Teodor Sigaev teo...@sigaev.ru writes: I reproduced the bug with a help of Grzegorz's point for 64-bit box. So, patch is attached and I'm going to commit it ... !Conf-flagval[(unsigned int) *s] = (unsigned char) val; ... !Conf-flagval[*(unsigned char*) s] = (unsigned char) val; Maybe I'm missing something but I don't understand how this fixes the problem. Ah, I understand how this fixes the problem. You were casting to unsigned *int* not unsigned char so it was sign extending first and then overflowing. So char255 was coming out as MAX_INT instead of 255. #include stdio.h main() { volatile signed char a = -1; printf(ud=%ud\n, (unsigned int)a); } $ ./a.out ud=4294967295d If you just make these all casts to (unsigned char) it should work just as well as the pointer type punning -- and be a whole lot less scary. What really boggles me is why you don't just use unsigned chars everywhere and remove all of these casts. or would that just move the casts to strcmp and company? It still seems to me if you put a few unsigned in variable declarations you could remove piles upon piles of casts and make all of the code more readable. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Text search segmentation fault
Tom Lane t...@sss.pgh.pa.us writes: Gregory Stark st...@enterprisedb.com writes: Maybe I'm missing something but I don't understand how this fixes the problem. s is a char* so type punning it to an unsigned char * before dereferencing it is really the same as casting it to unsigned char directly No, it isn't. If char is signed then you'll get quite different results from a high-bit-set byte value, because sign extension will happen before the value is reinterpreted as unsigned. What I wrote is correct. There's no sign extension if you're casting from signed char to unsigned char since there's no extension. I really think he should just change all the unsigned int into unsigned char and not do the type punning with pointer casts. That's just evil. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
David Fetter da...@fetter.org writes: * No built-in ways to get the information psql gets. See what psql is doing isn't an option when somebody doesn't have psql on hand. Uhm, what information are you referring to here? * No man pages for the internals. Is it just that not all of the manual is actually exported into man pages? Or is there stuff you would like to see in the manual that isn't there? * CTEs not yet integrated into the adjacency lists in pg_catalog, etc. I'm not sure what you're referring to here either. Remember to vote! This may not be so timely any more, though I suppose there's always someone somewhere holding elections :) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Jason Long mailing.l...@supernovasoftware.com writes: Richard Huxton wrote: 1. Case-folding on column-names. Quoting is a PITA sometimes when you're transferring from a different DBMS. Be nice to have a true_case_insensitive=on flag. I was just wishing for this the other day. I'm kind of wondering what behaviour you two are looking for and what different DBMS you're referring to. I'm assuming it's not the ANSI fold-to-uppercase behaviour you're looking for. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Steve Crawford scrawf...@pinpointresearch.com writes: 3. Date handling Sometimes I've got data with invalid dates and it would be great if it could replace all the bad ones with, say -00-00. Oh dear $DEITY, no. I think it would be best if we limited ourselves right now to discussing the problems themselves and not debating the pros and cons of possible solutions. I want to encourage people to post their peeves even if they know perfectly well the reasons why things are the way they are. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Pet Peeves?
Steve Atkins st...@blighty.com writes: 6. Where's my CPAN equivalent? Postgresql is extensible, but it's hard to find the extension you need, and often harder than it should be to install. FWIW our CPAN equivalent is pgfoundry. I don't think we quite have the critical mass yet that Perl has to really make it a big success though. Making modules more, uh, modular, so they can be installed and uninstalled smoothly and preferably without special access privileges is a recognized issue though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How relate pg_class to pg_constraint
Bill Todd p...@dbginc.com writes: I need to join pg_class and pg_constraint to get information about constraints on a table. It appears that pg_constraint.conrelid is the foreign key but I do not see a relid column in pg_class. What column(s) define the relationship between these tables? Thanks. There's a system column called oid on all the system tables which is the primary key. It doesn't show up unless you explicitly list it in the target list of the select. So you need a join like WHERE pg_class.oid = conrelid If all you need is the name to display for users then there's a convenience type called regclass which you can use by doing SELECT conrelid::regclass from pg_constraint. There are similar regtype and a few others like it too. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG's suitability for high volume environment (many INSERTs and lots of aggregation reporting)
Phoenix Kiula phoenix.ki...@gmail.com writes: My question: with that kind of volume and the underlying aggregation functions (by product id, dates, possibly IP addresses or at least countries of origin..) will PG ever be a good choice? Well, only you're able to judge that for your own data and use cases. Your query is sorting 10,000 records in half a second which is not great but not terrible either. I think the only way you'll be able to speed that up is by changing your index design so that Postgres can access the data you need without sorting through all the irrelevant records. I suspect others already suggested this, but you might look at partial indexes. If your queries are very dynamic against relatively static data you might look at building denormalized caches of the precalculated data. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgUS 2008 end of year summary
Alvaro Herrera alvhe...@commandprompt.com writes: Joshua D. Drake wrote: On Mon, 2009-01-12 at 18:34 -0300, Alvaro Herrera wrote: In the future please do not spam multiple lists with the same message. Or rather, if you want the message to appear in more than one list, please CC them all in a single message instead of sending one message to each. If we do that, we get cross posting. That is why I didn't. Cross posting is not necessarily bad; in fact it's regarded to be less annoying than multiposting, which is what you did. For argumentation, see here http://www.cs.tut.fi/~jkorpela/usenet/xpost.html#why Well cross-posting is especially annoying on subscriber-only moderated lists such as ours. Anyone who follows up to an email who isn't subscribed to all the lists will get bounce warnings for each list they're not on. (If you want it to be even more pain-free, add a Reply-To: pgsql-advocacy header or some such.) Yeah, actually that doesn't work. If you want to do that the only way to do it properly is to Bcc the various lists with the To set to the list you want followups to go to. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgUS 2008 end of year summary
Alvaro Herrera alvhe...@commandprompt.com writes: Gregory Stark wrote: Alvaro Herrera alvhe...@commandprompt.com writes: Well cross-posting is especially annoying on subscriber-only moderated lists such as ours. Anyone who follows up to an email who isn't subscribed to all the lists will get bounce warnings for each list they're not on. Especially annoying? You'll get a bounce warning. That's all. You don't have to do anything about it; just wait for the moderator to approve it. In fact, when I (as the sender) get those, I just delete them. You don't think getting bounces every time you respond to a message is annoying? I get annoyed whenever a user posts with an address which bounces and I get *really* annoyed at the similar case when someone's email address has a broken mailer which bounces to people who post to the mailing list. (If you want it to be even more pain-free, add a Reply-To: pgsql-advocacy header or some such.) Yeah, actually that doesn't work. Hmm, it doesn't work how? I admit I haven't tried it, so I'm using this message as a test (I added Reply-To: pgsql-advoc...@postgresql.org) Well this was a wide followup to your message, which I think failed to do what you wanted. More dramatically if I had tried to reply personally to you your Reply-To would redirect the personal message to the list. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] What determines the cost of an index scan?
Christian Schröder c...@deriva.de writes: Where does this difference come from? Pure cpu performance? Do the additional fields in the productive database have an impact on the performance? Or do I miss something? Sure, more data takes more time to process. Other factors which could affect it is how much dead space is in the table due to previous updates and deletes, as well as how fragmented the indexes have become over time. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Error: Operator does not exist: char=integer
Aarni aa...@kymi.com writes: ERROR: operator does not exist: character varying = integer at character 286 HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. Quick fix to sql statements eg. ... WHERE CAST (your_char AS INTEGER) = integer ... ... WHERE CAST (your_char AS INTEGER) IN (1,2,3,...) Note that this is *not* what was happening in 8.2. There it was casting them to text and doing a text comparison. In the case of integer and equality they're probably equivalent. However and will behave quite differently. That's why the casts disappeared -- you probably weren't running the queries you thought you were running in 8.2 and previously. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maximum reasonable free space map
Phillip Berry pbe...@stellaconcepts.com writes: So I guess my question is, is there a point where you start to see diminishing returns or even negative returns by setting the fsm too high? There is no benefit to having FSM larger than necessary, so I suppose that qualifies as diminishing returns. The only negative effect is the reduced memory available for caches and shared buffers. You might also want to check that you don't have just a few tables which have a lot of dead space in them. If so filling the FSM is the least of your worries. The tables with lots of dead space will perform poorly because of the time spent sifting through all that dead space. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] tup_returned/ tup_fetched
Sebastian Böhm s...@exse.net writes: one question: what actually is tup_returned and tup_fetched ? RTFM -- admittedly it's not so easy to find these since you have to know they come from the following functions: pg_stat_get_tuples_returned(oid) bigint Number of rows read by sequential scans when argument is a table, or number of index entries returned when argument is an index pg_stat_get_tuples_fetched(oid) bigint Number of table rows fetched by bitmap scans when argument is a table, or table rows fetched by simple index scans using the index when argument is an index From: http://www.postgresql.org/docs/8.3/interactive/monitoring-stats.html -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Craig Ringer [EMAIL PROTECTED] writes: Gregory Stark wrote: Another idea, I wonder if the project has gone more international and therefore has more traffic at odd hours of the day for everyone. It would also mean more long-lived threads with large latencies between messages and replies. I wouldn't be at all surprised if that were the case. Alas, it's not possible to analyze usefully because so many companies use .com addresses instead of addresses under a cctld, and because so many people use webmail services like gmail that provide no geographical information in the domain. I would be curious to see the average lifespan of threads over time. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Alvaro Herrera [EMAIL PROTECTED] writes: Richard Huxton wrote: Some of the EXPLAINs on the performance list are practically impossible to read unless you've got the time to cut+paste and fix line-endings. Maybe we should start recommending people to post those via http://explain-analyze.info/ What would be really neat would be having the mailing list do something automatically. Either fix the message inline or generate a link to something like this. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres mail list traffic over time
Tom Lane [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: So, to a first approximation, the PG list traffic has been constant since 2000. Not the result I expected. I also was confused by its flatness. I am finding the email traffic almost impossible to continue tracking, so something different is happening, but it seems it is not volume-related. Yes, my perception also is that it's getting harder and harder to keep up with the list traffic; so something is happening that a simple volume count doesn't capture. I've noticed recently that the mailing list traffic seems very bursty. We have days with hundreds of messages on lots of different in-depth topics and other days with hardly any messages at all. I wonder if it's hard to follow because we've been picking up more simultaneous threads instead of all being on one thread together before moving on to the next one. Another idea, I wonder if the project has gone more international and therefore has more traffic at odd hours of the day for everyone. It would also mean more long-lived threads with large latencies between messages and replies. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Storage location of temporary files
Scott Marlowe [EMAIL PROTECTED] writes: 2008/11/5 Christian Schröder [EMAIL PROTECTED]: Tomasz Ostrowski wrote: This is wrong. RAID5 is slower than RAID1. You should go for RAID1+0 for fast and reliable storage. Or RAID0 for even faster but unreliable. I did not find a clear statement about this. I agree that RAID10 would be better than RAID5, but in some situations RAID5 at least seems to be faster than RAID1. For certain read heavy loads RAID-5 will beat RAID-1 handily. After all, from a read only perspective, a healthy RAID-5 with n disks is equal to a healthy RAID-0 with n-1 disks. Uhm, and for a read-heavy load a RAID-1 or RAID 1+0 array with n disks is equal to a healthy RAID-0 with n disks. RAID-5 should never beat any combination of RAID-0 and RAID-1 with the same number of drives at read performance. It's advantage is that you get more capacity. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there plans to add data compression feature to postgresql?
Ivan Sergio Borgonovo [EMAIL PROTECTED] writes: But sorry I still can't get WHY compression as a whole and data integrity are mutually exclusive. ... Now on *average* the write operations should be faster so the risk you'll be hit by an asteroid during the time a fsync has been requested and the time it returns should be shorter. If you're not fsyncing... you've no warranty that your changes reached your permanent storage. Postgres *guarantees* that as long as everything else works correctly it doesn't lose data. Not that it minimizes the chances of losing data. It is interesting to discuss hardening against unforeseen circumstances as well but it's of secondary importance to first of all guaranteeing 100% that there is no data loss in the expected scenarios. That means Postgres has to guarantee 100% that if the power is lost mid-write that it can recover all the data correctly. It does this by fsyncing logs of some changes and depending on filesystems and drives behaving in certain ways for others -- namely that a partially completed write will leave each byte with either the new or old value. Compressed filesystems might break that assumption making Postgres's guarantee void. I don't know how these hypothetical compressed filesystems are implemented so I can't say whether they work or not. When I first wrote the comment I was picturing a traditional filesystem with each block stored compressed. That can't guarantee anything like this. However later in the discussion I mentioned that ZFS with an 8k block size could actually get this right since it never overwrites existing data, it always writes to a new location and then changes metadata pointers. I expect ext3 with data=journal might also be ok. These both have to make performance sacrifices to get there though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there plans to add data compression feature to postgresql?
Scott Marlowe [EMAIL PROTECTED] writes: What is the torn page problem? Note I'm no big fan of compressed file systems, but I can't imagine them not working with databases, as I've seen them work quite reliably under exhange server running a db oriented storage subsystem. And I can't imagine them not being invisible to an application, otherwise you'd just be asking for trouble. Invisible under normal operation sure, but when something fails the consequences will surely be different and I can't see how you could make a compressed filesystem safe without a huge performance hit. The torn page problem is what happens if the system loses power or crashes when only part of the data written has made it to disk. If you're compressing or encrypting data then you can't expect the old data portion and the new data portion to make sense together. So for example if Postgres sets a hint bit on one tuple in a block, then writes out that block and the filesystem recompresses it, the entire block will change. If the system crashes when only 4k of it has reached disk then when we read in that block it will fail decompression. And if the block size of the compressed filesystem is larger than the PostgreSQL block size your problems are even more severe. Even a regular WAL-logged write to a database block can cause the subsequent database block to become unreadable if power is lost before the entire set of database blocks within the filesystem block is written. The only way I could see this working is if you use a filesystem which logs data changes like ZFS or ext3 with data=journal. Even then you have to be very careful to make the filesystem block size that the journal treats as atomic match the Postgres block size or you'll still be in trouble. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there plans to add data compression feature to postgresql?
Ivan Sergio Borgonovo [EMAIL PROTECTED] writes: On Fri, 31 Oct 2008 08:49:56 + Gregory Stark [EMAIL PROTECTED] wrote: Invisible under normal operation sure, but when something fails the consequences will surely be different and I can't see how you could make a compressed filesystem safe without a huge performance hit. Pardon my naiveness but I can't get why compression and data integrity should be always considered clashing factors. Well the answer was in the next paragraph of my email, the one you've clipped out here. DB operation are supposed to be atomic if fsync actually does what it is supposed to do. So you'd have coherency assured by proper execution of fsync going down to all HW levels before it reach permanent storage. fsync lets the application know when the data has reached disk. Once it returns you know the data on disk is coherent. What we're talking about is what to do if the power fails or the system crashes before that happens. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Decreasing WAL size effects
Greg Smith [EMAIL PROTECTED] writes: Now, it would be possible to have that less sensitive archive code path zero things out, but you'd need to introduce a way to note when it's been done (so you don't do it for a segment twice) and a way to turn it off so everybody doesn't go through that overhead (which probably means another GUC). That's a bit much trouble to go through just for a feature with a fairly limited use-case that can easily live outside of the engine altogether. Wouldn't it be just as good to indicate to the archive command the amount of real data in the wal file and have it only bother copying up to that point? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there plans to add data compression feature to postgresql?
Scott Marlowe [EMAIL PROTECTED] writes: I'm sure this makes for a nice brochure or power point presentation, but in the real world I can't imagine putting that much effort into it when compressed file systems seem the place to be doing this. I can't really see trusting Postgres on a filesystem that felt free to compress portions of it. Would the filesystem still be able to guarantee that torn pages won't tear across adjacent blocks? What about torn pages that included hint bits being set? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql screen size
Alvaro Herrera [EMAIL PROTECTED] writes: I can confirm that when the pager is open, psql does not resize properly. Maybe psql is ignoring signals while the pager is open, or something. Hm, system() is documented to ignore SIGINT and SIGQUIT I wonder if it's (erroneously?) ignoring SIGWINCH as well. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Are there plans to add data compression feature to postgresql?
Scott Marlowe [EMAIL PROTECTED] writes: On Thu, Oct 30, 2008 at 4:41 PM, Tom Lane [EMAIL PROTECTED] wrote: Scott Marlowe [EMAIL PROTECTED] writes: On Thu, Oct 30, 2008 at 4:01 PM, Gregory Stark [EMAIL PROTECTED] wrote: I can't really see trusting Postgres on a filesystem that felt free to compress portions of it. Would the filesystem still be able to guarantee that torn pages won't tear across adjacent blocks? What about torn pages that included hint bits being set? I can't see PostgreSQL noticing it. PostgreSQL hands the OS a 512byte block, the OS compresses it and it's brethren as the go to disk, uncompresses as they come out, and as long as what you put in is what you get back it shouldn't really matter. I think Greg's issue is exactly about what guarantees you'll have left after the data that comes back fails to be the data that went in. Sounds kinda hand wavy to me. If compressed file systems didn't give you back what you gave them I couldn't imagine them being around for very long. I don't know, NFS has lasted quite a while. So you tell me, I write 512 bytes of data to a compressed filesystem, how does it handle the torn page problem? Is it going to have to WAL log all data operations again? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql screen size
wstrzalka [EMAIL PROTECTED] writes: On 27 Paź, 13:16, [EMAIL PROTECTED] (Sam Mason) wrote: On Mon, Oct 27, 2008 at 01:59:42AM -0700, wstrzalka wrote: When changing the window size (and those chars per row) psql output becomes mess, the only rescue is to exit and run the psql again. It looks like it's initializing the output params at startup and don't refresh it in runtime. At least in CVS HEAD it checks before every query output. However... Resizing the window when entering SQL works OK for me, but resizing when inside my pager (normally the less utility, when you get more results than will fit on the screen) causes psql to be confused when I return to it. Yes. This is exactly the case. When I browse large paged query result and resize during that, the screen is messed up. Could you define messed up? What I see is that the query output is formatted correctly but readline still thinks the screen is the old size. (This is in CVS HEAD -- this code was definitely different in 8.3 and before so the behaviour may be different). Perhaps we need to tell readline whenever we run a subprocess and it may have missed screen resize signals. It's easy enough to work around, just resize the window again a little bit once you're at the prompt. Readline notices that and adjusts. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql screen size
Gregory Stark [EMAIL PROTECTED] writes: Could you define messed up? What I see is that the query output is formatted correctly but readline still thinks the screen is the old size. (This is in CVS HEAD -- this code was definitely different in 8.3 and before so the behaviour may be different). Perhaps we need to tell readline whenever we run a subprocess and it may have missed screen resize signals. Hm, this Bash FAQ seems to indicate this shouldn't be a problem -- the whole process group is supposed to get the window size. Psql isn't doing the job control stuff the FAQ entry talks about so the pager ought to be in the same process group. So I'm puzzled. http://tiswww.case.edu/php/chet/bash/FAQ - E11) If I resize my xterm while another program is running, why doesn't bash - notice the change? - - This is another issue that deals with job control. - - The kernel maintains a notion of a current terminal process group. Members - of this process group (processes whose process group ID is equal to the - current terminal process group ID) receive terminal-generated signals like - SIGWINCH. (For more details, see the JOB CONTROL section of the bash - man page.) - - If a terminal is resized, the kernel sends SIGWINCH to each member of - the terminal's current process group (the `foreground' process group). - - When bash is running with job control enabled, each pipeline (which may be - a single command) is run in its own process group, different from bash's - process group. This foreground process group receives the SIGWINCH; bash - does not. Bash has no way of knowing that the terminal has been resized. - - There is a `checkwinsize' option, settable with the `shopt' builtin, that - will cause bash to check the window size and adjust its idea of the - terminal's dimensions each time a process stops or exits and returns control - of the terminal to bash. Enable it with `shopt -s checkwinsize'. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL 8.3.3 chooses wrong query plan when LIMIT 1 added?
This looks like another form of the cross-column dependency problem. Postgres is assuming that the revisions for all files will be evenly spread throughout the date range and apparently there's a larger variety of dates than files so it expects to find the last revision for that file fairly quickly scanning backwards through the dates. In fact of course files tend to be hot for a period of time and then mostly idle, so depending on which file you pick that may work well if it's currently hot or be absolutely terrible if it's a file that hasn't been touched recently. With the LIMIT Postgres favours the plan it thinks will return one row quickly without sorting. Without it it's favouring the plan that will return all the rows for that file_id most quickly. I'm not sure what to suggest for this case if you can't change the data model except perhaps increasing the statistics target. One thing that comes to mind though, I would have defined one of those two indexes to include both columns. Probably the file_id index, so you would have an index on revision_id and an index on file_id,revision_id. That would be a huge win for this query. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drupal and PostgreSQL - performance issues?
Martin Gainty [EMAIL PROTECTED] writes: MGcomments prefixed with MG Incidentally that's a good way to make sure people don't see your comments. There are a few variations but the common denominator is that things prefixed with foo are quotations from earlier messages. Many mailers hide such comments or de-emphasize them to help the user concentrate on the new material. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drupal and PostgreSQL - performance issues?
Greg Smith [EMAIL PROTECTED] writes: DB2 has automatically updated the shmmax kernel parameter from 33554432 to the recommended value 268435456. This seems like a bogus thing for an application to do though. The Redhat people seem happy with the idea but I'm pretty sure it would violate several Debian packaging rules. Generally it seems crazy for a distribution to ship configured one way by default but have packages change that behind the user's back. What if the admin set SHMMAX that way because he wanted it? What happens when a new distribution package has a new default but doesn't adjust it because it sees the admin has changed it -- even though it was actually Postgres which made the change? And you're off--it bumped that from the default 32MB to 256MB. The problem for PostgreSQL is that nobody who is motivated enough to write such magic for a large chunk of the supported platforms has had the time to do it yet. I'll get to that myself eventually even if nobody else does, as this is a recurring problem I'd like to make go away. ISTM the right way to make it go away is to allocate temporary files and mmap them instead of using sysv shared memory. Then we can mmap as much as we want. Before we lose root privileges we can even mlock as much as we want. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PQexecParams question
Grzegorz Jaśkiewicz [EMAIL PROTECTED] writes: that would be a type mismatch, heh. prepare select * from foo where a = any($1::int[]) then pass {1,2,3} -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Chart of Accounts
justin [EMAIL PROTECTED] writes: special note do not use only 2 decimal points in the accounting tables. If your application uses 10 decimal places somewhere then every table in the database that has decimals needs to have the same precision. Nothing is more annoying where a transaction says 1.01 and the other side says 1.02 due to rounding. FWIW I think this is wrong. You need to use precisely the number of decimal places that each datum needs. If you use extra it's just as wrong as if you use too few. For example, when you buy gas/petrol at $1.999/gallon and buy 4 gallons you get charged $8.00 not $7.996. If you fail to round at that point you'll find that your totals don't agree with the amount of money in your actual bank account. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Out of memory in create index
David Wilson [EMAIL PROTECTED] writes: create index val_datestamp_idx on vals(datestamp) tablespace space2; About 30 seconds into the query, I get: ERROR: out of memory DETAIL: Failed on request of size 536870912. Increasing maintenance_work_mem from 1GB to 2GB changed nothing at all- exact same error at exact same time. Watching memory on the machine shows the out of memory error happens when the machine is only at about 35% user. create index concurrently shows an identical error. Try *lowering* maintenance_work_mem. That's how much memory you're telling the index build to use. Evidently your machine doesn't have enough RAM/swap to handle 1G of temporary sort space. In practice values over a few hundred megs don't seem to help much anyways. Try 512M or 256M. Also, a little known fact is that an index build can actually allocate maintenance_work_mem plus an extra work_mem. So if you have work_mem set unreasonably high that could be contributing to the problem. Actually, while I was writing this, I added an additional column to the index and it now appears to be completing (memory has reached about the point it had been failing at and is now holding steady, and the query has been going for significantly longer than the 30 seconds or so it took to error out previously). I sort by both columns at times, so the extra column may in fact turn out to be useful, but the failure of the single column create index in the face of the other successful creates has me confused. Can anyone shed some light on the situation? How much memory the OS allows Postgres to allocate will depend on a lot of external factors. At a guess you had some other services or queries running at the same time the first time which reduced the available memory. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Static functions
Gurjeet Singh [EMAIL PROTECTED] writes: On Sat, Oct 4, 2008 at 8:49 AM, Tom Lane [EMAIL PROTECTED] wrote: Gurjeet Singh [EMAIL PROTECTED] writes: Shouldn't PG make all efforts to not execute something when the result is already known? Not if said effort would cost more than is saved, which would be by far the most likely result if we tried to cache all function results. Sorry Tom, I confused STABLE with IMMUTABLE; my bad. No, this is equally untrue for immutable. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index order
Glyn Astill [EMAIL PROTECTED] writes: I'd expect the rows starting with the caret to appear either at the start or end of, rather than in the middle, it appears as if the index ignores them. Database locale is Latin1 Latin1 isn't a locale, it's a character set (and an encoding). Your locale is probably something like en_US which generally sorts as you show above. It sounds like you're expecting the C locale sorting which is the ascii order. You can either re-initdb your database with locale set to C, or on recent versions of Postgres you can use ORDER BY artist USING ~~ which specifically sorts based on the binary ascii order. If you want the query to use an index you would have to built it with something like: create index idx on foo (artist text_pattern_ops); Note that this ordering won't do anything sane with any accented characters or in latin1 since they'll be sorted by their binary value which comes after all the unaccented characters. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Healing a table after massive updates
Scott Marlowe [EMAIL PROTECTED] writes: On Thu, Sep 11, 2008 at 8:56 AM, Bill Moran [EMAIL PROTECTED] wrote: In response to Alvaro Herrera [EMAIL PROTECTED]: Bill Moran wrote: In response to Gauthier, Dave [EMAIL PROTECTED]: I might be able to answer my own question... vacuum FULL (analyze is optional) CLUSTER _may_ be a better choice, but carefully read the docs regarding it's drawbacks first. You may want to do some benchmarks to see if it's really needed before you commit to it as a scheduled operation. What drawbacks? There's the whole there will be two copies of the table on-disk thing that could be an issue if it's a large table. I've also found cluster to be pretty slow, even on 8.3. On a server that hits 30-40Megs a second write speed for random access during pgbench, it's writing out at 1 to 2 megabytes a second when it runs, and takes the better part of a day on our biggest table. vacuumdb -fz + reindexdb ran in about 6 hours which means we could fit it into our maintenance window. vacuum moves a lot more data per second than cluster. Alternative you can do ALTER TABLE tab ALTER col TYPE sametype USING col; which will rewrite the table without using an index. This is much faster but has the same other disadvantages. Incidentally the visibility bugs are indeed entirely fixed in 8.3. In 8.2 and before cluster and alter table rewrites can both cause tuples to not appear for transactions which were started before the cluster or alter table such as a long-running pg_dump. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Healing a table after massive updates
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Incidentally the visibility bugs are indeed entirely fixed in 8.3. In 8.2 and before cluster and alter table rewrites can both cause tuples to not appear for transactions which were started before the cluster or alter table such as a long-running pg_dump. AFAIK that's true only for CLUSTER, not ALTER TABLE. There would be a bunch of logical inconsistencies in altering rows and then pretending you hadn't. Uh, what's true? That the bugs are fixed or that we ever had them to begin with? Oh, are you saying that ALTER TABLE doesn't go through the rewrite code to carry along old versions of the tuples because it can't alter the old versions? That is a bit annoying. That means there's no safe way to do a table rewrite without clustering the table? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] recover in single-user backend fails
Tom Lane [EMAIL PROTECTED] writes: The only recovery strategy that I can think of in 8.1 is to blow away your WAL with pg_resetxlog, let the database come up in a damaged condition, and then try to extract data from it. Would it work if he rebuilt 8.1 with a return; as the first line of gist_redo() in gistxlog.c and when recovery (hopefully) completes immediately drop any gist indexes? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] A challenge for the SQL gurus out there...
Uwe C. Schroeder [EMAIL PROTECTED] writes: I want to get a list looking like forum idthread_id post_id 1 6 443 2 9 123 3 3 557 ... It all boils down to me not being able to come up with a query that gives me the latest post per forum_id. In a situation like this I would probably denormalize the tables slightly by adding a form_id key to the individual posts. That would make it hard to ever move a thread from one forum to another, though not impossible, but would help in this case as well as any other time you want to do an operation on all posts in a forum regardless of thread. If you add that column then you could index form_id,date and get the result you're looking for instantly with a DISTINCT ON query (which is a Postgres SQL extension). SELECT DISTINCT ON (form_id) forum_id, thread_id, post_id FROM thread ORDER BY forum_id, date DESC (actually you would have to make the index on form_id, date DESC or make both columns DESC in the query and then re-order them in an outer query) Alternatively you could have a trigger on posts which updates a last_updated field on every thread (and possibly a recent_post_id) then you could have a query on forums which pulls the most recently updated thread directly without having to join on form_post at all. That would slow down inserts but speed up views -- possibly a good trade-off for a forum system. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MERGE: performance advices
Ivan Sergio Borgonovo [EMAIL PROTECTED] writes: I need to merge 2 tables: update d set c1=s.c1, c2=s.c2... from s where d.pk=s.pk; insert into d (pk, c1, c2, ...) select pk, c1, c2, c3 from s where s.pk not in (select pk from d); you could try making the not in an exists. In released versions of Postgres sometimes one is better than the other. Raising work_mem might matter if it lets you do a hash join for either the IN/EXISTS or the join. There is another approach though whether it's faster depends on how many indexes you have and other factors: CREATE TABLE new_d AS SELECT DISTINCT ON (pk) pk,c1,c FROM (select 1 as t, * from s union all select 2 as t, * from d ) ORDER BY pk, t This will pull in all the rows from both tables and sort them by pk with records from s appearing before matching records from t and then keep only the first value for each pk. Then you'll have to build indexes, swap the tables, and fix any views or rules which refer to the old table (they'll still refer to the old table, not the new table even after renaming it to the old name). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] indexes on functions and create or replace function
Tom Lane [EMAIL PROTECTED] writes: Since determining whether the function actually did change behavior is Turing-complete, we can't realistically try to determine that in software. So we leave it up to the user to reindex if he makes a behavioral change in an indexed function. Another option might be to mark the index invalid. Then allow the user to either reindex the index to enable it or use a magic DBA-only command to enable it asserting that a rebuild isn't necessary. Then of course someone would complain about the downtime caused by queries not using the index during the unavoidable window when the index is invalid. I'm not sure how to solve that. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres 8.3 is not using indexes
Clemens Schwaighofer [EMAIL PROTECTED] writes: Any tips why this is so? They don't appear to contain the same data. If they do have you run analyze recently? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] foreign key restrictions
Richard Broersma [EMAIL PROTECTED] writes: On Sun, Aug 10, 2008 at 1:15 AM, [EMAIL PROTECTED] wrote: Since table TWO has a unique constraint on column ID, the (ID, OFONE) pair will also be unique, obviously. This statement is not completely true. The only part of the pair that is true is ID. Also there is not unique constraint on the pare. So there is no way to PG to build a foreing key on the pair. Uhm, afaics he's right. if ID is unique not null then ID, OFONE also has to be unique. That is, there could be duplicate values of OFONE but they'll all have different values of ID anyways. I'm not sure if there's a fundamental reason why there has to be an index that exactly matches the foreign key or not -- offhand I can't think of one. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Initdb problem on debian mips cobalt: Bus error
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: (Rather than trying to browbeat configure into doing this, I'd suggest manually adjusting CFLAGS in src/Makefile.global, then make clean and rebuild.) eh? either of these should work fine: ./configure --enable-debug CFLAGS=-O0 CFLAGS=-O0 ./configure --enable-debug The trouble with that approach is that it overrides *everything* that configure would normally put into CFLAGS. I only want one thing changing, please ... this is confusing enough already. Eh? $ ./configure ... configure: using CFLAGS=-O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv configure: using CPPFLAGS= -D_GNU_SOURCE configure: using LDFLAGS= -Wl,--as-needed $ ./configure CFLAGS=-O0 ... configure: using CFLAGS=-O0 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv configure: using CPPFLAGS= -D_GNU_SOURCE configure: using LDFLAGS= -Wl,--as-needed -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Initdb problem on debian mips cobalt: Bus error
Tom Lane [EMAIL PROTECTED] writes: (Rather than trying to browbeat configure into doing this, I'd suggest manually adjusting CFLAGS in src/Makefile.global, then make clean and rebuild.) eh? either of these should work fine: ./configure --enable-debug CFLAGS=-O0 CFLAGS=-O0 ./configure --enable-debug And yes, you have to do make clean. I often forget that step :( -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AT TIME ZONE and DST in UTC-CET conversion
Jaromír Talíř [EMAIL PROTECTED] writes: postgres# select '2008-06-01 10:10:10 UTC' AT TIME ZONE 'CET'; timezone - 2008-06-01 12:10:10 ISTM this is the one that's wrong. CET is standard time, it, GMT+1. If you want a timezone which switches between CET and CST automatically you should use something like Europe/Paris. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory use in 8.3 plpgsql with heavy use of xpath()
Tom Lane [EMAIL PROTECTED] writes: Well, you tell me --- *you* reported a behavior that isn't obviously explained by the bug we found. In case it wasn't clear, the bug found was a intra-transaction memory leak. When the transaction ended the memory would be reclaimed. That doesn't seem to match the description of long term memory growth. It's possible that what you were seeing was an indirect effect of the now-known bug: if the xpath leak were to occur repeatedly on a large scale in a long-lived session, I think it's possible that memory allocation behavior might suffer due to fragmentation effects. I feel that that's a pretty hand-wavy explanation though. Another explanation is that there wasn't a inter-transaction memory leak, it was just that the high water mark would grow whenever a transaction processed more data than previous transactions. If the data set size varies a lot most common distributions would have the majority of data sets be about the same size with a long tail of larger sets. That might like a slow creep as the increases get rarer and rarer but do continue to happen. That's just a special case of what would be expected to happen with memory allocation anyways though. Few allocators return memory to the OS anyways. It might just be exaggerated in this case since probably a significant part of Postgres's footprint here was the per-transaction memory being used by this leak. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Functional index adding one
[EMAIL PROTECTED] writes: Hi all: I'm trying to create a functional index into column position of token table (see below). I want to make something like: CREATE INDEX token_position_func ON token (position+1); but I get: test=# CREATE INDEX token_position_func test-# ON token (position+1); ERROR: syntax error at or near + LINE 2: ON token (position+1); I think you just need another set of parentheses: CREATE INDEX token_position_func on (token(position+1)) Unless you're on a very old version of Postgres, I think 7.3 which had functional indexes but not expression indexes. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Functional index adding one
Gregory Stark [EMAIL PROTECTED] writes: CREATE INDEX token_position_func on (token(position+1)) Ooops, I misread that as if token were a function and not the table. Sam Mason had the right syntax. Sorry. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Switching between terminals
cinu [EMAIL PROTECTED] writes: Could anyone please tell me where I am going wrong and if there is a way I can get the same behaviour that I am getting while I am executing the through psql prompt. a) you might try hitting return occasionally in your email :) b) you maybe need to put a SELECT pg_sleep(10) between the two queries in the first file you run so that it hasn't updated both tables and exited before the second one even starts. But I'm just guessing since you haven't sent the actual files you're running. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Heavily fragmented table and index data in 8.0.3
Zoltan Boszormenyi [EMAIL PROTECTED] writes: Also, VACUUM FULL also takes too much time, on an otherwise idle database, I worked on a copy of their live database. During VACUUM, _bt_getbuf() was also called repeatedly with the block number jumping up and down. VACUUM or VACUUM FULL? VACUUM should only read the table sequentially but VACUUM FULL behaves exactly as you describe which is one of the reasons it sucks so much. That said I think 8.0's VACUUM does jump around when cleaning indexes. That's a big reason to upgrade to a more modern version. More recent VACUUM's (but not VACUUM FULL) do only sequential scans of both the table and indexes. VACUUM FULL also tends to bloat indexes. If you've been running VACUUM FULL regularly on this table you may want to REINDEX this table. I know, 8.0.3 is quite old. But nothing jumped out from the changelog up to 8.0.15 that would explain this excessive slowness. SELECTs are pretty fast on any of the tables I tried, but INSERT hangs on this table. How does this fragmentation happen and how can we prevent this situation? I'm not sure fragmentation has a direct analogy since tuples lie entirely on one page. Unless perhaps you have a problem with TOAST data being laid out poorly. Are any individual rows in tables over 2k? The problems people do run into are either a) lots of dead space because either vacuum (plain old vacuum, not full) wasn't run regularly or because large batch updates or deletes were run which later activity could never reuse b) indexes with bloat either due to the above or due to deleting many but not all tuples from a range and then never inserting into that range again. indexes can only reuse tuples if you insert in the same page again or if you delete all the tuples on the page. One trick you could use if you can stand the downtime is to periodically CLUSTER the table. Older versions of Postgres had a concurrency bugs in CLUSTER to watch out for, but as long as you don't run it at the same time as a very long-running transaction such as pg_dump it shouldn't be a problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] when to reindex?
Kevin Hunter [EMAIL PROTECTED] writes: Or, assuming the REINDEX is for speed/bloat, not for corruption, perhaps an option to use the old index as a basis, rather than scanning the entire table multiple times as with a CREATE INDEX CONCURRENTLY. That's been mentioned, it ought to be on the TODO. The trick is determining *when* to use the index and when to use the table -- but that's something the planner already does quite well and we could hopefully leverage that. Note that in typical cases it would be slower. REINDEX scans the table precisely once and sorts it. The sorting will probably have to do multiple passes through temporary files which is presumably what you're referring to. But those passes are still at least sequential. A full index scan has to do random access reads and in many cases read the same page many times to get different records. The cases where it would win would be where you have a lot of dead space in the table (but not in the index), partial indexes which don't cover much of the table, or a table which is already very well clustered (possibly, depending on other factors). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] turning fsync off for WAL
Ram Ravichandran [EMAIL PROTECTED] writes: The problem that I am facing is that EC2 has no persistent storage (at least currently). So, if the server restarts for some reason, all data on the local disks are gone. The idea was to store the tables on the non-persistent local disk, and do the WAL on to an S3 mounted drive. If the server goes down for some reason, I was hoping to recover by replaying the WAL. I was hoping that by faking the fsyncs, I would not incur the actual charges from Amazon until the file system writes into S3. Also, since WAL is on a separate FS, it will not affect my disk-write rates. Ahh. I think you can use this effectively but not the way you're describing. Instead of writing the wal directly to persistentFS what I think you're better off doing is treating persistentFS as your backup storage. Use Archiving as described here to archive the WAL files to persistentFS: http://postgresql.com.cn/docs/8.3/static/runtime-config-wal.html#GUC-ARCHIVE-MODE Then if your database goes down you'll have to restore from backup (stored in persistentFS) and then run recovery from the archived WAL files (from persistentFS) and be back up. You will lose any transactions which haven't been archived yet but you can control how many transactions you're at risk of losing versus how much you pay for all the puts. The more puts the fewer transactions you'll be putting at risk but the more you'll pay. You can also trade off paying for more frequent puts of hot backup images (make sure to read how to use pg_start_backup() properly) against longer recovery times. TANSTAAFL :( If you do this then you may as well turn fsync off on the server since you're resigned to having to restore from backup on a server crash anyways... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] turning fsync off for WAL
Ram Ravichandran [EMAIL PROTECTED] writes: Hey, I am running a postgresql server on Amazon EC2. My current plan is to mount an Amazon S3 bucket as a drive using PersistentFS which is a POSIX-compliant file system. I will be using this for write-ahead-logging. The issue with S3 is that though the actual storage is cheap, they charge $1 per 100,000 put requests - so frequent fsyncs will cost me a lot. Wow, this is a fascinating situation. Are you sure the fsyncs are the only thing to worry about though? Postgres will call write(2) many times even if you disabled fsync entirely. Surely the kernel and filesystem will eventually send some of them through even if no fsyncs arrive? Is it only fsyncs on the write-ahead-log that matter? Or on the data as well? Checkpoints fsync the data files. The logs are fsynced on every commit and also whenever a buffer has to be flushed if the logs for the last changes in that buffer haven't been synced yet. I've been talking to the makers of persistentFS, and one possible solution is for the file system to disobey fsyncs. I am trying to find out the implications of this method in case of a crash. Will I only lose information since the last fsync? Or will the earlier data, in general, be corrupted due to some out-of-order writes (I remember seeing this somewhere)? There actually is an option in Postgres to not call fsync. However your fear is justified. If your file system can flush buffers to disk in a different order than they were written (and most can) then it's possible for a database with fsync off to become corrupted. Typical examples would be things like records missing index pointers (or worse, index pointers to wrong records), or duplicate or missing records (consider if an update is only partly written). This is only an issue in the event of either a kernel crash or power failure (whatever that means for a virtual machine...). In which case the only safe course of action is to restore from backup. It's possible that in the context of Amazon these would be rare enough events and restoring from backups easy enough that that might be worth considering? However a safer and more interesting option with Postgres 8.3 would be to disable synchronous_commit and set a very large wal_writer_delay. Effectively this would do the same thing, disabling fsync for every transaction, but not risk the data integrity. The default wal_writer_delay is 200ms meaning 5 fsyncs per second but you could raise that substantially to get fewer fsyncs, possibly into the range of minutes. If you raise it *too* far then you'll start observing fsyncs due to processing being forced to flush dirty buffers before their changes have been logged and synced. The only way to raise that would be to increase the shared_buffers which would have complex effects. You'll also have to look at the checkpoint_timeout and checkpoint_segments parameters. And probably the bgwriter variables as well (lest it start trying to flush buffers whose changes haven't been logged yet too). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bottom Posting
Tom Lane [EMAIL PROTECTED] writes: Bottom line: have some respect for your readers. Make it easy to distinguish what you wrote from the preceding material, and remember that the only reason you are quoting anything at all is to provide some context for what you are saying. We don't need to re-read the entire darn thread. I don't think the people who top-post or quote the entire message are doing it out of disrespect. They just have never been exposed to the traditional style. The main reason I posted this was to demonstrate that there's really no reason to quote the original message. My response was only to this one point and not the longer previous point. I actually think this is a more important point to get across than simply don't top post which just seems to generate lots of bottom posts that are just as bad. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Open Source CRM - Options?
Chris Browne [EMAIL PROTECTED] writes: RT has a very different purpose; it was designed to track work (e.g. - work tickets), as opposed to managing web site content. It *might* be used as a bug tracker, though with a considerably different flavour from (say) Bugzilla; as a CRM, it would be pretty unsuitable :-(. FWIW I think you guys are using CRM to mean something very different from its usual meaning. A trouble ticket tracking system like RT is pretty much precisely what is normally referred to as a CRM. What you seem to be referring to is some subset of CMS, I don't know of any specific name for this kind of system. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] active queries
Alban Hertroys [EMAIL PROTECTED] writes: THE 'active' query (not AN) is the query that's currently being executed on a connection. There can be multiple connections with an active query each. You can't execute multiple queries in parallel on the same connection. You can execute them in sequence of course. What people are trying to tell you is that you apparently have multiple threads that perform queries simultaneously on the same connection, and that's not possible. FWIW it's not just queries that are a problem. You pretty much can't call *any* libpq function on a connection while another thread is calling any other libpq call on the same connection. It is however safe (assuming you built postgres with the thread-safe option) to call two libpq calls from different threads on *different* connections. And it's always safe to call two libpq calls from different threads on the same connection as long as you ensure that one is finished before the second one begins. In either case the normal way to guarantee either property is to put a mutex around either each connection or all of libpq. You can actually have more than one query simultaneously active in the same connection using named portals. But if you called PQgetResult() on one you have to ensure all threads wait until it returns before issuing PQgetResult() (or any other libpq function) on the other portal. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Psql crashes with Segmentation fault on copy from
Francisco Reyes [EMAIL PROTECTED] writes: #1 0x002a955820ae in pqPutMsgBytes (buf=0x2a9860a010, len=2147483647, So it's trying to execute an sql query that's MAXINT bytes long which is pretty off-course. #5 0x00406e95 in SendQuery ( query=0x2a9860a010 yjbjK8WKLRHKavptOqlPD4QeI24=\t200803\t2008-03-11\t0\t10.47\t08620667742MDF1\t\tMCDONALD'S F7742\t5814\t1003\t1187954\nyjbjK8WKLRHKavptOqlPD4QeI24=\t200803\t2008-03-11\t0\t16.99\t00614111869AWG1\t\tPIGGLY WIGGLY#52\t5411...) at common.c:878 #6 0x00409dfa in MainLoop (source=0x5414f0) at mainloop.c:340 #7 0x0040408d in process_file (filename=0x541d70 data/usb_t_60M.sql, single_txn=0 '\0') at command.c:1394 #8 0x004057cd in exec_command (cmd=0x541bc0 i, scan_state=0x549b30, query_buf=0x549890) at command.c:546 Could you post the first couple lines of data/usb_t_60M.sql ? Does it really have a COPY command at the beginning? Are you really doing \i data/usb_t_60M.sql or were you trying to do a copy from this file? #9 0x00406286 in HandleSlashCmds (scan_state=0x549b30, query_buf=0x549890) at command.c:92 #10 0x00409f66 in MainLoop (source=0x541070) at mainloop.c:259 #11 0x0040408d in process_file (filename=0x560750 bench-index-after.sql, single_txn=0 '\0') at command.c:1394 #12 0x004057cd in exec_command ( cmd=0x569360 Hxnh9aOwMviVVP9I=\t200803\t2008-03-10\t0\t15.54\t434240509886\t\tNU-YALE CLEANERS (BARD\t4225\t0\t11636644\n3MM9gwBGMHiyPGmUrqIEOOha1Ao=\t200803\t2008-03-11\t0\t739.95\t55417210017\t\tUSAA PC PREMIUM\t6300\t0\t183718..., scan_state=0x549c20, query_buf=0x54a480) at command.c:546 I'm a bit puzzled by this. The command being executed here seems to have been overwritten by data from the later command -- before it was overwritten it was apparently \i bench-index-after.sql. It seems something with the PsqlScanState is not being sufficient to make the lexer completely reentrant here. What version of flex is this built with? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bottom Posting
Bob Pawley [EMAIL PROTECTED] writes: The concept of most lists should be the free exchange of ideas in the most efficient manner possible. What is this in response to? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] small table, huge table, and a join = slow and tough query. cake inside!
edfialk [EMAIL PROTECTED] writes: So, what I need is all kinds of things, but to start (pseudo-query): SELECT small.fips, small.geom, small.name, SUM(huge.value) from small JOIN huge on huge.fips = small.fips WHERE (SUM(huge.value)) 500 AND huge.pollutant='co'; wonder if that makes sense. Obviously, can't have an aggregate in where clause, so I've tried a couple WHERE (SELECT) kind of things, nothing working out too well. For this case HAVING will suffice: select small.*, sum(huge.value) from small join huge on (huge.fips = small.fips) where huge.pollutant='co' having sum(huge.value) 500 But in more complex cases you may have to use a subquery and further where clauses or even joins outside the subquery. You could write this, for example, as: select * from small join ( select fips,sum(huge.value) as sum from huge where pollutant='co' ) as huge_sum using (fips) where huge_sum.sum 500 Which may actually run faster (Unfortunately Postgres doesn't use the foreign key relationship when planning so it can't reorder the join and the where clause because it doesn't know that every huge record will have a matching small record) any ideas on how I could speed up the query, I would be so extremely grateful. You could try an index on pollutant or fips,pollutant but with the numbers you're talking about they probably won't help unless you have a *lot* of different pollutants and not all that many records for each pollutant. To get anything more you would have to post an EXPLAIN output and preferably an EXPLAIN ANALYZE output if you can find a query which completes. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Error: Could not open relation...
Howard Cole [EMAIL PROTECTED] writes: Are there likely to be serious integrety implications if Postgres failed to access/write to this table for whatever reason, or would the transaction be rolled back. The command would get an error and the transaction rolled back. I would be more concerned about integrity issues if you started having the same error with something like an xlog WAL file though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general