Re: [HACKERS] [BUGS] BUG #8542: Materialized View with another column_name does not work?
Michael Paquier michael.paqu...@gmail.com wrote: I am not sure that adding a boolean flag introducing a concept related to matview inside checkRuleResultList is the best approach to solve that. checkRuleResultList is something related only to rules, and has nothing related to matviews in it yet. Well, I was tempted to keep that concept in DefineQueryRewrite() where the call is made, by calling the new bool something like requireColumnNameMatch and not having checkRuleResultList() continue to use isSelect for that purpose at all. DefineQueryRewrite() already references RELKIND_RELATION once, RELKIND_MATVIEW twice, and RELKIND_VIEW three times, so it would hardly be introducing a new concept there. I did it the way I did in the posted patch because it seemed to more nearly match what Tom was suggesting. I have been pondering myself about this issue, and wouldn't it be better to directly enforce targetList in checkRuleResultList call at an upper level with the column name list given by users if there is any instead of the list of columns of the SELECT query? After looking closely at the code this looks to be a better approach from a general viewpoint. I didn't got the time to write a patch but this is the conclusion I reached after some analysis at least... That doesn't sound like something we could back-patch. Anyway, I'm not sure why that would be better. We have rewrite rules on three kinds of relations -- tables, views, and materialized views. In general, a materialized view tends to support the properties of both a view and a table, with a few points in the rewrite code that need to pay attention to which kind of relation the rule applies to. Among the various validations for the rewrite, this one validation (matching column names) does not apply to a non-SELECT rule with a RETURNING clause or to a SELECT rule which populates a materialized view. If you were to move the code to exclude this validation for the latter case, wouldn't you also need to move the validation for the former case? Where would you put that? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #8542: Materialized View with another column_name does not work?
Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: CREATE MATERIALIZED VIEW statement ends up being CREATE TABLE AS statement underneath with table type matview. In that case, why don't I see special treatment only for materialized view and not CTAS in general, which allows column names to specified like the case in the bug reported. While the initial population of the matview is a lot like CTAS (and so far shares some code), the critical difference is that CTAS doesn't store a rule to support repopulating the table. This issue comes up in the validation of that rule for the matview. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How should row-security affects ON UPDATE RESTRICT / CASCADE ?
On 10/30/2013 11:25 AM, Kohei KaiGai wrote: + + /* +* Row-level security should be disabled in case when foreign-key +* relation is queried to check existence of tuples that references +* the primary-key being modified. +*/ + temp_sec_context = save_sec_context | SECURITY_LOCAL_USERID_CHANGE; + if (source_is_pk) + temp_sec_context |= SECURITY_ROW_LEVEL_DISABLED; + SetUserIdAndSecContext(RelationGetForm(query_rel)-relowner, - save_sec_context | SECURITY_LOCAL_USERID_CHANGE); + temp_sec_context); This isn't sufficient, as it doesn't kick in when the _target_ is a primary key. For example, if you run this against the rowsecurity regression test schema you should get a nice one-row insert, but instead row-security prevents even the superuser from seeing the target row from within the FK check, leading to counter-intuitive behaviour like: test=# insert into document(did,cid,dlevel,dauthor,dtitle) select 1000, cid, dlevel, dauthor, dtitle from document where did = 8; ERROR: insert or update on table document violates foreign key constraint document_cid_fkey DETAIL: Key (cid)=(44) is not present in table category. So: what's the logic behind restricting this to source_is_pk ? Shouldn't *all* FKs be exempt from rowsecurity checks? I think the code should just be: + temp_sec_context = save_sec_context | SECURITY_LOCAL_USERID_CHANGE | SECURITY_ROW_LEVEL_DISABLED; i.e. ALL foreign key checks are exempt from row security filters. This isn't a complete solution as it doesn't solve another hole. The owner of the RI target table can create a trigger that runs in this security context when an ON DELETE CASCADE, ON UPDATE CASCADE or ON DELETE SET NULL trigger is invoked. They can abuse this trigger to see rows in other tables, even ones they don't own. Rather than completely disabling row security in triggers, we might instead need to disable row security for *tables owned by the user the RI trigger is being invoked as*. Or just for the table that's the subject of the RI check. Otherwise we get this nasty security hole: SET SESSION AUTHORIZATION rls_regress_user2; CREATE TABLE secrets( id integer primary key, cid integer not null, dummy text ); INSERT INTO secrets (id, cid, dummy) VALUES (1, 11, 'a'), (2, 22, 'b'); ALTER TABLE secrets SET ROW SECURITY FOR ALL TO (cid = 11); -- rls_regress_user2 only wants user0 to see the rows the row -- security policy permits GRANT SELECT ON secrets TO rls_regress_user0; -- Now, naughty rls_regress_user0 wants to know what's so secret -- in rls_regress_user2's table, so it sets up a cascade delete -- between two of its OWN tables, completely unrelated, so it can -- run a trigger with SECURITY_ROW_LEVEL_DISABLED rights. SET SESSION AUTHORIZATION rls_regress_user0; CREATE TABLE leaked (LIKE secrets); CREATE OR REPLACE FUNCTION malicious_trigger() RETURNS TRIGGER AS $$ BEGIN -- Should only see cid=11, but sees both! oops! INSERT INTO leaked SELECT * FROM secrets; IF tg_op = 'INSERT' OR tg_op = 'UPDATE' THEN RETURN new; ELSE RETURN old; END IF; END; $$ LANGUAGE plpgsql; CREATE TABLE parent(id integer primary key); INSERT INTO parent(id) VALUES (1); CREATE TABLE child( parent_id integer not null references parent(id) on delete cascade on update cascade ); INSERT INTO child(parent_id) values (1); CREATE TRIGGER malicious BEFORE UPDATE OR DELETE ON child FOR EACH ROW EXECUTE PROCEDURE malicious_trigger(); -- Now rls_regress_user0 only needs to delete the parent row, triggering -- a cascade via the RI trigger to the child, which invokes the -- malicious trigger and copies the contents of the target table. test= DELETE FROM parent; DELETE 1 test= SELECT * FROM leaked; id | cid | dummy +-+--- 1 | 11 | a 2 | 22 | b (2 rows) oops! rls_regress_user0 should only be able to see id=1, but it bypassed RLS using the security context of the cascade to see id=2 as well! -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] pg_receivexlog: fixed to work with logical segno 0
pg_receivexlog calculated the xlog segment number incorrectly when started after the previous instance was interrupted. Resuming streaming only worked when the physical wal segment counter was zero, i.e. for the first 256 segments or so. --- src/bin/pg_basebackup/pg_receivexlog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c index 031ec1a..6f9fcf4 100644 --- a/src/bin/pg_basebackup/pg_receivexlog.c +++ b/src/bin/pg_basebackup/pg_receivexlog.c @@ -171,7 +171,7 @@ FindStreamingStart(uint32 *tli) progname, dirent-d_name); disconnect_and_exit(1); } - segno = ((uint64) log) 32 | seg; + segno = (((uint64) log) 8) | seg; /* * Check that the segment has the right size, if it's supposed to be -- 1.8.0.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
From: Fujii Masao [mailto:masao.fu...@gmail.com] This is what I'm looking for! This feature is really useful for tuning work_mem when using full text search with pg_trgm. I'm not sure if it's good idea to show the number of the fetches because it seems difficult to tune work_mem from that number. How can we calculate how much to increase work_mem to avoid lossy bitmap from the number of the fetches in EXPLAIN output? We can calculate that from the following equation in tbm_create(): nbuckets = maxbytes / (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry)) + sizeof(Pointer) + sizeof(Pointer)), where maxbytes is the size of memory used for the hashtable in a TIDBitmap, designated by work_mem, and nbuckets is the estimated number of hashtable entries we can have within maxbytes. From this, the size of work_mem within which we can have every hashtable entry as an exact bitmap is calculated as follows: work_mem = (the number of exact pages + the number of lossy pages) * (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry)) + sizeof(Pointer) + sizeof(Pointer)) / (1024 * 1024). I'll show you an example. The following is the result for work_mem = 1MB: postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02; QUERY PLAN -- -- Bitmap Heap Scan on demo (cost=2716.54..92075.46 rows=105766 width=34) (actual time=24.907..1119.961 rows=100047 loops=1) Recheck Cond: ((col2 = 0.01::double precision) AND (col2 = 0.02::double precision)) Rows Removed by Index Recheck: 5484114 Heap Blocks: exact=11975 lossy=46388 - Bitmap Index Scan on demo_idx (cost=0.00..2690.09 rows=105766 width=0) (actual time=22.821..22.821 rows=100047 loops=1) Index Cond: ((col2 = 0.01::double precision) AND (col2 = 0.02::double precision)) Total runtime: 1129.334 ms (7 rows) So, by setting work_mem to work_mem = (11975 + 46388) * (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry)) + sizeof(Pointer) + sizeof(Pointer)) / (1024 * 1024), which is about 5MB, we have the following (Note that no lossy heap pages!): postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02; QUERY PLAN Bitmap Heap Scan on demo (cost=2716.54..92075.46 rows=105766 width=34) (actual time=42.981..120.252 rows=1 00047 loops=1) Recheck Cond: ((col2 = 0.01::double precision) AND (col2 = 0.02::double precision)) Heap Blocks: exact=58363 - Bitmap Index Scan on demo_idx (cost=0.00..2690.09 rows=105766 width=0) (actual time=26.023..26.023 r ows=100047 loops=1) Index Cond: ((col2 = 0.01::double precision) AND (col2 = 0.02::double precision)) Total runtime: 129.304 ms (6 rows) BTW, as the EXPLAIN ANALYZE output, the number of exact/lossy heap pages would be fine with me. Anyway, could you add the patch into next CF? Done. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Shave a few instructions from child-process startup sequence
On Thu, Oct 31, 2013 at 11:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Amit Kapila amit.kapil...@gmail.com writes: On Thu, Oct 31, 2013 at 2:41 AM, Gurjeet Singh gurj...@singh.im wrote: Just a small patch; hopefully useful. This is valid saving as we are filling array ListenSocket[] in StreamServerPort() serially, so during ClosePostmasterPorts() once if it encountered PGINVALID_SOCKET, it is valid to break the loop. Although savings are small considering this doesn't occur in any performance path, still I think this is right thing to do in code. It is better to register this patch in CF app list, unless someone feels this is not right. I think this is adding fragility for absolutely no meaningful savings. The existing code does not depend on the assumption that the array is filled consecutively and no entries are closed early. Why should we add such an assumption here? IMHO, typical configurations have one, or maybe two of these array elements populated; one for TCP 5432 (for localhost or *), and the other for Unix Domain Sockets. Saving 62 iterations and comparisons in startup sequence may not be much, but IMHO it does match logic elsewhere. In fact, this was inspired by the following code in ServerLoop(): ServerLoop() { ... /* * New connection pending on any of our sockets? If so, fork a child * process to deal with it. */ if (selres 0) { int i; for (i = 0; i MAXLISTEN; i++) { if (ListenSocket[i] == PGINVALID_SOCKET) break; if (FD_ISSET(ListenSocket[i], rmask)) { And looking for other precedences, I found that initMasks() function also relies on the array being filled consecutively and the first PGINVALID_SOCKET marks end of valid array members. Best regards, -- Gurjeet Singh http://gurjeet.singh.im/ EnterpriseDB Inc.
[HACKERS] Save Hash Indexes
Hi, Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. Advantages: - it works - it's crash safe - it's (much?) faster than a hash index anyways Drawbacks: - root access concurrency - we need a hash_any function stable against pg_upgrade After talking about it with Heikki, we don't seem to find ways in which the root access concurrency pattern would be visible enough to matter. Also, talking with Peter Geoghegan, it's unclear that there's a use case where a hash index would be faster than a btree index over the hash function. Comments? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On Fri, Nov 01, 2013 at 01:31:10PM +, Dimitri Fontaine wrote: Hi, Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. Advantages: - it works - it's crash safe - it's (much?) faster than a hash index anyways Drawbacks: - root access concurrency - we need a hash_any function stable against pg_upgrade After talking about it with Heikki, we don't seem to find ways in which the root access concurrency pattern would be visible enough to matter. Also, talking with Peter Geoghegan, it's unclear that there's a use case where a hash index would be faster than a btree index over the hash function. Comments? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support Hi Dimitri, This use of a function index as a safe hash index has been the substitute for a while. Check the previous threads. It is not a true substitute because a hash index is O(1) for lookups but a btree is O(log n) so hash indexes have an advantage for very large numbers on entries. In fact a recent post compared both the btree substitute and the current hash index and for large indexes the hash allowed 2X the lookups than the equivalent btree, which is what you would expect. The use-case is exactly for very large tables/indexes where the index does not fit in memory, to say nothing of the data itself. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
Dimitri Fontaine dimi...@2ndquadrant.fr writes: Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. -1. If someone asks for a hash index, they should get a hash index. If you feel the documentation isn't sufficiently clear about the problems involved, we can work on that. The bigger picture here is that such an approach amounts to deciding that no one will ever be allowed to fix hash indexes. I'm not for that, even if I'm not volunteering to be the fixer myself. I also don't believe your claim that this would always be faster than a real hash index. What happened to O(1) vs O(log N)? Lastly: what real-world problem are we solving by kicking that code to the curb? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
Tom Lane t...@sss.pgh.pa.us writes: -1. If someone asks for a hash index, they should get a hash index. If you feel the documentation isn't sufficiently clear about the problems involved, we can work on that. Fair enough. Lastly: what real-world problem are we solving by kicking that code to the curb? Well how long did we wait for the fix already? The lack of crash safety and replication ability is not just a drawback, it's about killing the user adoption. Maybe we should implement UNLOGGED indexes and then Hash Indexes would only exist in the UNLOGGED fashion? But well, yeah ok, my idea is a non-starter. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On 11/01/2013 09:49 AM, Tom Lane wrote: Dimitri Fontaine dimi...@2ndquadrant.fr writes: Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. -1. If someone asks for a hash index, they should get a hash index. If you feel the documentation isn't sufficiently clear about the problems involved, we can work on that. The bigger picture here is that such an approach amounts to deciding that no one will ever be allowed to fix hash indexes. I'm not for that, even if I'm not volunteering to be the fixer myself. I also don't believe your claim that this would always be faster than a real hash index. What happened to O(1) vs O(log N)? Lastly: what real-world problem are we solving by kicking that code to the curb? Yeah, and there's this: I've had at least one client who switched to using hash indexes and got a significant benefit from it precisely because they aren't WAL logged. They could afford to rebuild the indexes in the unlikely event of a crash, but the IO gain was worth it to them. Neither could they have tolerated unlogged tables - they wanted crash safety for their data. After talking through the various options with them they decided this was the best choice, and it might be sad to remove that choice from people. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
Andrew Dunstan and...@dunslane.net writes: Yeah, and there's this: I've had at least one client who switched to using hash indexes and got a significant benefit from it precisely because they aren't WAL logged. They could afford to rebuild the indexes in the unlikely event of a crash, but the IO gain was worth it to them. Neither could they have tolerated unlogged tables - they wanted crash safety for their data. After talking through the various options with them they decided this was the best choice, and it might be sad to remove that choice from people. That's an interesting story, but it seems like what it points to is the need for a general unlogged index feature, rather than depending on what's universally agreed to be an implementation deficiency of hash indexes. So it sounds like an independent topic. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Cannot create matview when referencing another not-populated-yet matview in subquery
Laurent Sartran lsart...@gmail.com wrote: CREATE MATERIALIZED VIEW t1 AS SELECT text 'foo' AS col1 WITH NO DATA; CREATE MATERIALIZED VIEW t2b AS SELECT * FROM t1 WHERE col1 = (SELECT LEAST(col1) FROM t1) WITH NO DATA; ERROR: materialized view t1 has not been populated HINT: Use the REFRESH MATERIALIZED VIEW command. Is this behavior expected? No, and git bisect shows that it worked until commit 5194024d72f33fb209e10f9ab0ada7cc67df45b7. Moving to -hackers list for discussion of how to fix it. It looks like the above commit missed a trick here: diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 791f336..0b47106 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -865,7 +865,8 @@ InitPlan(QueryDesc *queryDesc, int eflags) * it is a parameterless subplan (not initplan), we suggest that it be * prepared to handle REWIND efficiently; otherwise there is no need. */ - sp_eflags = eflags EXEC_FLAG_EXPLAIN_ONLY; + sp_eflags = eflags + (EXEC_FLAG_EXPLAIN_ONLY | EXEC_FLAG_WITH_NO_DATA); if (bms_is_member(i, plannedstmt-rewindPlanIDs)) sp_eflags |= EXEC_FLAG_REWIND; The test case provided works with this change. Does anyone see a problem with that? If not, I'll push it with the above test case added to the regression tests. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On 2013-11-01 09:49:57 -0400, Tom Lane wrote: Lastly: what real-world problem are we solving by kicking that code to the curb? It makes hashed lookups much easier to use. Currently if you want indexed access over wide columns and equality is all you need you need to write rather awkward queries to make that happen in each query, something like: WHERE hash(column) = hash('value') AND column = 'value'. So some magic that hides having to do that would be nice, even it doesn't have O(log(n)) properties. The interesting part is that the index gets much denser and smaller, and that the individual comparisons are much cheaper. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API bug in DetermineTimeZoneOffset()
I wrote: The second attached patch, to be applied after the first, removes the existing checks of HasCTZSet in the backend. The only visible effect of this, AFAICT, is that to_char's TZ format spec now delivers something useful instead of an empty string when a brute-force timezone is in use. I could be persuaded either way as to whether to back-patch this part. From one standpoint, this to_char behavioral change is clearly a bug fix --- but it's barely possible that somebody out there thought that returning an empty string for TZ was actually the intended/desirable behavior. Any opinions about whether to back-patch this part or not? It seems like a bug fix, but on the other hand, I don't recall any complaints from the field about to_char's TZ spec not working with brute-force zones. So maybe the prudent thing is to leave well enough alone. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On Friday, November 1, 2013, k...@rice.edu wrote: On Fri, Nov 01, 2013 at 01:31:10PM +, Dimitri Fontaine wrote: Hi, Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. Advantages: - it works - it's crash safe - it's (much?) faster than a hash index anyways Drawbacks: - root access concurrency - we need a hash_any function stable against pg_upgrade After talking about it with Heikki, we don't seem to find ways in which the root access concurrency pattern would be visible enough to matter. Also, talking with Peter Geoghegan, it's unclear that there's a use case where a hash index would be faster than a btree index over the hash function. Comments? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support Hi Dimitri, This use of a function index as a safe hash index has been the substitute for a while. Check the previous threads. It is not a true substitute because a hash index is O(1) for lookups but a btree is O(log n) so hash indexes have an advantage for very large numbers on entries. In fact a recent post compared both the btree substitute and the current hash index and for large indexes the hash allowed 2X the lookups than the equivalent btree, which is what you would expect. The use-case is exactly for very large tables/indexes where the index does not fit in memory, to say nothing of the data itself. Regards, Ken Hi, I agree too.Theoretically, hash tables are constant time lookup while btree are logarithmic. There may be cases where we may get better performance with btree, but replacing hash indexes with them completely is not really a good idea,IMHO. Is the motivation behind this idea drawn from workloads where btree perform better, or are you trying to fix issues with hash indexes? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.orgjavascript:; ) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Regards, Atri *l'apprenant*
Re: [HACKERS] Save Hash Indexes
On Fri, Nov 1, 2013 at 6:31 AM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Hi, Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. Could something be added to the planner so that you can just build a btree index on a hash expression, and have the planner automatically realize it can used for an equality query because if a=b, then md5(a)=md5(b), at least for some data types? A general ability to recognize things like that seems like it might have other uses as well. Cheers, Jeff
Re: [HACKERS] Save Hash Indexes
On Fri, Nov 1, 2013 at 6:31 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, Here's an idea: when a user ask for an Hash Index transparently build a BTree index over an hash function instead. Advantages: - it works - it's crash safe - it's (much?) faster than a hash index anyways Drawbacks: - root access concurrency - we need a hash_any function stable against pg_upgrade After talking about it with Heikki, we don't seem to find ways in which the root access concurrency pattern would be visible enough to matter. Also, talking with Peter Geoghegan, it's unclear that there's a use case where a hash index would be faster than a btree index over the hash function. Comments? I haven't met a single Heroku user that has stumbled into this landmine. Normally I am very weary of such landmine laden features, but this one is there and doesn't seem to have detonated at any point. I guess the obscure nature of those indexes and the stern warning in the documentation has sufficed to discourage their use. Given that experience, I think foreclosing fixing hash indexes is premature, and doesn't seem to be hurting inexperienced users of this stripe. Maybe there are yet other reasons to argue the subject, though...it's not like the current user base relies on the behavior as-is either, having seemingly steered clear just about altogether. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Something fishy happening on frogmouth
On Fri, Nov 01, 2013 at 12:27:31AM -0400, Robert Haas wrote: On Thu, Oct 31, 2013 at 7:48 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 31.10.2013 16:43, Robert Haas wrote: There should be no cases where the main shared memory segment gets cleaned up and the dynamic shared memory segments do not. 1. initdb -D data1 2. initdb -D data2 3. postgres -D data1 4. killall -9 postgres 5. postgres -D data2 The system V shmem segment orphaned at step 4 will be cleaned up at step 5. The DSM segment will not. Note that dynamic_shared_memory_type='mmap' will give the desired behavior. If we want the behavior, we could mimic what the main shared memory code does here: instead of choosing a random value for the control segment identifier and saving it in a state file, start with something like port * 100 + 100 (the main shared memory segment uses port * 100, so we'd want something at least slightly different) and search forward one value at a time from there until we find an unused ID. This approach used for the main sysv segment has its own problems. If the first postmaster had to search forward but the second postmaster does not, the second postmaster will not reach the old segment to clean it up. It might be suitably-cheap insurance to store the DSM control segment handle in PGShmemHeader. Then if, by whatever means good or bad, we find a main sysv segment to clean up, we can always clean up the associated DSM segment(s). -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On Fri, Nov 1, 2013 at 8:52 AM, Daniel Farina dan...@heroku.com wrote: On Fri, Nov 1, 2013 at 6:31 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Also, talking with Peter Geoghegan, it's unclear that there's a use case where a hash index would be faster than a btree index over the hash function. Comments? I haven't met a single Heroku user that has stumbled into this landmine. Normally I am very weary of such landmine laden features, but this one is there and doesn't seem to have detonated at any point. I guess the obscure nature of those indexes and the stern warning in the documentation has sufficed to discourage their use. Given that experience, I think foreclosing fixing hash indexes is premature, and doesn't seem to be hurting inexperienced users of this stripe. Maybe there are yet other reasons to argue the subject, though...it's not like the current user base relies on the behavior as-is either, having seemingly steered clear just about altogether. In addendum, though, some users *have* done the hash-function + btree trick to make keys smaller. They tend to resort to full blown cryptographic hashes and assume/enforce non-collision, but it's somewhat awkward and finicky. So while perhaps commandeering the 'hash' index type is a bit over-aggressive, making that use case work better would probably carry positive impact. I can say most of my indexes over text are *never* used for range queries, so hashing them down one would think would be great. The fiddliness of applying expression indexes over all that forecloses getting the benefits that might be possible, there: only certain heavy workloads will receive that level of care. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install
Hi Andrew, On Mon, Sep 23, 2013 at 6:43 PM, Andrew Dunstan and...@dunslane.net wrote: I'm working on it. It appears to have a slight problem or two I want to fix at the same time, rather than backpatch something broken. Any progress on this? I notice that the fixes didn't make it into 9.3.1. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On 11/01/2013 03:49 PM, Andres Freund wrote: On 2013-11-01 09:49:57 -0400, Tom Lane wrote: Lastly: what real-world problem are we solving by kicking that code to the curb? It makes hashed lookups much easier to use. Currently if you want indexed access over wide columns and equality is all you need you need to write rather awkward queries to make that happen in each query, something like: WHERE hash(column) = hash('value') AND column = 'value'. So some magic that hides having to do that would be nice, even it doesn't have O(log(n)) properties. Is'nt there a way to do this using a special operator class ? The interesting part is that the index gets much denser and smaller, and that the individual comparisons are much cheaper. Greetings, Andres Freund -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] buffile.c resource owner breakage on segment extension
Hi, The attached testcase demonstrates that it currently is possible that buffile.c segments get created belonging to the wrong resource owner leading to WARNINGs ala temporary file leak: File %d still referenced, ERRORs like write failed, asserts and segfaults. The problem is that while BufFileCreateTemp() callers like tuplestore take care to use proper resource owners for it, they don't during BufFileWrite()-BufFileDumpBuffer()-extendBufFile(). The last in that chain creates a new tempfile which then gets owned by CurrentResourceOwner. Which, like in the testcase, might a subtransaction's one. While not particularly nice, given the API, it seems best for buffile.c to remember the resource owner used for the original segment and temporarily set that during the extension. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c index ac8cd1d..feb7011 100644 --- a/src/backend/storage/file/buffile.c +++ b/src/backend/storage/file/buffile.c @@ -38,6 +38,7 @@ #include storage/fd.h #include storage/buffile.h #include storage/buf_internals.h +#include utils/resowner.h /* * We break BufFiles into gigabyte-sized segments, regardless of RELSEG_SIZE. @@ -67,6 +68,7 @@ struct BufFile bool isTemp; /* can only add files if this is TRUE */ bool isInterXact; /* keep open over transactions? */ bool dirty; /* does buffer need to be written? */ + ResourceOwner resowner; /* Resowner to use when extending */ /* * current pos is position of start of buffer within the logical file. @@ -118,11 +120,21 @@ static void extendBufFile(BufFile *file) { File pfile; + ResourceOwner saved = CurrentResourceOwner; + + /* + * Temporarily set resource owner to the one used for + * BufFileCreateTemp so the new segment has the same lifetime as + * the first. + */ + CurrentResourceOwner = file-resowner; Assert(file-isTemp); pfile = OpenTemporaryFile(file-isInterXact); Assert(pfile = 0); + CurrentResourceOwner = saved; + file-files = (File *) repalloc(file-files, (file-numFiles + 1) * sizeof(File)); file-offsets = (off_t *) repalloc(file-offsets, @@ -155,7 +167,7 @@ BufFileCreateTemp(bool interXact) file = makeBufFile(pfile); file-isTemp = true; file-isInterXact = interXact; - + file-resowner = CurrentResourceOwner; return file; } SET work_mem = '100MB'; DROP TABLE IF EXISTS data CASCADE; CREATE TABLE data(data text); CREATE FUNCTION leak_file() RETURNS SETOF data LANGUAGE plpgsql AS $$ DECLARE i int; BEGIN FOR i IN 1..3 LOOP DECLARE r record; BEGIN -- make sure we need to extend a tempfile past one segment in a subtransaction FOR i IN 1..30 LOOP r := ('('||repeat('quite a bit of stupid text', 100)||')')::data; RETURN NEXT r; END LOOP; RAISE EXCEPTION 'frakkedifrak'; EXCEPTION WHEN others THEN RAISE NOTICE 'got exception %', sqlerrm; END; END LOOP; END $$ COPY (SELECT * FROM leak_file()) TO '/dev/null'; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing RelationCloseSmgr in FreeFakeRelcacheEntry?
Hi Heikki, All, On 2013-10-29 02:16:23 +0100, Andres Freund wrote: Looking a bit closer it seems to me that the fake relcache infrastructure seems to neglect the chance that something used the fake entry to read something which will have done a RelationOpenSmgr(). Which in turn will have set rel-rd_smgr-smgr_owner to rel. When we then pfree() the fake relation in FreeFakeRelcacheEntry we have a dangling pointer. It confuses me a bit that this doesn't cause issues during recovery more frequently... The code seems to have been that way since a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which introduced fake relcache entries. It looks like XLogCloseRelationCache() indirectly has done a RelationCloseSmgr(). This looks like it was caused by a213f1ee6c5a1bbe1f074ca201975e76ad2ed50c which you committed, any chance you could commit the trivial fix? diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 5429d5e..ee7c892 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode) void FreeFakeRelcacheEntry(Relation fakerel) { + RelationCloseSmgr(fakerel); pfree(fakerel); } Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Save Hash Indexes
On Fri, Nov 1, 2013 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: The bigger picture here is that such an approach amounts to deciding that no one will ever be allowed to fix hash indexes. I'm not for that, even if I'm not volunteering to be the fixer myself. Yeah. I have thought about doing some work on this, although I don't know when I'll ever have the time. As far as I can see, the basic problem is that we need a better way of splitting buckets. Right now, splitting a bucket means locking it, scanning the entire bucket chain and moving ~1/2 the tuples to the new bucket, and then unlocking it. Since this is a long operation, we have to use heavyweight locks to protect the buckets, which is bad for concurrency. Since it involves a modification to an arbitrarily large number of pages that has to be atomic, it's not possible to WAL-log it sensibly -- and in fact, I think this is really the major obstacle to being able to implementing WAL-logging for hash indexes. I think we could work around this problem as follows. Suppose we have buckets 1..N and the split will create bucket N+1. We need a flag that can be set and cleared on the metapage, call it split-in-progress, and a flag that can optionally be set on particular index tuples, call it moved-by-split. We will allow scans of all buckets and insertions into all buckets while the split is in progress, but (as now) we will not allow more than one split to be in progress at the same time. We start the split by updating the metapage to incrementing the number of buckets and set the split-in-progress flag. While the split-in-progress flag is set, any scans of bucket N+1 first scan that bucket, ignoring any tuples flagged moved-by-split, and then ALSO scan bucket (N+1)/2. The moved-by-split flag never has any effect except when scanning the highest-numbered bucket that existed at the start of that particular scan, and then only if the split-in-progress flag was also set at that time. Once the split operation has set the split-in-progress flag, it will begin scanning bucket (N+1)/2. Every time it finds a tuple that properly belongs in bucket N+1, it will insert the tuple into bucket N+1 with the moved-by-split flag set. Tuples inserted by anything other than a split operation will leave this flag clear, and tuples inserted while the split is in progress will target the same bucket that they would hit if the split were already complete. Thus, bucket N+1 will end up with a mix of moved-by-split tuples, coming from bucket (N+1)/2, and unflagged tuples coming from parallel insertion activity. When the scan of bucket (N+1)/2 is complete, we know that bucket N+1 now contains all the tuples that are supposed to be there, so we clear the split-in-progress flag. Future scans of both buckets can proceed normally. Of course, there's a bit of a problem here: we haven't actually removed any tuples from bucket (N+1)/2. That's not a correctness problem; we'll check for an exact hash-value match before returning any tuples from this bucket from the scan, so tuples that aren't present in the bucket but should be will just ignored anyway. But it's clearly something that needs to be fixed before we can really declare victory. We need to wait lfor the completion of any scans that began before we finished populating bucket N+1, because otherwise we might remove tuples that they're still expecting to find in bucket (N+1)/2. I'm not sure if there's some clever trick we can use to accomplish this; e.g. if the scan will always maintain a pin, we could take a buffer cleanup lock on all the buffers in buckets N+1 and (N+1)/2 in the same sequence a scan would visit them. Of course, even if that works, it might be a while if somebody's left a cursor lying around. And even if scans can't be counted on to maintain a pin, a point about which I'm unsure off-hand, then things get even trickier. But maybe there's some solution. Anyway, once we out-wait concurrent scans, we can go back and nuke everything from (N+1)/2 that no longer maps onto that bucket. It might be possible to make this work opportunistically, like HOT cleanup: if we're scanning a buffer, and RecentGlobalXmin has progressed enough that we know there can't be any pre-split scans in progress (or if we can recognize in some other inexpensive way that old scans must be gone), and we see a tuple there that no longer maps onto that bucket, we scan the page and remove all such tuples. This is full of hand-waving, but I'd be curious to know whether the approach is perceived to have any merit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #8573: int4range memory consumption
g.vanluffe...@qipc.com writes: int4range ( and any other range function) consumes much memory when used in a select statement on a big table. The problem is that range_out leaks memory, as a consequence of creating a number of intermediate strings that it doesn't bother to free. I don't believe it's the only output function that leaks memory, but it does so with particular vim: now that we've increased the initial size of StringInfo buffers, it's probably wasting upwards of 2K per call. While we could doubtless hack range_out to release those strings again, it seems to me that that's just sticking a finger in the dike. I'm inclined to think that we really ought to solve this class of problems once and for all by fixing printtup.c to run the output functions in a temporary memory context, which we could reset once per row to recover all memory used. That would also let us get rid of the inadequate kluges that printtup currently uses to try to minimize leakage, namely forced detoasting of varlena fields and forced pfree'ing of the strings returned by output functions. (There is no other context in which we imagine that it's safe to pfree a function's result, and there are a number of datatypes for which it'd make sense to return constant strings, were it not for this kluge.) It's possible that this would result in some net slowdown in tuple output; but it's also possible that eliminating the retail pfree's in favor of a single context reset per tuple would make for a net savings. In any case, we're already using a reset-per-row approach to memory management of output function calls in COPY OUT, and I know for a fact that we've squeezed that code path as hard as we could. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] API bug in DetermineTimeZoneOffset()
On Fri, Nov 1, 2013 at 10:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: The second attached patch, to be applied after the first, removes the existing checks of HasCTZSet in the backend. The only visible effect of this, AFAICT, is that to_char's TZ format spec now delivers something useful instead of an empty string when a brute-force timezone is in use. I could be persuaded either way as to whether to back-patch this part. From one standpoint, this to_char behavioral change is clearly a bug fix --- but it's barely possible that somebody out there thought that returning an empty string for TZ was actually the intended/desirable behavior. Any opinions about whether to back-patch this part or not? It seems like a bug fix, but on the other hand, I don't recall any complaints from the field about to_char's TZ spec not working with brute-force zones. So maybe the prudent thing is to leave well enough alone. I vote for leaving it alone until somebody complains. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] buffile.c resource owner breakage on segment extension
Andres Freund and...@2ndquadrant.com writes: While not particularly nice, given the API, it seems best for buffile.c to remember the resource owner used for the original segment and temporarily set that during the extension. Hm, yeah, that seems right. It's just like repalloc keeping the memory chunk in its original context. The comments here are a bit inadequate... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Handle LIMIT/OFFSET before select clause (was: [HACKERS] Feature request: optimizer improvement)
On Oct 31, 2013, at 11:04 AM, Joe Love j...@primoweb.com wrote: In postgres 9.2 I have a function that is relatively expensive. When I write a query such as: select expensive_function(o.id),o.* from offeirng o where valid='Y' order by name limit 1; the query runs slow and appears to be running the function on each ID, which in this case should be totally unnecessary as it really only needs to run on 1 row. When I rewrite the query like so: select expensive_function(o.id), o.* from (select *offering where valid='Y' order by name limit 1) o; the expensive function only runs once and thus, much faster. I would think that the optimizer could handle this situation, especially when limit or offset is used and the expensive function is not used in a group by, order by or where. Does anyone know what the SQL standard says about this, if anything? I can't see any way that this would change the result set, but of course if the function has external effects this would make a difference...
Re: [HACKERS] [BUGS] BUG #8573: int4range memory consumption
On Nov 1, 2013, at 2:08 PM, Tom Lane t...@sss.pgh.pa.us wrote: g.vanluffe...@qipc.com writes: int4range ( and any other range function) consumes much memory when used in a select statement on a big table. The problem is that range_out leaks memory, as a consequence of creating a number of intermediate strings that it doesn't bother to free. I don't believe it's the only output function that leaks memory, but it does so with particular vim: now that we've increased the initial size of StringInfo buffers, it's probably wasting upwards of 2K per call. While we could doubtless hack range_out to release those strings again, it seems to me that that's just sticking a finger in the dike. I'm inclined to think that we really ought to solve this class of problems once and for all by fixing printtup.c to run the output functions in a temporary memory context, ... we're already using a reset-per-row approach to memory management of output function calls in COPY OUT, and I know for a fact that we've squeezed that code path as hard as we could. +1. COPY is actually the case I was worried about… if you're dealing with large amounts of data in other clients ISTM that other things will bottleneck before the extra memory context. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Optimizer improvement
On Oct 31, 2013, at 2:57 PM, Kevin Grittner kgri...@ymail.com wrote: Joe Love j...@primoweb.com wrote: In postgres 9.2 I have a function that is relatively expensive. What did you specify in the COST clause on the CREATE FUNCTION statement? Should that really matter in this case? ISTM we should always handle LIMIT before moving on to the SELECT clause…?
Re: Handle LIMIT/OFFSET before select clause (was: [HACKERS] Feature request: optimizer improvement)
Jim Nasby j...@nasby.net writes: On Oct 31, 2013, at 11:04 AM, Joe Love j...@primoweb.com wrote: In postgres 9.2 I have a function that is relatively expensive. When I write a query such as: select expensive_function(o.id),o.* from offeirng o where valid='Y' order by name limit 1; Does anyone know what the SQL standard says about this, if anything? The computational model is that you evaluate the SELECT list before sorting; this must be so since you can write select somefunc(x) as y from tab order by y; In the general case, therefore, it's impossible to avoid evaluating the function at all rows. I'm not sure what the spec says about whether it's okay to skip evaluation of functions that would be evaluated in a naive implementation of the computational model, so it's possible that what the OP is asking for is directly contrary to spec. But more likely they permit implementations to skip unnecessary calls, if indeed they address this at all. As far as PG goes, I think the excess calls would only occur if the plan includes an explicit sort step, since the select list would be evaluated before the sort step. If you've got an index on name (or maybe you'd need (valid, name) if there aren't many rows with valid = 'Y') I'd expect it to pick out the minimal name row with the index, avoiding any sort, and then the function would only be evaluated on the single fetched row. But these are implementation details not anything you're going to find support for in the spec. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Optimizer improvement
On Friday, November 1, 2013, Jim Nasby wrote: On Oct 31, 2013, at 2:57 PM, Kevin Grittner kgri...@ymail.comjavascript:_e({}, 'cvml', 'kgri...@ymail.com'); wrote: Joe Love j...@primoweb.com javascript:_e({}, 'cvml', 'j...@primoweb.com'); wrote: In postgres 9.2 I have a function that is relatively expensive. What did you specify in the COST clause on the CREATE FUNCTION statement? Should that really matter in this case? ISTM we should always handle LIMIT before moving on to the SELECT clause…? +1 It's sounds straight logical -- Regards, Atri *l'apprenant*
Re: [HACKERS] [BUGS] BUG #8573: int4range memory consumption
I wrote: It's possible that this would result in some net slowdown in tuple output; but it's also possible that eliminating the retail pfree's in favor of a single context reset per tuple would make for a net savings. In any case, we're already using a reset-per-row approach to memory management of output function calls in COPY OUT, and I know for a fact that we've squeezed that code path as hard as we could. It appears that indeed, the reset-per-row approach is marginally faster than the existing code. It's just barely faster with a couple of columns of output, for instance I get about 660 vs 665 msec for select x,x from generate_series(1,100) x; but the advantage grows for more columns, which is expected since we're getting rid of more pfree's. With ten integer columns I see 1650 vs 1710 msec, for example. So I see no downside to fixing it like this, and will work on a complete patch. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] appendPQExpBufferVA vs appendStringInfoVA
Tom commited some changes to appendStringInfoVA a few weeks ago which allows it to return the required buffer size if the current buffer is not big enough. On looking at appendPQExpBufferVA I'm thinking it would be nice if it could make use of the new pvsnprintf function to bring the same potential performance improvement in to there too. My vision of how appendPQExpBufferVA would look after the change is pretty much exactly the same as appendStringInfoVA, which make me think... Why do we even have appendPQExpBufferVA ? The only reason that I can think of is that it is used in front end applications and allocates memory differently... Is this the only reason or is there some other special reason for this function that I can't think of? If someone wants to give me some guidance on how or if all this should re-factored, I'll happily supply a patch. Regards David Rowley
Re: [HACKERS] appendPQExpBufferVA vs appendStringInfoVA
David Rowley dgrowle...@gmail.com writes: Tom commited some changes to appendStringInfoVA a few weeks ago which allows it to return the required buffer size if the current buffer is not big enough. On looking at appendPQExpBufferVA I'm thinking it would be nice if it could make use of the new pvsnprintf function to bring the same potential performance improvement in to there too. Uh ... it does contain pretty much the same algorithm now. We can't simply use pvsnprintf there because exit-on-error is no good for libpq's purposes, so unless we want to rethink that, a certain amount of code duplication is unavoidable. But they both understand about C99 vsnprintf semantics now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] appendPQExpBufferVA vs appendStringInfoVA
On Fri, Nov 1, 2013 at 9:27 PM, Tom Lane t...@sss.pgh.pa.us wrote: David Rowley dgrowle...@gmail.com writes: Tom commited some changes to appendStringInfoVA a few weeks ago which allows it to return the required buffer size if the current buffer is not big enough. On looking at appendPQExpBufferVA I'm thinking it would be nice if it could make use of the new pvsnprintf function to bring the same potential performance improvement in to there too. Uh ... it does contain pretty much the same algorithm now. We can't simply use pvsnprintf there because exit-on-error is no good for libpq's purposes, so unless we want to rethink that, a certain amount of code duplication is unavoidable. But they both understand about C99 vsnprintf semantics now. I have often found it frustrating that we have appendStringInfo* for the backend and appendPQExpBuffer* for the frontend. It'd be nice to have one API that could be used in both places, somehow. There seems to be a lot of interest (including on my part) in writing code that can be compiled in either environment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature request: Optimizer improvement
Jim Nasby-2 wrote Should that really matter in this case? ISTM we should always handle LIMIT before moving on to the SELECT clause…? SELECT generate_series(1,10) LIMIT 1 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Feature-request-Optimizer-improvement-tp5776589p5776707.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm use of uint64
On Mon, 2013-10-28 at 12:17 -0400, Robert Haas wrote: On Sun, Oct 27, 2013 at 11:34 PM, Noah Misch n...@leadboat.com wrote: On Fri, Oct 25, 2013 at 10:11:41PM -0400, Robert Haas wrote: When I wrote the dynamic shared memory patch, I used uint64 everywhere to measure sizes - rather than, as we do for the main shared memory segment, Size. This now seems to me to have been the wrong decision; This change is now causing compiler warnings on 32-bit platforms. You can see them here, for example: http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwingdt=2013-11-01%2020%3A45%3A01stg=make -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers