[HACKERS] Small comment fix in sinvaladt.c
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c index 09f41c1..44d02c5 100644 --- a/src/backend/storage/ipc/sinvaladt.c +++ b/src/backend/storage/ipc/sinvaladt.c @@ -214,7 +214,7 @@ SInvalShmemSize(void) } /* - * SharedInvalBufferInit + * CreateSharedInvalidationState * Create and initialize the SI message buffer */ void -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY.
On Wed, Jul 17, 2013 at 7:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kgri...@postgresql.org writes: Add support for REFRESH MATERIALIZED VIEW CONCURRENTLY. The buildfarm members that use -DCLOBBER_CACHE_ALWAYS say this patch is broken. Looks like rd_indpred is not correct if index relation is fresh. Something like this works for me. diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c index edd34ff..46149ee 100644 --- a/src/backend/commands/matview.c +++ b/src/backend/commands/matview.c @@ -634,7 +634,7 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid) /* Skip partial indexes. */ indexRel = index_open(index-indexrelid, RowExclusiveLock); - if (indexRel-rd_indpred != NIL) + if (RelationGetIndexPredicate(indexRel) != NIL) { index_close(indexRel, NoLock); ReleaseSysCache(indexTuple); -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refresh materialized view concurrently
On Tue, Jul 9, 2013 at 12:50 PM, Kevin Grittner kgri...@ymail.com wrote: Thanks again! New patch attached. After a couple of more attempts trying to break it, I mark this as ready to go. One small question: why do we use multiple unique indexes if exist? One index isn't enough? -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add transforms feature
On Sun, Jul 7, 2013 at 12:06 PM, Peter Eisentraut pete...@gmx.net wrote: On Thu, 2013-07-04 at 02:18 -0700, Hitoshi Harada wrote: as someone suggested in the previous thread, it might be a variant of CAST. CREATE CAST (hstore AS plpython2u) ? Or CREATE LANGUAGE TRANSFORM might sound better. In either case, I think we are missing the discussion on the standard overloading. LANGUAGE isn't a concept limited to the server side in the SQL standard. I could go with something like CREATE SERVER TRANSFORM. I like it better than the current one. - dependency loading issue Although most of the use cases are via CREATE EXTENSION, it is not great to let users to load the dependency manually. Is it possible to load hstore.so and plpython2u.so from _PG_init of hstore_plpython2u? Because the author of transform should certainly know the name of shared library in the database installation, writing down the shared library names in the init function sounds reasonable. Or do we still need to consider cases where plpython2u.so is renamed to something else? I don't like my solution very much either, but I think I like this one even less. I think the identity of the shared library for the hstore type is the business of the hstore extension, and other extensions shouldn't mess with it. The interfaces exposed by the hstore extension are the types and functions, and that's what we are allowed to use. If that's not enough, we need to expose more interfaces. OK, my idea was worse, because the symbol resolution happens before _PG_init anyway. But what I feel is, why can't we load dependency libraries automatically? plpython can load libpython automatically, I guess. Is it only the matter of LD_LIBRARY_PATH stuff? - function types Although I read the suggestion to use internal type as the argument of from_sql function, I guess it exposes another security risk. Since we don't know what SQL type can safely be passed to the from_sql function, we cannot check if the function is the right one at the time of CREATE TRANSFORM. Non-super user can add his user defined type and own it, and create a transform that with from_sql function that takes internal and crashes with this user-defined type. A possible compromise is let only super user create transforms, or come up with nice way to allow func(sql_type) - internal signature. Good point. My original patch allowed func(sql_type) - internal, but I took that out because people had security concerns. I'd be OK with restricting transform creation to superusers in the first cut. Yeah, I think it's better to limit to superuser. - create or replace causes inconsistency I tried: * create transform python to hstore (one way transform) * create function f(h hstore) language python * create or replace transform hstore to python and python to hstore (both ways) * call f() causes error, since it misses hstore to python transform. It is probably looking at the old definition What error exactly? Can you show the full test case? There might be some caching going on. Here is the full set of what's happening. test=# create extension hstore; CREATE EXTENSION test=# create extension plpython2u; CREATE EXTENSION test=# CREATE FUNCTION hstore_to_plpython2(val internal) RETURNS internal test-# LANGUAGE C STRICT IMMUTABLE test-# AS '$libdir/hstore_plpython2', 'hstore_to_plpython'; CREATE FUNCTION test=# test=# CREATE FUNCTION plpython2_to_hstore(val internal) RETURNS hstore test-# LANGUAGE C STRICT IMMUTABLE test-# AS 'hstore_plpython2', 'plpython_to_hstore'; CREATE FUNCTION test=# test=# CREATE TRANSFORM FOR hstore LANGUAGE plpython2u ( test(# FROM SQL WITH FUNCTION hstore_to_plpython2(internal), test(# TO SQL WITH FUNCTION plpython2_to_hstore(internal) test(# ); CREATE TRANSFORM test=# create or replace function f(h hstore) returns hstore as $$ test$# h['b'] = 10 test$# return h test$# $$ language plpython2u; CREATE FUNCTION test=# select f('a=1'); f - a=1, b=10 (1 row) test=# CREATE OR REPLACE TRANSFORM FOR hstore LANGUAGE plpython2u ( test(# TO SQL WITH FUNCTION plpython2_to_hstore(internal) test(# ); CREATE TRANSFORM test=# select f('a=1'); f - a=1, b=10 (1 row) test=# \c You are now connected to database test as user haradh1. test=# select f('a=1'); ERROR: TypeError: 'str' object does not support item assignment CONTEXT: Traceback (most recent call last): PL/Python function f, line 2, in module h['b'] = 10 PL/Python function f After reconnecting to the database with \c, the function behavior changed because we did CREATE OR REPLACE. If we go with dependency between function and transform, we shouldn't support OR REPLACE, I guess. plpython can throw away the cache, but I feel like not supporting OR REPLACE in this case. -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org
Re: [HACKERS] [PATCH] Add transforms feature
On Thu, Jul 4, 2013 at 2:18 AM, Hitoshi Harada umi.tan...@gmail.com wrote: For now, that's it. I'm going to dig more later. After looking into rest of the change, - TYPTYPE_DOMAIN is not supported. Why did you specifically disallow it? - ParseFuncOrColumn now prohibits to find function returning internal, but is it effective? I think it's already disallowed, or there is no way to call it. - get_transform_oid and get_transform are redundant -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refresh materialized view concurrently
On Sat, Jul 6, 2013 at 9:20 AM, Kevin Grittner kgri...@ymail.com wrote: Hitoshi Harada umi.tan...@gmail.com wrote: Oops! Indeed. Thanks for the careful testing. drop materialized view if exists mv; drop table if exists foo; create table foo(a, b) as values(1, 10); create materialized view mv as select * from foo; create unique index on mv(a); insert into foo select * from foo; refresh materialized view mv; refresh materialized view concurrently mv; test=# refresh materialized view mv; ERROR: could not create unique index mv_a_idx DETAIL: Key (a)=(1) is duplicated. test=# refresh materialized view concurrently mv; REFRESH MATERIALIZED VIEW Fixed by scanning the temp table for duplicates before generating the diff: test=# refresh materialized view concurrently mv; ERROR: new data for mv contains duplicate rows without any NULL columns DETAIL: Row: (1,10) I think the point is not check the duplicate rows. It is about unique key constraint violation. So, if you change the original table foo as values(1, 10), (1, 20), the issue is still reproduced. In non-concurrent operation it is checked by reindex_index when swapping the heap, but apparently we are not doing constraint check in concurrent mode. [ matview with all columns covered by unique indexes fails ] Fixed. Other than these, I've found index is opened with NoLock, relying on ExclusiveLock of parent matview, and ALTER INDEX SET TABLESPACE or something similar can run concurrently, but it is presumably safe. DROP INDEX, REINDEX would be blocked by the ExclusiveLock. Since others were also worried that an index definition could be modified while another process is holding an ExclusiveLock on its table, I changed this. OK, others are resolved. One thing I need to apology make_temptable_name_n, because I pointed the previous coding would be a potential memory overrun, but actually the relation name is defined by make_new_heap, so unless the function generates stupid long name, there is not possibility to make such big name that overruns NAMEDATALEN. So +1 for revert make_temptable_name_n, which is also meaninglessly invented. I've found another issue, which is regarding matview_maintenance_depth. If SPI calls failed (pg_cancel_backend is quite possible) and errors out in the middle of processing, this value stays above zero, so subsequently we can issue DML on the matview. We should make sure the value becomes zero before jumping out from this function. -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Have REFRESH MATERIALIZED VIEW run as the MV owner
On Fri, Jul 5, 2013 at 9:45 AM, Noah Misch n...@leadboat.com wrote: REFRESH MATERIALIZED VIEW should temporarily switch the current user ID to the MV owner. REINDEX and VACUUM do so to let privileged users safely maintain objects owned by others, and REFRESH MATERIALIZED VIEW belongs in that class of commands. I was trying to understand why this is safe for a while. REINDEX and VACUUM make sense to me because they never contain side-effect as far as I know, but MV can contain some volatile functions which could have some unintended operation that shouldn't be invoked by no one but the owner. For example, if the function creates a permanent table per call and doesn't clean it up, but later some other maintenance operation is supposed to clean it up, and the owner schedules REFRESH and maintenance once a day. A non-owner user now can refresh it so many times until the disk gets full. Or is that operation supposed to be restricted by the security context you are adding? -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add transforms feature
On Friday, July 5, 2013, Peter Eisentraut wrote: On 7/3/13 7:15 PM, Josh Berkus wrote: I'm not comfortable with having all of the transform mappings in the main contrib/ directory though. Can we add a subdirectory called transforms containing all of these? I don't see any value in that. The data types they apply to are in contrib after all. I guess his suggestion is contrib/transforms directory, not transforms directory at top level. Stil you don't see value? -- Hitoshi Harada
Re: [HACKERS] [PATCH] Add transforms feature
On Thu, Jun 13, 2013 at 8:11 PM, Peter Eisentraut pete...@gmx.net wrote: A transform is an SQL object that supplies to functions for converting between data types and procedural languages. For example, a transform could arrange that hstore is converted to an appropriate hash or dictionary object in PL/Perl or PL/Python. The patch applies and regression including contrib passes in my Mac. I know I came late, but I have a few questions. - vs SQL standard I'm worried about overloading the standard. As document says, SQL standard defines CREATE TRANSFORM syntax which is exactly the same as this proposal but for different purpose. The standard feature is the data conversion between client and server side data type, I guess. I am concerned about it because in the future if someone wants to implement this SQL standard feature, there is no way to break thing. I'd be happy if subsequent clause was different. CREATE TYPE has two different syntax, one for composite type and one for internal user-defined type (I'm not sure either is defined in the standard, though) and I think we could do something like that. Or as someone suggested in the previous thread, it might be a variant of CAST. CREATE CAST (hstore AS plpython2u) ? Or CREATE LANGUAGE TRANSFORM might sound better. In either case, I think we are missing the discussion on the standard overloading. - dependency loading issue Although most of the use cases are via CREATE EXTENSION, it is not great to let users to load the dependency manually. Is it possible to load hstore.so and plpython2u.so from _PG_init of hstore_plpython2u? Because the author of transform should certainly know the name of shared library in the database installation, writing down the shared library names in the init function sounds reasonable. Or do we still need to consider cases where plpython2u.so is renamed to something else? - function types Although I read the suggestion to use internal type as the argument of from_sql function, I guess it exposes another security risk. Since we don't know what SQL type can safely be passed to the from_sql function, we cannot check if the function is the right one at the time of CREATE TRANSFORM. Non-super user can add his user defined type and own it, and create a transform that with from_sql function that takes internal and crashes with this user-defined type. A possible compromise is let only super user create transforms, or come up with nice way to allow func(sql_type) - internal signature. - create or replace causes inconsistency I tried: * create transform python to hstore (one way transform) * create function f(h hstore) language python * create or replace transform hstore to python and python to hstore (both ways) * call f() causes error, since it misses hstore to python transform. It is probably looking at the old definition - create func - create transform is not prohibited I saw your post in the previous discussion: * I don't think recording dependencies on transforms used when creating functions is a good idea as the transform might get created after the functions already exists. That seems to be a pretty confusing behaviour. We need the dependencies, because otherwise dropping a transform would break or silently alter the behavior of functions that depend on it. That sounds like my worst nightmare, thinking of some applications that would be affected by that. But your point is a good one. I think this could be addressed by prohibiting the creation of a transform that affects functions that already exist. However I don't see this prohibition of create transform if there is already such function. You are not planning to address this issue? For now, that's it. I'm going to dig more later. Thanks, Hitoshi Harada
Re: [HACKERS] in-catalog Extension Scripts and Control parameters (templates?)
On Thu, Jul 4, 2013 at 12:46 AM, Jaime Casanova ja...@2ndquadrant.com wrote: On Thu, Jul 4, 2013 at 2:42 AM, Jaime Casanova ja...@2ndquadrant.com wrote: create extension test version '123'; CREATE EXTENSION postgres=# \df List of functions Schema | Name | Result data type | Argument data types | Type +--+--+-+-- (0 rows) Actually, what this did was to create an 123 schema and it puts the functions there. But that schema is inaccesible and undroppable: and dropping the extension let the schema around Hm? I guess '123' is not schema, but it's version. -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] refresh materialized view concurrently
On Thu, Jun 27, 2013 at 12:19 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Wed, Jun 26, 2013 at 1:38 AM, Kevin Grittner kgri...@ymail.com wrote: New version attached. Will take another look. Oops! drop materialized view if exists mv; drop table if exists foo; create table foo(a, b) as values(1, 10); create materialized view mv as select * from foo; create unique index on mv(a); insert into foo select * from foo; refresh materialized view mv; refresh materialized view concurrently mv; test=# refresh materialized view mv; ERROR: could not create unique index mv_a_idx DETAIL: Key (a)=(1) is duplicated. test=# refresh materialized view concurrently mv; REFRESH MATERIALIZED VIEW Here's one more. create table foo(a, b, c) as values(1, 2, 3); create materialized view mv as select * from foo; create unique index on mv (a); create unique index on mv (b); create unique index on mv (c); insert into foo values(2, 3, 4); insert into foo values(3, 4, 5); refresh materialized view concurrently mv; test=# refresh materialized view concurrently mv; ERROR: syntax error at or near FROM LINE 1: UPDATE public.mv x SET FROM pg_temp_2.pg_temp_16615_2 d WHE... ^ QUERY: UPDATE public.mv x SET FROM pg_temp_2.pg_temp_16615_2 d WHERE d.tid IS NOT NULL AND x.ctid = d.tid Other than these, I've found index is opened with NoLock, relying on ExclusiveLock of parent matview, and ALTER INDEX SET TABLESPACE or something similar can run concurrently, but it is presumably safe. DROP INDEX, REINDEX would be blocked by the ExclusiveLock. -- Hitoshi Harada
Re: [HACKERS] refresh materialized view concurrently
On Tue, Jun 25, 2013 at 9:07 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada umi.tan...@gmail.com wrote: If I don't miss something, the requirement for the CONCURRENTLY option is to allow simple SELECT reader to read the matview concurrently while the view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR UPDATE/SHARE are still blocked. So, I wonder why it is not possible just to acquire ExclusiveLock on the matview while populating the data and swap the relfile by taking small AccessExclusiveLock. This lock escalation is no dead lock hazard, I suppose, because concurrent operation would block the other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts AccessExclusiveLock. Then you don't need the complicated SPI logic or unique key index dependency. This is no good. One, all lock upgrades are deadlock hazards. In this case, that plays out as follows: suppose that the session running REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something else. Some other process takes an AccessShareLock on the materialized view and then tries to take a conflicting lock on the other object. Kaboom, deadlock. Granted, the chances of that happening in practice are small, but it IS the reason why we typically try to having long-running operations perform lock upgrades. Users get really annoyed when their DDL runs for an hour and then rolls back. OK, that' not safe. What I was thinking was something similar to compare-and-swap, where the whole operation is atomic under an AccessExclusiveLock. What if we release ExclusiveLock once a new matview was created and re-acquire AccessExclusiveLock before trying swap? Note matview is a little different from index which I know people are talking about in REINDEX CONCURRENTLY thread, in that the content of matview does not change incrementally (at least at this point), but only does change fully in swapping operation by the same REFRESH MATERIALIZED VIEW command. The only race condition is between releasing Exclusive lock and re-acquire AccessExclusiveLock someone else can go ahead with the same operation and could create another one. If it happens, let's abort us, because I guess that's the way our transaction system is working anyway; in case of unique key index insertion for example, if I find another guy is inserting the same value in the index, I wait for the other guy to finish his work and if his transaction commits I give up, otherwise I go ahead. Maybe it's annoying if an hour operation finally gets aborted, but my purpose is actually achieved by the other guy. If the primary goal of this feature is let reader reads the matview concurrently it should be ok? Hmm, but in such cases the first guy is always win and the second guy who may come an hour later loses so we cannot get the result from the latest command... I still wonder there should be some way. Two, until we get MVCC catalog scans, it's not safe to update any system catalog tuple without an AccessExclusiveLock on some locktag that will prevent concurrent catalog scans for that tuple. Under SnapshotNow semantics, concurrent readers can fail to see that the object is present at all, leading to mysterious failures - especially if some of the object's catalog scans are seen and others are missed. So what I'm saying above is take AccessExclusiveLock on swapping relfile in catalog. This doesn't violate your statement, I suppose. I'm actually still skeptical about MVCC catalog, because even if you can make catalog lookup MVCC, relfile on the filesystem is not MVCC. If session 1 changes relfilenode in pg_class and commit transaction, delete the old relfile from the filesystem, but another concurrent session 2 that just took a snapshot before 1 made such change keeps running and tries to open this relation, grabbing the old relfile and open it from filesystem -- ERROR: relfile not found. So everyone actually needs to see up-to-date information that synchronizes with what filesystem says and that's SnapshotNow. In my experimental thought above about compare-and-swap way, in compare phase he needs to see the most recent valid information, otherwise he never thinks someone did something new. Since I haven't read the whole thread, maybe we have already discussed about it, but it would help if you clarify this concern. Thanks, -- Hitoshi Harada
Re: [HACKERS] refresh materialized view concurrently
On Wed, Jun 26, 2013 at 1:38 AM, Kevin Grittner kgri...@ymail.com wrote: Hitoshi Harada umi.tan...@gmail.com wrote: I spent a few hours to review the patch. Thanks! As far as I can tell, the overall approach is as follows. - create a new temp heap as non-concurrent does, but with ExclusiveLock on the matview, so that reader wouldn't be blocked Non-concurrent creates the heap in the matview's tablespace and namespace, so the temp part is different in concurrent generation. This difference is why concurrent can be faster when few rows change. It's still not clear to me why you need temp in concurrent and not in non-concurrent. If this type of operations is always creating temp table and just swap it with existing one, why can't we just make it temp always? And if the performance is the only concern, is temp better than just turning off WAL for the table or UNLOGGED table? Also, before the next step there is an ANALYZE of the temp table, so the planner can make good choices in the next step. - with this temp table and the matview, query FULL JOIN and extract difference between original matview and temp heap (via SPI) Right; into another temp table. - this operation requires unique index for performance reason (or correctness reason too) It is primarily for correctness in the face of duplicate rows which have no nulls. Do you think the reasons need to be better documented with comments? Ah, yes, even after looking at patch I was confused if it was for performance or correctness. It's a shame we cannot refresh it concurrently if we have duplicate rows in the matview. I thought it would make sense to allow it without unique key if it was only performance tradeoffs. I also modified the confusing error message to something close to the suggestion from Robert. What to do about the Assert that the matview is not a system relation seems like material for a separate patch. After review, I'm inclined to remove the test altogether, so that extensions can create matviews in pg_catalog. I like this better. New version attached. Will take another look. Thanks, -- Hitoshi Harada
Re: [HACKERS] in-catalog Extension Scripts and Control parameters (templates?)
On Mon, Jun 24, 2013 at 4:20 AM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Jaime Casanova ja...@2ndquadrant.com writes: just tried to build this one, but it doesn't apply cleanly anymore... specially the ColId_or_Sconst contruct in gram.y Please find attached a new version of the patch, v7, rebased to current master tree and with some more cleanup. I've been using the new grammar entry NonReservedWord_or_Sconst, I'm not sure about that. I played a bit with this patch. - If I have control file that has the same name as template, create extension picks up control file? Is this by design? - Though control file is kind of global information among different databases, pg_extension_template is not. Sounds a little weird to me. - Why do we need with() clause even if I don't need it? - I copied and paste from my plv8.control file to template script, and MODULE_PATHNAME didn't work. By design? - foo=# create template for extension ex2 version '1.0' with (requires ex1) as $$ $$; ERROR: template option requires takes an argument - create template ex2, create extension ex2, alter template ex2 rename to ex3, create extension ex3, drop template ex3; foo=# drop template for extension ex3 version '1.0'; ERROR: cannot drop unrecognized object 3179 16429 0 because other objects depend on it - a template that is created in another template script does not appear to depend on the parent template. - Without control file/template, attempt to create a new extension gives: foo=# create extension plv8; ERROR: extension plv8 has no default control template but can it be better, like extension plv8 has no default control file or template? - Do we really need separate tables, pg_extension_template and pg_extension_control? - Looks like both tables don't have toast tables. Some experiment gives: ERROR: row is too big: size 8472, maximum size 8160 Thanks, -- Hitoshi Harada
Re: [HACKERS] in-catalog Extension Scripts and Control parameters (templates?)
On Thu, Jun 27, 2013 at 2:49 AM, Dimitri Fontaine dimi...@2ndquadrant.frwrote: Hi, Thanks a lot for your review! Some answers here, new version of the patch with fixes by tuesday. Hitoshi Harada umi.tan...@gmail.com writes: - create template ex2, create extension ex2, alter template ex2 rename to ex3, create extension ex3, drop template ex3; foo=# drop template for extension ex3 version '1.0'; ERROR: cannot drop unrecognized object 3179 16429 0 because other objects depend on it Well, if I'm following, you're trying to remove a non-existing object. I guess you would prefer a better error message, right? Right. unrecognized object x y z doesn't look good. - a template that is created in another template script does not appear to depend on the parent template. I don't think that should be automatically the case, even if I admit I didn't think about that case. Really? My understanding is everything that is created under extension depends on the extension, which depends on the template. Why only template is exception? -- Hitoshi Harada
Re: [HACKERS] extensible external toast tuple support snappy prototype
On Thu, Jun 27, 2013 at 6:08 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 19, 2013 at 3:27 AM, Andres Freund and...@2ndquadrant.com wrote: There will be a newer version of the patch coming today or tomorrow, so there's probably no point in looking at the one linked above before that... This patch is marked as Ready for Committer in the CommitFest app. But it is not clear to me where the patch is that is being proposed for commit. No, this conversation is about patch #1153, pluggable toast compression, which is in Needs Review, and you may be confused with #1127, extensible external toast tuple support. -- Hitoshi Harada
Re: backend hangs at immediate shutdown (Re: [HACKERS] Back-branch update releases coming in a couple weeks)
On Thu, Jun 20, 2013 at 3:40 PM, MauMau maumau...@gmail.com wrote: Here, reliable means that the database server is certainly shut down when pg_ctl returns, not telling a lie that I shut down the server processes for you, so you do not have to be worried that some postgres process might still remain and write to disk. I suppose reliable shutdown is crucial especially in HA cluster. If pg_ctl stop -mi gets stuck forever when there is an unkillable process (in what situations does this happen? OS bug, or NFS hard mount?), I think the DBA has to notice this situation from the unfinished pg_ctl, investigate the cause, and take corrective action. So you're suggesting that keeping postmaster up is a useful sign that the shutdown is not going well? I'm not really sure about this. What do others think? I think you are right, and there is no harm in leaving postgres processes in unkillable state. I'd like to leave the decision to you and/or others. +1 for leaving processes, not waiting for long (or possibly forever, remember not all people are running postgres on such cluster ware). I'm sure some users rely on the current behavior of immediate shutdown. If someone wants to ensure processes are finished when pg_ctl returns, is it fast shutdown, not immediate shutdown? To me the current immediate shutdown is reliable, in that it without doubt returns control back to terminal, after killing postmaster at least. Thanks, -- Hitoshi Harada
Re: [HACKERS] refresh materialized view concurrently
On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for 9.4 CF1. The goal of this patch is to allow a refresh without interfering with concurrent reads, using transactional semantics. I spent a few hours to review the patch. As far as I can tell, the overall approach is as follows. - create a new temp heap as non-concurrent does, but with ExclusiveLock on the matview, so that reader wouldn't be blocked - with this temp table and the matview, query FULL JOIN and extract difference between original matview and temp heap (via SPI) - this operation requires unique index for performance reason (or correctness reason too) - run ANALYZE on this diff table - run UPDATE, INSERT and DELETE via SPI, to do the merge - close these temp heap If I don't miss something, the requirement for the CONCURRENTLY option is to allow simple SELECT reader to read the matview concurrently while the view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR UPDATE/SHARE are still blocked. So, I wonder why it is not possible just to acquire ExclusiveLock on the matview while populating the data and swap the relfile by taking small AccessExclusiveLock. This lock escalation is no dead lock hazard, I suppose, because concurrent operation would block the other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts AccessExclusiveLock. Then you don't need the complicated SPI logic or unique key index dependency. Assuming I'm asking something wrong and going for the current approach, here's what I've found in the patch. - I'm not sure if unique key index requirement is reasonable or not, because users only add CONCURRENTLY option and not merging or incremental update. - This could be an overflow in diffname buffer. + quoteRelationName(tempname, tempRel); + strcpy(diffname, tempname); + strcpy(diffname + strlen(diffname) - 1, _2\); - As others pointed out, quoteOneName can be replaced with quote_identifier - This looks harmless, but I wonder if you need to change relkind. *** 665,672 make_new_heap(Oid OIDOldHeap, Oid NewTableSpace) OldHeap-rd_rel-relowner, OldHeapDesc, NIL, ! OldHeap-rd_rel-relkind, ! OldHeap-rd_rel-relpersistence, false, RelationIsMapped(OldHeap), true, --- 680,687 OldHeap-rd_rel-relowner, OldHeapDesc, NIL, ! RELKIND_RELATION, ! relpersistence, false, RelationIsMapped(OldHeap), true, Since OldHeap-rd_rel-relkind has been working with 'm', too, not only 'r'? - I found two additional parameters on make_new_heap ugly, but couldn't come up with better solution. Maybe we can pass Relation of old heap to the function instead of Oid.. That's about it from me. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On Thu, Jun 20, 2013 at 7:24 PM, Craig Ringer cr...@2ndquadrant.comwrote: I've missed this feature more than once, and am curious about whether any more recent changes may have made it cleaner to tackle this, or whether consensus can be formed on adding the new entries to btree's opclass to avoid the undesirable explicit lookups of the '+' and '-' oprators. As far as I know the later development didn't add anything to help this conversation. I initially thought range type or knn gist would add something, but they were something else far from this. On the other hand, if this makes it, it'll also open doors to range PARTITION BY for CREATE TABLE command, so the impact will be bigger than you may think. I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Thanks, -- Hitoshi Harada
Re: [HACKERS] refresh materialized view concurrently
On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada umi.tan...@gmail.comwrote: On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner kgri...@ymail.com wrote: Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for 9.4 CF1. The goal of this patch is to allow a refresh without interfering with concurrent reads, using transactional semantics. I spent a few hours to review the patch. Oh, BTW, though it is not part of this patch, but I came across this. ! /* ! * We're not using materialized views in the system catalogs. ! */ Assert(!IsSystemRelation(matviewRel)); Of course we don't have builtin matview on system catalog, but it is possible to create such one by allow_system_table_mods=true, so Assert doesn't look like good to me. Thanks, -- Hitoshi Harada
Re: [HACKERS] Patch for removng unused targets
On Thu, Jun 20, 2013 at 12:19 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: From: Hitoshi Harada [mailto:umi.tan...@gmail.com] I guess the patch works fine, but what I'm saying is it might be limited to small use cases. Another instance of this that I can think of is ORDER BY clause of window specifications, which you may want to remove from the target list as well, in addition to ORDER BY of query. It will just not be removed by this approach, simply because it is looking at only parse-sortClause. Certainly you can add more rules to the new function to look at the window specification, but then I'm not sure what we are missing. Yeah, I thought the extension to the window ORDER BY case, too. But I'm not sure it's worth complicating the code, considering that the objective of this optimization is to improve full-text search related things if I understand correctly, though general solutions would be desirable as you mentioned. Ah, I see the use case now. Makes sense. So, as it stands it doesn't have critical issue, but more generalized approach would be desirable. That said, I don't have strong objection to the current patch, and just posting one thought to see if others may have the same opinion. OK. I'll also wait for others' comments. For review, an updated version of the patch is attached, which fixed the bug using the approach that directly uses the clause information in the parse tree. I tried several ways but I couldn't find big problems. Small typo: s/rejunk/resjunk/ I defer to commiter. Thanks, -- Hitoshi Harada
Re: [HACKERS] Support for RANGE ... PRECEDING windows in OVER
On Fri, Jun 21, 2013 at 3:20 AM, Craig Ringer cr...@2ndquadrant.com wrote: On 06/21/2013 05:32 PM, Hitoshi Harada wrote: I also later found that we are missing not only notion of '+' or '-', but also notion of 'zero value' in our catalog. Per spec, RANGE BETWEEN needs to detect ERROR if the offset value is negative, but it is not always easy if you think about interval, numeric types as opposed to int64 used in ROWS BETWEEN. Zero can be tested for with `val = (@ val)` ie `val = abs(val)`. That should make sense for any type in which the concept of zero makes sense. Yeah, I mean, it needs to know if offset is negative or not by testing with zero. So we need zero value or is_negative function for each type. Thanks, -- Hitoshi Harada
Re: [HACKERS] extensible external toast tuple support snappy prototype
On Wed, Jun 5, 2013 at 8:01 AM, Andres Freund and...@2ndquadrant.comwrote: Two patches attached: 1) add snappy to src/common. The integration needs some more work. 2) Combined patch that adds indirect tuple and snappy compression. Those coul be separated, but this is an experiment so far... I took a look at them a little. This proposal is a super set of patch #1127. https://commitfest.postgresql.org/action/patch_view?id=1127 - endian.h is not found in my mac. Commented it out, it builds clean. - I don't see what the added is_inline flag means in toast_compress_datum(). Obviously not used, but I wonder what was the intention. - By this, * compression method. We could just use the two bytes to store 3 other * compression methods but maybe we better don't paint ourselves in a * corner again. you mean two bits, not two bytes? And patch adds snappy-c in src/common. I definitely like the idea to have pluggability for different compression algorithm for datum, but I am not sure if this location is a good place to add it. Maybe we want a modern algorithm other than pglz for different components across the system in core, and it's better to let users choose which to add more. The mapping between the index number and compression algorithm should be consistent for the entire life of database, so it should be defined at initdb time. From core maintainability perspective and binary size of postgres, I don't think we want to put dozenes of different algorithms into core in the future. And maybe someone will want to try BSD-incompatible algorithm privately. I guess it's ok to use one more byte to indicate which compression is used for the value. It is a compressed datum and we don't expect something short anyway. I don't see big problems in this patch other than how to manage the pluggability, but it is a WIP patch anyway, so I'm going to mark it as Returned with Feedback. Thanks, -- Hitoshi Harada
Re: [HACKERS] Patch for removng unused targets
On Wed, Jun 19, 2013 at 4:49 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jpwrote: Hi Harada-san, ** ** Thank you for the review. ** ** I think that the parse tree has enough information to do this optimization and that the easiest way to do it is to use the information, though I might not have understand your comments correctly. So, I would like to fix the bug by simply modifying the removability check in adjust_targetlist() so that the resjunk column is not used in GROUP BY, DISTINCT ON and *window PARTITION/ORDER BY*, besides ORDER BY. No? I am open to any comments.*** * I guess the patch works fine, but what I'm saying is it might be limited to small use cases. Another instance of this that I can think of is ORDER BY clause of window specifications, which you may want to remove from the target list as well, in addition to ORDER BY of query. It will just not be removed by this approach, simply because it is looking at only parse-sortClause. Certainly you can add more rules to the new function to look at the window specification, but then I'm not sure what we are missing. So, as it stands it doesn't have critical issue, but more generalized approach would be desirable. That said, I don't have strong objection to the current patch, and just posting one thought to see if others may have the same opinion. Thanks, -- Hitoshi Harada
Re: [HACKERS] extensible external toast tuple support
On Fri, Jun 14, 2013 at 4:06 PM, Andres Freund and...@2ndquadrant.comwrote: Here's the updated version. It shouldn't contain any obvious WIP pieces anymore, although I think it needs some more documentation. I am just not sure where to add it yet, postgres.h seems like a bad place :/ I have a few comments and questions after reviewing this patch. - heap_tuple_fetch_attr doesn't need to be updated to reflect ONDISK macro? - I'm not sure if plural for datum is good to use. Datum values? - -1 from me to use enum for tag types, as I don't think it needs to be. This looks more like other kind macros such as relkind, but I know there's pros/cons - don't we need cast for tag value comparison in VARTAG_SIZE macro, since tag is unit8 and enum is signed int? - Is there better way to represent ONDISK size, instead of magic number 18? I'd suggest constructing it with sizeof(varatt_external). Other than that, the patch applies fine and make check runs, though I don't think the new code path is exercised well, as no one is creating INDIRECT datum yet. Also, I wonder how we could add more compression in datum, as well as how we are going to add more compression options in database. I'd love to see pluggability here, as surely the core cannot support dozens of different compression algorithms, but because the data on disk is critical and we cannot do anything like user defined functions. The algorithms should be optional builtin so that once the system is initialized the the plugin should not go away. Anyway pluggable compression is off-topic here. Thanks, -- Hitoshi Harada
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Sat, Jun 15, 2013 at 1:30 PM, Jeff Davis pg...@j-davis.com wrote: On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote: I've redone the leadlag function changes to use datumCopy as you suggested. However, I've had to remove the NOT_USED #ifdef around datumFree so I can use it to avoid building up large numbers of copies of Datums in the memory context while a query is executing. I've attached the revised patch... Comments: WinGetResultDatumCopy() calls datumCopy, which will just copy in the current memory context. I think you want it in the per-partition memory context, otherwise the last value in each partition will stick around until the query is done (so many partitions could be a problem). That should be easy enough to do by switching to the winobj-winstate-partcontext memory context before calling datumCopy, and then switching back. For that matter, why store the datum again at all? You can just store the offset of the last non-NULL value in that partition, and then fetch it using WinGetFuncArgInPartition(), right? We really want to avoid any per-tuple allocations. I believe WinGetFuncArgInPartition is a bit slow if the offset is far from the current row. So it might make sense to store the last-seen value, but I'm not sure if we need to copy datum every time. I haven't looked into the new patch. Thanks, -- Hitoshi Harada
Re: [HACKERS] extensible external toast tuple support
On Tue, Jun 18, 2013 at 1:58 AM, Andres Freund and...@2ndquadrant.comwrote: On 2013-06-18 00:56:17 -0700, Hitoshi Harada wrote: On Fri, Jun 14, 2013 at 4:06 PM, Andres Freund and...@2ndquadrant.com wrote: Here's the updated version. It shouldn't contain any obvious WIP pieces anymore, although I think it needs some more documentation. I am just not sure where to add it yet, postgres.h seems like a bad place :/ I have a few comments and questions after reviewing this patch. Cool! - heap_tuple_fetch_attr doesn't need to be updated to reflect ONDISK macro? It calls toast_fetch_datum() for any kind of external datum, so it should be fine since ONDISK is handled in there. toast_fetch_datum doesn't expect the input is INDIRECT. At least I see the code path in the same file around toast_insert_or_update() where we have a chance to (possibly accidentally) try to fetch ONDISK toasted value from non-ONDISK datum. - -1 from me to use enum for tag types, as I don't think it needs to be. This looks more like other kind macros such as relkind, but I know there's pros/cons Well, relkind cannot easily be a enum because it needs to be stored in a char field. I like using enums because it gives you the chance of using switch()es at some point and getting warned about missed cases. Why do you dislike it? I put -1 just because it doesn't have to be now. If you argue relkind is char field, tag is also uint8. - don't we need cast for tag value comparison in VARTAG_SIZE macro, since tag is unit8 and enum is signed int? I think it should be fine (and the compiler doesn't warn) due to the integer promotion rules. - Is there better way to represent ONDISK size, instead of magic number 18? I'd suggest constructing it with sizeof(varatt_external). I explicitly did not want to do that, since the numbers really don't have anything to do with the struct size anymore. Otherwise the next person reading that part will be confused because there's a 8byte struct with the enum value 1. Or somebody will try adding another type of external tuple that also needs 18 bytes by copying the sizeof(). Which will fail in ugly, non-obvious ways. Sounds reasonable. I was just confused when I looked at it first. Other than that, the patch applies fine and make check runs, though I don't think the new code path is exercised well, as no one is creating INDIRECT datum yet. Yea, I only use the API in the changeset extraction patch. That actually worries me to. But I am not sure where to introduce usage of it in core without making the patchset significantly larger. I was thinking of adding an option to heap_form_tuple/heap_fill_tuple to allow it to reference _4B Datums instead of copying them, but that would require quite some adjustments on the callsites. Maybe you can create a user-defined function that creates such datum for testing, just in order to demonstrate it works fine. Also, I wonder how we could add more compression in datum, as well as how we are going to add more compression options in database. I'd love to see pluggability here, as surely the core cannot support dozens of different compression algorithms, but because the data on disk is critical and we cannot do anything like user defined functions. The algorithms should be optional builtin so that once the system is initialized the the plugin should not go away. Anyway pluggable compression is off-topic here. Separate patchset by now, yep ;). I just found. Will look into it. Thanks,
Re: [HACKERS] Patch for removng unused targets
On Tue, Jun 18, 2013 at 5:15 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jpwrote: Hi Alexander, I wrote: From: Tom Lane [mailto:t...@sss.pgh.pa.us] resjunk means that the target is not supposed to be output by the query. Since it's there at all, it's presumably referenced by ORDER BY or GROUP BY or DISTINCT ON, but the meaning of the flag doesn't depend on that. What you would need to do is verify that the target is resjunk and not used in any clause besides ORDER BY. I have not read your patch, but I rather imagine that what you've got now is that the parser checks this and sets the new flag for consumption far downstream. Why not just make the same check in the planner? I've created a patch using this approach. I've rebased the above patch against the latest head. Could you review the patch? If you have no objection, I'd like to mark the patch ready for committer. Sorry, I've had a cleanup of the patch. Please find attached the patch. Don't forget about window functions! test=# EXPLAIN (ANALYZE, VERBOSE) SELECT *, count(*) over (partition by slow_func(x,y)) FROM test ORDER BY slow_func(x,y) LIMIT 10;QUERY PLAN --- Limit (cost=0.28..3.52 rows=10 width=16) (actual time=20.860..113.764 rows=10 loops=1) Output: x, y, (count(*) OVER (?)) - WindowAgg (cost=0.28..324.27 rows=1000 width=16) (actual time=20.858..113.747 rows=10 loops=1) Output: x, y, count(*) OVER (?) - Index Scan using test_idx on public.test (cost=0.28..59.27 rows=1000 width=16) (actual time=10.563..113.530 rows=11 loops=1) Output: slow_func(x, y), x, y Total runtime: 117.889 ms (7 rows) And I don't think it's a good idea to rely on the parse tree to see if we can remove those unused columns from the target list, because there should be a lot of optimization that has been done through grouping_planner, and the parse tree is not necessarily representing the corresponding elements at this point. I think it'd be better to see path keys to find out the list of elements that may be removed, rather than SortClause, which would be a more generalized approach. Thanks, -- Hitoshi Harada
Re: [HACKERS] Parallel Sort
On Wed, May 15, 2013 at 11:11 AM, Noah Misch n...@leadboat.com wrote: On Wed, May 15, 2013 at 08:12:34AM +0900, Michael Paquier wrote: The concept of clause parallelism for backend worker is close to the concept of clause shippability introduced in Postgres-XC. In the case of XC, the equivalent of the master backend is a backend located on a node called Coordinator that merges and organizes results fetched in parallel from remote nodes where data scans occur (on nodes called Datanodes). The backends used for tuple scans across Datanodes share the same data visibility as they use the same snapshot and transaction ID as the backend on Coordinator. This is different from the parallelism as there is no idea of snapshot import to worker backends. Worker backends would indeed share snapshot and XID. However, the code in XC planner used for clause shippability evaluation is definitely worth looking at just considering the many similarities it shares with parallelism when evaluating if a given clause can be executed on a worker backend or not. It would be a waste to implement twice the same thing is there is code already available. Agreed. Local parallel query is very similar to distributed query; the specific IPC cost multipliers differ, but that's about it. I hope we can benefit from XC's experience in this area. I believe the parallel execution is much easier to be done if the data is partitioned. Of course it is possible to make only the sort operation parallel but then the question would be how to split and pass each tuple to workers. XC and Greenplum use notion of hash distributed table that enables the parallel sort (XC doesn't perform parallel sort on replicated table, I guess). For postgres, I don't think hash distributed table is foreseeable option, but MergeAppend over inheritance is a good choice to run in parallel. You won't even need to modify many lines of sort execution code if you correctly dispatch the work, as it's just to split and assign the subnode of query plan to workers. Transactions and locks will be tricky though, and we might end up introducing small set of snapshot sharing infra for the former and notion of session id rather than process id for the latter. I don't think SnapshotNow is the problem as anyway executor is reading catalogs with that today. Thanks, Hitoshi -- Hitoshi Harada
Re: [HACKERS] Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Sat, Mar 23, 2013 at 3:25 PM, Nicholas White n.j.wh...@gmail.com wrote: Thanks - I've added it here: https://commitfest.postgresql.org/action/patch_view?id=1096 . I've also attached a revised version that makes IGNORE and RESPECT UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST). Hm, you made another lookahead in base_yylex to make them unreserved -- looks ok, but not sure if there was no way to do it. You might want to try byref types such as text. It seems you need to copy the datum to save the value in appropriate memory context. Also, try to create a view on those expressions. I don't think it correctly preserves it. Thanks, -- Hitoshi Harada
[HACKERS] Validation in to_date()
to_date() doesn't check the date range, which results in unreadable data like this. foo=# create table t as select to_date('-12-10 BC', '-MM-DD BC')::timestamp; SELECT 1 foo=# table t; ERROR: timestamp out of range Attached is to add IS_VALID_JULIAN() as we do like in date_in(). I think to_date() should follow it because it is the entrance place to check sanity. Thanks, -- Hitoshi Harada to_date_check.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On Fri, Dec 28, 2012 at 12:24 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2012/12/28 Peter Eisentraut pete...@gmx.net: On Mon, 2012-12-17 at 16:34 -0500, Peter Eisentraut wrote: Yes, this would be a good solution for some applications, but the only way I can think of to manage the compatibility issue is to invent some function attribute system like CREATE FUNCTION ... OPTIONS (call_convention 'xyz') An alternative that has some amount of precedent in the Python world would be to use comment pragmas, like this: CREATE FUNCTION foo(a,b,c) AS $$ # plpython: module import x from __future__ import nex_cool_feature def helper_function(x): ... def __pg_main__(a,b,c): defined function body here $$; The source code parser would look for this string on, say, the first two lines, and then decide which way to process the source text. This way we could get this done fairly easily without any new infrastructure outside the language handler. this concept looks like more stronger and cleaner +1 I thing so same idea is used in PL/v8 What part of PL/v8 do you think is same?? It parses the body as other PLs do and nothing special. plv8 also wanted this notion of function setting before introducing find_function(), which natively loads other JS functions that are defined via CREATE FUNCTION. Of course it is not optimal, but it turned out it is not terribly slow. Maybe it depends on language runtime. So for now, PL/v8 is managing to do it and happy with the existing mechanism. It could be improved somehow, but I don't want it to be complicated than now in order to improve small stuff. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash_search and out of memory
On Fri, Oct 19, 2012 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: Hitoshi Harada umi.tan...@gmail.com writes: On Thu, Oct 18, 2012 at 8:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm not terribly comfortable with trying to use a PG_TRY block to catch an OOM error - there are too many ways that could break, and this code path is by definition not very testable. I think moving up the expand_table action is probably the best bet. Will you submit a patch? Here it is. I factored out the bucket finding code to re-calculate it after expansion. I didn't like that too much. I think a better solution is just to do the table expansion at the very start of the function, along the lines of the attached patch. This requires an extra test of the action parameter, but I think that's probably cheaper than an extra function call. It's definitely cheaper than recalculating the hash etc when a split does occur. OK. Looks better. But nentries should be bogus a little now? + /* +* Can't split if running in partitioned mode, nor if frozen, nor if +* table is the subject of any active hash_seq_search scans. Strange +* order of these tests is to try to check cheaper conditions first. +*/ + if (!IS_PARTITIONED(hctl) !hashp-frozen + hctl-nentries / (long) (hctl-max_bucket + 1) = hctl-ffactor + !has_seq_scans(hashp)) + (void) expand_table(hashp); hctl-nentries + 1 sounds appropriate? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hash_search and out of memory
On Thu, Oct 18, 2012 at 8:35 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Hitoshi Harada umi.tan...@gmail.com writes: If OOM happens during expand_table() in hash_search_with_hash_value() for RelationCacheInsert, the palloc-based allocator does throw errors. I think that when that was designed, we were thinking that palloc-based hash tables would be thrown away anyway after an error, but of course that's not true for long-lived tables such as the relcache hash table. I'm not terribly comfortable with trying to use a PG_TRY block to catch an OOM error - there are too many ways that could break, and this code path is by definition not very testable. I think moving up the expand_table action is probably the best bet. Will you submit a patch? Here it is. I factored out the bucket finding code to re-calculate it after expansion. Thanks, -- Hitoshi Harada hashoom.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] hash_search and out of memory
If OOM happens during expand_table() in hash_search_with_hash_value() for RelationCacheInsert, the hash table entry is allocated and stored in the hash table, but idhentry-reldesc remains NULL. Since OOM causes AbortTransaction(), in AtEOXact_RelationCache() this NULL pointer is referenced and we hit SIGSEGV. The fix would be either catch OOM error with PG_TRY() and undo the hash entry, or do the expansion first and put the entry later. The latter is a bit ugly because we have to re-calculate hash bucket after we decided to expand, so the former looks better. Do you think of other solutions? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] date_in and buffer overrun
It seems date_in() has a risk of buffer overrun. If the input is '.', it sets field[0] to the beginning of workbuf and goes into DecodeDate(). This function checks null-termination of the head of string, but it can go beyond the end of string inside the first loop and replace some bytes with zero. The worst scenario we've seen is overwrite of the stack frame, in which the compiler rearranged the memory allocation of local variables in date_in() and work_buf is at lower address than field. I tried to attach a patch file but failed somehow, so I paste the fix here. Thanks, diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index d827d7d..b81960a 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -2176,9 +2176,13 @@ DecodeDate(char *str, int fmask, int *tmask, bool *is2digits, while (*str != '\0' nf MAXDATEFIELDS) { /* skip field separators */ - while (!isalnum((unsigned char) *str)) + while (*str != '\0' !isalnum((unsigned char) *str)) str++; + /* or it may not be what we expected... */ + if (*str == '\0') + return DTERR_BAD_FORMAT; + field[nf] = str; if (isdigit((unsigned char) *str)) { -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] date_in and buffer overrun
On Mon, Oct 1, 2012 at 3:30 PM, Hitoshi Harada umi.tan...@gmail.com wrote: lower address than field. Ugh, s/lower/higher/ -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
On Tue, Sep 25, 2012 at 1:06 AM, Simon Riggs si...@2ndquadrant.com wrote: On 24 September 2012 21:26, Andrew Dunstan and...@dunslane.net wrote: Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. I think we're missing something in the discussion here. Why can't you find out the oid of the type by looking it up by name? That mechanism is used throughout postgres already and seems to work just fine. Sure, but how do you know the type named hstore is what you know as hstore? We don't have a global consensus a specific type name means exactly what everyone thinks it is. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] htup header reorganization breaks many extension modules
On Tue, Sep 25, 2012 at 5:30 PM, Peter Eisentraut pete...@gmx.net wrote: I haven't followed the details of the htup header reorganization, but I have noticed that a number of external extension modules will be broken because of the move of GETSTRUCT() and to a lesser extent heap_getattr(). Of course some #ifdefs can fix that, but it seems annoying to make everyone do that. Maybe this could be reconsidered to reduce the impact on other projects. But it's only add #include access/htup_details.h? I'd not argue it's harmful unless I missed your point. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oid registry
On Mon, Sep 24, 2012 at 8:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 09/24/2012 09:37 PM, Peter Eisentraut wrote: Could you fill the rest of us in with some technical details about why this might be necessary and what it aims to achieve? Well, an obvious case is how record_to_json handles fields. If it knows nothing about the type all it can do is output the string value. That doesn't work well for types such as hstore. If it could reliably recognize a field as an hstore it might well be able to do lots better. Um ... that is an entirely unconvincing use case. We would not put any code into core that knows specifically about a non-core datatype, or at least I sure hope we'd not do such a klugy thing. It would be far better to invent an extensibility scheme (plug-in of some sort) for record_to_json. I don't think (and not hope) the core should know about external data type, but I have certainly seen a lot of use cases where an external project wants to know about another external data type. Say, if plv8 wants to convert hstore into a javascript object. It is arbitrary for users to define such a function that accepts hstore as arguments, but how does plv8 know the input is actually hstore? Of course you can look up type name conlusting SysCache and see if the type name is hstore, but it's expensive to do it for every function invocation, so caching the hstore's oid in plv8 is the current workaround, which is not truly safe because the hstore's oid may change while caching. I can tell similar stories around array creation which needs element type oid. The core code can do some special stuff by using pre-defined type or function oid macros, and I guess it is reasonable for the external developers to hope to do such things. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_reorg in core?
On Thu, Sep 20, 2012 at 7:05 PM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, During the last PGCon, I heard that some community members would be interested in having pg_reorg directly in core. Just to recall, pg_reorg is a functionality developped by NTT that allows to redistribute a table without taking locks on it. The technique it uses to reorganize the table is to create a temporary copy of the table to be redistributed with a CREATE TABLE AS whose definition changes if table is redistributed with a VACUUM FULL or CLUSTER. Then it follows this mechanism: - triggers are created to redirect all the DMLs that occur on the table to an intermediate log table. - creation of indexes on the temporary table based on what the user wishes - Apply the logs registered during the index creation - Swap the names of freshly created table and old table - Drop the useless objects I'm not familiar with pg_reorg, but I wonder why we need a separate program for this task. I know pg_reorg is ok as an external program per se, but if we could optimize CLUSTER (or VACUUM which I'm a little pessimistic about) in the same way, it's much nicer than having additional binary + extension. Isn't it possible to do the same thing above within the CLUSTER command? Maybe CLUSTER .. CONCURRENTLY? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH]Tablesample Submission
On Tue, Aug 21, 2012 at 8:08 AM, Qi Huang huangq...@outlook.com wrote: Please add your patch here: https://commitfest.postgresql.org/action/commitfest_view/open -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company Hi, Robert I added it under Miscellaneous. https://commitfest.postgresql.org/action/patch_view?id=918 Patch does not apply cleanly against latest master. outfuncs.c, allpath.c and cost.h have rejected parts. The make check failed in a lot of cases up to 26 out of 133. I didn't look into each issue but I suggest rebasing on the latest master and making sure the regression test passes. Some of the patch don't follow our coding standard. Please adjust brace positions, for example. For the header include list and Makefile, place a new files in alphabetical order. The patch doesn't include any documentation. Consider add some doc patch for such a big feature like this. You should update kwlist.h for REPEATABLE but I'm not sure if REPEATABLE should become a reserved keyword yet. I don't see why you created T_TableSampleInfo. TableSampleInfo looks fine with a simple struct rather than a Node. If we want to disable a cursor over a sampling table, we should check it in the parser not the planner. In the wiki page, one of the TODOs says about cursor support, but how much difficult is it? How does the standard say? s/skiped/skipped/ Don't we need to reset seed on ExecReScanSampleScan? Should we add a new executor node SampleScan? It seems everything about random sampling is happening under heapam. It looks like BERNOULLI allocates heap tuple array beforehand, and copy all the tuples into it. This doesn't look acceptable when you are dealing with a large number of rows in the table. As wiki says, BERNOULLI relies on the statistics of the table, which doesn't sound good to me. Of course we could say this is our restriction and say good-bye to users who hadn't run ANALYZE first, but it is too hard for a normal users to use it. We may need quick-and-rough count(*) for this. That is pretty much I have so far. I haven't read all the code nor the standard, so I might be wrong somewhere. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Plan cache and name space behavior in 9.2
I came across a new behavior in 9.2 with nested plpgsql functions and temporary table. create or replace function chck(tname text) returns void as $$ declare cnt int; begin select count(*) from pg_attribute where attrelid = tname::regclass::oid into cnt; end; $$ language plpgsql; create or replace function train(tname text) returns text as $$ declare begin perform chck(tname); return 'OK'; end; $$ language plpgsql; create or replace function test(tname text) returns text as $$ declare result text; begin result = train(tname); return result; end; $$ language plpgsql; drop table if exists tbl; create table tbl(a int); select test('tbl'); -- call 1 drop table if exists tbl; create temp table tbl(a int); select test('tbl'); -- call 2 drop table tbl; I expected success in tname::regclass in the function chck(), but it actually fails for your first run in the session. The second run of this script will succeed. I assume it's the cause is the new plan cache behavior. On the first run of the script, the temporary namespace was not created when the chck() is called at call 1 , and it saves namespace, then it searches the old saved namespace for 'tbl' at call 2, fails to find 'tbl'. Removing the call 1 call it runs successfully. The second run of this script in the same session succeeds because the temporary namespace has been created in the first run. I confirmed it runs fine in 9.1 Is this a bug in 9.2 or an expected behavior in the new plan cache? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Plan cache and name space behavior in 9.2
On Fri, Sep 14, 2012 at 12:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Hitoshi Harada umi.tan...@gmail.com writes: I expected success in tname::regclass in the function chck(), but it actually fails for your first run in the session. Really? Not for me. In the example as given, I see success for call 1 and then an error at call 2, which is occurring because we're trying to replan the query with the original search_path, which doesn't include the temp schema since it didn't exist yet. I'm saying the same thing actually. I see success for call 1 and error at call 2, which was not observed in 9.1 and older. A replan would have failed in previous versions too, but that's much less likely in previous versions since you'd need to see a relcache invalidation on one of the referenced tables to make one happen. I don't think so. I tried it in 9.1 and succeeded. I found this during the test of an external module that has been running back to 8.4. So I wonder if we could say this is a behavior change or a bug. And I agree the replan failure would be sane if the function was marked as immutable or stable, but all the functions I defined in the example is volatile. I'm not sure how others feel, but at least it's surprising to me that the call 2 keeps the state of call 1 though it is a volatile function. I have not been tracking the periodic discussion of plan cache vs search_path, but what is the beneficial use case of the new behavior? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing optimization when filters are applied after window functions
On Thu, May 17, 2012 at 7:26 AM, Volker Grabsch v...@notjusthosting.com wrote: Hitoshi Harada schrieb: On Wed, May 16, 2012 at 12:50 AM, Volker Grabsch v...@notjusthosting.com wrote: I propose the following general optimization: If all window functions are partitioned by the same first field (here: id), then any filter on that field should be executed before WindowAgg. So a query like this: I think that's possible. Currently the planner doesn't think any qualification from the upper query can be pushed down to a query that has a window function. It would be possible to let it push down if the expression matches PARTITION BY expression. Sounds great! However, the challenge is that a query may have a number of window functions that have different PARTITION BY expressions. At the time of pushing down in the planning, it is not obvious which window function comes first. I'm don't really unterstand what you mean with which window function comes first, because to my understanding, all window functions of a query belong to the same level in the query hierarchy. But then, my knowledge of PostgreSQL internals isn't very deep, either. No, you can specify as many window specification as you like. Say, SELECT count(*) over (w1), count(*) over (w2) FROM tbl WINDOW w1 AS (PARTITION BY x ORDER BY y), w2 AS (PARTITION BY y ORDER BYx); and in the same query level there are different type of window specifications. The code as stands today doesn't have any semantics about which of w1 or w2 to run first (probably w1 will be run first, but the query semantics doesn't enforce anything). One idea is to restrict such optimization in only case of single window function, and the other is to make it generalize and cover a lot of cases. From a practical point of view, the restriction to a single window function wouldn't be that bad, although I'd prefer to think about the number of different windows rather than number of window functions. In other words, every optimization that is correct for a single window function is also correct for multiple window functions if those use all the same window. Yeah, I mean, multiple windows, not window functions. That said, our planner on window functions has a lot of improvement to be done. Every kind of optimization I see is what I raised above; they can be done easily by hacking in a small case, or they can be done by generalizing for the most of cases. My understanding is our project tends to like the latter and it takes a little time but covers more use cases. I'd also prefer to see a general solution, as this provides less room for unpleasant surprises (e.g. This query is only slightly different from the previous one. Why does it take so much longer?). On the other hand, any small improvement is a big step forward regarding window functions. Unfortunately, I can't voluteer on that, as it is currently impossible for me to allocate enough time for this. However, any pointer to where to look at the source (or in the manual) would be of great. Maybe I'll find at least enough time to provide a rough proposal, or to improve existing attempts to solve this issue. Look at subquery_is_pushdown_safe in allpath.c. Here we stop looking deeper if the upper qualification can be pushed down to the inner sub-query if the sub-query has any window function expressions. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing optimization when filters are applied after window functions
On Wed, May 16, 2012 at 12:50 AM, Volker Grabsch v...@notjusthosting.com wrote: I propose the following general optimization: If all window functions are partitioned by the same first field (here: id), then any filter on that field should be executed before WindowAgg. So a query like this: I think that's possible. Currently the planner doesn't think any qualification from the upper query can be pushed down to a query that has a window function. It would be possible to let it push down if the expression matches PARTITION BY expression. However, the challenge is that a query may have a number of window functions that have different PARTITION BY expressions. At the time of pushing down in the planning, it is not obvious which window function comes first. One idea is to restrict such optimization in only case of single window function, and the other is to make it generalize and cover a lot of cases. That said, our planner on window functions has a lot of improvement to be done. Every kind of optimization I see is what I raised above; they can be done easily by hacking in a small case, or they can be done by generalizing for the most of cases. My understanding is our project tends to like the latter and it takes a little time but covers more use cases. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Sat, Apr 14, 2012 at 2:28 PM, Jay Levitt jay.lev...@gmail.com wrote: Christopher Browne wrote: On Thu, Apr 12, 2012 at 6:11 PM, Jay Levittjay.lev...@gmail.com wrote: Rather than extend the CF app into a trivial-patch workflow app, it might be worth looking at integrating it with github. There's a reluctance to require a proprietary component that could disappear on us without notice. Excellent point. I was thinking that GitHub's API would allow archival exporting to counter that, along the lines of let's take advantage of it for the next five years until it goes south, and THEN we could write our own. But I can see how that might not be the best choice for a project that expects to preserve history for a few decades. GitHub does offer an enterprise version that you can self-host, but it seems to be priced per-user and intended for solely intranet use. If the feature set is desirable, though, I wonder if Postgres is big/high profile enough for them to figure out some sort of better arrangement. They *love* it when big open-source projects use GitHub as their public repo - they'll email and blog announcements about it - and if there's interest I'd be happy to open a conversation with them. The existence of git itself is a result of *exactly* that circumstance, as Linux kernel developers had gotten dependent on BitKeeper, whereupon the owner decided to take his toys home, at which point they were left bereft of their SCM tool. http://kerneltrap.org/node/4966 Good history lesson there, with a great outcome. I expect that it would be more worthwhile to look into enhancements to git workflow such ashttp://code.google.com/p/gerrit/ Gerrit. I don't know that Gerrit is THE answer, but there are certainly projects that have found it of value, and it doesn't have the oops, it's proprietary problem. I've looked at it in conjunction with Jenkins CI; it looked nice but was way too heavy-weight for a four-person startup (what's code review?). It's probably much more suitable for this sized project. Gerrit's a full-featured code review app with a tolerable UI; I was thinking of GitHub more as a great lightweight UI for doc patches and other trivial patches where you might have lots of casual review and comments but no need for, say, recording regression tests against each patch version. e.g.: https://github.com/rails/rails/pull/5730 Also, for doc patches, GitHub has the great advantage of in-place editing right from the web UI. I don't know if GitHub's pull request or Gerrit is a good tool (I doubt, actually), but I've been thinking how we could improve our review process in terms of both of human process perspective and tool process. As we have our simple CF app (while there are a bunch of tools like JIRA or something), I'd think we could have our own review UI connected to the rest of our toolset including CF app. I know we want the mail archive history of the whole discussion, but still giving feedback to the submitter via email is hard-work and the successors cannot read it entirely. From a human- rather than technology-oriented perspective: I was shocked to find that you folks *WANT* reviews from non-contributors. It was my assumption as a newcomer that if I don't feel well-versed enough to submit patches yet, the last thing you'd want me to do was to look over someone else's patch and say Yeah, that looks good, any more than I care if my mom thinks my latest web app is very nice. I see now that the Reviewing a Patch wiki page explains this, but maybe this info should be pushed higher into the docs and web site; a How can I contribute page, open calls for reviewers on the non-hackers mailing lists, things like that. Or maybe just make the wiki page bright red and blink a lot. I found myself enjoying reviewing other patches where I don't have strong knowledge. I strongly believe we should encourage more and more people who haven't worked particular patches in that area, to review patches. The more eyeballs there are, the more quality we get. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Regarding GSoc Application
On Tue, Apr 10, 2012 at 8:41 PM, Atri Sharma atri.j...@gmail.com wrote: Hi All, I think we are back on the initial approach I proposed(hooking directly into the JVM and executing Java code that calls JDBC).I think the best way to do this is create a JVM that executes the Java code and give the control of the JVM to the native API. I agree,the only need of Pl/Java that is apparent here is the need of the Java internals(JDK et al).If we set them up independently,then,we can have the FDW wrapping JDBC directly through JNI.JNI would call pure Java functions to connect to the JDBC. I think we can proceed with this.Once we are done with the API calling Java functions,I think the rest of the path is easily mapped(writing Java functions to connect to JDBC). Please let me know your opinions on this. I think Multicorn is a good example, which invokes Python from FDW routines though it is not using PL/Python. http://multicorn.org/ Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #6572: The example of SPI_execute is bogus
On Wed, Apr 4, 2012 at 8:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: umi.tan...@gmail.com writes: http://www.postgresql.org/docs/9.1/static/spi-spi-execute.html === SPI_execute(INSERT INTO foo SELECT * FROM bar, false, 5); will allow at most 5 rows to be inserted into the table. === This seems not true unless I'm missing something. Hmm ... that did work as described, until we broke it :-(. This is an oversight in the 9.0 changes that added separate ModifyTuple nodes to plan trees. ModifyTuple doesn't return after each updated row, unless there's a RETURNING clause; which means that the current_tuple_count check logic in ExecutePlan() no longer stops execution as intended. Given the lack of complaints since 9.0, maybe we should not fix this but just redefine the new behavior as being correct? But it seems mighty inconsistent that the tuple limit would apply if you have RETURNING but not when you don't. In any case, the ramifications are wider than one example in the SPI docs. Thoughts? To be honest, I was surprised when I found tcount parameter is said to be applied to even INSERT. I believe people think that parameter is to limit memory consumption when returning tuples thus it'd be applied for only SELECT or DML with RETURNING. So I'm +1 for non-fix but redefine the behavior. Who wants to limit the number of rows processed inside the backend, from SPI? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd out of memory problem.
On Thu, Mar 29, 2012 at 7:38 PM, Peter Eisentraut pete...@gmx.net wrote: On tis, 2012-03-27 at 00:53 +0100, Greg Stark wrote: Hm. So my original plan was dependent on adding the state-merge function we've talked about in the past. Not all aggregate functions necessarily can support such a function but I think all or nearly all the builtin aggregates can. Certainly min,max, count, sum, avg, stddev, array_agg can which are most of what people do. That would be a function which can take two state variables and produce a new state variable. This information could also be useful to have in PL/Proxy (or similar FDWs) to be able to integrate aggregate computation into the language. Currently, you always have to do the state merging yourself. I don't know exactly how PL/Proxy or pgpool accomplish the multi-phase aggregate, but in theory the proposal above is state-merge function, so it doesn't apply to general aggregate results that passed through the final function. Of course some functions that don't have final functions are ok to call state-merge function on the results. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Wed, Mar 28, 2012 at 8:52 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Mar 28, 2012 at 11:28 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: In practice, however, that sounds like a real pain in the neck. I would expect most people who were packaging extensions to handle a situation like this by forcing the user to provide the name of the function to be called, either via a control table or via a GUC. And once you've done that, it makes no sense to shove a feature dependency into the extension, because the user might very well just write an appropriate function themselves and tell the extension to call that. I don't know what you're talking about here, all I can say is that is has nothing to do with what the patch is implementing. What's in the patch is a way to depend on known versions of an extension rather than the extension wholesale, whatever the version. Using feature dependency allow to avoid 2 particularly messy things: - imposing a version numbering scheme with a comparator - maintaining a separate feature matrix So instead of having to say foo version 1.2 is now doing buzz and having an extension depend on foo = 1.2, you can say that your extension depends on the buzz feature. That's about it. Based on this information, it seems that I've misinterpreted the purpose of the patch. Since extension features seem to live in a global namespace, I assumed that the purpose of the patch was to allow both extension A and extension B to provide feature F, and extension C to depend on F rather than A or B specifically. What I understand you to be saying here is that's not really what you're trying to accomplish. Instead, you're just concerned about allowing some but not all versions of package A to provide feature F, so that other extensions can depend on F to get the specific version of A that they need (and not, as I had assumed, so that they can get either A or B). Let me think more about that. Maybe I'm just easily confused here, or maybe there is something that should be changed in the code or docs; I'm not sure yet. On a more prosaic note, you seem to have made a mistake when generating the v5 diff. It includes reverts of a couple of unrelated, recent patches. WTF? WTF? On a further note, I have spent a heck of a lot more time reviewing other people's patches this CommitFest than you have, and I don't appreciate this. If you'd rather that I didn't spend time on this patch, I have plenty of other things to do with my time. Frankly I'm still against this patch. Since I started to review it I've never been convinced with the use case. Yeah, someone said it'd be useful to him, but as a developer of some of PGXN modules I don't see it. I totally agree with Robert's point that one feature is not standardized and nobody can tell how you can depend on the feature in the end. Mind you, I've never heard about building dependency by its name as a string in other packaging system. If you want to introduce the concept of version dependency not feature name dependency, do *it*; I don't think feature dependency solves it. I know you Dimitri is working so hard for this and other patches, but it seems to me that the quality of both of the design and patch code are not adequate at this point of time. I think I agree we are not 100% happy with the current dependency system of extensions, but we need more time to think and make it mature idea rather than rushing and pushing and dropping something premature. The cost we would pay if we rushed this to this release will be higher than what we'd get from it, I think. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Thu, Mar 29, 2012 at 12:51 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hitoshi Harada umi.tan...@gmail.com writes: Frankly I'm still against this patch. Since I started to review it I've never been convinced with the use case. Yeah, someone said it'd be useful to him, but as a developer of some of PGXN modules I don't see it. I totally agree with Robert's point that one feature is not standardized and nobody can tell how you can depend on the feature in the end. Mind you, I've never heard about building dependency by its Ok, we might need to find another word for the concept here. Will think, would appreciate native speakers' thought. name as a string in other packaging system. If you want to introduce the concept of version dependency not feature name dependency, do *it*; I don't think feature dependency solves it. I don't want to introduce version dependency, because I don't think we need it. If you want to compare what we're doing here with say debian packaging, then look at how they package libraries. The major version number is now part of the package name and you depend on that directly. So let's take the shortcut to directly depend on the “feature” name. For a PostgreSQL extension example, we could pick ip4r. That will soon include support for ipv6 (it's already done code wise, missing docs update). If you want to use ip4r for storing ipv6, you will simply require “ip6r” or whatever feature name is provided by the extension including it. So my question is why you cannot depend on ip4r in that case. If some version of the module introduces ipv6, then let's depend on that version. It doesn't explain why a string feature name is needed. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd out of memory problem.
On Mon, Mar 26, 2012 at 5:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Greg Stark st...@mit.edu writes: On Mon, Mar 26, 2012 at 6:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Could you give us a brain dump on the sketch? I've never seen how to do it without unreasonable overhead. Hm. So my original plan was dependent on adding the state-merge function we've talked about in the past. Not all aggregate functions necessarily can support such a function but I think all or nearly all the builtin aggregates can. Certainly min,max, count, sum, avg, stddev, array_agg can which are most of what people do. That would be a function which can take two state variables and produce a new state variable. I'd rather not invent new requirements for aggregate implementations if we can avoid it. However now that I've started thinking about it further I think you could solve it with less complexity by cheating in various ways. For example if you limit the hash size to 1/2 of work_mem then you when you reach that limit you could just stuff any tuple that doesn't match a hash entry into a tuplesort with 1/2 of work_mem and do the regular level break logic on the output of that. Or just start dumping such tuples into a tuplestore, while continuing to process tuples that match the hashagg entries that are already in existence. Once the input is exhausted, read out the hashagg entries we have, flush the hashagg table, start reading from the tuplestore. Repeat as needed. I like this idea because the only thing you give up is predictability of the order of output of aggregated entries, which is something that a hashagg isn't guaranteeing anyway. In particular, we would still have a guarantee that any one aggregate evaluation processes the matching tuples in arrival order, which is critical for some aggregates. The main problem I can see is that if we start to flush after work_mem is X% full, we're essentially hoping that the state values for the existing aggregates won't grow by more than 1-X%, which is safe for many common aggregates but fails for some like array_agg(). Ultimately, for ones like that, it'd probably be best to never consider hashing at all. I guess we could invent an unsafe for hash aggregation flag for aggregates that have unbounded state-size requirements. According to what I've learned in the last couple of months, array_agg is not ready for fallback ways like dumping to tuplestore. Even merge-state is not able for them. The problem is that the executor doesn't know how to serialize/deserialize the internal type trans value. So in one implementation, the existence of merge function is a flag to switch back to sort grouping not hash aggregate and array_agg is one of such aggregate functions. That said, if you invent a new flag to note the aggregate is not dump-ready, it'd be worth inventing state merge function to aggregate infrastructure anyway. So I can imagine a way without state-merge function nor dumping to tuplestore would be to sort hash table content the rest of inputs so that we can switch to sort grouping. Since we have hash table, we can definitely sort them in memory, and we can put to disk everything that comes later than the fallback and read it after the scan finishes. Now we have sorted state values and sorted input, we can continue the rest of work. Anyway the memory usage problem is not only array_agg and hash aggregate. Even if you can say the hash table exceeds X% of the work_mem, how can you tell other operators such like Sort are not using the rest of memory? One approach I could see to avoid this is assigning arbitrary amount of memory to each operator from work_mem and calculate it locally. But this approach is going to skew occasionally and not perfect, either. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] CREATE DOMAIN json vs built-in json
I've noticed our plv8 regression test now fails. It has CREATE DOMAIN json AS text ... and validates text via v8's JSON.parse(), which was working before introducing built-in json type. The test itself can be solved simply by creating schema, but my surprise is that we allow a domain whose name is the same as other base type. Is it intentional? Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Fri, Feb 24, 2012 at 2:09 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hitoshi Harada umi.tan...@gmail.com writes: I confirmed DROP EXTENSION is fixed now. In turn, it seems to me requires doesn't work. My test ext2.control looks like: I'm very sorry about that. It's all about playing with pg_depend and I've failed to spend enough time on that very topic to send a patch that just works, it seems. I'm going to fix that over the week-end. Thanks for your reviewing so far. Quickly reviewed the patch and found some issues. - There are some mixture of pg_extension_feature and pg_extension_features - The doc says pg_extension_features has four columns but it's not true. - Line 608 is bad. In the loop, provides_itself is repeatedly changed to true and false and I guess that's not what you meant. - Line 854+, you can fold two blocks into one. The two blocks are similar and by giving provides list with list_make1 when control-provides == NIL you can do it in one block. - s/trak/track/ - Line 960, you missed updating classId for dependency. That's pretty much from me. I just looked at the patch and have no idea about grand architecture. Marking Waiting on Author. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Mon, Feb 13, 2012 at 3:18 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, Sorry for the delays, I'm back on PostgreSQL related work again. Hitoshi Harada umi.tan...@gmail.com writes: I just tried DROP EXTENSION now, and found it broken :( Please find v2 of the patch. I did change the dependency management in between the simple cases and the more challenging ones and forgot that I had to retest it all in between, which is what happen on a tight schedule and when working at night, I guess. The patch is partially rejected due to the pg_proc column changes from leakproof, but I could apply manually. I confirmed DROP EXTENSION is fixed now. In turn, it seems to me requires doesn't work. My test ext2.control looks like: comment = 'sample1' default_version = '1.0' requires = 'featZ' relocatable = true And simply this extension can be installed against cleanly-initialized database. I double-checked there's no entry for featz in pg_extension_feature. Also, I found that if control file has duplicate names in provides, the error is not friendly (duplicate entry for pg_extension_feature, or something). This is same if provides has the extension name itself. I'll have a look more but give comments so far so that you can find solutions to them soon. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory usage during sorting
On Sat, Feb 11, 2012 at 11:34 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Feb 8, 2012 at 1:01 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Sun, Jan 15, 2012 at 4:59 PM, Jeff Janes jeff.ja...@gmail.com wrote: The attached patch allows it to reuse that memory. On my meager system it reduced the building of an index on an integer column in a skinny 200 million row totally randomly ordered table by about 3% from a baseline of 25 minutes. Just to give a standard review, this patch is one line change and applies cleanly, builds ok. I'm not pretty sure what exactly you're trying to accomplish, but it seems to me that it's avoiding the first dumptuples cycle by forcing availMem = 0 even if it's negative. Yes. Currently when it switches to the TSS_BUILDRUNS part of a tape-sort, it starts by calling WRITETUP a large number of time consecutively, to work off the memory deficit incurred by the 3 blocks per tape of tape overhead, and then after that calls WRITETUP about once per puttuple.. Under my patch, it would only call WRITETUP about once per puttuple, right from the beginning. I read your comments as it'd be avoiding to alternate reading/writing back and force with scattered memory failing memory cache much during merge phase, but actually it doesn't affect merge phase but only init-dump phase, does it? It effects the building of the runs. But this building of the runs is not a simple dump, it is itself a mini merge phase, in which it merges the existing in-memory priority queue against the still-incoming tuples from the node which invoked the sort. By using less memory than it could, this means that the resulting runs are smaller than they could be, and so will sometimes necessitate an additional layer of merging later on. (This effect is particularly large for the very first run being built. Generally by merging incoming tuples into the memory-tuples, you can create runs that are 1.7 times the size of fits in memory. By wasting some memory, we are getting 1.7 the size of a smaller starting point. But for the first run, it is worse than that. Most of the benefit that leads to that 1.7 multiplier comes at the very early stage of each run-build. But by initially using the full memory, then writing out a bunch of tuples without doing any merge of the incoming, we have truncated the part that gives the most benefit.) My analysis that the freed memory is never reused (because we refuse to reuse it ourselves and it is too fragmented to be reused by anyone else, like the palloc or VM system) only applies to the run-building phase. So never was a bit of an overstatement. By the time the last initial run is completely written out to tape, the heap used for the priority queue should be totally empty. So at this point the allocator would have the chance to congeal all of the fragmented memory back into larger chunks, or maybe it parcels the allocations back out again in an order so that the unused space is contiguous and could be meaningfully paged out. But, it is it worth worrying about how much we fragment memory and if we overshoot our promises by 10 or 20%? If so, I'm not so convinced your benchmark gave 3 % gain by this change. Correct me as I'm probably wrong. I've now done more complete testing. Building an index on an 200,000,000 row table with an integer column populated in random order with integers from 1..500,000,000, non-unique, on a machine with 2GB of RAM and 600MB of shared_buffers. It improves things by 6-7 percent at the end of working mem size, the rest are in the noise except at 77936 KB, where it reproducibly makes things 4% worse, for reasons I haven't figured out. So maybe the best thing to do is, rather than micromanaging memory usage, simply don't set maintenance_work_mem way to low. (But, it is the default). I've tested here with only a million rows mix of integer/text (table size is 80MB or so) with default setting, running CREATE INDEX continuously, but couldn't find performance improvement. The number varies from -2% to +2%, which I think is just error. While I agree with your insist that avoiding the first dump would make sense, I guess it depends on situations; if the dump goes with lots of tuples (which should happen when availMem is big), writing tuples a lot at a time will be faster than writing little by little later. I'm not sure about the conclusion, but given this discussion, I'm inclined to mark this Returned with Feedback. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory usage during sorting
On Sun, Jan 15, 2012 at 4:59 PM, Jeff Janes jeff.ja...@gmail.com wrote: The attached patch allows it to reuse that memory. On my meager system it reduced the building of an index on an integer column in a skinny 200 million row totally randomly ordered table by about 3% from a baseline of 25 minutes. Just to give a standard review, this patch is one line change and applies cleanly, builds ok. I'm not pretty sure what exactly you're trying to accomplish, but it seems to me that it's avoiding the first dumptuples cycle by forcing availMem = 0 even if it's negative. I read your comments as it'd be avoiding to alternate reading/writing back and force with scattered memory failing memory cache much during merge phase, but actually it doesn't affect merge phase but only init-dump phase, does it? If so, I'm not so convinced your benchmark gave 3 % gain by this change. Correct me as I'm probably wrong. Anyway, it's nice to modify the comment just above the change, since it's now true with it. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory usage during sorting
On Sat, Feb 4, 2012 at 10:06 AM, Jeff Janes jeff.ja...@gmail.com wrote: The worst thing about the current memory usage is probably that big machines can't qsort more than 16,777,216 tuples no matter how much memory they have, because memtuples has to live entirely within a single allocation, which is currently limited to 1GB. It is probably too late to fix this problem for 9.2. I wish I had started there rather than here. This 16,777,216 tuple limitation will get even more unfortunate if the specializations that speed up qsort but not external sort get accepted. I think it's a fair ask to extend our palloc limitation of 1GB to 64bit space. I see there are a lot of applications that want more memory by one palloc call in user-defined functions, aggregates, etc. As you may notice, however, the area in postgres to accomplish it needs to be investigated deeply. I don't know where it's safe to allow it and where not. varlena types is unsafe, but it might be possible to extend varlena header to 64 bit in major release somehow. Attached is a completely uncommitable proof of concept/work in progress patch to get around the limitation. It shows a 2 fold improvement when indexing an integer column on a 50,000,000 row randomly ordered table. In any case, we do need bird-eye sketch to attack it but I guess it's worth and at some future point we definitely must do, though I don't know if it's the next release or third next release from now. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Thu, Feb 2, 2012 at 3:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: [ working on this patch now ... ] Matthew Draper matt...@trebex.net writes: On 25/01/12 18:37, Hitoshi Harada wrote: Should we throw an error in such ambiguity? Or did you make it happen intentionally? If latter, we should also mention the rule in the manual. I did consider it, and felt it was the most consistent: I believe the issue here is that a two-part name A.B has two possible interpretations (once we have eliminated table references supplied by the current SQL command inside the function): per the comment, * A.B A = record-typed parameter name, B = field name * OR: A = function name, B = parameter name If both interpretations are feasible, what should we do? The patch tries them in the above order, but I think the other order would be better. My argument is this: the current behavior doesn't provide any out other than changing the function or parameter name. Now presumably, if someone is silly enough to use a parameter name the same as the function's name, he's got a good reason to do so and would not like to be forced to change it. If we prefer the function.parameter interpretation, then he still has a way to get to a field name: he just has to use a three-part name function.parameter.field. If we prefer the parameter.field interpretation, but he needs to refer to a scalar parameter, the only way to do it is to use an unqualified reference, which might be infeasible (eg, if it also matches a column name exposed in the SQL command). Another problem with the current implementation is that if A matches a parameter name, but ParseFuncOrColumn fails (ie, the parameter is not of composite type or doesn't contain a field named B), then the hook just fails to resolve anything; it doesn't fall back to consider the function-name-first alternative. So that's another usability black mark. We could probably complicate the code enough so it did consider the function.parameter case at that point, but I don't think that there is a strong enough argument for this precedence order to justify such complication. In short, I propose swapping the order in which these cases are tried. +1 from me. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Mon, Jan 23, 2012 at 3:06 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Mon, Jan 23, 2012 at 2:00 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hitoshi Harada umi.tan...@gmail.com writes: - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Can you provide me the test case you've been using? That looks like a bug I need to fix, indeed (unless the problem lies in the test case, which would mean I need to tighten things some more). The test case is just above; createdb db1 and create and drop an extension. The kmean extension is on pgxn. I tried my small test extension named ext1 which contains only one plpgsql function, and created it then dropped it, reproduced. Ping. In case you don't have updates soon, I'll mark Returned with Feedback. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Sun, Jan 29, 2012 at 1:08 AM, Matthew Draper matt...@trebex.net wrote: On 25/01/12 18:37, Hitoshi Harada wrote: I'm still not sure whether to just revise (almost) all the SQL function examples to use parameter names, and declare them the right choice; as it's currently written, named parameters still seem rather second-class. Agreed. I'll try a more comprehensive revision of the examples. The patch seems ok, except an example I've just found. db1=# create function t(a int, t t) returns int as $$ select t.a $$ language sql; CREATE FUNCTION db1=# select t(0, row(1, 2)::t); t --- 1 (1 row) Should we throw an error in such ambiguity? Or did you make it happen intentionally? If latter, we should also mention the rule in the manual. I did consider it, and felt it was the most consistent: # select t.x, t, z from (select 1) t(x), (select 2) z(t); x | t | z ---+---+- 1 | 2 | (2) (1 row) I haven't yet managed to find the above behaviour described in the documentation either, though. To me, it feels like an obscure corner case, whose description would leave the rules seeming more complicated than they generally are. Makes sense to me. I marked this as Ready for committer. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Mon, Jan 23, 2012 at 7:13 PM, Matthew Draper matt...@trebex.net wrote: On 19/01/12 20:28, Hitoshi Harada wrote: (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) I fixed it here and it now works with my environment. Well spotted; that's exactly what I'd done. :/ The regression tests pass, the feature seems working as aimed, but it seems to me that it needs more test cases and documentation. For the tests, I believe at least we need ambiguous case given upthread, so that we can ensure to keep compatibility. For the document, it should describe the name resolution rule, as stated in the patch comment. Attached are a new pair of patches, fixing the missing include (and the other warning), plus addressing the above. I'm still not sure whether to just revise (almost) all the SQL function examples to use parameter names, and declare them the right choice; as it's currently written, named parameters still seem rather second-class. Agreed. The patch seems ok, except an example I've just found. db1=# create function t(a int, t t) returns int as $$ select t.a $$ language sql; CREATE FUNCTION db1=# select t(0, row(1, 2)::t); t --- 1 (1 row) Should we throw an error in such ambiguity? Or did you make it happen intentionally? If latter, we should also mention the rule in the manual. Other than that, the patch looks good to me. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Sat, Jan 21, 2012 at 9:20 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, Thank you for reviewing this patch! Hitoshi Harada umi.tan...@gmail.com writes: The patch applies with one reject, which I could fix easily. The make check passed. Bitrot happens fast in this season… will produce another version of the patch. Table pg_catalog.pg_extension_feature Column | Type | Modifiers +--+--- extoid | oid | not null extfeature | name | not null Indexes: pg_extension_feature_index UNIQUE, btree (extoid, extfeature) pg_extension_feature_oid_index UNIQUE, btree (oid) * I'm not quit sure why pg_extension_feature_index needs extoid column. That allows managing features per extension: you need to know which extension is providing which feature to be able to solve dependencies. Do you mean you want UNIQUE constraint by this index? I found the usage is to search feature by (only) its name, so I wondered if extoid is not necessary. * I have a big question to add two-column catalog. I don't mind the actual number of columns, but if the table has only two columns, it implies the design may be bad. Only two column catalog other than this is pg_largeobject_metadata. We need each feature to be a full PostgreSQL object so that we can use the dependency tracking. That allows to manage DROP EXTENSION foo and cascade to extensions that depend on feature(s) provided by foo. I guess if we spend more time, we'll figure out what is feature actually, and then will see what kind of columns/attributes are needed to represent it. Although I agree we can add them later, again, this may imply the design is premature. (it's ok if i am the only person who thinks so) Next, some questions: - Why is the finer dependency needed? Do you have tangible example that struggles with the dependency granularity? I feel so good about the existing dependency on extension as an extension developer of several ones. The problem is not yet very apparent only because extensions are very new. The main thing we address with this patch is depending on a feature that appeared while developing an extension or that gets removed down the line. It allows to depend on features and avoid needing to compare version numbers and maintain a list of which version number is providing which feature. This feature has been asked by several extension users, beginning even before 9.1 got released. - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. db1=# drop extension kmeans cascade; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Am I missing something? I'm confused why this happens. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Mon, Jan 23, 2012 at 2:00 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hitoshi Harada umi.tan...@gmail.com writes: - What happens if DROP EXTENSION ... CASCADE? Does it work? It should, what happens when you try? :) I just tried DROP EXTENSION now, and found it broken :( db1=# create extension kmeans; CREATE EXTENSION db1=# drop extension kmeans; ERROR: cannot drop extension kmeans because extension feature kmeans requires it HINT: You can drop extension feature kmeans instead. Can you provide me the test case you've been using? That looks like a bug I need to fix, indeed (unless the problem lies in the test case, which would mean I need to tighten things some more). The test case is just above; createdb db1 and create and drop an extension. The kmean extension is on pgxn. I tried my small test extension named ext1 which contains only one plpgsql function, and created it then dropped it, reproduced. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Thu, Jan 19, 2012 at 1:58 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Wed, Jan 18, 2012 at 1:11 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Sat, Jan 14, 2012 at 8:10 AM, Matthew Draper matt...@trebex.net wrote: I just remembered to make time to advance this from WIP to proposed patch this week... and then worked out I'm rudely dropping it into the last commitfest at the last minute. :/ The patch applies clean against master but compiles with warnings. functions.c: In function ‘prepare_sql_fn_parse_info’: functions.c:212: warning: unused variable ‘argnum’ functions.c: In function ‘sql_fn_post_column_ref’: functions.c:341: warning: implicit declaration of function ‘ParseFuncOrColumn’ functions.c:345: warning: assignment makes pointer from integer without a cast (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) I fixed it here and it now works with my environment. The regression tests pass, the feature seems working as aimed, but it seems to me that it needs more test cases and documentation. For the tests, I believe at least we need ambiguous case given upthread, so that we can ensure to keep compatibility. For the document, it should describe the name resolution rule, as stated in the patch comment. Aside from them, I wondered at first what if the function is schema-qualified. Say, CREATE FUNCTION s.f(a int) RETURNS int AS $$ SELECT b FROM t WHERE a = s.f.a $$ LANGUAGE sql; It actually errors out, since function-name-qualified parameter only accepts function name without schema name, but it looked weird to me at first. No better idea from me at the moment, though. I mark this Waiting on Author for now. It's been a few days since my last comment, but are you sending a new patch? If there's no reply, I'll make it Returned with Feedback. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Finer Extension dependencies
On Sun, Dec 18, 2011 at 6:36 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote: Hi, The extensions work we began in 9.1 is not yet finished entirely (*cough*), so I'm opening a new patch series here by attacking the dependency problems. Some people want us to manage extension version numbers with sorting semantics so that we are able to depend on foo = 1.2 and crazy things like this, and I think the need is reasonable and easier than that to address. The patch applies with one reject, which I could fix easily. The make check passed. extension-provides.v1.patch extensions.docx haradh1-mac:postgresql-head haradh1$ patch -p1 ~/Downloads/extension-provides.v1.patch patching file contrib/pg_upgrade_support/pg_upgrade_support.c patching file doc/src/sgml/catalogs.sgml Hunk #2 succeeded at 3063 (offset 11 lines). Hunk #3 succeeded at 6865 (offset 11 lines). patching file doc/src/sgml/extend.sgml patching file src/backend/catalog/Makefile patching file src/backend/catalog/dependency.c patching file src/backend/catalog/system_views.sql patching file src/backend/commands/extension.c patching file src/include/catalog/dependency.h patching file src/include/catalog/indexing.h patching file src/include/catalog/pg_extension_feature.h patching file src/include/catalog/pg_proc.h Hunk #1 succeeded at 4341 (offset 25 lines). patching file src/include/commands/extension.h patching file src/test/regress/expected/rules.out patching file src/test/regress/expected/sanity_check.out Hunk #1 succeeded at 103 (offset 1 line). Hunk #2 FAILED at 163. 1 out of 2 hunks FAILED -- saving rejects to file src/test/regress/expected/sanity_check.out.rej What this patch does is basically: - add pg_extension_feature to store feature (name) provided by extensions - extension control file now has provide parameter to indicate feature, which is comma separated - when creating an extension, the backend looks for feature required in control file - the installed extension has dependency on feature So, the first thing is catalog. db1=# \d pg_extension_feature; Table pg_catalog.pg_extension_feature Column | Type | Modifiers +--+--- extoid | oid | not null extfeature | name | not null Indexes: pg_extension_feature_index UNIQUE, btree (extoid, extfeature) pg_extension_feature_oid_index UNIQUE, btree (oid) * I'm not quit sure why pg_extension_feature_index needs extoid column. * I have a big question to add two-column catalog. I don't mind the actual number of columns, but if the table has only two columns, it implies the design may be bad. Only two column catalog other than this is pg_largeobject_metadata. Next, some questions: - Why is the finer dependency needed? Do you have tangible example that struggles with the dependency granularity? I feel so good about the existing dependency on extension as an extension developer of several ones. - What happens if DROP EXTENSION ... CASCADE? Does it work? - How does pg_upgrade interact with this? The dependency changes. A minor memo. list_extension_features(): I guess the size argument for repalloc is bogus. So, that's pretty much I've reviewed quickly. I'll need more time to look in detail, but I'd like more inputs for the high-level design and direction. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Wed, Jan 18, 2012 at 1:11 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Sat, Jan 14, 2012 at 8:10 AM, Matthew Draper matt...@trebex.net wrote: I just remembered to make time to advance this from WIP to proposed patch this week... and then worked out I'm rudely dropping it into the last commitfest at the last minute. :/ The patch applies clean against master but compiles with warnings. functions.c: In function ‘prepare_sql_fn_parse_info’: functions.c:212: warning: unused variable ‘argnum’ functions.c: In function ‘sql_fn_post_column_ref’: functions.c:341: warning: implicit declaration of function ‘ParseFuncOrColumn’ functions.c:345: warning: assignment makes pointer from integer without a cast (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) I fixed it here and it now works with my environment. The regression tests pass, the feature seems working as aimed, but it seems to me that it needs more test cases and documentation. For the tests, I believe at least we need ambiguous case given upthread, so that we can ensure to keep compatibility. For the document, it should describe the name resolution rule, as stated in the patch comment. Aside from them, I wondered at first what if the function is schema-qualified. Say, CREATE FUNCTION s.f(a int) RETURNS int AS $$ SELECT b FROM t WHERE a = s.f.a $$ LANGUAGE sql; It actually errors out, since function-name-qualified parameter only accepts function name without schema name, but it looked weird to me at first. No better idea from me at the moment, though. I mark this Waiting on Author for now. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Publish checkpoint timing and sync files summary data to pg_stat_bgwriter
On Sun, Jan 15, 2012 at 10:46 PM, Greg Smith g...@2ndquadrant.com wrote: On 01/16/2012 01:28 AM, Greg Smith wrote: -I can't tell for sure if this is working properly when log_checkpoints is off. This now collects checkpoint end time data in all cases, whereas before it ignored that work if log_checkpoints was off. ...and there's at least one I missed located already: inside of md.c. I'd forgotten how many spots where timing calls are optimized out are floating around this code path. I think CF app page for this patch has missing link with wrong message-id. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Allow SQL-language functions to reference parameters by parameter name
On Sat, Jan 14, 2012 at 8:10 AM, Matthew Draper matt...@trebex.net wrote: I just remembered to make time to advance this from WIP to proposed patch this week... and then worked out I'm rudely dropping it into the last commitfest at the last minute. :/ The patch applies clean against master but compiles with warnings. functions.c: In function ‘prepare_sql_fn_parse_info’: functions.c:212: warning: unused variable ‘argnum’ functions.c: In function ‘sql_fn_post_column_ref’: functions.c:341: warning: implicit declaration of function ‘ParseFuncOrColumn’ functions.c:345: warning: assignment makes pointer from integer without a cast Then, I ran make check but hit a bunch of crash. Looking closer, I found the FieldSelect returned from ParseFuncOrColumn is trimmed to 32bit pointer and subsequent operation on this is broken. I found unnecessary cltq is inserted after call. 0x0001001d8288 sql_fn_post_column_ref+748:mov$0x0,%eax 0x0001001d828d sql_fn_post_column_ref+753:callq 0x100132f75 ParseFuncOrColumn 0x0001001d8292 sql_fn_post_column_ref+758:cltq 0x0001001d8294 sql_fn_post_column_ref+760:mov%rax,-0x48(%rbp) My environment: 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:32:41 PDT 2011; root:xnu-1504.15.3~1/RELEASE_X86_64 x86_64 $ gcc -v Using built-in specs. Target: i686-apple-darwin10 Configured with: /var/tmp/gcc/gcc-5666.3~6/src/configure --disable-checking --enable-werror --prefix=/usr --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin10 --program-prefix=i686-apple-darwin10- --host=x86_64-apple-darwin10 --target=i686-apple-darwin10 --with-gxx-include-dir=/include/c++/4.2.1 Thread model: posix gcc version 4.2.1 (Apple Inc. build 5666) (dot 3) (Now it occurred to me that forgetting the #include parse_func.h might hit this breakage..., so I'll fix it here and continue to test, but if you'll fix it yourself, let me know) Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers
On Sat, Jan 14, 2012 at 6:48 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jan 14, 2012 at 5:25 AM, Hitoshi Harada umi.tan...@gmail.com wrote: The patch looks ok, though I wonder if we could have a way to release the lock on namespace much before the end of transaction. Well, that wold kind of miss the point, wouldn't it? I mean, the race is that the process dropping the schema might not see the newly created object. The only way to prevent that is to hold some sort of lock until the newly created object is visible, which doesn't happen until commit. I know it's a limited situation, though. Maybe the right way would be to check the namespace at the end of the transaction if any object is created, rather than locking it. And then what? If you find that you created an object in a namespace that's been subsequently dropped, what do you do about that? As far as I can see, your own really choice would be to roll back the transaction that uncovers this problem, which is likely to produce more rollbacks than just letting the deadlock detector sort it out. Right. I thought this patch introduced lock on schema in SELECT, but actually we'd been locking schema with SELECT since before the patch. So the behavior becomes consistent (between SELECT and CREATE) now with it. And I agree that my insist is pointless after looking more. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers
On Fri, Jan 13, 2012 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 29, 2011 at 10:10 AM, Robert Haas robertmh...@gmail.com wrote: I have plans to try to improve this, but it's one of those things that I care about more than the people who write the checks do, so it hasn't quite gotten to the top of the priority list yet. All right... I have a patch that I think fixes this, at least so far as relations are concerned. I rewrote RangeVarGetAndCheckCreationNamespace() extensively, did surgery on its callers, and then modified CREATE OR REPLACE VIEW and ALTER relkind .. SET SCHEMA to make use of it rather than doing things as they did before. So this patch prevents (1) concurrently dropping a schema in which a new relation is being created, (2) concurrently dropping a schema into which an existing relation is being moved, and (3) using CREATE OR REPLACE VIEW to attempt to obtain a lock on a relation you don't own (the command would eventually fail, of course, but if there were, say, an outstanding AccessShareLock on the relation, you'd queue up for the lock and thus prevent any further locks from being granted, despite your lack of any rights on the target). The patch looks ok, though I wonder if we could have a way to release the lock on namespace much before the end of transaction. Since the lock is held all the time, now the possibility of dead lock is bigger. Say, -- tx1 BEGIN; SELECT * FROM s.x; DROP SCHEMA t; -- tx2 BEGIN; SELECT * FROM t.y; DROP SCHEMA s; I know it's a limited situation, though. Maybe the right way would be to check the namespace at the end of the transaction if any object is created, rather than locking it. It doesn't fix any of the non-relation object types. That would be nice to do, but I think it's material for a separate patch. I took a quick look under src/include/catalog and the objects that have namespace are - collation - constraint - conversion - extension - operator - operator class - operator family - function (proc) - ts_config - ts_dict - ts_parser - ts_template - (what's missing?) I agree with you that it's not worth doing everything, but function is nice to have. I don't mind if we don't have it with the other objects. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers
On Sat, Jan 14, 2012 at 2:25 AM, Hitoshi Harada umi.tan...@gmail.com wrote: On Fri, Jan 13, 2012 at 2:05 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 29, 2011 at 10:10 AM, Robert Haas robertmh...@gmail.com wrote: I have plans to try to improve this, but it's one of those things that I care about more than the people who write the checks do, so it hasn't quite gotten to the top of the priority list yet. All right... I have a patch that I think fixes this, at least so far as relations are concerned. I rewrote RangeVarGetAndCheckCreationNamespace() extensively, did surgery on its callers, and then modified CREATE OR REPLACE VIEW and ALTER relkind .. SET SCHEMA to make use of it rather than doing things as they did before. So this patch prevents (1) concurrently dropping a schema in which a new relation is being created, (2) concurrently dropping a schema into which an existing relation is being moved, and (3) using CREATE OR REPLACE VIEW to attempt to obtain a lock on a relation you don't own (the command would eventually fail, of course, but if there were, say, an outstanding AccessShareLock on the relation, you'd queue up for the lock and thus prevent any further locks from being granted, despite your lack of any rights on the target). The patch looks ok, though I wonder if we could have a way to release the lock on namespace much before the end of transaction. Since the lock is held all the time, now the possibility of dead lock is bigger. Say, -- tx1 BEGIN; SELECT * FROM s.x; DROP SCHEMA t; -- tx2 BEGIN; SELECT * FROM t.y; DROP SCHEMA s; Although I wrote I know it's a limited situation, though. Maybe the right way would be to check the namespace at the end of the transaction if any object is created, rather than locking it. actually what's surprising to me is that even SELECT holds lock on namespace to the end of transaction. The ideal way is that we lock only on CREATE or other DDL. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reprise: pretty print viewdefs
On Tue, Dec 27, 2011 at 8:02 AM, Andrew Dunstan and...@dunslane.net wrote: Updated, with docs and tests. Since the docs mark the versions of pg_get_viewdef() that take text as the first param as deprecated, I removed that variant of the new flavor. I left adding extra psql support to another day - I think this already does a good job of cleaning it up without any extra adjustments. I'll add this to the upcoming commitfest. I've looked at (actually tested with folks in reviewfest, so not so in detail :P) the patch. It applies with some hunks and compiles cleanly, documentation and tests are prepared. make install passed without fails. $ patch -p1 ~/Downloads/viewdef.patch patching file doc/src/sgml/func.sgml Hunk #1 succeeded at 13736 (offset -22 lines). Hunk #2 succeeded at 13747 (offset -22 lines). patching file src/backend/utils/adt/ruleutils.c patching file src/include/catalog/pg_proc.h Hunk #1 succeeded at 3672 (offset -43 lines). patching file src/include/utils/builtins.h Hunk #1 succeeded at 604 (offset -20 lines). patching file src/test/regress/expected/polymorphism.out Hunk #1 succeeded at 1360 (offset -21 lines). patching file src/test/regress/expected/rules.out patching file src/test/regress/expected/with.out patching file src/test/regress/sql/rules.sql The change of the code is in only ruleutiles.c, which looks sane to me, with some trailing white spaces. I wonder if it's worth working more than on target list, namely like FROM clause, expressions. For example, pg_get_viewdef('pg_stat_activity', true). # select pg_get_viewdef('pg_stat_activity'::regclass, true); pg_get_viewdef -- SELECT s.datid, d.datname, s.procpid, s.usesysid, u.rolname AS usename, + s.application_name, s.client_addr, s.client_hostname, s.client_port, + s.backend_start, s.xact_start, s.query_start, s.waiting, s.current_query + FROM pg_database d, pg_stat_get_activity(NULL::integer) s(datid, procpid, usesysid, application_name, current_query, waiting, xact_start, query_start, backend_start, client_addr, client_hostname, client_port), pg_authid u+ WHERE s.datid = d.oid AND s.usesysid = u.oid; (1 row) It doesn't help much although the SELECT list is wrapped within 80 characters. For expressions, I mean: =# select pg_get_viewdef('v', true); pg_get_viewdef - SELECT a.a, a.a - 1 AS b, a.a - 2 AS c, a.a - 3 AS d, + a.a / 1 AS long_long_name, a.a * 1000 AS bcd, + CASE + WHEN a.a = 1 THEN 1 + WHEN (a.a + a.a + a.a + a.a + a.a + a.a + a.a + a.a + a.a + a.a + a.a + a.a) = 2 THEN 2+ ELSE 10 + END AS what_about_case_expression + FROM generate_series(1, 100) a(a); (1 row) So my conclusion is it's better than nothing, but we could do better job here. From timeline perspective, it'd be ok to apply this patch and improve more later in 9.3+. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSON for PG 9.2
On Tue, Dec 13, 2011 at 1:13 PM, Merlin Moncure mmonc...@gmail.com wrote: Mozilla SpiderMonkey seems like a good fit: it compiles to a dependency free .so, has excellent platform support, has a stable C API, and while it's C++ internally makes no use of exceptions (in fact, it turns them off in the c++ compiler). ISTM to be a suitable foundation for an external module, 'in core' parser, pl, or anything really. When I started to think about PL/js, I compared three of SpiderMonkey, SquirrelFish, and V8. SpiderMonkey at that time (around 2009) was not-fast, not-small in memory while what you raise, as well as its advanced features like JS1.7 (pure yield!), was attractive. Also SpiderMonkey was a little harder to build in arbitrary platform (including Windows) than v8. SquirrelFish was fastest of three, but yet it's sticky with Webkit and also hard to build itself. Dunno how they've changed since then. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feature proposal: www_fdw
On Sun, Nov 27, 2011 at 10:28 AM, Alexander Soudakov cyga...@gmail.com wrote: Hello. I finished developing www_fdw: https://github.com/cyga/www_fdw/ It has docs/examples: https://github.com/cyga/www_fdw/wiki I haven't upload it to pgxn or pgfoundry yet. I want to ask for the following 2 things: 1. testing from community; 2. how can I add my extension to official fdw list: http://wiki.postgresql.org/wiki/Foreign_data_wrappers Is there any procedure for it? You need a community login to edit wiki. Go https://www.postgresql.org/account/login/?next=/account/ for sign up. Now that you have uploaded it to PGXN, I hope many people will test it. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] (PATCH) Adding CORRESPONDING to Set Operations
On Mon, Nov 14, 2011 at 6:09 AM, Kerem Kat kerem...@gmail.com wrote: On Mon, Nov 14, 2011 at 15:32, Tom Lane t...@sss.pgh.pa.us wrote: Kerem Kat kerem...@gmail.com writes: Corresponding is currently implemented in the parse/analyze phase. If it were to be implemented in the planning phase, explain output would likely be as you expect it to be. It's already been pointed out to you that doing this at parse time is unacceptable, because of the implications for reverse-listing of rules (views). regards, tom lane I am well aware of that thank you. I took a quick look at the patch and found some miscellaneous points including: - don't use // style comment - keep consistent in terms of space around parenthesis like if and foreach - ereport should have error position as long as possible, especially in syntax error - I'm not convinced that new correspoinding_union.sql test is added. I prefer to include new tests in union.sql - CORRESPONDING BY should have column name list, not expression list. It's not legal to say CORRESPONDING BY (1 + 1) - column list rule should be presented in document, too - I don't see why you call checkWellFormedRecursionWalker on corresponding clause And more than above, Tom's point is the biggest blocker. I'd suggest to rework it so that target list of Query of RangeTblEntry on the top of tree have less columns that match the filter. By that way, I guess you can keep original information as well as filtered top-most target list. Eventually you need to work on the planner, too. Though I've not read all of the patch, the design rework should be done first. I'll mark this as Waiting on Author. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2011/10/25 Shigeru Hanada shigeru.han...@gmail.com: Connection management = The pgsql_fdw establishes a new connection when a foreign server is accessed first for the local session. Established connection is shared between all foreign scans in the local query, and shared between even scans in following queries. Connections are discarded when the current transaction aborts so that unexpected failure won't cause connection leak. This is implemented with resource owner mechanism. I have a doubt here, on sharing connection for each server. What if there are simultaneous scan on the same plan? Say, - Nested Loop - Foreign Scan to table T1 on server A - Foreign Scan to table T2 on server A Okay, you are thinking about Foreign Join, so example above is too simple. But it is always possible to execute such a query if foreign scan nodes are separated far, isn't it? As far as I see from your explanation, scan T1 and scan T2 share the same connection. Now join node scans one row from left (T1) while asking rows from right (T2) without fetching all the rows from left. If T2 requests to server A, the connection's result (of T1) is discarded. Am I understand correctly? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
On Sat, Oct 29, 2011 at 8:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Hitoshi Harada umi.tan...@gmail.com writes: I have a doubt here, on sharing connection for each server. What if there are simultaneous scan on the same plan? Say, - Nested Loop - Foreign Scan to table T1 on server A - Foreign Scan to table T2 on server A Okay, you are thinking about Foreign Join, so example above is too simple. But it is always possible to execute such a query if foreign scan nodes are separated far, isn't it? As far as I see from your explanation, scan T1 and scan T2 share the same connection. Now join node scans one row from left (T1) while asking rows from right (T2) without fetching all the rows from left. If T2 requests to server A, the connection's result (of T1) is discarded. Am I understand correctly? I have not looked at the code, but ISTM the way that this has to work is that you set up a portal for each active scan. Then you can fetch a few rows at a time from any one of them. Hmm, true. Looking back at the original proposal (neither did I look at the code,) there seems to be a cursor mode. ISTM it is hard for fdw to know how the whole plan tree looks, so consequently do we always cursor regardless of estimated row numbers? I haven't had much experiences around cursor myself, but is it as efficient as non-cursor? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Underspecified window queries in regression tests
2011/10/17 Tom Lane t...@sss.pgh.pa.us: Hitoshi Harada umi.tan...@gmail.com writes: 2011/10/15 Tom Lane t...@sss.pgh.pa.us: I can't recall whether there was some good reason for underspecifying these test queries. It looks like all the problematic ones were added in commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 Extend the set of frame options supported for window functions, which means it was either me or Hitoshi-san who wrote them that way, but memory is not serving this afternoon. I don't remember if I wrote that part or not, but I like to add explicit ORDER BY to the test cases. It doesn't appear that some reason stopped us to specify it. So +1 for adding the clauses. I looked at this more closely and realized that the reason for doing it like that was to test window frames defined using ROWS rather than RANGE. If we fully specify the window function's input ordering then there's no very interesting distinction between the two, because no rows will have any peers. So adding ORDER BY would in fact reduce the scope of the tests. At this point I'm inclined to leave it alone. Maybe we could think of some other test cases (perhaps using some other function than SUM) which would both exercise the difference between RANGE and ROWS mode, and not be sensitive to the detailed input ordering. But I doubt it's really worth the trouble. Ah, you mentioned about ORDER BY in window specification (OVER clause). I thought it was query's ORDER BY. Yes, it affects in RANGE case, and we don't have rich frame support of RANGE (like n PRECEDING ...) so the case ORDER BY affects result is limited. Agree with leaving it alone. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Underspecified window queries in regression tests
2011/10/17 Greg Stark st...@mit.edu: On Fri, Oct 14, 2011 at 9:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: We could hack around this by adding more columns to the result so that an index-only scan doesn't work. But I wonder whether it wouldn't be smarter to add ORDER BY clauses to the window function calls. I've been known to argue against adding just-in-case ORDER BYs to the regression tests in the past; but these cases bother me more because a plan change will not just rearrange the result rows but change their contents, making it really difficult to verify that nothing's seriously wrong. I'm not sure if it applies to this case but I recall I was recently running queries on Oracle that included window functions and it wouldn't even let me run them without ORDER BY clauses in the window definition. I don't know if it cleverly determines that the ORDER BY will change the results or if Oracle just requires ORDER BY on all window definitions or what. AFAIK, the current standard doesn't tell clearly if all/some window functions require ORDER BY clause in window specifications. Some window functions like rank and row_number is meaningless if it is omitted, so some implementation doesn't allow it omitted. And I believe Oracle implemented it before the standard, so that'd be why details are different from spec. We designed it per spec and omitting the clause doesn't violate any part of the standard. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Underspecified window queries in regression tests
2011/10/15 Tom Lane t...@sss.pgh.pa.us: I can't recall whether there was some good reason for underspecifying these test queries. It looks like all the problematic ones were added in commit ec4be2ee6827b6bd85e0813c7a8993cfbb0e6fa7 Extend the set of frame options supported for window functions, which means it was either me or Hitoshi-san who wrote them that way, but memory is not serving this afternoon. I don't remember if I wrote that part or not, but I like to add explicit ORDER BY to the test cases. It doesn't appear that some reason stopped us to specify it. So +1 for adding the clauses. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Questions and experiences writing a Foreign Data Wrapper
2011/8/26 Albe Laurenz laurenz.a...@wien.gv.at: I wrote: I wrote a FDW for Oracle to a) learn some server coding and b) see how well the FDW API works for me. I have released the software on PgFoundry: http://oracle-fdw.projects.postgresql.org/ Would it make sense to mention that in chapter 5.10 of the documentation? Let's share it on PGXN! There are already three FDWs, and I'm gonig to add one more. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Short document fix
In the CREATE DOMAIN reference page of the current HEAD, it says --- CREATE DOMAIN us_postal_code AS TEXT CHECK( VALUE ~ '^\\d{5}$' OR VALUE ~ '^\\d{5}-\\d{4}$' ); --- but I believe it should conform the standard string style now that the default is standard_conforming_strings = on. I didn't grep if there other pages like this. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))
2011/7/27 Yeb Havinga yebhavi...@gmail.com: On 2011-07-22 17:35, Hitoshi Harada wrote: 2011/7/23 Yeb Havingayebhavi...@gmail.com: A few days ago I read Tomas Vondra's blog post about dss tpc-h queries on PostgreSQL at http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in which he showed how to manually pull up a dss subquery to get a large speed up. Initially I thought: cool, this is probably now handled by Hitoshi's patch, but it turns out the subquery type in the dss query is different. The original and rewritten queries are below. The debug_print_plan output shows the subquery is called from a opexpr ( l_quantity, subquery output) and the sublink type is EXPR_SUBLINK. Looking at the source code; pull_up_sublink only considers ANY and EXISTS sublinks. I'm wondering if this could be expanded to deal with EXPR sublinks. Clearly in the example Tomas has given this can be done. I'm wondering if there are show stoppers that prevent this to be possible in the general case, but can't think of any, other than the case of a sublink returning NULL and the opexpr is part of a larger OR expression or IS NULL; in which case it should not be pulled op, or perhaps it could be pulled up as outer join. Thoughts? Good catch. I was not aware of the sublink case so I'm not sure if it is possible, but I believe it will be worth modifying the optimizer to handle them in the same way. Since my latest proposal is based on parameterized NestLoop, the first step is how to transform the sublink expression into join. I bet there are chances in simple cases since we have Semi/Anti Join technique. On the other hand, those pseudo-join types are easily failing to be transformed to join, in such cases above like it have another filter clause than join qual expression. If tpc bechmark can be speed up that's a good use case which you pointed out I'm missing. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pull up aggregate sublink (was: Parameterized aggregate subquery (was: Pull up aggregate subquery))
2011/7/27 Tom Lane t...@sss.pgh.pa.us: Yeb Havinga yebhavi...@gmail.com writes: A few days ago I read Tomas Vondra's blog post about dss tpc-h queries on PostgreSQL at http://fuzzy.cz/en/articles/dss-tpc-h-benchmark-with-postgresql/ - in which he showed how to manually pull up a dss subquery to get a large speed up. Initially I thought: cool, this is probably now handled by Hitoshi's patch, but it turns out the subquery type in the dss query is different. Actually, I believe this example is the exact opposite of the transformation Hitoshi proposes. Tomas was manually replacing an aggregated subquery by a reference to a grouped table, which can be a win if the subquery would be executed enough times to amortize calculation of the grouped table over all the groups (some of which might never be demanded by the outer query). Hitoshi was talking about avoiding calculations of grouped-table elements that we don't need, which would be a win in different cases. Or at least that was the thrust of his original proposal; I'm not sure where the patch went since then. My first proposal which is about pulling up aggregate like sublink expression is exact opposite of this (Tomas pushed down the sublink expression to join subquery). But the latest proposal is upon parameterized NestLoop, so I think my latest patch might help something for the *second* query. Actually the problem is the same; We want to reduce grouping operation which is not interesting to the final output, by filtering other relations expression. In this case, if the joined lineitem-part relation has very few rows by WHERE conditions (p_brand, p_container), we don't want calculate avg of huge lineitem because we know almost all of the avg result is not in the upper result. However, the current optimizer cannot pass the upper query's condition (like it will have only few rows) down to the lower aggregate query. This leads me to think that we need to represent both cases as the same sort of query and make a cost-based decision as to which way to go. Thinking of it as a pull-up or push-down transformation is the wrong approach because those sorts of transformations are done too early to be able to use cost comparisons. Wrapping up my mind above and reading this paragraph, it might be another work to make sublink expression look like the same as join. But what we want to solve is the same goal, I think. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/7/22 Yeb Havinga yebhavi...@gmail.com: On 2011-07-02 10:02, Hitoshi Harada wrote: Although I still need to think about suitable regression test case, the patch itself can be reviewed again. You may want to try some additional tests as you imagine after finding my test case gets quicker. Hello Hitoshi-san, Hi, I double checked that I had applied the patch (git diff shows the patch), installed and restarted postgres. The database is a fresh created database with no edits in postgresql.conf. :( I updated the patch. Could you try attached once more? The issafe switch seems wrong. Thanks, -- Hitoshi Harada aggjoin-20110722.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/7/23 Yeb Havinga yebhavi...@gmail.com: On 2011-07-22 16:17, Hitoshi Harada wrote: :( I updated the patch. Could you try attached once more? The issafe switch seems wrong. Works like a charm :-). However, now there is always a copyObject of a subquery even when the subquery is not safe for qual pushdown. The problem with the previous issafe was that it was only assigned for rel-baserestrictinfo != NIL. If it is assigned before the if statement, it still works. See attached patch that avoids subquery copy for unsafe subqueries, and also exits best_inner_subqueryscan before palloc of differenttypes in case of unsafe queries. Ah, yeah, right. Too quick fix bloated my brain :P Thanks for testing! I'll check it more. -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/7/5 Hitoshi Harada umi.tan...@gmail.com: 2011/7/5 Yeb Havinga yebhavi...@gmail.com: Hello Hitosh, list, Attached is revised version. I failed to attached the patch. I'm trying again. I'm currently unable to test, since on holiday. I'm happy to continue testing once returned but it may not be within the bounds of the current commitfest, sorry. Thanks for replying. Regarding the time remained until the end of this commitfest, I'd think we should mark this item Returned with Feedback if no objection appears. I will be very happy if Yeb tests my latest patch after he comes back. I'll make up my mind around regression test. It seems that Yeb marked Returned with Feedback. Thanks for the review again. Still, I'd appreciate if further review is done on my latest patch. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add GiST support for BOX @ POINT queries
2011/6/19 Hitoshi Harada umi.tan...@gmail.com: 2011/6/17 Andrew Tipton andrew.t.tip...@gmail.com: At this point I'm a bit lost -- while pg_amop.h has plenty of examples of crosstype comparison operators for btree index methods, there are none for GiST. Is GiST somehow a special case in this regard? It was I that was lost. As Tom mentioned, GiST indexes have records in pg_amop in their specialized way. I found gist_point_consistent has some kind of hack for that and pg_amop for point_ops records have multiple crosstype for that. So, if I understand correctly your first approach modifying gist_box_consistent was the right way, although trivial issues should be fixed. Also, you may want to follow point_ops when you are worried if the counterpart operator of commutator should be registered or not. Looking around those mechanisms, it occurred to me that you mentioned only box @ point. Why don't you add circly @ point, poly @ point as well as box? Is that hard? It looks like the time to wrap up. I marked Return with Feedback on this patch, since response from author has not come for a while. You may think the fix was pretty easy and the patch be small, but more general approach was preferred, I guess. Looking forward to seeing it in better shape next time! Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/7/5 Yeb Havinga yebhavi...@gmail.com: Hello Hitosh, list, Attached is revised version. I failed to attached the patch. I'm trying again. I'm currently unable to test, since on holiday. I'm happy to continue testing once returned but it may not be within the bounds of the current commitfest, sorry. Thanks for replying. Regarding the time remained until the end of this commitfest, I'd think we should mark this item Returned with Feedback if no objection appears. I will be very happy if Yeb tests my latest patch after he comes back. I'll make up my mind around regression test. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/6/29 Yeb Havinga yebhavi...@gmail.com: On 2011-06-17 09:54, Hitoshi Harada wrote: While reviewing the gist/box patch, I found some planner APIs that can replace parts in my patch. Also, comments in includes wasn't updated appropriately. Revised patch attached. Hello Hitoshi-san, I read your latest patch implementing parameterizing subquery scans. Attached is revised version. 1) In the email from june 9 with the patch You wrote: While IndexScan is simple since its information like costs are well known by the base relation, SubqueryScan should re-plan its Query to gain that, which is expensive. Initial concerns I had were caused by misinterpreting 'replanning' as: for each outer tuple, replan the subquery (it sounds a bit like 'ReScan'). Though the general comments in the patch are helpful, I think it would be an improvement to describe why subqueries are planned more than once, i.e. something like best_inner_subqueryscan Try to find a better subqueryscan path and its associated plan for each join clause that can be pushed down, in addition to the path that is already calculated by set_subquery_pathlist. I changed comments around set_subquery_pathlist and best_inner_subqueryscan. 2) Since 'subquery_is_pushdown_safe' is invariant under join clause push down, it might be possible to have it called only once in set_subquery_pathlist, i.e. by only attaching rel-preprocessed_subquery if the subquery_is_pushdown_safe. I modified as you suggested. 3) /* * set_subquery_pathlist * Build the (single) access path for a subquery RTE */ This unchanged comment is still correct, but 'the (single) access path' might have become a bit misleading now there are multiple paths possible, though not by set_subquery_pathlist. As noted #1. 4) somewhere in the patch s/regsitered/registered/ Fixed. 5) Regression tests are missing; I think at this point they'd aid in speeding up development/test cycles. I'm still thinking about it. I can add complex test but the concept of regression test focuses on small pieces of simple cases. I don't want take pg_regress much more than before. 6) Before patching Postgres, I could execute the test with the size_l and size_m tables, after patching against current git HEAD (patch without errors), I get the following error when running the example query: ERROR: plan should not reference subplan's variable I get the same error with the version from june 9 with current git HEAD. Fixed. Some variable initializing was wrong. 7) Since both set_subquery_pathlist and best_inner_subqueryscan push down clauses, as well as add a path and subplan, could a generalized function be made to support both, to reduce duplicate code? No touch as answered before. Although I still need to think about suitable regression test case, the patch itself can be reviewed again. You may want to try some additional tests as you imagine after finding my test case gets quicker. Thanks, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/7/2 Hitoshi Harada umi.tan...@gmail.com: 2011/6/29 Yeb Havinga yebhavi...@gmail.com: On 2011-06-17 09:54, Hitoshi Harada wrote: While reviewing the gist/box patch, I found some planner APIs that can replace parts in my patch. Also, comments in includes wasn't updated appropriately. Revised patch attached. Hello Hitoshi-san, I read your latest patch implementing parameterizing subquery scans. Attached is revised version. I failed to attached the patch. I'm trying again. 1) In the email from june 9 with the patch You wrote: While IndexScan is simple since its information like costs are well known by the base relation, SubqueryScan should re-plan its Query to gain that, which is expensive. Initial concerns I had were caused by misinterpreting 'replanning' as: for each outer tuple, replan the subquery (it sounds a bit like 'ReScan'). Though the general comments in the patch are helpful, I think it would be an improvement to describe why subqueries are planned more than once, i.e. something like best_inner_subqueryscan Try to find a better subqueryscan path and its associated plan for each join clause that can be pushed down, in addition to the path that is already calculated by set_subquery_pathlist. I changed comments around set_subquery_pathlist and best_inner_subqueryscan. 2) Since 'subquery_is_pushdown_safe' is invariant under join clause push down, it might be possible to have it called only once in set_subquery_pathlist, i.e. by only attaching rel-preprocessed_subquery if the subquery_is_pushdown_safe. I modified as you suggested. 3) /* * set_subquery_pathlist * Build the (single) access path for a subquery RTE */ This unchanged comment is still correct, but 'the (single) access path' might have become a bit misleading now there are multiple paths possible, though not by set_subquery_pathlist. As noted #1. 4) somewhere in the patch s/regsitered/registered/ Fixed. 5) Regression tests are missing; I think at this point they'd aid in speeding up development/test cycles. I'm still thinking about it. I can add complex test but the concept of regression test focuses on small pieces of simple cases. I don't want take pg_regress much more than before. 6) Before patching Postgres, I could execute the test with the size_l and size_m tables, after patching against current git HEAD (patch without errors), I get the following error when running the example query: ERROR: plan should not reference subplan's variable I get the same error with the version from june 9 with current git HEAD. Fixed. Some variable initializing was wrong. 7) Since both set_subquery_pathlist and best_inner_subqueryscan push down clauses, as well as add a path and subplan, could a generalized function be made to support both, to reduce duplicate code? No touch as answered before. Although I still need to think about suitable regression test case, the patch itself can be reviewed again. You may want to try some additional tests as you imagine after finding my test case gets quicker. Thanks, -- Hitoshi Harada -- Hitoshi Harada diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 681f5f8..039fd7f 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -1557,6 +1557,17 @@ _outTidPath(StringInfo str, TidPath *node) } static void +_outSubqueryPath(StringInfo str, SubqueryPath *node) +{ + WRITE_NODE_TYPE(SUBQUERYPATH); + + _outPathInfo(str, (Path *) node); + WRITE_NODE_FIELD(subplan); + WRITE_NODE_FIELD(subrtable); + WRITE_NODE_FIELD(subrowmark); +} + +static void _outForeignPath(StringInfo str, ForeignPath *node) { WRITE_NODE_TYPE(FOREIGNPATH); @@ -1738,9 +1749,6 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node) WRITE_NODE_FIELD(indexlist); WRITE_UINT_FIELD(pages); WRITE_FLOAT_FIELD(tuples, %.0f); - WRITE_NODE_FIELD(subplan); - WRITE_NODE_FIELD(subrtable); - WRITE_NODE_FIELD(subrowmark); WRITE_NODE_FIELD(baserestrictinfo); WRITE_NODE_FIELD(joininfo); WRITE_BOOL_FIELD(has_eclass_joins); @@ -2948,6 +2956,9 @@ _outNode(StringInfo str, void *obj) case T_TidPath: _outTidPath(str, obj); break; + case T_SubqueryPath: +_outSubqueryPath(str, obj); +break; case T_ForeignPath: _outForeignPath(str, obj); break; diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 47ab08e..354a3e5 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -31,6 +31,7 @@ #include optimizer/planner.h #include optimizer/prep.h #include optimizer/restrictinfo.h +#include optimizer/subselect.h #include optimizer/var.h #include parser/parse_clause.h #include parser/parsetree.h @@ -677,6 +678,12 @@ has_multiple_baserels(PlannerInfo *root) /* * set_subquery_pathlist * Build the (single) access path for a subquery RTE + * + * Although we build only one access path
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/6/30 Yeb Havinga yebhavi...@gmail.com: On 2011-06-29 19:22, Hitoshi Harada wrote: Other things are all good points. Thanks for elaborate review! More than anything, I'm going to fix the 6) issue, at least to find the cause. Some more questions: 8) why are cheapest start path and cheapest total path in best_inner_subqueryscan the same? Because best_inner_indexscan has the two. Actually one of them is enough so far. I aligned it as the existing interface but they might be one. 10) I have a hard time imagining use cases that will actually result in a alternative plan, especially since not all subqueries are allowed to have quals pushed down into, and like Simon Riggs pointed out that many users will write queries like this with the subqueries pulled up. If it is the case that the subqueries that can't be pulled up have a large overlap with the ones that are not pushdown safe (limit, set operations etc), there might be little actual use cases for this patch. I have seen many cases that this planner hack would help significantly, which were difficult to rewrite. Why were they difficult to write? Because, quals on size_m (and they have quals on size_l in fact) are usually very complicated (5-10 op clauses) and the join+agg part itself is kind of subquery in other big query. Of course there were workaround like split the statement to two, filtering size_m then aggregate size_l by the result of the first statement. However, it's against instinct. The reason why planner is in RDBMS is to let users to write simple (as needed) statements. I don't know if the example I raise here is common or not, but I believe the example represents one to many relation simply, therefore there should be many users who just don't find themselves currently in the slow query performance. I think the most important thing for this patch to go forward is to have a few examples, from which it's clear that the patch is beneficial. What will be good examples to show benefit of the patch? I guess the test case of size_m/size_l shows it. What lacks on the case, do you think? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/6/29 Yeb Havinga yebhavi...@gmail.com: On 2011-06-17 09:54, Hitoshi Harada wrote: While reviewing the gist/box patch, I found some planner APIs that can replace parts in my patch. Also, comments in includes wasn't updated appropriately. Revised patch attached. Hello Hitoshi-san, Hi Yeb, I read your latest patch implementing parameterizing subquery scans. 6) Before patching Postgres, I could execute the test with the size_l and size_m tables, after patching against current git HEAD (patch without errors), I get the following error when running the example query: ERROR: plan should not reference subplan's variable I get the same error with the version from june 9 with current git HEAD. It seems that something has changed since I developed the patch at first. The last one and the one before are not so different with each other, especially in terms of runtime but might not be tested enough. I tried time-slip of the local git branch to around june 10, but the same error occurs. The error message itself is not in parts changed recently, so I guess some surrounding change would affect it. 7) Since both set_subquery_pathlist and best_inner_subqueryscan push down clauses, as well as add a path and subplan, could a generalized function be made to support both, to reduce duplicate code? I tried it but decided as it is, for the future extensibility. The subquery parameterizing will (can) be extended more complicatedly. I'm not sure if we'd better gathering some small parts into one by throwing future capability. Other things are all good points. Thanks for elaborate review! More than anything, I'm going to fix the 6) issue, at least to find the cause. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Adding Japanese README
2011/6/30 Itagaki Takahiro itagaki.takah...@gmail.com: On Thu, Jun 30, 2011 at 09:42, Tatsuo Ishii is...@postgresql.org wrote: BTW I will talk to some Japanese speaking developers about my idea if community agree to add Japanese README to the source tree so that I am not the only one who are contributing this project Now, if someone wanted to set up a web site or Wiki page with translations, that would be up to them. IMHO, the Wiki approach seems to be reasonable than a README file. It will be suitable for adding non-Japanese translations and non-core developer can join to translate or fix the docs. +1. If we really want to prove the demand, let's start with Wiki, which is less invasive than README (though I doubt such pages would be updated finally.) Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add GiST support for BOX @ POINT queries
2011/6/17 Andrew Tipton andrew.t.tip...@gmail.com: On Fri, Jun 10, 2011 at 22:16, Hitoshi Harada umi.tan...@gmail.com wrote: Isn't it worth adding new consistent function for those purposes? The approach in the patch as stands looks kludge to me. Thanks for your review. Coming back to this patch after a few months' time, I have to say it looks pretty hackish to my eyes as well. :) I've attempted to add a new consistent function, gist_boxpoint_consistent(), but the GiST subsystem doesn't call it -- it continues to call gist_box_consistent(). My very simple testcase is: CREATE TABLE test (key TEXT PRIMARY KEY, boundary BOX NOT NULL); CREATE INDEX ON test USING gist (boundary); INSERT INTO test VALUES ('a', '(2,2,5,5)'), ('b', '(4,4,8,8)'), ('c', '(7,7,11,11)'); SELECT * FROM test WHERE boundary @ '(4,4)'::POINT; Prior to my patch, this query is executed as a straightforward seqscan. Once I add a new strategy to pg_amop.h: + DATA(insert ( 2593 603 600 7 s 433 783 0 )); (603 is the BOX oid, 600 is the POINT oid, and 433 is the @ operator oid): ...the plan switches to an index scan and gist_box_consistent() is called; at this point, the query fails to return the correct results. But even after adding the new consistent proc to pg_proc.h: + DATA(insert OID = 8000 ( gist_boxpoint_consistent PGNSP PGUID 12 1 0 0 f f f t f i 5 0 16 2281 600 23 26 2281 _null_ _null_ _null_ _null_ gist_boxpoint_consistent _null_ _null_ _null_ )); And adding it as a new support function in pg_amproc.h: + DATA(insert ( 2593 603 600 1 8000 )); + DATA(insert ( 2593 603 600 2 2583 )); + DATA(insert ( 2593 603 600 3 2579 )); + DATA(insert ( 2593 603 600 4 2580 )); + DATA(insert ( 2593 603 600 5 2581 )); + DATA(insert ( 2593 603 600 6 2582 )); + DATA(insert ( 2593 603 600 7 2584 )); ...my gist_boxpoint_consistent() function still doesn't get called. At this point I'm a bit lost -- while pg_amop.h has plenty of examples of crosstype comparison operators for btree index methods, there are none for GiST. Is GiST somehow a special case in this regard? It was I that was lost. As Tom mentioned, GiST indexes have records in pg_amop in their specialized way. I found gist_point_consistent has some kind of hack for that and pg_amop for point_ops records have multiple crosstype for that. So, if I understand correctly your first approach modifying gist_box_consistent was the right way, although trivial issues should be fixed. Also, you may want to follow point_ops when you are worried if the counterpart operator of commutator should be registered or not. Looking around those mechanisms, it occurred to me that you mentioned only box @ point. Why don't you add circly @ point, poly @ point as well as box? Is that hard? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized aggregate subquery (was: Pull up aggregate subquery)
2011/6/10 Hitoshi Harada umi.tan...@gmail.com: 2011/6/9 Robert Haas robertmh...@gmail.com: On Thu, Jun 9, 2011 at 2:28 AM, Hitoshi Harada umi.tan...@gmail.com wrote: BTW, as I changed title and design from the previous post, should I throw away the old commit fest entry and make the new one? Nah, just edit the existing entry and change the title. Also add a link to the new patch, of course. Ok, done. While reviewing the gist/box patch, I found some planner APIs that can replace parts in my patch. Also, comments in includes wasn't updated appropriately. Revised patch attached. Regards, -- Hitoshi Harada aggjoin-20110617.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] planinstr, showing planner time on EXPLAIN
2011/6/17 Robert Haas robertmh...@gmail.com: On Tue, Jun 14, 2011 at 11:18 PM, Hitoshi Harada umi.tan...@gmail.com wrote: Yesterday on PGXN I just released the first version of planinstr, a plugin module to append planner time to EXPLAIN. I post this here since it is mostly for developers. Wow, that is awesome. I sorta wish we had something like that in core -- and I don't even think it should be optional, I think it should just always give you that additional piece of information. Thanks for a positive feedback. Current design passes instrument time as global variable from planner to executor. To get it in core we need to add a field to the planner for that, which I don't think is too difficult. Anyway I still like its way as it doesn't need any change in core; free for anyone to use, even in the older versions. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] planinstr, showing planner time on EXPLAIN
Yesterday on PGXN I just released the first version of planinstr, a plugin module to append planner time to EXPLAIN. I post this here since it is mostly for developers. http://www.pgxn.org/dist/planinstr/ db1=# load '$libdir/planinstr'; LOAD db1=# explain select * from pg_class; QUERY PLAN --- Seq Scan on pg_class (cost=0.00..141.87 rows=3287 width=194) Plan time: 0.119 ms db1=# explain analyze select count(*) from size_m; QUERY PLAN Aggregate (cost=26272.00..26272.01 rows=1 width=0) (actual time=51.938..51.938 rows=1 loops=1) - Append (cost=0.00..23147.00 rows=125 width=0) (actual time=0.037..45.809 rows=2 loops=1) - Seq Scan on size_m (cost=0.00..847.00 rows=2 width=0) (actual time=0.037..41.863 rows=2 loops=1) .. snip .. - Seq Scan on myt1000 size_m (cost=0.00..22.30 rows=1230 width=0) (actual time=0.001..0.001 rows=0 loops=1) Total runtime: 75.217 ms Plan time: 61.353 ms (1005 rows) This may help to make the planner performance regression visible on some internal logic refactoring, etc. Hope this helps someone. Feel free to tell me if similar mechanism already exists, or you want more rich interfaces. Github is here: https://github.com/umitanuki/planinstr Also free to fork and send pull request! Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers