Re: [HACKERS] PDF builds broken again
Hi, On Wed, 2014-07-23 at 15:53 +0200, Magnus Hagander wrote: > I ended up splitting the paragraph in two in order to get the PDFs to > build. I've applied a patch for this to 9.0 only so we can keep > building PDFs. Thanks for looking at this -- you saved my time. Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Use unique index for longer pathkeys.
On Tue, Jul 22, 2014 at 2:49 PM, Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp> wrote: > > Sorry , previous version has bugs. It stamps over the stack and > crashesX( The attached is the bug fixed version, with no > substantial changes. > > > On Tue, Jul 15, 2014 at 2:17 PM, Kyotaro HORIGUCHI < > > horiguchi.kyot...@lab.ntt.co.jp> wrote: > > > > > > Hi, the attached is the revised version. > > > > Thanks Horiguchi-San for the updated patch. > > # By the way, this style of calling a person is quite common > # among Japanese since the first-name basis implies very close > # relationship or it frequently conveys offensive shade. I don't know if I have ever offended you or any other Japanese community member while interaction, but my intention was never so. The reason for using this style for calling you is during my initial 4 years of work, I have worked very closely with Japanese, so I have learned few things during that time and it was quite common to refer the way I used above, however I am not sure I have always used during communication, so if something I have used which is not common in your culture, please feel free to let me know, I will surely not do that again. > > Today while looking into updated patch, I was wondering why can't > > we eliminate useless keys in query_pathkeys when we actually build > > the same in function standard_qp_callback(), basically somewhat > > similar to what we do in > > standard_qp_callback->make_pathkeys_for_sortclauses->pathkey_is_redundant(). > > I agree that standard_qp_callback is basically more preferable. > > > We already have index information related to base_rels before building > > query pathkeys. I noticed that you mentioned the below in your original > > mail which indicates that information related to inheritance tables is built > > only after set_base_rel_sizes() > > > > "These steps take place between set_base_rel_sizes() and > > set_base_rel_pathlists() in make_one_rel(). The reason for the > > position is that the step 2 above needs all inheritence tables to > > be extended in PlannerInfo and set_base_rel_sizes (currently) > > does that". > > Sorry, my memory had been worn down. After some reconfirmation, > this description found to be a bit (quite?) wrong. > > collect_common_primary_pathkeys needs root->eq_classes to be set > up beforehand, not appendrels. Bacause build_index_pathkeys > doesn't work before every EC member for all relation involved is > already registered. > > > standard_qp_callback registers EC members in the following path > but only for the primary(?) tlist of the subquery, so EC members > for the child relations are not registered at the time. > > .->make_pathekeys_sortclauses->make_pathkey_from_sortop >->make_pathkey_from_sortinfo->get_eclass_for_sort_expr > > EC members for the child rels are registered in > > set_base_rel_sizes->set_rel_size->set_append_rel_size >->add_child_rel_equivalences > > So trim_query_pathkeys (collect_common...) doesn't work before > set_base_rel_sizes(). These EC members needed to be registered at > least if root->query_pathkeys exists so this seems to be done in > standard_qp_callback adding a separate inheritance tree walk. Do we really need that information to eliminate useless keys from query_pathkeys? * We have to make child entries in the EquivalenceClass data * structures as well. This is needed either if the parent * participates in some eclass joins (because we will want to consider * inner-indexscan joins on the individual children) or if the parent * has useful pathkeys (because we should try to build MergeAppend * paths that produce those sort orderings). Referring to above comment, I think we don't need it to eliminate useless keys based on primary key info from query_pathkeys, but I might be missing some case, could you please elaborate more to let me know the case/cases where this would be useful. I think there is one more disadvantage in the way current patch is done which is that you need to collect index path keys for all relations irrespective of whether they will be of any use to eliminate useless pathkeys from query_pathkeys. One trivial case that comes to mind is when there are multiple relations involved in query and ORDER BY is base on columns of only part of the tables involved in query. > # rel->has_elcass_joins seems not working but it is not the topic > # here. > > > Could you please explain me why the index information built in above > > path is not sufficient or is there any other case which I am missing? > > Is the description above enough to be the explaination for the > place? Sorry for the inaccurate description. No issues, above description is sufficient to explain why you have written patch in its current form. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] SKIP LOCKED DATA (work in progress)
On 24 July 2014 00:52, Thomas Munro wrote: > On 23 July 2014 13:15, David Rowley wrote: >> I'm also wondering about this block of code in general: >> >> if (erm->waitPolicy == RWP_WAIT) >> wait_policy = LockWaitBlock; >> else if (erm->waitPolicy == RWP_SKIP ) >> wait_policy = LockWaitSkip; >> else /* erm->waitPolicy == RWP_NOWAIT */ >> wait_policy = LockWaitError; >> >> Just after this code heap_lock_tuple() is called, and if that returns >> HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that >> whole block again. I'm wondering why there's 2 enums that are for the same >> thing? if you just had 1 then you'd not need this block of code at all, you >> could just pass erm->waitPolicy to heap_lock_tuple(). > > True. Somewhere upthread I mentioned the difficulty I had deciding > how many enumerations were needed, for the various subsystems, ie > which headers and type they were allowed to share. Then I put off > working on this for so long that a nice example came along that showed > me the way: the lock strength enums LockTupleMode (heapam.h) and > RowMarkType (plannodes.h). The wait policy enums LockWaitPolicy > (heapam.h) and RowWaitPolicy (plannodes.h) mirror them closely, and > the same type of enumeration translation takes place in nodeLockRows.c > immediately above the code you pasted. I don't have any more > principled argument than monkey-see-monkey-do for that one... On reflection, I agree that this sucks, and I would like to unify the three new enums in the current patch (see below for recap) into one that can be passed between parser, planner, executor and heap access manager code as I think you are suggesting. My only question is: in which header should the enum be defined, that all those modules could include? Best regards, Thomas Munro Enumeration explosion recap: * parsenode.h defines enum LockClauseWaitPolicy, which is used in the structs LockClause and RowMarkClause, for use by the parser code * plannodes.h defines enum RowWaitPolicy, which is used in the structs PlanRowMark and ExecRowMark, for use by the planner and executor code (numbers coincide with LockClauseWaitPolicy) * heapam.h defines enum LockWaitPolicy, which is used as a parameter to heap_lock_tuple, for use by heap access code The parser produces LockClauseWaitPolicy values. InitPlan converts these to RowWaitPolicy values in execMain.c. Then nodeLockRows.c converts RowWaitPolicy values to LockWaitPolicy values (by if-then-else) so it can call heap_lock_tuple. This roughly mirrors what happens to lock strength information. The unpatched code simply passes a boolean 'nowait' flag around. An earlier version of my patch passed a pair of booleans around. Simon's independent patch[1] uses an int in the various node structs and the heap_lock_tuple function, and in execNode.h it has macros to give names to the values, and that is included by access/heapm.h. [1] http://www.postgresql.org/message-id/537610d5.3090...@2ndquadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor inaccuracy in jsonb_path_ops documentation
On Thu, Jul 24, 2014 at 3:02 PM, Tom Lane wrote: > The fact that the planner can avoid that in the presence of a > simple constant comparison value doesn't make it a good idea to > use jsonb_path_ops indexes if you do this type of query a lot, > because in some usages the planner won't see a constant comparison > value, and if so it'll likely choose an indexscan by default. Well, with idiomatic usage a constant value generally will be used. But I do see your point. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why does xlog.c not treat COMMIT_PREPARED/ABORT_PREPARED as commit/abort?
On 2014-07-24 17:53:17 -0400, Tom Lane wrote: > There's a bunch of code in xlog.c that special-cases commit/abort records > for purposes of controlling replay, ie whether you can pause before/after > a particular xlog record, extract a timestamp from it, etc. This code > does not, and apparently never has, counted COMMIT_PREPARED or > ABORT_PREPARED as commit/abort records. Is there a good reason for that, > or is it just an oversight? Seems like an oversight to me. > I noticed this in investigation of bug #11032, which is a side effect > of the fact that xlog.c doesn't believe there's any timestamp to be > found in COMMIT_PREPARED records. But more generally, they can't be > used as recovery targets, and I don't see a reason for that. > Assuming we agree that this is a bug, is it something to back-patch, > or do we not want to change back-branch behavior here? I started to argue that we should change the cases where SetLatestXTime() but not stop for prepared xacts but couldn't really think of a reason why it'd be a good plan. So I ovte for backpatching the whole thing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Minor inaccuracy in jsonb_path_ops documentation
Peter Geoghegan writes: > The jsonb documentation says of the jsonb_path_ops GIN opclass: > """ > A disadvantage of the jsonb_path_ops approach is that it produces no > index entries for JSON structures not containing any values, such as > {"a": {}}. If a search for documents containing such a structure is > requested, it will require a full-index scan, which is quite slow. > jsonb_path_ops is therefore ill-suited for applications that often > perform such searches. > """ > The reference to a full index scan seems questionable. Well, if you get an indexscan, it will be a full index scan, so I find this warning appropriate. The fact that the planner can avoid that in the presence of a simple constant comparison value doesn't make it a good idea to use jsonb_path_ops indexes if you do this type of query a lot, because in some usages the planner won't see a constant comparison value, and if so it'll likely choose an indexscan by default. Getting into exactly when the planner will or won't save your bacon is the kind of detail I'd rather not put in user-facing docs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why does xlog.c not treat COMMIT_PREPARED/ABORT_PREPARED as commit/abort?
There's a bunch of code in xlog.c that special-cases commit/abort records for purposes of controlling replay, ie whether you can pause before/after a particular xlog record, extract a timestamp from it, etc. This code does not, and apparently never has, counted COMMIT_PREPARED or ABORT_PREPARED as commit/abort records. Is there a good reason for that, or is it just an oversight? I noticed this in investigation of bug #11032, which is a side effect of the fact that xlog.c doesn't believe there's any timestamp to be found in COMMIT_PREPARED records. But more generally, they can't be used as recovery targets, and I don't see a reason for that. Assuming we agree that this is a bug, is it something to back-patch, or do we not want to change back-branch behavior here? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor inaccuracy in jsonb_path_ops documentation
The jsonb documentation says of the jsonb_path_ops GIN opclass: """ A disadvantage of the jsonb_path_ops approach is that it produces no index entries for JSON structures not containing any values, such as {"a": {}}. If a search for documents containing such a structure is requested, it will require a full-index scan, which is quite slow. jsonb_path_ops is therefore ill-suited for applications that often perform such searches. """ The reference to a full index scan seems questionable. This text should indicate that a sequential scan can be expected. Even without any statistics, in the event of not being able to extract any hash values as GIN keys, the optimizer prefers a sequential scan. Example with query where one jsonb_path_ops GIN hash value is generated: postgres=# explain analyze select count(*) from test where j @> '{"tags":[{"term":"postgres"}]}'; QUERY PLAN Aggregate (cost=4732.72..4732.73 rows=1 width=0) (actual time=0.513..0.513 rows=1 loops=1) -> Bitmap Heap Scan on test (cost=33.71..4729.59 rows=1253 width=0) (actual time=0.107..0.496 rows=100 loops=1) Recheck Cond: (j @> '{"tags": [{"term": "postgres"}]}'::jsonb) Heap Blocks: exact=100 -> Bitmap Index Scan on ttt (cost=0.00..33.40 rows=1253 width=0) (actual time=0.076..0.076 rows=100 loops=1) Index Cond: (j @> '{"tags": [{"term": "postgres"}]}'::jsonb) Planning time: 0.083 ms Execution time: 0.560 ms (8 rows) Example of empty query hazard with no such hash values: postgres=# explain select count(*) from test where j @> '{"tags":[]}'; QUERY PLAN -- Aggregate (cost=191519.46..191519.47 rows=1 width=0) -> Seq Scan on test (cost=0.00..191516.33 rows=1253 width=0) Filter: (j @> '{"tags": []}'::jsonb) Planning time: 0.073 ms (4 rows) gincostestimate() does at least have the ability to anticipate that a full index scan will be required, and that actually makes the optimizer do the right thing here. Maybe the text quoted above is intended to indicate that if there was an index scan, it would have to be a full index scan, but that doesn't seem appropriate for user facing documentation like this. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench -- splits Bv6
Fabien COELHO wrote: > I also have a problem with assert & Assert. I finally figured out > that Assert is not compiled in by default, thus it is generally > ignored. So it is more for debugging purposes when activated than > for guarding against some unexpected user errors. Yes, Assert() is for debugging during development. If you need to deal with user error, use regular if () and exit() as appropriate (ereport() in the backend). We mostly avoid assert() in our own code. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Verbose output of pg_dump not show schema name
On Thu, Jul 24, 2014 at 4:36 PM, Robert Haas wrote: > > On Wed, Jul 23, 2014 at 5:48 PM, Fabrízio de Royes Mello > wrote: > > > > On Thu, Apr 17, 2014 at 9:15 PM, Michael Paquier < michael.paqu...@gmail.com> > > wrote: > >> > >> On Fri, Apr 18, 2014 at 4:29 AM, Fabrízio de Royes Mello > >> wrote: > >> > > >> > On Thu, Apr 17, 2014 at 12:46 PM, Bruce Momjian > >> > wrote: > >> >> > >> >> On Thu, Apr 17, 2014 at 11:44:37AM -0400, Tom Lane wrote: > >> >> > Bruce Momjian writes: > >> >> > > The idea is that we only need quotes when there are odd characters > >> >> > > in > >> >> > > the identifier. We do that right now in some places, though I > >> >> > > can't > >> >> > > find them in pg_dump. I know psql does that, see quote_ident(). > >> >> > > >> >> > I think our general style rule is that identifiers embedded in > >> >> > messages > >> >> > are always double-quoted. There's an exception for type names, but > >> >> > not otherwise. You're confusing the message case with printing SQL. > >> >> > >> >> OK. I was unclear if a status _display_ was a message like an error > >> >> message. > >> >> > >> > > >> > The attached patch fix missing double-quoted in "dumping contents of > >> > table.." message and add schema name to other messages: > >> > - "reading indexes for table \"%s\".\"%s\"\n" > >> > - "reading foreign key constraints for table \"%s\".\"%s\"\n" > >> > - "reading triggers for table \"%s\".\"%s\"\n" > >> > > >> > - "finding the columns and types of table \"%s\".\"%s\"\n" > >> > - "finding default expressions of table \"%s\".\"%s\"\n" > >> > - "finding check constraints for table \"%s\".\"%s\"\n" > >> Cool additions. There may be a more elegant way to check if namespace > >> is NULL, but I couldn't come up with one myself. So patch may be fine. > >> > > > > Hi all, > > > > I think this small patch was lost. There are something wrong? > > Did it get added to a CommitFest? > > I don't see it there. > Given this is a very small and simple patch I thought it's not necessary... Added to the next CommitFest. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 3aebac8..5c34a63 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -713,8 +713,8 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te, _becomeOwner(AH, te); _selectOutputSchema(AH, te->namespace); - ahlog(AH, 1, "processing data for table \"%s\"\n", - te->tag); + ahlog(AH, 1, "processing data for table \"%s\".\"%s\"\n", + AH->currSchema, te->tag); /* * In parallel restore, if we created the table earlier in diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 412f52f..3c7e88f 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -1400,7 +1400,17 @@ dumpTableData_copy(Archive *fout, void *dcontext) const char *column_list; if (g_verbose) - write_msg(NULL, "dumping contents of table %s\n", classname); + { + /* Print namespace information if available */ + if (tbinfo->dobj.namespace != NULL && + tbinfo->dobj.namespace->dobj.name != NULL) + write_msg(NULL, "dumping contents of table \"%s\".\"%s\"\n", + tbinfo->dobj.namespace->dobj.name, + classname); + else + write_msg(NULL, "dumping contents of table \"%s\"\n", + classname); + } /* * Make sure we are in proper schema. We will qualify the table name @@ -5019,8 +5029,17 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables) continue; if (g_verbose) - write_msg(NULL, "reading indexes for table \"%s\"\n", - tbinfo->dobj.name); + { + /* Print namespace information if available */ + if (tbinfo->dobj.namespace != NULL && +tbinfo->dobj.namespace->dobj.name != NULL) +write_msg(NULL, "reading indexes for table \"%s\".\"%s\"\n", + tbinfo->dobj.namespace->dobj.name, + tbinfo->dobj.name); + else +write_msg(NULL, "reading indexes for table \"%s\"\n", + tbinfo->dobj.name); + } /* Make sure we are in proper schema so indexdef is right */ selectSourceSchema(fout, tbinfo->dobj.namespace->dobj.name); @@ -5385,8 +5404,17 @@ getConstraints(Archive *fout, TableInfo tblinfo[], int numTables) continue; if (g_verbose) - write_msg(NULL, "reading foreign key constraints for table \"%s\"\n", - tbinfo->dobj.name); + { + /* Print namespace information if available */ + if (tbinfo->dobj.namespace != NULL && +tbinfo->dobj.namespace->dobj.name != NULL) +write_msg(NULL, "reading foreign key constraints for table \"%s\".\"%s\"\n", + tbinfo->dobj.namespace->dobj.name, + tbinfo->dobj.name); + else +write_msg(NULL, "reading fore
Re: [HACKERS] Least Active Transaction ID function
Hi Robert, On Thu, Jul 24, 2014 at 9:32 PM, Robert Haas wrote: > On Wed, Jul 23, 2014 at 3:53 PM, Rohit Goyal wrote: > > Hi All, > > > > On Wed, Jul 23, 2014 at 5:01 PM, Rohit Goyal > wrote: > >> > >> Hi All, > >> > >> I am doing programming with postgresql source code. I want to find out > the > >> function which can give me Least active transaction id currenty in the > >> system. > >> > >> Is there any function which can give me that? > >> > >> Regards, > >> Rohit Goyal > > > > 1> I know that somewhere there is an active transaction list in > postgresql. > > At any point of time, I want to get the smallest transaction present in > this > > active tx list or I want to get the transaction id before which all > > transaction smaller than that are committed/aborted. > > > > Is there any function which can give me this value? > > See the RecentXmin calculation in GetSnapshotData. > > This was really -2 helpful. 1. Can I use this xmin variable directly anytime anywhere in my code as it is a global variable. 2. What is the difference b/w recentXmin and RecentGlobalXmin. I read the description but any small detail can clear my mind. :) Thanks in advance!! > > 2> I found a function giving GetStableLatestTransactionId(), please tel > me > > what this function gives. I was not able to understand the description > given > > above it. > > I don't know how to help with this; the description seems clear to me. > > This is not important now, as you have already told me the variable and file for recentXmin.:) Regards, Rohit > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Regards, Rohit Goyal
Re: [HACKERS] Verbose output of pg_dump not show schema name
On Wed, Jul 23, 2014 at 5:48 PM, Fabrízio de Royes Mello wrote: > > On Thu, Apr 17, 2014 at 9:15 PM, Michael Paquier > wrote: >> >> On Fri, Apr 18, 2014 at 4:29 AM, Fabrízio de Royes Mello >> wrote: >> > >> > On Thu, Apr 17, 2014 at 12:46 PM, Bruce Momjian >> > wrote: >> >> >> >> On Thu, Apr 17, 2014 at 11:44:37AM -0400, Tom Lane wrote: >> >> > Bruce Momjian writes: >> >> > > The idea is that we only need quotes when there are odd characters >> >> > > in >> >> > > the identifier. We do that right now in some places, though I >> >> > > can't >> >> > > find them in pg_dump. I know psql does that, see quote_ident(). >> >> > >> >> > I think our general style rule is that identifiers embedded in >> >> > messages >> >> > are always double-quoted. There's an exception for type names, but >> >> > not otherwise. You're confusing the message case with printing SQL. >> >> >> >> OK. I was unclear if a status _display_ was a message like an error >> >> message. >> >> >> > >> > The attached patch fix missing double-quoted in "dumping contents of >> > table.." message and add schema name to other messages: >> > - "reading indexes for table \"%s\".\"%s\"\n" >> > - "reading foreign key constraints for table \"%s\".\"%s\"\n" >> > - "reading triggers for table \"%s\".\"%s\"\n" >> > >> > - "finding the columns and types of table \"%s\".\"%s\"\n" >> > - "finding default expressions of table \"%s\".\"%s\"\n" >> > - "finding check constraints for table \"%s\".\"%s\"\n" >> Cool additions. There may be a more elegant way to check if namespace >> is NULL, but I couldn't come up with one myself. So patch may be fine. >> > > Hi all, > > I think this small patch was lost. There are something wrong? Did it get added to a CommitFest? I don't see it there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Least Active Transaction ID function
On Wed, Jul 23, 2014 at 3:53 PM, Rohit Goyal wrote: > Hi All, > > On Wed, Jul 23, 2014 at 5:01 PM, Rohit Goyal wrote: >> >> Hi All, >> >> I am doing programming with postgresql source code. I want to find out the >> function which can give me Least active transaction id currenty in the >> system. >> >> Is there any function which can give me that? >> >> Regards, >> Rohit Goyal > > 1> I know that somewhere there is an active transaction list in postgresql. > At any point of time, I want to get the smallest transaction present in this > active tx list or I want to get the transaction id before which all > transaction smaller than that are committed/aborted. > > Is there any function which can give me this value? See the RecentXmin calculation in GetSnapshotData. > 2> I found a function giving GetStableLatestTransactionId(), please tel me > what this function gives. I was not able to understand the description given > above it. I don't know how to help with this; the description seems clear to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench -- splits Bv6
Thank you for your grate documentation and fix working!!! It becomes very helpful for understanding our feature. Hopefully it will help make it, or part of it, pass through. I add two feature in gauss_B_4.patch. 1) Add gaussianProbability() function It is same as exponentialProbability(). And the feature is as same as before. Ok, that is better for readability and easy reuse. 2) Add result of "max/min percent of the range" It is almost same as --exponential option's result. However, max percent of the range is center of distribution and min percent of the range is most side of distribution. Here is the output example, Ok, good that make it homogeneous with the exponential case. + pgbench_account's aid selected with a truncated gaussian distribution + standard deviation threshold: 5.0 + decile percents: 0.0% 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 0.1% 0.0% + probability of max/min percent of the range: 4.0% 0.0% And I add the explanation about this in the document. This is a definite improvement. I tested these minor changes and everything seems ok. Attached is a very small update. One word removed from the doc, and one redundant declaration removed from the code. I also have a problem with assert & Assert. I finally figured out that Assert is not compiled in by default, thus it is generally ignored. So it is more for debugging purposes when activated than for guarding against some unexpected user errors. -- Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index e07206a..0247a05 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -41,6 +41,7 @@ #include #include #include +#include #ifdef HAVE_SYS_SELECT_H #include #endif @@ -173,6 +174,11 @@ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ int main_pid; /* main process id used in log filename */ +/* gaussian/exponential distribution tests */ +double threshold; /* threshold for gaussian or exponential */ +booluse_gaussian = false; +bool use_exponential = false; + char *pghost = ""; char *pgport = ""; char *login = NULL; @@ -294,11 +300,11 @@ static int num_commands = 0; /* total number of Command structs */ static int debug = 0; /* debug flag */ /* default scenario */ -static char *tpc_b = { +static char *tpc_b_fmt = { "\\set nbranches " CppAsString2(nbranches) " * :scale\n" "\\set ntellers " CppAsString2(ntellers) " * :scale\n" "\\set naccounts " CppAsString2(naccounts) " * :scale\n" - "\\setrandom aid 1 :naccounts\n" + "\\setrandom aid 1 :naccounts%s\n" "\\setrandom bid 1 :nbranches\n" "\\setrandom tid 1 :ntellers\n" "\\setrandom delta -5000 5000\n" @@ -312,11 +318,11 @@ static char *tpc_b = { }; /* -N case */ -static char *simple_update = { +static char *simple_update_fmt = { "\\set nbranches " CppAsString2(nbranches) " * :scale\n" "\\set ntellers " CppAsString2(ntellers) " * :scale\n" "\\set naccounts " CppAsString2(naccounts) " * :scale\n" - "\\setrandom aid 1 :naccounts\n" + "\\setrandom aid 1 :naccounts%s\n" "\\setrandom bid 1 :nbranches\n" "\\setrandom tid 1 :ntellers\n" "\\setrandom delta -5000 5000\n" @@ -328,9 +334,9 @@ static char *simple_update = { }; /* -S case */ -static char *select_only = { +static char *select_only_fmt = { "\\set naccounts " CppAsString2(naccounts) " * :scale\n" - "\\setrandom aid 1 :naccounts\n" + "\\setrandom aid 1 :naccounts%s\n" "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" }; @@ -377,6 +383,8 @@ usage(void) " -v, --vacuum-all vacuum all four standard tables before tests\n" " --aggregate-interval=NUM aggregate data over NUM seconds\n" " --sampling-rate=NUM fraction of transactions to log (e.g. 0.01 for 1%%)\n" + " --exponential=NUMexponential distribution with NUM threshold parameter\n" + " --gaussian=NUM gaussian distribution with NUM threshold parameter\n" "\nCommon options:\n" " -d, --debug print debugging output\n" " -h, --host=HOSTNAME database server host or socket directory\n" @@ -2329,6 +2337,30 @@ process_builtin(char *tb) return my_commands; } +/* + * compute the probability of the truncated gaussian random generation + * to draw values in the i-th slot of the range. + */ +static double gaussianProbability(int i, int slots, double threshold) +{ + assert(1 <= i && i <= slots); + return (0.50 * (erf (threshold * (1.0 - 1.0 / slots * (2.0 * i - 2.0)) / sqrt(2.0)) - + erf (threshold * (1.0 - 1.0 / slots * 2.0 * i) / sqrt(2.0))) / + erf (threshold / sqrt(2.0))); +} + +/* + * compute the probability of the truncated exponential random generation + * to draw values in the i-th slot of the range. + */ +static double exponentialProbability(int i, int slots, double threshold) +{ + assert(1 <= i && i <= slots); + return (exp(- threshold * (i - 1) / slots)
Re: [HACKERS] parametric block size?
Note that I was more asking about the desirability of the feature, the implementation is another, although also relevant, issue. To me it is really desirable given the potential performance impact, but maybe we should not care about 10%? 10% performance improvement sounds good, no doubt. What will happen to performance for people with the same block size? I mean, if you run a comparison of current HEAD vs. patched with identical BLCKSZ, is there a decrease in performance? I expect there will be some, although I'm not sure to what extent. I do not understand the question. Do you mean to compare current 'compile time set block size' vs an hypothetical 'adaptative initdb-time block size' version, which does not really exist yet? I cannot answer that, but I would not expect significant differences. If there is a significant performance impact, this would be sure no good. People who pg_upgrade for example will be stuck with whatever blcksz they had on the original installation and so will be unable to benefit from this improvement. Sure. What I'm looking at is just to have a postmaster executable which tolerates several block sizes, but they must be set & chosen when initdb-ing anyway. I admit I'm not sure where's the breakeven point, i.e. what's the loss we're willing to tolerate. It might be pretty small. Minimal performance impact wrt the current version, got that! -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Remove comment about manually flipping attnotnull
Andres Freund writes: > catalogs.sgml has this comment: >This represents a not-null constraint. It is possible to >change this column to enable or disable the constraint. > Someone on irc just took that as "permission" to do so manually... Did anything especially bad happen? Obviously, turning it on wouldn't have verified that the column contains no nulls, but hopefully anyone who's bright enough to do manual catalog updates would realize that. I think things would generally have worked otherwise, in particular sinval signalling would have happened (IIRC). > That comment has been there since 1efd7330c/2000-11-29, in the the > initial commit adding catalogs.sgml. Does somebody object to just > removing the second part of the comment in all branches? Seems like a reasonable idea, in view of the fact that we're thinking about adding pg_constraint entries for NOT NULL, in which case frobbing attnotnull by itself would definitely be a bad thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Remove comment about manually flipping attnotnull
Hi, catalogs.sgml has this comment: This represents a not-null constraint. It is possible to change this column to enable or disable the constraint. Someone on irc just took that as "permission" to do so manually... That comment has been there since 1efd7330c/2000-11-29, in the the initial commit adding catalogs.sgml. Does somebody object to just removing the second part of the comment in all branches? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parametric block size?
Fabien COELHO wrote: Hi, > Note that I was more asking about the desirability of the feature, > the implementation is another, although also relevant, issue. To me > it is really desirable given the potential performance impact, but > maybe we should not care about 10%? 10% performance improvement sounds good, no doubt. What will happen to performance for people with the same block size? I mean, if you run a comparison of current HEAD vs. patched with identical BLCKSZ, is there a decrease in performance? I expect there will be some, although I'm not sure to what extent. People who pg_upgrade for example will be stuck with whatever blcksz they had on the original installation and so will be unable to benefit from this improvement. I admit I'm not sure where's the breakeven point, i.e. what's the loss we're willing to tolerate. It might be pretty small. > About your point: if we really have to do without dynamic stack > allocation (C99 is only 15, not ripe for adult use yet, maybe when > it turns 18 or 21, depending on the state:-), a possible way around > would be to allocate a larger area with some MAX_BLCKSZ with a ifdef > for compilers that really would not support dynamic stack > allocation. Moreover, it might be possible to hide it more or less > cleanly in a macro. Maybe we could try to use dynamic stack allocation on compilers that support it, and use your MAX_BLCKSZ idea on the rest. Of course, finding all problematic code sites might prove difficult. I pointed out the one case I'm familiar with because of working with similar code recently. > I had to put "-pedantic -Werror" to manage to > get an error on dynamic stack allocation with "gcc -std=c89". Yeah, I guess in practice it will work everywhere except very old dinosaurs and Windows. But see a thread elsewhere about supporting VAXen; we don't appear to be prepared to drop support for dinosaurs just yet. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some bogus results from prairiedog
Robert Haas writes: > On Tue, Jul 22, 2014 at 8:14 PM, Tom Lane wrote: >> and it sure looks to me like that >> "ConflictsWithRelationFastPath(&lock->tag" is looking at the tag of the >> shared-memory lock object you just released. If someone else had managed >> to recycle that locktable entry for some other purpose, the >> ConflictsWithRelationFastPath call might incorrectly return true. >> >> I think s/&lock->tag/locktag/ would fix it, but maybe I'm missing >> something. > ...this is probably the real cause of the failures we've actually been > seeing. I'll go back-patch that change. For the archives' sake --- I took another look at the two previous instances we'd seen of FastPathStrongRelationLocks assertion failures: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-03-25%2003%3A15%3A03 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2014-04-06%2017%3A04%3A00 Neither of these tripped at the bug site in LockRefindAndRelease, but that's not so surprising, because there was no Assert there at the time. An occurrence of the bug would have silently left a negative entry in the FastPathStrongRelationLocks->count[] array, which would have triggered an assertion by the next process unlucky enough to use the same array entry. In the prairiedog case, the assert happened in a test running concurrently with the prepared_xacts test (which is presumably where the bug occurred), so that this seems a highly plausible explanation. In the rover_firefly case, the assert happened quite a bit later than prepared_xacts; but it might've been just luck that that particular array entry (out of 1024) wasn't re-used for awhile. So it's not certain that this one bug explains all three cases, but it seems well within the bounds of plausibility that it does. Also, the narrow width of the race condition window (from LWLockRelease(partitionLock) to an inlined test in the next line) is consistent with the very low failure rate we've seen in the buildfarm. BTW, I wonder whether it would be interesting for testing purposes to have a build option that causes SpinLockRelease and LWLockRelease to do a sched_yield before returning. One could assume that that would greatly increase the odds of detecting this type of bug. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 docs current as of
On Thu, Jul 24, 2014 at 5:17 PM, Tom Lane wrote: > Magnus Hagander writes: >> 9.4 beta docs are listed as "Current as of 2014-05-10". >> I'm assuming that's just a step we missed in the version stamping? > > No, what that means is that nobody has looked at the commit logs since > then to see if the 9.4 release notes need any updates. Since we don't > release-note simple bug fixes in a branch before the .0 release, > 9.4 isn't yet getting the same notes as the back branches; it requires > a scan of the commit logs with different criteria, ie look for feature > changes. And I didn't do that over the weekend (I barely got the > back-branch notes done :-(). Ah. I just did a "git log" on it and saw there were a number of updates in the relnotes themselves, didn't reflect on the fact that nobody had checked them against *other* updates to the tree. > It will get done at least once before 9.4.0, but I suspect that any > changes as a result of that will be pretty minor, so I'm not terribly > upset that it didn't happen for beta2. Nope, I agree. I just thought it meant something else... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gaussian distribution pgbench -- splits v4
Hi, Thank you for your grate documentation and fix working!!! It becomes very helpful for understanding our feature. I add two feature in gauss_B_4.patch. 1) Add gaussianProbability() function It is same as exponentialProbability(). And the feature is as same as before. 2) Add result of "max/min percent of the range" It is almost same as --exponential option's result. However, max percent of the range is center of distribution and min percent of the range is most side of distribution. Here is the output example, + pgbench_account's aid selected with a truncated gaussian distribution + standard deviation threshold: 5.0 + decile percents: 0.0% 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 0.1% 0.0% + probability of max/min percent of the range: 4.0% 0.0% And I add the explanation about this in the document. I'm very appreciate for your works!!! Best regards, -- Mitsumasa KONDO gauss_B_5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
On 2014-07-24 11:17:15 -0400, Robert Haas wrote: > I think you might be approaching this problem from the wrong end, > though. Yep. > The question in my mind is: why does the > StartTransactionCommand() / CommitTransactionCommand() pair in > ProcessCatchupEvent() end up writing a commit record? The obvious > possibility that occurs to me is that maybe rereading the invalidated > catalog entries causes a HOT prune, and maybe there ought to be some > way for a transaction that has only done HOT pruning to commit > asynchronously, just as we already do for transactions that only > modify temporary tables. Or, failing that, maybe there's a way to > suppress synchronous commit for this particular transaction. I think we should do what the first paragraph in http://archives.postgresql.org/message-id/20140707155113.GB1136%40alap3.anarazel.de outlined. As Tom says somewhere downthread that requires some code review, but other than that it should get rid of a fair amount of problems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.4 docs current as of
Magnus Hagander writes: > 9.4 beta docs are listed as "Current as of 2014-05-10". > I'm assuming that's just a step we missed in the version stamping? No, what that means is that nobody has looked at the commit logs since then to see if the 9.4 release notes need any updates. Since we don't release-note simple bug fixes in a branch before the .0 release, 9.4 isn't yet getting the same notes as the back branches; it requires a scan of the commit logs with different criteria, ie look for feature changes. And I didn't do that over the weekend (I barely got the back-branch notes done :-(). It will get done at least once before 9.4.0, but I suspect that any changes as a result of that will be pretty minor, so I'm not terribly upset that it didn't happen for beta2. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations
On Wed, Jul 23, 2014 at 12:17 PM, MauMau wrote: > From: "Tom Lane" >> This seems like a pretty unsafe suggestion, because the smgr level doesn't >> know or care whether relations are temp; files are files. In any case it >> would only paper over one specific instance of whatever problem you're >> worried about, because sinval messages definitely do need to be sent in >> general. > > I'm sorry I don't show the exact problem yet. Apart from that, I understood > that you insist it's not appropriate for smgr to be aware of whether the > file is a temporary relation, in terms of module layering. However, it > doesn't seem so in the current implementation. md.c, which is a layer under > or part of smgr, uses SmgrIsTemp(). In addition, as the name suggests, > SmgrIsTemp() is a function of smgr, which is defined in smgr.h. So, it's > not inappropriate for smgr to use it. > > What I wanted to ask is whether and why sinval messages are really necessary > for session-private objects like temp relations. I thought shared inval is, > as the name indicates, for objects "shared" among sessions. If so, sinval > for session-private objects doesn't seem to match the concept. I think the problem here is that it actually is possible for one session to access the temporary objects of another session: rhaas=# create temp table fructivore (a int); CREATE TABLE rhaas=# select 'fructivore'::regclass::oid; oid --- 24578 (1 row) Switch windows: rhaas=# select 24578::regclass; regclass -- pg_temp_2.fructivore (1 row) rhaas=# alter table pg_temp_2.fructivore add column b int; ALTER TABLE Now, we could prohibit that specific thing. But at the very least, it has to be possible for one session to drop another session's temporary objects, because autovacuum does it eventually, and superusers will want to do it sooner to shut autovacuum up. So it's hard to reason about whether and to what extent it's safe to not send sinval messages for temporary objects. I think you might be approaching this problem from the wrong end, though. The question in my mind is: why does the StartTransactionCommand() / CommitTransactionCommand() pair in ProcessCatchupEvent() end up writing a commit record? The obvious possibility that occurs to me is that maybe rereading the invalidated catalog entries causes a HOT prune, and maybe there ought to be some way for a transaction that has only done HOT pruning to commit asynchronously, just as we already do for transactions that only modify temporary tables. Or, failing that, maybe there's a way to suppress synchronous commit for this particular transaction. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parametric block size?
Resent: previous message was stalled because of a bad "From:". ISTM that a desirable and reasonably simple to implement feature would be to be able to set the blocksize at "initdb" time, and "postgres" could use the value found in the database instead of a compile-time one. I think you will find it more difficult to implement than it seems at first. [...] There's a performance argument here as well. Static allocation is likely faster that palloc, and there are likely many other places where having things like BLCKSZ or MaxIndexTuplesPerPage as compile-time constants saves a few cycles. A 10% speedup is nice, but I wouldn't want to pay 1% for everybody to get back 10% people who are willing to fiddle with the block size. Yes, I agree that it would not make much sense to have such a feature with a significant performance penalty for most people. For what I have seen, ISTM that palloc can be avoided altogether either with dynamic stack allocation when supported (that is in most cases?), or maybe in some case by allocating larger safe area. In such case, the "block size" setting would be a "max block size", and all valid block sizes below (eg for 8 kB: 1, 2, 4 and 8 kB) would be allowed. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checkpointer crashes on slave in 9.4 on windows
On Thu, Jul 24, 2014 at 5:38 AM, Amit Kapila wrote: > Revised patch initialize the WALInsertLocks and call > LWLockRegisterTranche() in XLOGShmemInit() which makes their > initialization similar to what we do at other places. OK, that seems all right. Will commit and back-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] parametric block size?
On Tue, Jul 22, 2014 at 1:22 PM, Alvaro Herrera wrote: > Fabien wrote: >> ISTM that a desirable and reasonably simple to implement feature >> would be to be able to set the blocksize at "initdb" time, and >> "postgres" could use the value found in the database instead of a >> compile-time one. > > I think you will find it more difficult to implement than it seems at > first. For one thing, there are several macros that depend on the block > size and the algorithms involved cannot work with dynamic sizes; > consider MaxIndexTuplesPerPage which is used inPageIndexMultiDelete() > for instance. That value is used to allocate an array in the stack, > but that doesn't work if the array size is dynamic. (Actually it works > almost everywhere, but that feature is not in C89 and thus it fails on > Windows). That shouldn't be a problem, you say, just palloc() the array > -- except that that function is called within critical sections in some > places (e.g. _bt_delitems_vacuum) and you cannot use palloc there. There's a performance argument here as well. Static allocation is likely faster that palloc, and there are likely many other places where having things like BLCKSZ or MaxIndexTuplesPerPage as compile-time constants saves a few cycles. A 10% speedup is nice, but I wouldn't want to pay 1% for everybody to get back 10% people who are willing to fiddle with the block size. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use unique index for longer pathkeys.
On Tue, Jul 22, 2014 at 5:19 AM, Kyotaro HORIGUCHI wrote: > # By the way, this style of calling a person is quite common > # among Japanese since the first-name basis implies very close > # relationship or it frequently conveys offensive shade. But I'm > # not sure what should I call others who're not Japases native. Typical usage on this mailing list is to refer to individuals by first name (e.g. Tom, Alvaro, Heikki) or by full name (e.g. Tom Lane, Alvaro Herrera, Heikki Linnakangas). The last name is typically included only when it might otherwise be confusing - for example, you will rarely see people use just "Greg" because there are several people by that name who are reasonably active, but Heikki's last name is almost never mentioned). That having been said, if you want to use the Japanese style, I don't think anyone here will mind. I am personally not totally familiar with it so please forgive me in turn if I should address you or anyone else in a manner that would be considered too familiar in another culture. I try not to do that, but I might make some mistakes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] shm_mq bug
In testing some new code I'm working on that uses shm_mq to do Cool Stuff (TM), I discovered that my code was mysteriously failing assertions when I hit ^C in the middle of the test. This led me on a lengthy, mostly-misguided hunt for the culprit. I eventually discovered that the problem wasn't in my new code at all, but was rather an oversight in the shm_mq stuff I previously committed. So, I intend to commit and back-patch the attached fix shortly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company commit 42900c00c831607ec435228e346ccbb8617be521 Author: Robert Haas Date: Wed Jul 23 16:29:40 2014 -0400 Prevent shm_mq_send from reading uninitialized memory. shm_mq_send_bytes didn't invariably initialize *bytes_written before returning, which would cause shm_mq_send to read from uninitialized memory and add the value it found there to mqh->mqh_partial_bytes. This could cause the next attempt to send a message via the queue to fail an assertion (if the queue was detached) or copy data from a garbage pointer value into the queue (if non-blocking mode was in use). diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c index 6f9c3a3..d96627a 100644 --- a/src/backend/storage/ipc/shm_mq.c +++ b/src/backend/storage/ipc/shm_mq.c @@ -676,7 +676,10 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait, /* Bail out if the queue has been detached. */ if (detached) + { + *bytes_written = sent; return SHM_MQ_DETACHED; + } if (available == 0) { @@ -691,12 +694,16 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, void *data, bool nowait, if (nowait) { if (shm_mq_get_receiver(mq) == NULL) + { + *bytes_written = sent; return SHM_MQ_WOULD_BLOCK; + } } else if (!shm_mq_wait_internal(mq, &mq->mq_receiver, mqh->mqh_handle)) { mq->mq_detached = true; + *bytes_written = sent; return SHM_MQ_DETACHED; } mqh->mqh_counterparty_attached = true; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Some bogus results from prairiedog
On Tue, Jul 22, 2014 at 8:14 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Jul 22, 2014 at 12:24 AM, Tom Lane wrote: >>> Anyway, to cut to the chase, the crash seems to be from this: >>> TRAP: FailedAssertion("!(FastPathStrongRelationLocks->count[fasthashcode] > >>> 0)", File: "lock.c", Line: 2957) >>> So there is still something rotten in the fastpath lock logic. > >> Gosh, that sucks. > >> The inconstancy of this problem would seem to suggest some kind of >> locking bug rather than a flat-out concurrency issue, but it looks to >> me like everything relevant is marked volatile. > > I don't think that you need any big assumptions about machine-specific > coding issues to spot the problem. I don't think that I'm making what could be described as big assumptions; I think we should fix and back-patch the PPC64 spinlock change. But... > The assert in question is here: > > /* > * Decrement strong lock count. This logic is needed only for 2PC. > */ > if (decrement_strong_lock_count > && ConflictsWithRelationFastPath(&lock->tag, lockmode)) > { > uint32fasthashcode = FastPathStrongLockHashPartition(hashcode); > > SpinLockAcquire(&FastPathStrongRelationLocks->mutex); > Assert(FastPathStrongRelationLocks->count[fasthashcode] > 0); > FastPathStrongRelationLocks->count[fasthashcode]--; > SpinLockRelease(&FastPathStrongRelationLocks->mutex); > } > > and it sure looks to me like that > "ConflictsWithRelationFastPath(&lock->tag" is looking at the tag of the > shared-memory lock object you just released. If someone else had managed > to recycle that locktable entry for some other purpose, the > ConflictsWithRelationFastPath call might incorrectly return true. > > I think s/&lock->tag/locktag/ would fix it, but maybe I'm missing > something. ...this is probably the real cause of the failures we've actually been seeing. I'll go back-patch that change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exporting Table-Specified BLOBs Only?
The reason this is needed from the export/dump side is that the database can become huge due to the number of datasheets added to the database. These datasheets are not necessary to troubleshoot the setup and, sometimes, the datasheets are secure-sensitive. In either case, they're not necessary when we need a copy of the customer's database to troubleshoot and they make the transport and import of the database horribly time consuming. Thanks for the response. Hopefully this can be addressed one day. Cheers. -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Thursday, July 24, 2014 7:31 AM To: Braunstein, Alan Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Exporting Table-Specified BLOBs Only? On Mon, Jul 21, 2014 at 2:14 PM, Braunstein, Alan wrote: > What do I need? > > A method of using pg_dump to selectively export BLOBs with OID’s used > in the tables specified with --table --table > Hmm. If you take a full backup using pg_dump -Fc, you can then use pg_restore -l and pg_restore -L to find and selectively restore whatever objects you want; e.g. restore the tables first, then fetch the list of OIDs from the relevant columns and restore those particular blobs. But I don't think we've got a tool built into core for doing this kind of filtering on the dump side. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Production block comparison facility
On Thu, Jul 24, 2014 at 8:36 PM, Andres Freund wrote: > On 2014-07-24 20:35:04 +0900, Michael Paquier wrote: >> - FPW/page consistency check is done after converting them to hex. >> This is done only this way to facilitate viewing the page diffs with a >> debugger. A best method would be to perform the checks using >> MASK_MARKER (which should be moved to bufmask.h btw). It may be better >> to put all this hex magic within a WAL_DEBUG ifdef. > > Can't you just do "p/x whatever" in the debugger to display things in > hex? Well yes :p -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Production block comparison facility
On 2014-07-24 20:35:04 +0900, Michael Paquier wrote: > - FPW/page consistency check is done after converting them to hex. > This is done only this way to facilitate viewing the page diffs with a > debugger. A best method would be to perform the checks using > MASK_MARKER (which should be moved to bufmask.h btw). It may be better > to put all this hex magic within a WAL_DEBUG ifdef. Can't you just do "p/x whatever" in the debugger to display things in hex? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Production block comparison facility
On Thu, Jul 24, 2014 at 12:35 AM, Simon Riggs wrote: > On 23 July 2014 15:14, Michael Paquier wrote: > >> I have spent some time digging more into this idea and finished with the >> patch attached > > Thank you for investigating the idea. I'll review by Monday. OK, thanks. Here are a couple of things that are not really necessary for the feature but I did to facilitate tests with the patch as well as its review: - Some information is logged to the user as DEBUG1 even if the current page and FDW are consistent. It may be better removed. - FPW/page consistency check is done after converting them to hex. This is done only this way to facilitate viewing the page diffs with a debugger. A best method would be to perform the checks using MASK_MARKER (which should be moved to bufmask.h btw). It may be better to put all this hex magic within a WAL_DEBUG ifdef. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exporting Table-Specified BLOBs Only?
On Mon, Jul 21, 2014 at 2:14 PM, Braunstein, Alan wrote: > What do I need? > > A method of using pg_dump to selectively export BLOBs with OID’s used in the > tables specified with --table --table Hmm. If you take a full backup using pg_dump -Fc, you can then use pg_restore -l and pg_restore -L to find and selectively restore whatever objects you want; e.g. restore the tables first, then fetch the list of OIDs from the relevant columns and restore those particular blobs. But I don't think we've got a tool built into core for doing this kind of filtering on the dump side. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] 9.4 docs current as of
9.4 beta docs are listed as "Current as of 2014-05-10". I'm assuming that's just a step we missed in the version stamping? Needs to go on a checklist? Should we backpatch a fix for that? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Stating the significance of Lehman & Yao in the nbtree README
On Thu, Jul 24, 2014 at 9:28 AM, Peter Geoghegan wrote: > On Wed, Jul 23, 2014 at 8:52 PM, Amit Kapila wrote: > > As such there is no problem in saying the way you have mentioned, but > > I feel it would be better if we can mention the mechanism of _bt_search() > > as quoted by you upthread in the first line. > > "> In more concrete terms, _bt_search() releases and only then acquires > >> read locks during a descent of the tree (by calling > >> _bt_relandgetbuf()), and, perhaps counterintuitively, that's just > >> fine." > > I guess I could say that too. Okay. > > One more point, why you think it is important to add this new text > > on top? I think adding new text after "Lehman and Yao don't require read > > locks, .." paragraph is okay. > > I've added it to the top because it's really the most important point > on Lehman and Yao. It's the _whole_ point. Consider how it's > introduced here, for example: > http://db.cs.berkeley.edu/jmh/cs262b/treeCCR.html > > Why should I "bury the lead"? I think even if you want to keep it at top, may be we could have another heading like : Concurrency Considerations with Lehman & Yao Approach However, I think we can leave this point for Committer to decide. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Checkpointer crashes on slave in 9.4 on windows
On Thu, Jul 24, 2014 at 12:14 AM, Robert Haas wrote: > On Mon, Jul 21, 2014 at 4:16 AM, Amit Kapila wrote: > > So, this problem was introduced by Heikki's commit, > 68a2e52bbaf98f136a96b3a0d734ca52ca440a95, to replace XLogInsert slots > with regular LWLocks. I think the problem here is that the > initialization code here really doesn't belong in InitXLOGAccess at > all: > > 1. I think WALInsertLocks is just another global variable that needs > to be saved and restored in EXEC_BACKEND mode and that it therefore > ought to participate in the save_backend_variables() mechanism instead > of having its own special-purpose mechanism to save and restore the > value. It seems to me that we don't need to save/restore variables that points to shared memory which we initialize during startup of process. We do initliaze shared memory during each process start in below call: SubPostmasterMain() { .. .. CreateSharedMemoryAndSemaphores(false, 0); } Few another examples of some similar variables are as below: MultiXactShmemInit() { .. OldestMemberMXactId = MultiXactState->perBackendXactIds; OldestVisibleMXactId = OldestMemberMXactId + MaxOldestSlot; } CreateSharedProcArray() { .. allProcs = ProcGlobal->allProcs; allProcs = ProcGlobal->allProcs; } However, I think it is better to initialize WALInsertLocks in XLOGShmemInit() as we do for other cases and suggested by you in point number-2. > 2. And I think that the LWLockRegisterTranche call belongs in > XLOGShmeInit(), so that it's parallel to the other call in > CreateLWLocks. Agreed. Revised patch initialize the WALInsertLocks and call LWLockRegisterTranche() in XLOGShmemInit() which makes their initialization similar to what we do at other places. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com fix_checkpointer_crash_on_slave_v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW
Hi Fujita-san, My coworker Takaaki EITOKU reviewed the patch, and here are some comments from him. 2014-07-08 16:07 GMT+09:00 Etsuro Fujita : ... > In the patch postgresPlanForeignModify has been modified so that if, in > addition to the above condition, the followings are satisfied, then the > ForeignScan and ModifyTable node will work that way. > > - There are no local BEFORE/AFTER triggers. The reason the check ignores INSTEAD OF triggers here is that INSTEAD OF trigger would prevent executing UPDATE/DELETE statement against a foreign tables at all, right? > - In UPDATE it's safe to evaluate expressions to assign to the target > columns on the remote server. IIUC, in addition to target expressions, whole WHERE clause should be safe to be pushed-down. Though that is obviously required, but mentioning that in documents and source comments would help users and developers. > > Here is a simple performance test. > > On remote side: > postgres=# create table t (id serial primary key, inserted timestamp > default clock_timestamp(), data text); > CREATE TABLE > postgres=# insert into t(data) select random() from generate_series(0, > 9); > INSERT 0 10 > postgres=# vacuum t; > VACUUM > > On local side: > postgres=# create foreign table ft (id integer, inserted timestamp, data > text) server myserver options (table_name 't'); > CREATE FOREIGN TABLE > > Unpatched: > postgres=# explain analyze verbose delete from ft where id < 1; > QUERY PLAN > --- > Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual > time=1275.255..1275.255 rows=0 loops=1) >Remote SQL: DELETE FROM public.t WHERE ctid = $1 >-> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) > (actual time=1.180..52.095 rows= loops=1) > Output: ctid > Remote SQL: SELECT ctid FROM public.t WHERE ((id < 1)) FOR > UPDATE > Planning time: 0.112 ms > Execution time: 1275.733 ms > (7 rows) > > Patched (Note that the DELETE command has been pushed down.): > postgres=# explain analyze verbose delete from ft where id < 1; > QUERY PLAN > --- > Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual > time=0.006..0.006 rows=0 loops=1) >-> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6) > (actual time=0.001..0.001 rows=0 loops=1) > Output: ctid > Remote SQL: DELETE FROM public.t WHERE ((id < 1)) > Planning time: 0.101 ms > Execution time: 8.808 ms > (6 rows) We found that this patch speeds up DELETE case remarkably, as you describe above, but we saw only less than 2x speed on UPDATE cases. Do you have any numbers of UPDATE cases? Some more random thoughts: * Naming of new behavior You named this optimization "Direct Update", but I'm not sure that this is intuitive enough to express this behavior. I would like to hear opinions of native speakers. * Macros for # of fdw_private elements In postgres_fdw.c you defined MaxFdwScanFdwPrivateLength and MinFdwScanFdwPrivateLength for determining the mode, but number of fdw_private elements is not a ranged value but an enum value (I mean that fdw_private holds 3 or 7, not 4~6, so Max and Min seems inappropriate for prefix. IMO those macros should be named so that the names represent the behavior, or define a macro to determine the mode, say IS_SHIPPABLE(foo). * Comparison of Macros Comparison against MaxFdwScanFdwPrivateLength and MinFdwScanFdwPrivateLength should be == instead of >= or <= to detect unexpected value. Adding assert macro seems good too. -- Shigeru HANADA -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max
Hi Haribabu, Sorry for being late. Thank you for sharing updated patch, sgml changes seems not working i.e. postgres=# select max('192.168.1.5', '192.168.1.4'); > ERROR: function max(unknown, unknown) does not exist > LINE 1: select max('192.168.1.5', '192.168.1.4'); >^ > HINT: No function matches the given name and argument types. You might > need to add explicit type casts. > postgres=# select min('192.168.1.5', '192.168.1.4'); > ERROR: function min(unknown, unknown) does not exist > LINE 1: select min('192.168.1.5', '192.168.1.4'); >^ > HINT: No function matches the given name and argument types. You might > need to add explicit type casts. I would suggest the following or similar changes e.g. doc/src/sgml/func.sgml > > max( class="parameter">expression) > > - any array, numeric, string, or date/time type > + any inet, array, numeric, string, or date/time type >same as argument type > > maximum value of @@ -12204,7 +12228,7 @@ NULL baz(3 rows) > > min( class="parameter">expression) > > - any array, numeric, string, or date/time type > + any inet, array, numeric, string, or date/time type >same as argument type > > minimum value of wrote: > On Mon, Jul 7, 2014 at 6:59 PM, Asif Naeem wrote: > > Hi Haribabu, > > > > Thank you for sharing the patch. I have spent some time to review the > > changes. Overall patch looks good to me, make check and manual testing > seems > > run fine with it. There seems no related doc/sgml changes ?. Patch added > > network_smaller() and network_greater() functions but in PG source code, > > general practice seems to be to use “smaller" and “larger” as related > > function name postfix e.g. timestamp_smaller()/timestamp_larger(), > > interval_smaller/interval_larger(), cashsmaller()/cashlarger() etc. > Thanks. > > Thanks for reviewing the patch. > > I corrected the function names as smaller and larger. > and also added documentation changes. > > Updated patch attached in the mail. > > Regards, > Hari Babu > Fujitsu Australia >