Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Thu, Apr 3, 2014 at 11:44 PM, Tom Lane wrote: > Personally, I have paid no attention to this thread and have no intention > of doing so before feature freeze. Anything that you missed was likely musings on how to further generalize SortSupport. The actual additions to SortSupport and tuplesort proposed are rather small. A simple abstraction to allow for semi-reliable normalized keys, and a text sort support function to use it. > Perhaps I shouldn't lay my own guilt trip on other committers --- but > I think it would be a bad precedent to not deal with the existing patch > queue first. I think that that's a matter of personal priorities for committers. I am not in a position to tell anyone what their priorities ought to be, and giving extra attention to this patch may be unfair. It doesn't have to be a zero-sum game, though. Attention from a committer to, say, this patch does not necessarily have to come at the expense of another, if for example this patch piques somebody's interest and causes them to put extra work into it on top of what they'd already planned to look at. Again, under these somewhat unusual circumstances, that seems like something that some committer might be inclined to do, without it being altogether unreasonable. Then again, perhaps not. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] four minor proposals for 9.5
On Tue, Apr 1, 2014 at 11:42 PM, Pavel Stehule wrote: > 2014-03-27 17:56 GMT+01:00 Pavel Stehule : >> So I'll prepare a some prototypes in next month for >> >> 1. log a execution time for cancelled queries, >> 2. track a query lock time >> > > When I though about this proposal, I found so our implementation is > plus/minus hack, that can works well in GoodData, but can be inconsistent > with native postgresql. So in this proposal I'll plan some different than we > use, but I hope so it is more consistent with upstream. > > So I miss a execution time for cancelled queries. Same time can be > interesting for some queries that was from any reasons - temp file limits > can stop queries after 5 minutes, some out of memory etc .. > > It is not hard to implement printing duration for cancelled queries, but is > impossible do it for any kind of exception. But there is way. We can use a > "log line prefix" space. Now, there are not a possibility to print duration > - so we can introduce a new symbol %D as duration. This means for any exception/error it will print duration if %D is used, not only for cancelled queries or you planing just for some specific logs like when query is cancelled. One more thing I think currently also we can find that by crude way (pg_stat_activity has query_start time and log_line_prefix has end time), but I think the having in one place 'log' will be more useful. > Same technique I would to > use for printing lock time - it can be printing instead symbol %L. Above you have mentioned that you are planing to have three different lock times (Table, Tuple and others), so will this one symbol (%L) enable all three lock times? Are you planing to add new logs for logging this or planing to use existing infrastructure? One general point is that won't it be bit difficult to parse the log line having so many times, should we consider to log with some special marker for each time (for example: tup_lock_wait_time:1000ms). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] A question about code in DefineRelation()
If I understand correctly, foreign tables cannot have an OID column, but the following code in DefineRelation() assumes that foreign tables *can* have that coulum: 560 /* 561 * Create a tuple descriptor from the relation schema. Note that this 562 * deals with column names, types, and NOT NULL constraints, but not 563 * default values or CHECK constraints; we handle those below. 564 */ 565 descriptor = BuildDescForRelation(schema); 566 567 localHasOids = interpretOidsOption(stmt->options, 568(relkind == RELKIND_RELATION || 569 relkind == RELKIND_FOREIGN_TABLE)); Is this intended to support an OID column on foreign tables in future? Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing "make check" (CVE-2014-0067)
On Fri, Apr 04, 2014 at 02:36:05AM +, YAMAMOTO Takashi wrote: > > Thanks. To avoid socket path length limitations, I lean toward placing the > > socket temporary directory under /tmp rather than placing under the CWD: > > > > http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com > > openvswitch has some tricks to overcome the socket path length > limitation using symlink. (or procfs where available) > iirc these were introduced for debian builds which use deep CWD. That's another reasonable approach. Does it have a notable advantage over placing the socket in a subdirectory of /tmp? Offhand, the security and compatibility consequences look similar. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
Peter Geoghegan writes: > I think that those are objectively very large reductions in a cost > that figures prominently in most workloads. Based solely on those > facts, but also on the fairly low complexity of the patch, it may be > worth considering committing this before 9.4 goes into feature freeze, Personally, I have paid no attention to this thread and have no intention of doing so before feature freeze. There are three dozen patches at https://commitfest.postgresql.org/action/commitfest_view?id=21 that have moral priority for consideration for 9.4. Not all of them are going to get in, certainly, and I'm already feeling a lot of guilt about the small amount of time I've been able to devote to reviewing/committing patches this cycle. Spending time now on patches that didn't even exist at the submission deadline feels quite unfair to me. Perhaps I shouldn't lay my own guilt trip on other committers --- but I think it would be a bad precedent to not deal with the existing patch queue first. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Thu, Apr 3, 2014 at 3:19 PM, Thom Brown wrote: > Master: 38.450082 / 37.440701 (avg: 37.9453915) > Patch: 153.321946 / 145.004726 (avg: 149.163336) I think that those are objectively very large reductions in a cost that figures prominently in most workloads. Based solely on those facts, but also on the fairly low complexity of the patch, it may be worth considering committing this before 9.4 goes into feature freeze, purely as something that just adds a SortSupport function for the default text opclass, with more or less the same cases accelerated as with the existing SortSupport-supplying-opclasses. There has been only very modest expansions to the SortSupport and tuplesort code. I haven't generalized this to work with other areas where a normalized key could be put to good use, but I see no reason to block on that. Obviously I've missed the final commitfest deadline by a wide berth. I don't suggest this lightly, and it in no small part has a lot to do with the patch being simple and easy to reason about. The patch could almost (but not quite) be written as part of a third party text operator class's sort support routine. I think that if an individual committer were willing to commit this at their own discretion before feature freeze, outside of the formal commitfest process, that would not be an unreasonable thing in these particular, somewhat unusual circumstances. I defer entirely to the judgement of others here - this is not an issue that I feel justified in expressing a strong opinion on, and in fact I don't have such an opinion anyway. However, by actually looking at the risks and the benefits here, I think everyone will at least understand why I'd feel justified in broaching the topic. This is a very simple idea, and a rather old one at that. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Securing "make check" (CVE-2014-0067)
> Thanks. To avoid socket path length limitations, I lean toward placing the > socket temporary directory under /tmp rather than placing under the CWD: > > http://www.postgresql.org/message-id/flat/20121129223632.ga15...@tornado.leadboat.com openvswitch has some tricks to overcome the socket path length limitation using symlink. (or procfs where available) iirc these were introduced for debian builds which use deep CWD. http://git.openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=blob;f=lib/socket- util.c;h=aa0c7196da9926de38b7388b8e28ead12e12913e;hb=HEAD look at shorten_name_via_symlink and shorten_name_via_proc. YAMAMOTO Takashi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On 2014-04-03 19:33:12 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2014-04-03 19:08:27 -0400, Tom Lane wrote: > >> A somewhat more relevant concern is where are we going to keep the state > >> data involved in all this. Since this code is, by definition, going to be > >> called in critical sections, any solution involving internal pallocs is > >> right out. > > > We actually already allocate memory in XLogInsert() :(, although only in > > the first XLogInsert() a backend does... > > Ouch. I wonder if we should put an Assert(not-in-critical-section) > into MemoryContextAlloc. XLogInsert() is using malloc() directly, so that wouldn't detect this case... It's not a bad idea tho. I wonder how far the regression tests get... Not even through initdb. GetVirtualXIDsDelayingChkpt() is to blame. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
Andres Freund writes: > On 2014-04-03 19:08:27 -0400, Tom Lane wrote: >> A somewhat more relevant concern is where are we going to keep the state >> data involved in all this. Since this code is, by definition, going to be >> called in critical sections, any solution involving internal pallocs is >> right out. > We actually already allocate memory in XLogInsert() :(, although only in > the first XLogInsert() a backend does... Ouch. I wonder if we should put an Assert(not-in-critical-section) into MemoryContextAlloc. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On 2014-04-03 19:08:27 -0400, Tom Lane wrote: > Robert Haas writes: > > On Thu, Apr 3, 2014 at 10:14 AM, Heikki Linnakangas > > wrote: > >> (Instead of using a new XLogRegisterBuffer() function to register the > >> buffers, perhaps they should be passed to XLogInsert as a separate list or > >> array. I'm not wedded on the details...) > > > That would have the advantage of avoiding the function-call overhead, > > which seems appealing. > > You're kidding, right? One function call per buffer touched by an update > is going to be lost in the noise. > > A somewhat more relevant concern is where are we going to keep the state > data involved in all this. Since this code is, by definition, going to be > called in critical sections, any solution involving internal pallocs is > right out. We actually already allocate memory in XLogInsert() :(, although only in the first XLogInsert() a backend does... I didn't remember it, but I complained about it before: http://archives.postgresql.org/message-id/4FEB223A.3060707%40enterprisedb.com I didn't realize the implications for it back then and thus didn't press hard for a change, but I think it should be fixed. > The existing arrangement keeps all its data in local variables > of the function doing the update, which is probably a nice property to > preserve if we can manage it. That might force us to do it as above. We could simply allocate enough scratch space for the state at process startup. I don't think there'll ever be a need for XLogInsert() to be reentrant, so that should be fine. > I'd still like something that *looks* more like a list of function calls, > though, even if they're macros under the hood. The existing approach to > setting up the XLogRecData chains is awfully prone to bugs of > omission. Agreed. There's some pretty ugly code around this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_fn_expr_variadic considered harmful
I wrote: > Robert Haas writes: >> Well, that argues for the choice of trying to make them equivalent >> again, I suppose, but it sounds like there are some nasty edge cases >> that won't easily be filed down. I think your idea of redefining >> funcvariadic to be true only for VARIADIC ANY is probably a promising >> approach to that solution, but as you say it leaves some problems >> unsolved. > I think what we'll have to do is tell complainants to recreate any > affected indexes or rules after installing 9.3.5. Given the relatively > small number of complaints, I don't think it's worth working harder, > nor taking risks like inserting catalog lookups into readfuncs.c. After some thought, it seems to me that the best solution is to redefine funcvariadic as meaning "the last actual argument is a variadic array", which means it will always be true for a VARIADIC non-ANY function. In HEAD, that leads to a nice patch that actually simplifies the logic in ruleutils.c, as attached. In 9.3, we will not be able to assume that funcvariadic has that meaning, so we'll have to use the existing decompilation logic (which basically ignores funcvariadic unless it's a VARIADIC ANY function). I've not made that version of the patch yet but it should be pretty straightforward. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 32c0135..e54d46d 100644 *** a/contrib/postgres_fdw/deparse.c --- b/contrib/postgres_fdw/deparse.c *** deparseFuncExpr(FuncExpr *node, deparse_ *** 1548,1562 procform = (Form_pg_proc) GETSTRUCT(proctup); /* Check if need to print VARIADIC (cf. ruleutils.c) */ ! if (OidIsValid(procform->provariadic)) ! { ! if (procform->provariadic != ANYOID) ! use_variadic = true; ! else ! use_variadic = node->funcvariadic; ! } ! else ! use_variadic = false; /* Print schema name only if it's not pg_catalog */ if (procform->pronamespace != PG_CATALOG_NAMESPACE) --- 1548,1554 procform = (Form_pg_proc) GETSTRUCT(proctup); /* Check if need to print VARIADIC (cf. ruleutils.c) */ ! use_variadic = node->funcvariadic; /* Print schema name only if it's not pg_catalog */ if (procform->pronamespace != PG_CATALOG_NAMESPACE) diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index 2b4ade0..941b101 100644 *** a/doc/src/sgml/xfunc.sgml --- b/doc/src/sgml/xfunc.sgml *** CREATE OR REPLACE FUNCTION retcomposite( *** 3152,3160 is zero based. get_call_result_type can also be used as an alternative to get_fn_expr_rettype. There is also get_fn_expr_variadic, which can be used to ! find out whether the call contained an explicit VARIADIC ! keyword. This is primarily useful for VARIADIC "any" ! functions, as described below. --- 3152,3161 is zero based. get_call_result_type can also be used as an alternative to get_fn_expr_rettype. There is also get_fn_expr_variadic, which can be used to ! find out whether variadic arguments have been merged into an array. ! This is primarily useful for VARIADIC "any" functions, ! since such merging will always have occurred for variadic functions ! taking ordinary array types. diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index 5934ab0..cc46084 100644 *** a/src/backend/parser/parse_func.c --- b/src/backend/parser/parse_func.c *** ParseFuncOrColumn(ParseState *pstate, Li *** 556,561 --- 556,572 make_fn_arguments(pstate, fargs, actual_arg_types, declared_arg_types); /* + * If the function isn't actually variadic, forget any VARIADIC decoration + * on the call. (Perhaps we should throw an error instead, but + * historically we've allowed people to write that.) + */ + if (!OidIsValid(vatype)) + { + Assert(nvargs == 0); + func_variadic = false; + } + + /* * If it's a variadic function call, transform the last nvargs arguments * into an array --- unless it's an "any" variadic. */ *** ParseFuncOrColumn(ParseState *pstate, Li *** 584,589 --- 595,605 newa->location = exprLocation((Node *) vargs); fargs = lappend(fargs, newa); + + /* We could not have had VARIADIC marking before ... */ + Assert(!func_variadic); + /* ... but now, it's a VARIADIC call */ + func_variadic = true; } /* diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 566b4c9..b1bac86 100644 *** a/src/backend/utils/adt/ruleutils.c --- b/src/backend/utils/adt/ruleutils.c *** static char *get_relation_name(Oid relid *** 401,407 static char *generate_relation_name(Oid relid, List *namespaces); static char *generate_function_name(Oid funcid, int nargs, List *argnames, Oid *argtypes, ! bool was_variadic, bool *use_variadic_p); static char *
Re: [HACKERS] WAL format and API changes (9.5)
Robert Haas writes: > On Thu, Apr 3, 2014 at 10:14 AM, Heikki Linnakangas > wrote: >> (Instead of using a new XLogRegisterBuffer() function to register the >> buffers, perhaps they should be passed to XLogInsert as a separate list or >> array. I'm not wedded on the details...) > That would have the advantage of avoiding the function-call overhead, > which seems appealing. You're kidding, right? One function call per buffer touched by an update is going to be lost in the noise. A somewhat more relevant concern is where are we going to keep the state data involved in all this. Since this code is, by definition, going to be called in critical sections, any solution involving internal pallocs is right out. The existing arrangement keeps all its data in local variables of the function doing the update, which is probably a nice property to preserve if we can manage it. That might force us to do it as above. I'd still like something that *looks* more like a list of function calls, though, even if they're macros under the hood. The existing approach to setting up the XLogRecData chains is awfully prone to bugs of omission. There are a few places that went so far as to set up custom macros to help with that. I'm not a fan of doing the same thing in randomly different ways in different places; but if we did it that way uniformly, it might be better/safer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: SSL auth question
=?koi8-r?B?98/Sz87JziDkzcnU0snK?= writes: > I know it. So, my second questions is: > How can I add support of this extension in PostgreSQL. So, I want to do > thing, that PostgreSQL accept connection with cert auth method and > certificate has my extension with critical flag? Seems like this is a question you should direct to OpenSSL people, not us. Postgres itself knows nothing to speak of about SSL certificates; it just delegates all that processing to openssl. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On Thu, Apr 3, 2014 at 10:14 AM, Heikki Linnakangas wrote: > (Instead of using a new XLogRegisterBuffer() function to register the > buffers, perhaps they should be passed to XLogInsert as a separate list or > array. I'm not wedded on the details...) That would have the advantage of avoiding the function-call overhead, which seems appealing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: SSL auth question
Thank you for answer! I know it. So, my second questions is: How can I add support of this extension in PostgreSQL. So, I want to do thing, that PostgreSQL accept connection with cert auth method and certificate has my extension with critical flag? 03.04.2014, 04:33, "Wim Lewis" : > On 1 Apr 2014, at 11:38 PM, carriingfat...@ya.ru wrote: > >> I set certificate auth on postgresql 9.3. I generate SSL certificate with >> my custom extension. So, OpenSSL read it, PostgreSQL accept it if this >> extension is not critical, but if I set this extension critical, PostgreSQL >> deny connection. > > I think that is the correct behavior. The "critical" bit tells PostgreSQL (or > other software) what to do if it does not understand the extension: if > there's an unknown extension with the critical bit set, then the certificate > can't be validated. If the critical bit is not set, then the unknown > extension is ignored, and the certificate is processed as if the extension > weren't there. > > See this section of RFC 5280: > http://tools.ietf.org/html/rfc5280#section-4.2 > > The idea is that you can set the critical bit for extensions that are > supposed *restrict* the usability of the certificate, so that the certificate > won't be used in undesired ways by software that doesn't understand the > extension. Best regards, Dmitry Voronin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL Columnar Store for Analytic Workloads
Dear Hackers, We at Citus Data have been developing a columnar store extension for PostgreSQL. Today we are excited to open source it under the Apache v2.0 license. This columnar store extension uses the Optimized Row Columnar (ORC) format for its data layout, which improves upon the RCFile format developed at Facebook, and brings the following benefits: * Compression: Reduces in-memory and on-disk data size by 2-4x. Can be extended to support different codecs. We used the functions in pg_lzcompress.h for compression and decompression. * Column projections: Only reads column data relevant to the query. Improves performance for I/O bound queries. * Skip indexes: Stores min/max statistics for row groups, and uses them to skip over unrelated rows. We used the PostgreSQL FDW APIs to make this work. The extension doesn't implement the writable FDW API, but it uses the process utility hook to enable COPY command for the columnar tables. This extension uses PostgreSQL's internal data type representation to store data in the table, so this columnar store should support all data types that PostgreSQL supports. We tried the extension on TPC-H benchmark with 4GB scale factor on a m1.xlarge Amazon EC2 instance, and the query performance improved by 2x-3x compared to regular PostgreSQL table. Note that we flushed the page cache before each test to see the impact on disk I/O. When data is cached in memory, the performance of cstore_fdw tables were close to the performance of regular PostgreSQL tables. For more information, please visit: * our blog post: http://citusdata.com/blog/76-postgresql-columnar-store-for-analytics * our github page: https://github.com/citusdata/cstore_fdw Feedback from you is really appreciated. Thanks, -- Hadi
[HACKERS] Firing trigger if only
Good afternoon all.I have some problem with triggers on PostgreSQL 8.4.I have a trigger on specific table(articles) that fires on update statement: CREATE OR REPLACE FUNCTION trigger_articles_update() RETURNS trigger AS $BODY$BEGIN INSERT INTO article_change(article_id,change_type)VALUES(OLD.article_id,2); RETURN NULL; END$BODY$ LANGUAGE 'plpgsql' VOLATILE COST 100; ALTER FUNCTION trigger_articles_update() OWNER TO postgres; I have 2 different applications that performs update on table articles(written in Delphi and using ZeosDB). My problem is that I want trigger to fire only when performing update with first application, but not with second.I know that triggers supposed to fire on every change on table, but this is a specific problem that I have.Any hint appreciated. 8) -- View this message in context: http://postgresql.1045698.n5.nabble.com/Firing-trigger-if-only-tp5798484.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add ALTER TABLESPACE ... MOVE command
Alvaro Herrera writes: > Stephen Frost wrote: >> Add ALTER TABLESPACE ... MOVE command > I just noticed that this commit added the new commands under the > "ALTER THING name RENAME TO" production (which were originally for > RenameStmt); since commit d86d51a95 had already added some > AlterTableSpaceOptionsStmt nodes to the possible results, maybe this > wasn't so bad in itself; but still it seems quite unlike the way we > organize our parse productions. Ick. That's just plain sloppy. Please create a separate production, *and* a separate comment header. Commit d86d51a95 was pretty damn awful in this regard as well, but let's clean them both up, not make it worse. Existing precedent would suggest inventing two new productions named the same as the parse node types they produce, viz AlterTableSpaceMoveStmt and AlterTableSpaceOptionsStmt. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2014 proposal
On Thu, Apr 3, 2014 at 11:21 PM, Heikki Linnakangas wrote: > On 04/03/2014 04:15 PM, Alexander Korotkov wrote: > >> On Wed, Apr 2, 2014 at 2:22 PM, Alexander Korotkov > >wrote: >> >> On Tue, Apr 1, 2014 at 2:23 PM, Heikki Linnakangas < >>> hlinnakan...@vmware.com> wrote: >>> >>> The BIRCH algorithm as described in the paper describes building a tree in memory. If I understood correctly, you're suggesting to use a pre-built GiST index instead. Interesting idea! There are a couple of signifcant differences between the CF tree described in the paper and GiST: 1. In GiST, a leaf item always represents one heap tuple. In the CF tree, a leaf item represents a cluster, which consists of one or more tuples. So the CF tree doesn't store an entry for every input tuple, which makes it possible to keep it in memory. 2. In the CF tree, "all entries in a leaf node must satisfy a threshold requirement, with respect to a threshold value T: the diameter (or radius) has to be less than T". GiST imposes no such restrictions. An item can legally be placed anywhere in the tree; placing it badly will just lead to degraded search performance, but it's still a legal GiST tree. 3. A GiST index, like any other index in PostgreSQL, holds entries also for deleted tuples, until the index is vacuumed. So you cannot just use information from a non-leaf node and use it in the result, as the information summarized at a non-leaf level includes noise from the dead tuples. Can you elaborate how you are planning to use a GiST index to implement BIRCH? You might also want to take a look at SP-GiST; SP-GiST is more strict in where in the tree an item can be stored, and lets the operator class to specify exactly when a node is split etc. >>> Hmmm, it's likely I've imagined something quite outside of this paper, >>> and >>> even already suggested it to Ivan... :) >>> I need a little time to rethink it. >>> >>> >> Using GiST we can implement BIRCH-like clustering like so: >> 1) Build a CF tree as GiST index without restriction of T threshold value. >> 2) Scan CF tree with threshold T with some auxiliary operator. If >> consistent method see CF entry which diameter is greater than T then it >> returns true. Otherwise it returns false and put this CF entry into output >> area (could be either in-memory or temporary table). >> 3) Process other steps of algorithm as usual. >> > > I still don't understand how that would work. You can't trust the non-leaf > entries, because their CF value can contain deleted entries. So you have to > scan every tuple anyway. Once you do that, what's the point of the index? Yeah, it is limitation of this idea. It's not going to be auto-updatable CF. User can build index on freshly vacuumed table and play with clustering some time. Updates on table during that time would be either allowed clustering error or prohibited. Another potential solution is to let this index to hold some snapshot of data. But it seems not possible to do in extension now. -- With best regards, Alexander Korotkov.
Re: [HACKERS] GSoC 2014 proposal
On 04/03/2014 04:15 PM, Alexander Korotkov wrote: On Wed, Apr 2, 2014 at 2:22 PM, Alexander Korotkov wrote: On Tue, Apr 1, 2014 at 2:23 PM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote: The BIRCH algorithm as described in the paper describes building a tree in memory. If I understood correctly, you're suggesting to use a pre-built GiST index instead. Interesting idea! There are a couple of signifcant differences between the CF tree described in the paper and GiST: 1. In GiST, a leaf item always represents one heap tuple. In the CF tree, a leaf item represents a cluster, which consists of one or more tuples. So the CF tree doesn't store an entry for every input tuple, which makes it possible to keep it in memory. 2. In the CF tree, "all entries in a leaf node must satisfy a threshold requirement, with respect to a threshold value T: the diameter (or radius) has to be less than T". GiST imposes no such restrictions. An item can legally be placed anywhere in the tree; placing it badly will just lead to degraded search performance, but it's still a legal GiST tree. 3. A GiST index, like any other index in PostgreSQL, holds entries also for deleted tuples, until the index is vacuumed. So you cannot just use information from a non-leaf node and use it in the result, as the information summarized at a non-leaf level includes noise from the dead tuples. Can you elaborate how you are planning to use a GiST index to implement BIRCH? You might also want to take a look at SP-GiST; SP-GiST is more strict in where in the tree an item can be stored, and lets the operator class to specify exactly when a node is split etc. Hmmm, it's likely I've imagined something quite outside of this paper, and even already suggested it to Ivan... :) I need a little time to rethink it. Using GiST we can implement BIRCH-like clustering like so: 1) Build a CF tree as GiST index without restriction of T threshold value. 2) Scan CF tree with threshold T with some auxiliary operator. If consistent method see CF entry which diameter is greater than T then it returns true. Otherwise it returns false and put this CF entry into output area (could be either in-memory or temporary table). 3) Process other steps of algorithm as usual. I still don't understand how that would work. You can't trust the non-leaf entries, because their CF value can contain deleted entries. So you have to scan every tuple anyway. Once you do that, what's the point of the index? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 3 April 2014 19:05, Peter Geoghegan wrote: > On Thu, Apr 3, 2014 at 1:23 PM, Thom Brown wrote: >> I'm getting an error when building this: > > Sorry. I ran this through Valgrind, and forgot to reset where the > relevant macro is define'd before submission. Attached revision should > build without issue. Looking good: -T 100 -n -f sort.sql Master: 21.670467 / 21.718653 (avg: 21.69456) Patch: 66.888756 / 66.888756 (avg: 66.888756) 308% increase in speed -c 80 -j 80 -T 100 -n -f sort.sql Master: 38.450082 / 37.440701 (avg: 37.9453915) Patch: 153.321946 / 145.004726 (avg: 149.163336) 393% increase in speed -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add ALTER TABLESPACE ... MOVE command
Stephen Frost wrote: > Add ALTER TABLESPACE ... MOVE command > > This adds a 'MOVE' sub-command to ALTER TABLESPACE which allows moving sets of > objects from one tablespace to another. This can be extremely handy and > avoids > a lot of error-prone scripting. ALTER TABLESPACE ... MOVE will only move > objects the user owns, will notify the user if no objects were found, and can > be used to move ALL objects or specific types of objects (TABLES, INDEXES, or > MATERIALIZED VIEWS). I just noticed that this commit added the new commands under the "ALTER THING name RENAME TO" production (which were originally for RenameStmt); since commit d86d51a95 had already added some AlterTableSpaceOptionsStmt nodes to the possible results, maybe this wasn't so bad in itself; but still it seems quite unlike the way we organize our parse productions. If we don't want to add new productions for these new node types, I think at least this comment update is warranted: diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 2867fa2..359bb8c 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -7009,6 +7009,8 @@ opt_force:FORCE { $$ = TRUE; } /* * * ALTER THING name RENAME TO newname + * ALTER TABLESPACE name MOVE blah + * ALTER TABLESPACE name SET/RESET blah * */ Other thoughts? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Useless "Replica Identity: NOTHING" noise from psql \d
On 03/27/2014 01:09 PM, Tom Lane wrote: Alvaro Herrera writes: Note that "make check" in contrib is substantially slower than "make installcheck" --- it creates a temporary installation for each contrib module. If we make buildfarm run installcheck for all modules, that might be a problem for some animals. Would it be possible to have a target that runs "make installcheck" for most modules and "make check" only for those that need the separate installation? Maybe we can have a file listing modules that don't support installcheck, for instance, and then use $(filter) and $(filter-out) to produce appropriate "$(MAKE) -C" loops. This seems like a good idea to me; the slower animals will be putting lots of cycles into pretty-much-useless "make check" builds if we don't. Rather than a separate file, though, I think a distinct target in contrib/Makefile would be the best mechanism for keeping the list of modules that lack installcheck support. Perhaps "make special-check" or some name along that line? Also, there's the vcregress.pl business. The way it essentially duplicates pg_upgrade/test.sh is rather messy; and now that test_decoding also needs similar treatment, it's not looking so good. Should we consider redoing that stuff in a way that allows both MSVC and make-based systems to run those tests? Agreed, but I'm not volunteering to fix that one ;-) I've been kind of hoping that someone would step up on both these items, but the trail seems to have gone cold. I'm going to put out the new buildfarm release with the new module to run test_decoding check. But of course It won't have MSVC support. These can go on my long TOO list unless someone else gets there first. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Thu, Apr 3, 2014 at 1:23 PM, Thom Brown wrote: > I'm getting an error when building this: Sorry. I ran this through Valgrind, and forgot to reset where the relevant macro is define'd before submission. Attached revision should build without issue. -- Peter Geoghegan *** a/src/backend/executor/nodeMergeAppend.c --- b/src/backend/executor/nodeMergeAppend.c *** ExecInitMergeAppend(MergeAppend *node, E *** 136,141 --- 136,142 sortKey->ssup_collation = node->collations[i]; sortKey->ssup_nulls_first = node->nullsFirst[i]; sortKey->ssup_attno = node->sortColIdx[i]; + sortKey->leading = (i == 0); PrepareSortSupportFromOrderingOp(node->sortOperators[i], sortKey); } *** a/src/backend/utils/adt/varlena.c --- b/src/backend/utils/adt/varlena.c *** *** 17,22 --- 17,23 #include #include + #include "access/nbtree.h" #include "access/tuptoaster.h" #include "catalog/pg_collation.h" #include "catalog/pg_type.h" *** *** 29,34 --- 30,36 #include "utils/bytea.h" #include "utils/lsyscache.h" #include "utils/pg_locale.h" + #include "utils/sortsupport.h" /* GUC variable */ *** typedef struct *** 50,61 --- 52,85 int skiptable[256]; /* skip distance for given mismatched char */ } TextPositionState; + typedef struct + { + char *buf1; /* 1st string, or poorman original string buf */ + char *buf2; /* 2nd string, or leading key/poor man blob */ + int buflen1; + int buflen2; + #ifdef HAVE_LOCALE_T + pg_locale_t locale; + #endif + } TextSortSupport; + + /* + * This should be large enough that most strings will be fit, but small enough + * that we feel comfortable putting it on the stack + */ + #define TEXTBUFLEN 1024 + #define DatumGetUnknownP(X) ((unknown *) PG_DETOAST_DATUM(X)) #define DatumGetUnknownPCopy(X) ((unknown *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_UNKNOWN_P(n) DatumGetUnknownP(PG_GETARG_DATUM(n)) #define PG_GETARG_UNKNOWN_P_COPY(n) DatumGetUnknownPCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_UNKNOWN_P(x) PG_RETURN_POINTER(x) + static void btpoorman_worker(SortSupport ssup, Oid collid); + static int bttextfastcmp_c(Datum x, Datum y, SortSupport ssup); + static int bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup); + static int bttextfastcmp_locale_poorman(Datum x, Datum y, SortSupport ssup); + static Datum bttext_convert(Datum d, SortSupport ssup); static int32 text_length(Datum str); static text *text_catenate(text *t1, text *t2); static text *text_substring(Datum str, *** varstr_cmp(char *arg1, int len1, char *a *** 1356,1365 } else { ! #define STACKBUFLEN 1024 ! ! char a1buf[STACKBUFLEN]; ! char a2buf[STACKBUFLEN]; char *a1p, *a2p; --- 1380,1387 } else { ! char a1buf[TEXTBUFLEN]; ! char a2buf[TEXTBUFLEN]; char *a1p, *a2p; *** varstr_cmp(char *arg1, int len1, char *a *** 1393,1416 int a2len; int r; ! if (len1 >= STACKBUFLEN / 2) { a1len = len1 * 2 + 2; a1p = palloc(a1len); } else { ! a1len = STACKBUFLEN; a1p = a1buf; } ! if (len2 >= STACKBUFLEN / 2) { a2len = len2 * 2 + 2; a2p = palloc(a2len); } else { ! a2len = STACKBUFLEN; a2p = a2buf; } --- 1415,1438 int a2len; int r; ! if (len1 >= TEXTBUFLEN / 2) { a1len = len1 * 2 + 2; a1p = palloc(a1len); } else { ! a1len = TEXTBUFLEN; a1p = a1buf; } ! if (len2 >= TEXTBUFLEN / 2) { a2len = len2 * 2 + 2; a2p = palloc(a2len); } else { ! a2len = TEXTBUFLEN; a2p = a2buf; } *** varstr_cmp(char *arg1, int len1, char *a *** 1475,1485 } #endif /* WIN32 */ ! if (len1 >= STACKBUFLEN) a1p = (char *) palloc(len1 + 1); else a1p = a1buf; ! if (len2 >= STACKBUFLEN) a2p = (char *) palloc(len2 + 1); else a2p = a2buf; --- 1497,1507 } #endif /* WIN32 */ ! if (len1 >= TEXTBUFLEN) a1p = (char *) palloc(len1 + 1); else a1p = a1buf; ! if (len2 >= TEXTBUFLEN) a2p = (char *) palloc(len2 + 1); else a2p = a2buf; *** bttextcmp(PG_FUNCTION_ARGS) *** 1683,1688 --- 1705,2039 PG_RETURN_INT32(result); } + Datum + bttextsortsupport(PG_FUNCTION_ARGS) + { + SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0); + Oidcollid = ssup->ssup_collation; + + btpoorman_worker(ssup, collid); + + PG_RETURN_VOID(); + } + + static void + btpoorman_worker(SortSupport ssup, Oid collid) + { + TextSortSupport *tss; + + /* + * If LC_COLLATE = C, we can make things quite a bit faster by using + * memcmp() rather than strcoll(). To minimize the per-comparison + * overhead, we make this decision just once for the whole
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 3 April 2014 17:52, Peter Geoghegan wrote: > On Mon, Mar 31, 2014 at 7:35 PM, Peter Geoghegan wrote: >> Okay. Attached revision only trusts strxfrm() blobs (as far as that >> goes) when the buffer passed to strxfrm() was sufficiently large that >> the blob could fully fit. > > Attached revision has been further polished. I've added two additional > optimizations: > > * Squeeze the last byte out of each Datum, so that on a 64-bit system, > the full 8 bytes are available to store strxfrm() blobs. > > * Figure out when the strcoll() bttextfastcmp_locale() comparator is > called, if it was called because a poor man's comparison required it > (and *not* because it's the non-leading key in the traditional sense, > which implies there are no poorman's normalized keys in respect of > this attribute at all). This allows us to try and get away with a > straight memcmp if and when the lengths of the original text strings > match, on the assumption that when the initial poorman's comparison > didn't work out, and when the string lengths match, there is a very > good chance that both are equal, and on average it's a win to avoid > doing a strcoll() (along with the attendant copying around of buffers > for NULL-termination) entirely. Given that memcmp() is so much cheaper > than strcoll() anyway, this seems like a good trade-off. I'm getting an error when building this: In file included from printtup.c:23:0: ../../../../src/include/utils/memdebug.h:21:31: fatal error: valgrind/memcheck.h: No such file or directory compilation terminated. gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -I../../../src/include -D_GNU_SOURCE -c -o analyze.o analyze.c -MMD -MP -MF .deps/analyze.Po make[4]: *** [printtup.o] Error 1 make[4]: Leaving directory `/home/thom/Development/postgresql/src/backend/access/common' make[3]: *** [common-recursive] Error 2 make[3]: Leaving directory `/home/thom/Development/postgresql/src/backend/access' make[2]: *** [access-recursive] Error 2 make[2]: *** Waiting for unfinished jobs -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Mon, Mar 31, 2014 at 7:35 PM, Peter Geoghegan wrote: > Okay. Attached revision only trusts strxfrm() blobs (as far as that > goes) when the buffer passed to strxfrm() was sufficiently large that > the blob could fully fit. Attached revision has been further polished. I've added two additional optimizations: * Squeeze the last byte out of each Datum, so that on a 64-bit system, the full 8 bytes are available to store strxfrm() blobs. * Figure out when the strcoll() bttextfastcmp_locale() comparator is called, if it was called because a poor man's comparison required it (and *not* because it's the non-leading key in the traditional sense, which implies there are no poorman's normalized keys in respect of this attribute at all). This allows us to try and get away with a straight memcmp if and when the lengths of the original text strings match, on the assumption that when the initial poorman's comparison didn't work out, and when the string lengths match, there is a very good chance that both are equal, and on average it's a win to avoid doing a strcoll() (along with the attendant copying around of buffers for NULL-termination) entirely. Given that memcmp() is so much cheaper than strcoll() anyway, this seems like a good trade-off. -- Peter Geoghegan *** a/src/backend/executor/nodeMergeAppend.c --- b/src/backend/executor/nodeMergeAppend.c *** ExecInitMergeAppend(MergeAppend *node, E *** 136,141 --- 136,142 sortKey->ssup_collation = node->collations[i]; sortKey->ssup_nulls_first = node->nullsFirst[i]; sortKey->ssup_attno = node->sortColIdx[i]; + sortKey->leading = (i == 0); PrepareSortSupportFromOrderingOp(node->sortOperators[i], sortKey); } *** a/src/backend/utils/adt/varlena.c --- b/src/backend/utils/adt/varlena.c *** *** 17,22 --- 17,23 #include #include + #include "access/nbtree.h" #include "access/tuptoaster.h" #include "catalog/pg_collation.h" #include "catalog/pg_type.h" *** *** 29,34 --- 30,36 #include "utils/bytea.h" #include "utils/lsyscache.h" #include "utils/pg_locale.h" + #include "utils/sortsupport.h" /* GUC variable */ *** typedef struct *** 50,61 --- 52,85 int skiptable[256]; /* skip distance for given mismatched char */ } TextPositionState; + typedef struct + { + char *buf1; /* 1st string, or poorman original string buf */ + char *buf2; /* 2nd string, or leading key/poor man blob */ + int buflen1; + int buflen2; + #ifdef HAVE_LOCALE_T + pg_locale_t locale; + #endif + } TextSortSupport; + + /* + * This should be large enough that most strings will be fit, but small enough + * that we feel comfortable putting it on the stack + */ + #define TEXTBUFLEN 1024 + #define DatumGetUnknownP(X) ((unknown *) PG_DETOAST_DATUM(X)) #define DatumGetUnknownPCopy(X) ((unknown *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_UNKNOWN_P(n) DatumGetUnknownP(PG_GETARG_DATUM(n)) #define PG_GETARG_UNKNOWN_P_COPY(n) DatumGetUnknownPCopy(PG_GETARG_DATUM(n)) #define PG_RETURN_UNKNOWN_P(x) PG_RETURN_POINTER(x) + static void btpoorman_worker(SortSupport ssup, Oid collid); + static int bttextfastcmp_c(Datum x, Datum y, SortSupport ssup); + static int bttextfastcmp_locale(Datum x, Datum y, SortSupport ssup); + static int bttextfastcmp_locale_poorman(Datum x, Datum y, SortSupport ssup); + static Datum bttext_convert(Datum d, SortSupport ssup); static int32 text_length(Datum str); static text *text_catenate(text *t1, text *t2); static text *text_substring(Datum str, *** varstr_cmp(char *arg1, int len1, char *a *** 1356,1365 } else { ! #define STACKBUFLEN 1024 ! ! char a1buf[STACKBUFLEN]; ! char a2buf[STACKBUFLEN]; char *a1p, *a2p; --- 1380,1387 } else { ! char a1buf[TEXTBUFLEN]; ! char a2buf[TEXTBUFLEN]; char *a1p, *a2p; *** varstr_cmp(char *arg1, int len1, char *a *** 1393,1416 int a2len; int r; ! if (len1 >= STACKBUFLEN / 2) { a1len = len1 * 2 + 2; a1p = palloc(a1len); } else { ! a1len = STACKBUFLEN; a1p = a1buf; } ! if (len2 >= STACKBUFLEN / 2) { a2len = len2 * 2 + 2; a2p = palloc(a2len); } else { ! a2len = STACKBUFLEN; a2p = a2buf; } --- 1415,1438 int a2len; int r; ! if (len1 >= TEXTBUFLEN / 2) { a1len = len1 * 2 + 2; a1p = palloc(a1len); } else { ! a1len = TEXTBUFLEN; a1p = a1buf; } ! if (len2 >= TEXTBUFLEN / 2) { a2len = len2 * 2 + 2; a2p = palloc(a2len); } else { ! a2len = TEXTBUFLEN; a2p = a2buf; } *** varstr_cmp(char *arg1, int len1, char *a *** 1475,1485 } #endif /* WIN32 */ ! if (len1 >= STACKBUFLEN) a1p = (char *) palloc(len1 + 1);
[HACKERS] PostgreSQL Columnar Store for Analytic Workloads
Dear Hackers, We at Citus Data have been developing a columnar store extension for PostgreSQL. Today we are excited to open source it under the Apache v2.0 license. This columnar store extension uses the Optimized Row Columnar (ORC) format for its data layout, which improves upon the RCFile format developed at Facebook, and brings the following benefits: * Compression: Reduces in-memory and on-disk data size by 2-4x. Can be extended to support different codecs. We used the functions in pg_lzcompress.h for compression and decompression. * Column projections: Only reads column data relevant to the query. Improves performance for I/O bound queries. * Skip indexes: Stores min/max statistics for row groups, and uses them to skip over unrelated rows. We used the PostgreSQL FDW APIs to make this work. The extension doesn't implement the writable FDW API, but it uses the process utility hook to enable COPY command for the columnar tables. This extension uses PostgreSQL's internal data type representation to store data in the table, so this columnar store should support all data types that PostgreSQL supports. We tried the extension on TPC-H benchmark with 4GB scale factor on a m1.xlarge Amazon EC2 instance, and the query performance improved by 2x-3x compared to regular PostgreSQL table. Note that we flushed the page cache before each test to see the impact on disk I/O. When data is cached in memory, the performance of cstore_fdw tables were close to the performance of regular PostgreSQL tables. For more information, please visit: * our blog post: http://citusdata.com/blog/76-postgresql-columnar-store-for-analytics * our github page: https://github.com/citusdata/cstore_fdw Feedback from you is really appreciated. Thanks, -- Hadi
Re: [HACKERS] WAL format and API changes (9.5)
On 04/03/2014 07:11 PM, Tom Lane wrote: More generally, I'm pretty sure that your proposal is already going to involve some small growth of WAL records compared to today, Quite possible. but I think that's probably all right; the benefits seem significant. Yep. OTOH, once we store the relfilenode+block in a common format, we can then try to optimize that format more heavily. Just as an example, omit the tablespace oid in the RelFileNode, when it's the default tablespace (with a flag bit indicating we did that). Or use a variable-length endoding for the block number, on the assumption that smaller numbers are more common. Probably not be worth the extra complexity, but we can easily experiment with that kind of stuff once we have the infrastructure in place. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
Heikki Linnakangas writes: > On 04/03/2014 06:37 PM, Tom Lane wrote: >> +1, but one important step here is finding the data to be replayed. >> That is, a large part of the complexity of replay routines has to do >> with figuring out which parts of the WAL record were elided due to >> full-page-images, and locating the remaining parts. What can we do >> to make that simpler? > We can certainly add more structure to the WAL records, but any extra > information you add will make the records larger. It might be worth it, > and would be lost in the noise for more complex records like page > splits, but we should keep frequently-used records like heap insertions > as lean as possible. Sure, but in simple WAL records there would just be a single data chunk that would be present in the normal case and not present in the FPI case. Seems like we ought to be able to handle that degenerate case with no significant wastage (probably just a flag bit someplace). More generally, I'm pretty sure that your proposal is already going to involve some small growth of WAL records compared to today, but I think that's probably all right; the benefits seem significant. > Hmm. You could register a separate XLogRecData chain for each buffer. > Plus one more chain for the data not associated with a buffer. That would probably work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On 04/03/2014 06:37 PM, Tom Lane wrote: Also, IIRC there are places that WAL-log full pages that aren't in a shared buffer at all (btree build does this I think). How will that fit into this model? Hmm. We could provide a function for registering a block with given content, without a Buffer. Something like: XLogRegisterPage(int id, RelFileNode, BlockNumber, Page) Let's simplify that, and have one new function, XLogOpenBuffer, which returns a return code that indicates which of the four cases we're dealing with. A typical redo function looks like this: if (XLogOpenBuffer(0, &buffer) == BLK_REPLAY) { /* Modify the page */ ... PageSetLSN(page, lsn); MarkBufferDirty(buffer); } if (BufferIsValid(buffer)) UnlockReleaseBuffer(buffer); The '0' in the XLogOpenBuffer call is the ID of the block reference specified in the XLogRegisterBuffer call, when the WAL record was created. +1, but one important step here is finding the data to be replayed. That is, a large part of the complexity of replay routines has to do with figuring out which parts of the WAL record were elided due to full-page-images, and locating the remaining parts. What can we do to make that simpler? We can certainly add more structure to the WAL records, but any extra information you add will make the records larger. It might be worth it, and would be lost in the noise for more complex records like page splits, but we should keep frequently-used records like heap insertions as lean as possible. Ideally, if XLogOpenBuffer (bad name BTW) returns BLK_REPLAY, it would also calculate and hand back the address/size of the logged data that had been pointed to by the associated XLogRecData chain item. The trouble here is that there might've been multiple XLogRecData items pointing to the same buffer. Perhaps the magic ID number you give to XLogOpenBuffer should be thought of as identifying an XLogRecData chain item, not so much a buffer? It's fairly easy to see what to do when there's just one chain item per buffer, but I'm not sure what to do if there's more than one. Hmm. You could register a separate XLogRecData chain for each buffer. Along the lines of: rdata[0].data = data for buffer rdata[0].len = ... rdata[0].next = &rdata[1]; rdata[1].data = more data for same buffer rdata[1].len = ... rdata[2].next = NULL; XLogRegisterBuffer(0, buffer, &data[0]); At replay: if (XLogOpenBuffer(0, &buffer, &xldata, &len) == BLK_REPLAY) { /* xldata points to the data registered for this buffer */ } Plus one more chain for the data not associated with a buffer. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] It seems no Windows buildfarm members are running find_typedefs
On 04/03/2014 12:01 AM, Tom Lane wrote: Andrew Dunstan writes: For OSX we'd construct the list via File::Find to recurse through the directories. So, something like this: Thanks for the tips. The attached patch against buildfarm 4.11 seems to work, cf http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=dromedary&dt=2014-04-03%2003%3A33%3A16&stg=typedefs I tried it on current OSX (10.9.2) as well as dromedary's 10.6.8. It does not work on prairiedog (10.4.11) --- the debug info format seems to be different that far back. Can't say that's worth worrying about. Thanks, applied. BTW, after looking a bit more closely at what this added to the typedefs list, I realize that this mechanism also collects typedefs that appear in files that are compiled but never installed, for example src/timezone/zic and the ecpg regression tests. This seems like a Good Thing, since certainly pgindent is going to hit the source files for those programs too. I wonder if we ought to switch over to scanning the .o files on all platforms? Sure, if someone wants to work out what's involved. Do the MSVC tools have some gadget for extracting symbols from .o files in a similar way? It would be nice to get them into the mix too. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
Heikki Linnakangas writes: > The big change in creating WAL records is that the buffers involved in > the WAL-logged operation are explicitly registered, by calling a new > XLogRegisterBuffer function. Seems reasonable, especially if we can make the buffer numbering business less error-prone. > void XLogRegisterBuffer(int blockref_id, Buffer buffer, bool buffer_std) > blockref_id: An arbitrary ID given to this block reference. It is used > in the redo routine to open/restore the same block. > buffer: the buffer involved > buffer_std: is the page in "standard" page layout? > That's for the normal cases. We'll need a couple of variants for also > registering buffers that don't need full-page images, and perhaps also a > function for registering a page that *always* needs a full-page image, > regardless of the LSN. A few existing WAL record types just WAL-log the > whole page, so those ad-hoc full-page images could be replaced with this. Why not just one function with an additional flags argument? Also, IIRC there are places that WAL-log full pages that aren't in a shared buffer at all (btree build does this I think). How will that fit into this model? > (While we're at it, perhaps we should let XLogInsert set the LSN of all > the registered buffers, to reduce the amount of boilerplate code). Yeah, maybe so. I'm not sure why that was separate to begin with. > Let's simplify that, and have one new function, XLogOpenBuffer, which > returns a return code that indicates which of the four cases we're > dealing with. A typical redo function looks like this: > if (XLogOpenBuffer(0, &buffer) == BLK_REPLAY) > { > /* Modify the page */ > ... > PageSetLSN(page, lsn); > MarkBufferDirty(buffer); > } > if (BufferIsValid(buffer)) > UnlockReleaseBuffer(buffer); > The '0' in the XLogOpenBuffer call is the ID of the block reference > specified in the XLogRegisterBuffer call, when the WAL record was created. +1, but one important step here is finding the data to be replayed. That is, a large part of the complexity of replay routines has to do with figuring out which parts of the WAL record were elided due to full-page-images, and locating the remaining parts. What can we do to make that simpler? Ideally, if XLogOpenBuffer (bad name BTW) returns BLK_REPLAY, it would also calculate and hand back the address/size of the logged data that had been pointed to by the associated XLogRecData chain item. The trouble here is that there might've been multiple XLogRecData items pointing to the same buffer. Perhaps the magic ID number you give to XLogOpenBuffer should be thought of as identifying an XLogRecData chain item, not so much a buffer? It's fairly easy to see what to do when there's just one chain item per buffer, but I'm not sure what to do if there's more than one. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb is also breaking the rule against nameless unions
On 2014-04-02 23:50:19 +0200, Andres Freund wrote: > > > I just tried it on clang. It builds clean with -Wc11-extensions except > > > warning about _Static_assert(). That's possibly fixable with some > > > autoconf trickery. > > > > Ah. That sounds promising. What clang version is that? > > It's debian's clang-3.5, which is built from trunk IIUC: 1:3.5~svn201651-1 > > I have some patches that fix the configure tests to properly use > -Werror, but I am too tired to check their validity now. So, three patches attached: 1) fix a couple of warnings clang outputs in -pedantic. All of them somewhat valid and not too ugly to fix. 2) Use -Werror in a couple more configure checks. That allows to compile pg using both -Wc11-extensions and -Wc99-extensions in a halfway sane fashion. I think this is sane, but I'd welcome comments. 3) Fix C89 compliance issue in pgbench, plus minor cleanups. I don't like the fix here, but I don't really have a better idea. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services >From 3219b14205f33e4e6c3682cf9ca795159d59e611 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Thu, 3 Apr 2014 12:51:40 +0200 Subject: [PATCH 1/3] Fix a bunch of somewhat pedantic compiler warnings. In C89 it's not allowed to have a trailing comma after the last enumerator value, compound literals have to be compile time constant and strictly speaking a void * pointer isn't a function pointer... --- src/backend/access/transam/xlog.c | 2 +- src/bin/pg_dump/parallel.c | 5 - src/bin/psql/tab-complete.c| 10 +- src/include/catalog/objectaccess.h | 2 +- src/include/pgstat.h | 2 +- src/include/utils/jsonapi.h| 2 +- 6 files changed, 13 insertions(+), 10 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f9d6609..5096999 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -651,7 +651,7 @@ typedef enum XLOG_FROM_ANY = 0, /* request to read WAL from any source */ XLOG_FROM_ARCHIVE, /* restored using restore_command */ XLOG_FROM_PG_XLOG, /* existing file in pg_xlog */ - XLOG_FROM_STREAM, /* streamed from master */ + XLOG_FROM_STREAM /* streamed from master */ } XLogSource; /* human-readable names for XLogSources, for debugging output */ diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index 6f2634b..7d6e146 100644 --- a/src/bin/pg_dump/parallel.c +++ b/src/bin/pg_dump/parallel.c @@ -558,7 +558,10 @@ ParallelBackupStart(ArchiveHandle *AH, RestoreOptions *ropt) { /* we are the worker */ int j; - int pipefd[2] = {pipeMW[PIPE_READ], pipeWM[PIPE_WRITE]}; + int pipefd[2]; + + pipefd[0] = pipeMW[PIPE_READ]; + pipefd[1] = pipeWM[PIPE_WRITE]; /* * Store the fds for the reverse communication in pstate. Actually diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 1d69b95..202ffce 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -781,7 +781,7 @@ static const pgsql_thing_t words_after_create[] = { /* Forward declaration of functions */ -static char **psql_completion(char *text, int start, int end); +static char **psql_completion(const char *text, int start, int end); static char *create_command_generator(const char *text, int state); static char *drop_command_generator(const char *text, int state); static char *complete_from_query(const char *text, int state); @@ -790,7 +790,7 @@ static char *_complete_from_query(int is_schema_query, const char *text, int state); static char *complete_from_list(const char *text, int state); static char *complete_from_const(const char *text, int state); -static char **complete_from_variables(char *text, +static char **complete_from_variables(const char *text, const char *prefix, const char *suffix); static char *complete_from_files(const char *text, int state); @@ -812,7 +812,7 @@ void initialize_readline(void) { rl_readline_name = (char *) pset.progname; - rl_attempted_completion_function = (void *) psql_completion; + rl_attempted_completion_function = psql_completion; rl_basic_word_break_characters = WORD_BREAKS; @@ -834,7 +834,7 @@ initialize_readline(void) * completion_matches() function, so we don't have to worry about it. */ static char ** -psql_completion(char *text, int start, int end) +psql_completion(const char *text, int start, int end) { /* This is the variable we'll return. */ char **matches = NULL; @@ -3847,7 +3847,7 @@ complete_from_const(const char *text, int state) * to support quoting usages. */ static char ** -complete_from_variables(char *text, const char *prefix, const char *suffix) +complete_from_variables(const char *text, const char *prefix, const char *suffix) { char **matches; char **varnames; diff --git a/src/include/ca
Re: [HACKERS] json(b) equality rules
On Thu, Apr 3, 2014 at 7:43 AM, Andrew Dunstan wrote: > I don't think we should follow these rules at all. Json is not Javascript, > and these are Javascript rules, not Json rules. I'm entirely opposed to > treating 0, "0", false, and [0] as equal. The equality rule we actually have > for jsonb is the correct one, I believe. +1 -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] quiet inline configure check misses a step for clang
On 2014-04-03 10:15:33 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2014-04-03 09:43:20 -0400, Tom Lane wrote: > >> I object to the latter; you're proposing to greatly increase the warning > >> noise seen with any compiler that issues a warning for this without caring > >> about .h vs .c. For somebody who finds gcc -pedantic unusable, I would > >> think you'd have a bit more sympathy for people using other compilers. > > > Yea, but which compilers are that? The only one in the buildfarm I could > > find a couple weeks back was acc, and there's a flag we could add to the > > relevant template that silences it. I also don't think that very old > > platforms won't usually be used for active development, so a louder > > build there doesn't really have the same impact as noisy builds for > > actively developed on platforms. > > Didn't we already have this discussion last year? The main points > are all mentioned in > > http://www.postgresql.org/message-id/ca+tgmoyjnc4b+8y01grnal52gtpbzc3zsc4sdnw4lgxhqt3...@mail.gmail.com To which I replied: http://www.postgresql.org/message-id/20131224141631.gf26...@alap2.anarazel.de It really seems like an odd behaviour if a compiler behaved that way. But even if some decade+ old compiler gets this wrong: I am not going to shed many tears. We're talking about HP-UX's ac++ here. If binaries get a bit more bloated there... I really am not trying to win the inline fight here through the backdoor, I only want to make clang use inlines again. As just written in the other message, I just don't see any easy and nice way to write the autoconf test more robustly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] quiet inline configure check misses a step for clang
On 2014-04-03 09:43:20 -0400, Tom Lane wrote: > Andres Freund writes: > > The current quiet inline test doesn't work for clang. As e.g. evidenced in > > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gull&dt=2014-04-03%2007%3A49%3A26&stg=configure > > configure thinks it's not quiet. > > > Which means that postgres compiled with a recent clang will be noticably > > slower than it needs to be. > > > The reason for that is that clang is smart and warns about static inline > > if they are declared locally in the .c file, but not if they are > > declared in a #included file. That seems to be a reasonable > > behaviour... > > > I think that needs to be fixed. We either can make the configure test > > considerably more complex or simply drop the requirement for quiet > > inline. > > I object to the latter; Do you have an idea how to make the former work? The whole autoconf getup doesn't really seem to support generating two files for a test. I've looked a bit around and it seems like it'd be a fair amount of effort to do it solely in autoconf. The easiest seems to be to just have a header for the test in the sourcetree, but that seems fairly ugly... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] quiet inline configure check misses a step for clang
Andres Freund writes: > On 2014-04-03 09:43:20 -0400, Tom Lane wrote: >> I object to the latter; you're proposing to greatly increase the warning >> noise seen with any compiler that issues a warning for this without caring >> about .h vs .c. For somebody who finds gcc -pedantic unusable, I would >> think you'd have a bit more sympathy for people using other compilers. > Yea, but which compilers are that? The only one in the buildfarm I could > find a couple weeks back was acc, and there's a flag we could add to the > relevant template that silences it. I also don't think that very old > platforms won't usually be used for active development, so a louder > build there doesn't really have the same impact as noisy builds for > actively developed on platforms. Didn't we already have this discussion last year? The main points are all mentioned in http://www.postgresql.org/message-id/ca+tgmoyjnc4b+8y01grnal52gtpbzc3zsc4sdnw4lgxhqt3...@mail.gmail.com regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WAL format and API changes (9.5)
I'd like to do some changes to the WAL format in 9.5. I want to annotate each WAL record with the blocks that they modify. Every WAL record already includes that information, but it's done in an ad hoc way, differently in every rmgr. The RelFileNode and block number are currently part of the WAL payload, and it's the REDO routine's responsibility to extract it. I want to include that information in a common format for every WAL record type. That makes life a lot easier for tools that are interested in knowing which blocks a WAL record modifies. One such tool is pg_rewind; it currently has to understand every WAL record the backend writes. There's also a tool out there called pg_readahead, which does prefetching of blocks accessed by WAL records, to speed up PITR. I don't think that tool has been actively maintained, but at least part of the reason for that is probably that it's a pain to maintain when it has to understand the details of every WAL record type. It'd also be nice for contrib/pg_xlogdump and backend code itself. The boilerplate code in all WAL redo routines, and writing WAL records, could be simplified. So, here's my proposal: Insertion - The big change in creating WAL records is that the buffers involved in the WAL-logged operation are explicitly registered, by calling a new XLogRegisterBuffer function. Currently, buffers that need full-page images are registered by including them in the XLogRecData chain, but with the new system, you call the XLogRegisterBuffer() function instead. And you call that function for every buffer involved, even if no full-page image needs to be taken, e.g because the page is going to be recreated from scratch at replay. It is no longer necessary to include the RelFileNode and BlockNumber of the modified pages in the WAL payload. That information is automatically included in the WAL record, when XLogRegisterBuffer is called. Currently, the backup blocks are implicitly numbered, in the order the buffers appear in XLogRecData entries. With the new API, the blocks are numbered explicitly. This is more convenient when a WAL record sometimes modifies a buffer and sometimes not. For example, a B-tree split needs to modify four pages: the original page, the new page, the right sibling (unless it's the rightmost page) and if it's an internal page, the page at the lower level whose split the insertion completes. So there are two pages that are sometimes missing from the record. With the new API, you can nevertheless always register e.g. original page as buffer 0, new page as 1, right sibling as 2, even if some of them are actually missing. SP-GiST contains even more complicated examples of that. The new XLogRegisterBuffer would look like this: void XLogRegisterBuffer(int blockref_id, Buffer buffer, bool buffer_std) blockref_id: An arbitrary ID given to this block reference. It is used in the redo routine to open/restore the same block. buffer: the buffer involved buffer_std: is the page in "standard" page layout? That's for the normal cases. We'll need a couple of variants for also registering buffers that don't need full-page images, and perhaps also a function for registering a page that *always* needs a full-page image, regardless of the LSN. A few existing WAL record types just WAL-log the whole page, so those ad-hoc full-page images could be replaced with this. With these changes, a typical WAL insertion would look like this: /* register the buffer with the WAL record, with ID 0 */ XLogRegisterBuffer(0, buf, true); rdata[0].data = (char *) &xlrec; rdata[0].len = sizeof(BlahRecord); rdata[0].buffer_id = -1; /* -1 means the data is always included */ rdata[0].next = &(rdata[1]); rdata[1].data = (char *) mydata; rdata[1].len = mydatalen; rdata[1].buffer_id = 0; /* 0 here refers to the buffer registered above */ rdata[1].next = NULL ... recptr = XLogInsert(RM_BLAH_ID, xlinfo, rdata); PageSetLSN(buf, recptr); (While we're at it, perhaps we should let XLogInsert set the LSN of all the registered buffers, to reduce the amount of boilerplate code). (Instead of using a new XLogRegisterBuffer() function to register the buffers, perhaps they should be passed to XLogInsert as a separate list or array. I'm not wedded on the details...) Redo There are four different states a block referenced by a typical WAL record can be in: 1. The old page does not exist at all (because the relation was truncated later) 2. The old page exists, but has an LSN higher than current WAL record, so it doesn't need replaying. 3. The LSN is < current WAL record, so it needs to be replayed. 4. The WAL record contains a full-page image, which needs to be restored. With the current API, that leads to a long boilerplate: /* If we have a full-page image, restore it and we're done */ if (HasBackupBl
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
Heikki Linnakangas writes: > No-one's replied yet, but perhaps the worry is that after you've written > the commit record, you have to go ahead with removing/creating the init > fork, and that is seen as too risky. If a creat() or unlink() call > fails, that will have to be a PANIC, and crash recovery will likewise > have to PANIC if the forks still cannot be removed/created. > My first thought is that that seems ok. No, it isn't. No filesystem operation should *ever* be thought to be guaranteed to succeed. I also concur with Andres' complaint that this feature is not worth adding complication to the core transaction commit path for. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] quiet inline configure check misses a step for clang
On 2014-04-03 09:43:20 -0400, Tom Lane wrote: > Andres Freund writes: > > The current quiet inline test doesn't work for clang. As e.g. evidenced in > > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gull&dt=2014-04-03%2007%3A49%3A26&stg=configure > > configure thinks it's not quiet. > > > Which means that postgres compiled with a recent clang will be noticably > > slower than it needs to be. > > > The reason for that is that clang is smart and warns about static inline > > if they are declared locally in the .c file, but not if they are > > declared in a #included file. That seems to be a reasonable > > behaviour... > > > I think that needs to be fixed. We either can make the configure test > > considerably more complex or simply drop the requirement for quiet > > inline. > > I object to the latter; you're proposing to greatly increase the warning > noise seen with any compiler that issues a warning for this without caring > about .h vs .c. For somebody who finds gcc -pedantic unusable, I would > think you'd have a bit more sympathy for people using other compilers. Yea, but which compilers are that? The only one in the buildfarm I could find a couple weeks back was acc, and there's a flag we could add to the relevant template that silences it. I also don't think that very old platforms won't usually be used for active development, so a louder build there doesn't really have the same impact as noisy builds for actively developed on platforms. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] quiet inline configure check misses a step for clang
Andres Freund writes: > The current quiet inline test doesn't work for clang. As e.g. evidenced in > http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gull&dt=2014-04-03%2007%3A49%3A26&stg=configure > configure thinks it's not quiet. > Which means that postgres compiled with a recent clang will be noticably > slower than it needs to be. > The reason for that is that clang is smart and warns about static inline > if they are declared locally in the .c file, but not if they are > declared in a #included file. That seems to be a reasonable > behaviour... > I think that needs to be fixed. We either can make the configure test > considerably more complex or simply drop the requirement for quiet > inline. I object to the latter; you're proposing to greatly increase the warning noise seen with any compiler that issues a warning for this without caring about .h vs .c. For somebody who finds gcc -pedantic unusable, I would think you'd have a bit more sympathy for people using other compilers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2014 proposal
On Wed, Apr 2, 2014 at 2:22 PM, Alexander Korotkov wrote: > On Tue, Apr 1, 2014 at 2:23 PM, Heikki Linnakangas < > hlinnakan...@vmware.com> wrote: > >> The BIRCH algorithm as described in the paper describes building a tree >> in memory. If I understood correctly, you're suggesting to use a pre-built >> GiST index instead. Interesting idea! >> >> There are a couple of signifcant differences between the CF tree >> described in the paper and GiST: >> >> 1. In GiST, a leaf item always represents one heap tuple. In the CF tree, >> a leaf item represents a cluster, which consists of one or more tuples. So >> the CF tree doesn't store an entry for every input tuple, which makes it >> possible to keep it in memory. >> >> 2. In the CF tree, "all entries in a leaf node must satisfy a threshold >> requirement, with respect to a threshold value T: the diameter (or radius) >> has to be less than T". GiST imposes no such restrictions. An item can >> legally be placed anywhere in the tree; placing it badly will just lead to >> degraded search performance, but it's still a legal GiST tree. >> >> 3. A GiST index, like any other index in PostgreSQL, holds entries also >> for deleted tuples, until the index is vacuumed. So you cannot just use >> information from a non-leaf node and use it in the result, as the >> information summarized at a non-leaf level includes noise from the dead >> tuples. >> >> Can you elaborate how you are planning to use a GiST index to implement >> BIRCH? You might also want to take a look at SP-GiST; SP-GiST is more >> strict in where in the tree an item can be stored, and lets the operator >> class to specify exactly when a node is split etc. >> > > Hmmm, it's likely I've imagined something quite outside of this paper, and > even already suggested it to Ivan... :) > I need a little time to rethink it. > Using GiST we can implement BIRCH-like clustering like so: 1) Build a CF tree as GiST index without restriction of T threshold value. 2) Scan CF tree with threshold T with some auxiliary operator. If consistent method see CF entry which diameter is greater than T then it returns true. Otherwise it returns false and put this CF entry into output area (could be either in-memory or temporary table). 3) Process other steps of algorithm as usual. This modification would have following advantages: 1) User can build GiST index once and then try clustering with different parameters. Initial GiST index build would be slowest operation while other steps is expected to be fast. 2) Use GiST infrastructure and automatically get buffering build. The drawback is that building GiST index is more expensive than building in-memory CF tree with given threshold T (assuming T is well chosen). Does it make any sense? -- With best regards, Alexander Korotkov.
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 2014-04-03 15:02:27 +0300, Heikki Linnakangas wrote: > On 04/03/2014 02:41 PM, Andres Freund wrote: > >On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > >>On 04/01/2014 08:58 PM, Andres Freund wrote: > >>>On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > On 3/4/14, 8:50 AM, Andres Freund wrote: > >Can't that be solved by just creating the permanent relation in a new > >relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >for anything but wal_level=minimal anyway. > > Maybe I'm missing something, but doesn't this actually involve writing > the data twice? Once into WAL and again into the relation itself? > >>> > >>>Yes. But as I said, that's unavoidable for anything but > >>>wal_level=minimal. > >> > >>Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > >>SET LOGGED. There's no fundamental reason you need to rewrite the > >>heap, too. > > > >As another point: What's the advantage of that? The amount of writes > >will be the same, no? It doesn't seem to be all that interesting that > >a second filenode exists temporarily? > > Surely it's cheaper to read the whole relation and copy it to just WAL, than > to read the whole relation and write it both the WAL and another file. I have to admit I was thinking of the WAL replay case ;). But we'll actually have to write all dirty s_b, change the persistency tags and such anyway because there's no LSN interlock with checkpoints. That seems pretty ugly as well, and once again, avoidable by a rewrite. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json(b) equality rules
On 04/03/2014 04:32 AM, Oleg Bartunov wrote: > Hi there, > > I'm wondering if we should follow all js equility rules as > nicely visualized in > http://strilanc.com/visualization/2014/03/27/Better-JS-Equality-Table.html Probably not as JSON is general interchange format. If somebody wants JavaScript rules, they can use pl/v8 Any equality operations specific for JSON should be related to array and object/dictionary equality and not data store inside JSON Cheers Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 04/03/2014 02:41 PM, Andres Freund wrote: On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: On 04/01/2014 08:58 PM, Andres Freund wrote: On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: On 3/4/14, 8:50 AM, Andres Freund wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into the relation itself? Yes. But as I said, that's unavoidable for anything but wal_level=minimal. Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. As another point: What's the advantage of that? The amount of writes will be the same, no? It doesn't seem to be all that interesting that a second filenode exists temporarily? Surely it's cheaper to read the whole relation and copy it to just WAL, than to read the whole relation and write it both the WAL and another file. (Maybe it's not worth the trouble to avoid it - but that depends on whether we come up with a good design..) - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json(b) equality rules
On 04/03/2014 05:02 AM, Oleg Bartunov wrote: Well, we don't supported Infinity and NaN in json(b), as well as Json standard :) Now we need a script, which generated nice html table. (Oleg, please try not to top-post.) I don't think we should follow these rules at all. Json is not Javascript, and these are Javascript rules, not Json rules. I'm entirely opposed to treating 0, "0", false, and [0] as equal. The equality rule we actually have for jsonb is the correct one, I believe. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:58 PM, Andres Freund wrote: > >On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > >>On 3/4/14, 8:50 AM, Andres Freund wrote: > >>>Can't that be solved by just creating the permanent relation in a new > >>>relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >>>for anything but wal_level=minimal anyway. > >> > >>Maybe I'm missing something, but doesn't this actually involve writing the > >>data twice? Once into WAL and again into the relation itself? > > > >Yes. But as I said, that's unavoidable for anything but > >wal_level=minimal. > > Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > SET LOGGED. There's no fundamental reason you need to rewrite the > heap, too. As another point: What's the advantage of that? The amount of writes will be the same, no? It doesn't seem to be all that interesting that a second filenode exists temporarily? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 2014-04-03 14:26:50 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:39 PM, Heikki Linnakangas wrote: > >On 03/07/2014 05:36 AM, Tom Lane wrote: > >>=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= writes: > >>>Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? > >>>Thinking in a scope of one GSoC, of course. > >> > >>I think it's basically the same thing. You might hope to optimize it; > >>but you have to create (rather than remove) an init fork, and there's > >>no way to do that in exact sync with the commit. > > > >You just have to include that information with the commit WAL record, no? > > No-one's replied yet That might be because it was a month after the initial discussion, and at least I'd temporarily lost track of the thread ;) > , but perhaps the worry is that after you've written the > commit record, you have to go ahead with removing/creating the init fork, > and that is seen as too risky. If a creat() or unlink() call fails, that > will have to be a PANIC, and crash recovery will likewise have to PANIC if > the forks still cannot be removed/created. That's part of the worry, yes. It's also creeping code dealing with unlogged relations into a fairly critical place (RecordTransactionCommit()) where it really doesn't seem to belong. > My first thought is that that seems ok. It's unlikely that an unlink() of a > small file in the data directory would fail. Creation could be done with a > temporary name first and renamed into place, to avoid running out of disk > space in the critical section. I continue to feel that that's far too much impact for a minor feature. Even if it could be made work reliably, it'll be a fair amount of seldomly used infrastructure. > If that's not acceptable, one idea off the top of my head is to somehow > stamp the init forks when making an unlogged table logged, with the XID of > the transcation. Crash recovery could then check the clog to see if the > transaction committed, and ignore any init fork files belonging to committed > transactions. (Same in reverse when making a logged table unlogged). I've thought about that - after all, the logical decoding stuff uses that trick in some places - but it has the grave disadvantage that it requires a full directory scan to fully remove a relation. That seems to be a heavy price. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 04/01/2014 08:39 PM, Heikki Linnakangas wrote: On 03/07/2014 05:36 AM, Tom Lane wrote: =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= writes: Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? Thinking in a scope of one GSoC, of course. I think it's basically the same thing. You might hope to optimize it; but you have to create (rather than remove) an init fork, and there's no way to do that in exact sync with the commit. You just have to include that information with the commit WAL record, no? No-one's replied yet, but perhaps the worry is that after you've written the commit record, you have to go ahead with removing/creating the init fork, and that is seen as too risky. If a creat() or unlink() call fails, that will have to be a PANIC, and crash recovery will likewise have to PANIC if the forks still cannot be removed/created. My first thought is that that seems ok. It's unlikely that an unlink() of a small file in the data directory would fail. Creation could be done with a temporary name first and renamed into place, to avoid running out of disk space in the critical section. If that's not acceptable, one idea off the top of my head is to somehow stamp the init forks when making an unlogged table logged, with the XID of the transcation. Crash recovery could then check the clog to see if the transaction committed, and ignore any init fork files belonging to committed transactions. (Same in reverse when making a logged table unlogged). Currently, we reset unlogged relations before replaying the WAL. That would have to be delayed until end of WAL replay, because otherwise we don't know if the transaction committed or not. Although if we go with the stamping approach, we could still reset unstamped files at the beginning of recovery. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 2014-04-01 20:39:35 +0300, Heikki Linnakangas wrote: > On 03/07/2014 05:36 AM, Tom Lane wrote: > >=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= writes: > >>Do you think is difficult to implement "ALTER TABLE ... SET UNLOGGED" too? > >>Thinking in a scope of one GSoC, of course. > > > >I think it's basically the same thing. You might hope to optimize it; > >but you have to create (rather than remove) an init fork, and there's > >no way to do that in exact sync with the commit. > > You just have to include that information with the commit WAL record, no? Sure, it's possible to do that. But that seems like complicating generic paths more than I'd like for a minor feature. Especially as the unlinking of the files would need to happen somewhere in RecordTransactionCommit(). After the XLogFlush(), but before unsetting MyPgXact->delayChkpt. That's a crit section, right? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 04/03/2014 01:44 PM, Andres Freund wrote: On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: On 04/01/2014 08:58 PM, Andres Freund wrote: On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: On 3/4/14, 8:50 AM, Andres Freund wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into the relation itself? Yes. But as I said, that's unavoidable for anything but wal_level=minimal. Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. I understand that it might be difficult to do, because of the way the system catalogs work, but it's worthy goal. I don't think that's realistic to achieve due to the issues described in http://archives.postgresql.org/message-id/CA%2BTgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0%2BaLg%3Dx2Q%40mail.gmail.com To which I replied here: http://www.postgresql.org/message-id/533af9d7.7010...@vmware.com. Please reply to that sub-thread with any problems you see. I might be missing something, but I really don't see any insurmountable problem here. I don't think it's worthwile to make the feature much more complex, just to address this. perfect is the enemy of good and all that. We should do the trivial implementation first, sure. But that ought to be trivial. Now is the time to discuss how to do the more optimal thing. If we can come up with a feasible design on that, Fabrizio will have time to do that as part of the GSoC. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] quiet inline configure check misses a step for clang
Hi, The current quiet inline test doesn't work for clang. As e.g. evidenced in http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=gull&dt=2014-04-03%2007%3A49%3A26&stg=configure configure thinks it's not quiet. Which means that postgres compiled with a recent clang will be noticably slower than it needs to be. The reason for that is that clang is smart and warns about static inline if they are declared locally in the .c file, but not if they are declared in a #included file. That seems to be a reasonable behaviour... I think that needs to be fixed. We either can make the configure test considerably more complex or simply drop the requirement for quiet inline. Comments? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 2014-04-03 13:38:29 +0300, Heikki Linnakangas wrote: > On 04/01/2014 08:58 PM, Andres Freund wrote: > >On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: > >>On 3/4/14, 8:50 AM, Andres Freund wrote: > >>>Can't that be solved by just creating the permanent relation in a new > >>>relfilenode? That's equivalent to a rewrite, yes, but we need to do that > >>>for anything but wal_level=minimal anyway. > >> > >>Maybe I'm missing something, but doesn't this actually involve writing the > >>data twice? Once into WAL and again into the relation itself? > > > >Yes. But as I said, that's unavoidable for anything but > >wal_level=minimal. > > Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... > SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. > I understand that it might be difficult to do, because of the way the system > catalogs work, but it's worthy goal. I don't think that's realistic to achieve due to the issues described in http://archives.postgresql.org/message-id/CA%2BTgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0%2BaLg%3Dx2Q%40mail.gmail.com I don't think it's worthwile to make the feature much more complex, just to address this. perfect is the enemy of good and all that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC proposal - "make an unlogged table logged"
On 04/01/2014 08:58 PM, Andres Freund wrote: On 2014-04-01 12:56:04 -0500, Jim Nasby wrote: On 3/4/14, 8:50 AM, Andres Freund wrote: Can't that be solved by just creating the permanent relation in a new relfilenode? That's equivalent to a rewrite, yes, but we need to do that for anything but wal_level=minimal anyway. Maybe I'm missing something, but doesn't this actually involve writing the data twice? Once into WAL and again into the relation itself? Yes. But as I said, that's unavoidable for anything but wal_level=minimal. Ideally, you would *only* write the data to WAL, when you do ALTER TABLE ... SET LOGGED. There's no fundamental reason you need to rewrite the heap, too. I understand that it might be difficult to do, because of the way the system catalogs work, but it's worthy goal. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On 2014-04-03 00:48:12 -0400, Greg Stark wrote: > On Wed, Apr 2, 2014 at 4:16 PM, Andres Freund wrote: > > PS: Could you please start to properly quote again? You seem to have > > stopped doing that entirely in the last few months. > > I've been responding a lot from the phone. Unfortunately the Gmail client > on the phone makes it nearly impossible to format messages well. I'm > beginning to think it would be better to just not quote at all any more. > I'm normally not doing a point-by-point response anyways. I really don't care where you're answering from TBH. It's unreadable, misses context and that's it. If $device doesn't work for you, don't use it. I don't mind an occasional quick answer that's badly formatted, but for other things it's really annoying. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json(b) equality rules
Well, we don't supported Infinity and NaN in json(b), as well as Json standard :) Now we need a script, which generated nice html table. On Thu, Apr 3, 2014 at 12:40 PM, Yeb Havinga wrote: > On 2014-04-03 09:40, Oleg Bartunov wrote: >> >> Sure, we don't follow. I mean should we add to documentation >> such matrices. >> >> Oleg >> >> On Thu, Apr 3, 2014 at 11:32 AM, Oleg Bartunov >> wrote: >>> >>> Hi there, >>> >>> I'm wondering if we should follow all js equility rules as >>> nicely visualized in >>> >>> http://strilanc.com/visualization/2014/03/27/Better-JS-Equality-Table.html >>> >>> Oleg > > > > +1 > > I was a bit curious what the result would be. A quick inspection of the > query results below gave the impression that the matrix would probably show > a diagonal line. Even though the table is not necessary as a reference to > strange equality rules, a table of equality showing a diagonal will be easy > to remember. > > regards, > Yeb > > drop table testjsonb; > create table testjsonb(a jsonb); > insert into testjsonb (a) values ('true'); > insert into testjsonb (a) values ('false'); > insert into testjsonb (a) values ('1'); > insert into testjsonb (a) values ('0'); > insert into testjsonb (a) values ('-1'); > insert into testjsonb (a) values ('"true"'); > insert into testjsonb (a) values ('"false"'); > insert into testjsonb (a) values ('"1"'); > insert into testjsonb (a) values ('"0"'); > insert into testjsonb (a) values ('""'); > insert into testjsonb (a) values ('null'); > insert into testjsonb (a) values ('undefined'); > insert into testjsonb (a) values ('Infinity'); > insert into testjsonb (a) values ('-Infinity'); > insert into testjsonb (a) values ('[]'); > insert into testjsonb (a) values ('{}'); > insert into testjsonb (a) values ('[{}]'); > insert into testjsonb (a) values ('[0]'); > insert into testjsonb (a) values ('[1]'); > insert into testjsonb (a) values ('NaN'); > > select a.a, b.a, a.a = b.a > fromtestjsonb a, testjsonb b > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json(b) equality rules
On 2014-04-03 09:40, Oleg Bartunov wrote: Sure, we don't follow. I mean should we add to documentation such matrices. Oleg On Thu, Apr 3, 2014 at 11:32 AM, Oleg Bartunov wrote: Hi there, I'm wondering if we should follow all js equility rules as nicely visualized in http://strilanc.com/visualization/2014/03/27/Better-JS-Equality-Table.html Oleg +1 I was a bit curious what the result would be. A quick inspection of the query results below gave the impression that the matrix would probably show a diagonal line. Even though the table is not necessary as a reference to strange equality rules, a table of equality showing a diagonal will be easy to remember. regards, Yeb drop table testjsonb; create table testjsonb(a jsonb); insert into testjsonb (a) values ('true'); insert into testjsonb (a) values ('false'); insert into testjsonb (a) values ('1'); insert into testjsonb (a) values ('0'); insert into testjsonb (a) values ('-1'); insert into testjsonb (a) values ('"true"'); insert into testjsonb (a) values ('"false"'); insert into testjsonb (a) values ('"1"'); insert into testjsonb (a) values ('"0"'); insert into testjsonb (a) values ('""'); insert into testjsonb (a) values ('null'); insert into testjsonb (a) values ('undefined'); insert into testjsonb (a) values ('Infinity'); insert into testjsonb (a) values ('-Infinity'); insert into testjsonb (a) values ('[]'); insert into testjsonb (a) values ('{}'); insert into testjsonb (a) values ('[{}]'); insert into testjsonb (a) values ('[0]'); insert into testjsonb (a) values ('[1]'); insert into testjsonb (a) values ('NaN'); select a.a, b.a, a.a = b.a fromtestjsonb a, testjsonb b -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] json(b) equality rules
Sure, we don't follow. I mean should we add to documentation such matrices. Oleg On Thu, Apr 3, 2014 at 11:32 AM, Oleg Bartunov wrote: > Hi there, > > I'm wondering if we should follow all js equility rules as > nicely visualized in > http://strilanc.com/visualization/2014/03/27/Better-JS-Equality-Table.html > > Oleg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json(b) equality rules
Hi there, I'm wondering if we should follow all js equility rules as nicely visualized in http://strilanc.com/visualization/2014/03/27/Better-JS-Equality-Table.html Oleg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] using arrays within structure in ECPG
Hi Michael, The problem of offsets seems to be universal. If there is a structure within structure. The offset to the members of inner structure should be the size of the outer structure and not size of inner structure. Applying this rule recursively, offset to the member of any nested structure, at whatever level of nesting it is, should be same as the size of the outermost structure. But the code as of now, is using the size of the immediate parent. None of these problems are caught in the regression because, whatever tests I have seen are not fetching more than one tuple into such complex structure. On Tue, Apr 1, 2014 at 4:34 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > Hi MIchael, > I tried to fix the offset problem. PFA the patch. It does solve the > problem of setting wrong offset in ECPGdo() call. > > But then there is problem of interpreting the result from server as an > array within array of structure. The problem is there is in > ecpg_get_data(). This function can not understand that the "field" is an > array of integers (or for that matter array of anything) and store all the > values in contiguous memory at the given address. > > > > On Thu, Mar 27, 2014 at 11:05 PM, Michael Meskes wrote: > >> On Mon, Mar 24, 2014 at 11:52:30AM +0530, Ashutosh Bapat wrote: >> > For all the members of struct employee, except arr_col, the size of >> array >> > is set to 14 and next member offset is set of sizeof (struct employee). >> But >> > for arr_col they are set to 3 and sizeof(int) resp. So, for the next row >> > onwards, the calculated offset of arr_col member would not coincide with >> > the real arr_col member's address. >> > >> > Am I missing something here? >> >> No, this looks like a bug to me. I haven't had time to look into the >> source codebut the offset definitely is off. >> >> Michael >> -- >> Michael Meskes >> Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) >> Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org >> Jabber: michael.meskes at gmail dot com >> VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL >> > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company