Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
On 23.04.2012 02:59, Fujii Masao wrote: On Thu, Apr 19, 2012 at 2:20 PM, Kyotaro HORIGUCHI wrote: Hello, this is new version of standby checkpoint_segments patch. Thanks for the patch! This still makes catching up in standby mode slower, as you get many more restartpoints. The reason for ignoring checkpoint_segments during recovery was to avoid that. Maybe it's still better than what we have currently, I'm not sure, but at least it needs to be discussed. Would be good to do some performance testing of recovery with various checkpoint_segments and _timeout settings, with and without this patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote: > On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli wrote: > > I'd love to see enhanced CTID operators, to fetch all visible tuples in a > > page > > using a tidscan. Something like: WHERE ctid =~ '(501,*)' or a ctidrange. > > Among other things, this would enable user-space implementation of > tablesample. Given the operator =~(tid, int) that matches the page > number and planner/executor integration so that it results in a TID > scan, you would need the following functions: > > random_pages(tbl regclass, samples int) returns int[] > aggregate function: > reservoir_sample(item anyelement, samples int) returns anyarray > > Implementations for both of the functions could be adapted from analyze.c. > > Then tablesample could be implemented with the following query: > SELECT (SELECT reservoir_sample(some_table, 50) AS samples >FROM some_table WHERE ctid =~ ANY (rnd_pgtids)) > FROM random_pages('some_table', 50) AS rnd_pgtids; > > Actually, now that I think about it, it could actually be implemented > without any modifications to core at some cost to efficiency. > random_pages would have to return tid[] that contains for each > generated pagenumber all possible tids on that page. This is exactly what I'm after. I've actually started crafting such a TableSample function and I'm in the process to refine the signature so your suggested interface above is very useful, thanks ! But I don't understand the reservoir_sample call, what is it supposed to do ? And how flexibly "anyarray" return would be ? Could you return arbitrary typed rowtypes from it ? > By making the building blocks available users get more flexibility. > The downside would be that we can't automatically make better sampling > methods available. One approach doesn't preclude the other. TABLESAMPLE will still be useful, also for SQL compliance. --strk; ,--o-. | __/ |Delivering high quality PostGIS 2.0 ! | / 2.0 |http://strk.keybit.net - http://vizzuality.com `-o--' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Hello, > > - xlog.c: Make StandbyMode shared. > > > > - checkpointer.c: Use IsStandbyMode() to check if postmaster is > > under standby mode. > > IsStandbyMode() looks overkill to me. The standby mode flag is forcibly > turned off at the end of recovery, but its change doesn't need to be shared > to the checkpointer process, IOW, the shared flag doesn't need to change > since startup like XLogCtl->archiveCleanupCommand, I think. So we can > simplify the code to share the flag to the checkpointer. See the attached > patch (though not tested yet). Hmm. I understood that the aim of the spinlock and volatil'ize of the pointer in reading shared memory is to secure the memory consistency on SMPs with weak memory consistency and to make compiler help from over-optimization for non-volatile pointer respectively. You removed both of them in the patch. If we are allowed to be tolerant of the temporary lack of coherence in shared memory there, the spinlock could be removed. But the possibility to read garbage by using XLogCtl itself to access standbyMode does not seem to be tolerable. What do you think about that? > The comments in checkpointer.c seems to need to be revised more. For > example, > > + * XLogInsert that actually triggers a checkpoint when > > Currently a checkpoint is triggered by XLogWrite (not XLogInsert), the above > needs to be corrected. I will be carefull for such outdated description. regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Usage of planner_ctx
PlannerInfo->planner_ctx is assigned with either MessageContext or PortalHeapMemory depending on the flow from which planner is invoked. My doubt is that during whole planning process till the Plan is stored in PlannedStmt, all the memory is from CurrentMemoryContext which is same as PlannerInfo->planner_ctx. So what is the use of having PlannerInfo->planner_ctx which only contains CurrentMemoryContext?
Re: [HACKERS] Aggressive memory consumption in {ts,array}_typanalyze
On Wed, Apr 18, 2012 at 11:09 AM, Robert Haas wrote: > On Mon, Apr 16, 2012 at 4:58 PM, Noah Misch wrote: >> The size hint I chose is fairly arbitrary. Any suggestions for principled >> alternatives? > > Based on your test results, it doesn't seem like it matters very much > what you put in there, so I'm inclined to think that num_mcelem is > fine. I thought about maybe allowing for a little slop, like > num_mcelem * 10, but maybe the way you did it is better. It's > possible that people will set ridiculously overblown stats targets on > some columns, and that's certainly going to cost something no matter > what we do, but there's no point in making that worse than it has to > be without some clear reason for doing so. Hearing no further comments, I have committed your patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add timing of buffer I/O requests
On Sat, Apr 14, 2012 at 10:33 PM, Tom Lane wrote: > Robert Haas writes: >> The internal representation doesn't have to be (and certainly >> shouldn't be) numeric. But if you translate to numeric before >> returning the data to the user, then you have the freedom, in the >> future, to whack around the internal representation however you like, >> without breaking backward compatibility. Choosing float8 for the >> external representation is fine as long as we're sure we're not ever >> going to want more than 16 significant digits, but I see no particular >> value in baking in that assumption. But perhaps, as the saying goes, >> 16 digits ought to be enough for anyone. > > There's no particular reason to think that Moore's Law is going to > result in an increase in the fractional precision of timing data. > It hasn't done so in the past, for sure. Perhaps, but nobody's explained what we gain out of NOT using numeric. "It's slow" doesn't impress me; selecting from a system view doesn't need to be lightning-fast. However, the main thing here is that we need to do *something* here... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli wrote: > I'd love to see enhanced CTID operators, to fetch all visible tuples in a page > using a tidscan. Something like: WHERE ctid =~ '(501,*)' or a ctidrange. Among other things, this would enable user-space implementation of tablesample. Given the operator =~(tid, int) that matches the page number and planner/executor integration so that it results in a TID scan, you would need the following functions: random_pages(tbl regclass, samples int) returns int[] aggregate function: reservoir_sample(item anyelement, samples int) returns anyarray Implementations for both of the functions could be adapted from analyze.c. Then tablesample could be implemented with the following query: SELECT (SELECT reservoir_sample(some_table, 50) AS samples FROM some_table WHERE ctid =~ ANY (rnd_pgtids)) FROM random_pages('some_table', 50) AS rnd_pgtids; Actually, now that I think about it, it could actually be implemented without any modifications to core at some cost to efficiency. random_pages would have to return tid[] that contains for each generated pagenumber all possible tids on that page. By making the building blocks available users get more flexibility. The downside would be that we can't automatically make better sampling methods available. Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] psql omits row count under "\x auto"
I've been enjoying "\x auto" in .psqlrc, but I noticed the row count footer missing when it chooses ordinary output: [local] test=# \x off Expanded display is off. [local] test=# select 1; ?column? -- 1 (1 row) [local] test=# \x auto Expanded display is used automatically. [local] test=# select 1; ?column? -- 1 [local] test=# Looks like the logic in printQuery() needs further treatment. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-tree page deletion boundary cases
On Sun, Apr 22, 2012 at 12:13:34AM +0530, Nikhil Sontakke wrote: > Was wondering if there's a similar bug which gets triggered while using > VACUUM FULL. See for instance this thread: > > http://postgresql.1045698.n5.nabble.com/index-corruption-in-PG-8-3-13-td4257589.html > > This issue has been reported on-off from time to time and in most cases > VACUUM or VACUUM FULL appears to be involved. We have usually attributed it > to hardware issues and reindex has been recommended by default as a > solution/work around.. I do not perceive much similarity. The bug I've raised can produce wrong query results transiently. It might permit injecting a tuple into the wrong spot in the tree, yielding persistent wrong results. It would not introduce tree-structural anomalies like sibling pointers directed at zeroed pages or internal pages in an 1-level tree. Given the symptoms you reported, I share Robert's suspicion of WAL replay in your scenario. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ECPG FETCH readahead
Hi, 2012-04-17 06:48 keltezéssel, Michael Meskes írta: On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote: I listed two scenarios. 1. occasional bump of the readahead window for large requests, for smaller requests it uses the originally set size 2. permanent bump of the readahead window for large requests (larger than previously seen), all subsequent requests use the new size Both can be implemented easily, which one do you prefer? If you always use very large requests, 1) behaves like 2) I'd say let's go for #2. #1 is probably more efficient but not what the programmer asked us to do. After all it's easy to increase the window size accordingly if you want so as a programmer. Michael OK, I will implement #2. Another question popped up: what to do with FETCH ALL? The current readahead window size or temporarily bumping it to say some tens of thousands can be used. We may not know how much is the "all records". This, although lowers performance, saves memory. Please, don't apply this patch yet. I discovered a rather big hole that can confuse the cursor position tracking if you do this: DECLARE mycur; MOVE ABSOLUTE n IN mycur; MOVE BACKWARD m IN mycur; If (n+m) is greater, but (n-m) is smaller than the number of rows in the cursor, the backend's and the caching code's ideas about where the cursor is will differ. I need to fix this before it can be applied. That will also need a new round of review. Sorry for that. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig& Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Namespace of array of user defined types is confused by the parser in insert?
Hello, Sorry for re-posting - I initially posted this in pgsql.sql - probably this group is more appropriate. I have a bizzare problem that started to manifest itself after addition of field being the array of compound UDTs to the table declared in multiple schemas. It is clearly related to how the type namespace is resolved and shows up for the JDBC client (probably related to the paramterized query, as the static query works without problems). given the types: drop type if exists periodSearchResults cascade; create type periodSearchResults as ( period float8[], -- [1] - 1st value, kth... value for the same ts periodError float8[], probability float8[], amplitude float8[] ); drop type if exists periodTSResult cascade; create type periodTSResult as ( method text, periods periodSearchResults[] -- 1st raw ts, 2...kth - residual values ) ; and a table defined as: CREATE TABLE timeseriesresult ( id bigint NOT NULL, --- ... fperiodsearchresults periodTSResult[] --- ); when the type periodTSResult is defined in more than one schema, I got error for insert using JDBC client (or prepared statement - it works with a non-parametrized query) in any other then a first schema it was defined in - in this case the first schema it was defined in is cu7user_test, the schema of the execution is cu7user_test_2: ERROR: column "fperiodsearchresults" is of type periodtsresult[] but expression is of type cu7user_test.periodtsresult[] at character 1416 HINT: You will need to rewrite or cast the expression. STATEMENT: INSERT INTO cu7user_test_2.timeseriesresult (runid, catalogid, sourceid, ftimeseriestype, fstate, fminimizedfunctionminval, freducedchi2, fstopcriteria, fweightedmodeling, fhomoscedasticitytest, fkurtosis, fnumpointsobstime, fljungboxrandomnesstest, fmedianabsolutedeviation, fmax, fmeanobstime, fmean, fmeanerror, fmedian, fmedianerror, fmin, frange, frobustweightedstddev, fskewness, fstddev, fsymmetrytest, ftrimmedweightedmean, ftrimmedweightedrange, fvariabilityflag, fstatvariabilityflag, fweightedkurtosis, fweightedmean, fweightedmeanconfidenceinterval, fweightedmeanobstime, fweightednormalizedp2pscatter, fweightedskewness, fweightedstddevdf, fweightedstddevwdf, fabbe, fchi2, fiqr, foutliermedian, fpstetson, fpabbe, fpchi2, fpiqr, fpkurtosis, fpoutliermedian, fpskew, fpweightedkurtosis, fpweightedskew, fstetson, referencetime, cadencevalue, cadenceerror, cadencename, fperiodsearchmethod, fweightedpercentileranks, fweightedpercentiles, fotherparameters, ffundamentalfrequencies, mapfperiodsearchresults, fperiodsearchresults, fpolynomialcoefficients, ffouriercoefficients, derivedtschanges) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32, $33, $34, $35, $36, $37, $38, $39, $40, $41, $42, $43, $44, $45, $46, $47, $48, $49, $50, $51, $52, $53, $54, $55, $56, $57, $58, $59, $60, $61, $62, $63, $64, $65, $66) DEBUG: parse S_10: ROLLBACK I debugged a little to find out that indeed there are two types defined in separate schemas: SELECT t.oid, t.typname,(select nspname from pg_namespace n where n.oid=typnamespace) nspname ,typarray FROM pg_catalog.pg_type t where typname like 'periodts%'; gives: oid |typname |nspname | typarray ---+++-- 16646 | periodtsresult | cu7user_test |16645 17123 | periodtsresult | cu7user_test_2 |17122 but for user/schema cu7user_test_2 parse_coerce.c:86: if (!can_coerce_type(1, &exprtype, &targettype, ccontext)) return NULL; the function returns false and subsequently coerce_to_target_type returns null resulting in the above error. This is caused by the parse_coerce: 421 expr pointer returning type_id for the type defined in the '1st' schema: type_id = exprType((Node *) expr); //<-- type_id returned is 16645, should be 17122 expr pointer value is already wrong in the parsed list in the loop starting at analyze:799 foreach(lc, exprlist) <- wrong type_id already for the array UDT column Is there any way of avoid this error different than having a single type defined for all schemas? Any hints appreciated.. Postgres 9.1.3, latest JDBC driver. Best regards, Krzysztof -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Sat, Apr 21, 2012 at 02:28:52PM +0800, Qi Huang wrote: > > Hi, Heikki ... > > Another idea that Robert Haas suggested was to add support doing a TID > > scan for a query like "WHERE ctid< '(501,1)'". That's not enough work > > for GSoC project on its own, but could certainly be a part of it. > > the first one and the last one are still not clear. The last one was the TID scan on filters like ctid < '(501,1)'. TID "scans" are the fastest access method as they directly access explicitly referenced addresses. Starting from this observation a sampling function may select random pages and tuples within pages and directly access them, optimizing accesses by grouping tuples within the same page so to fetch them all togheter. This is what the ANALYZE command already does when providing samples for the type analyzers. Unfortunately it looks like at SQL level only the equality operator triggers a TID scan, so things like "WHERE ctid < '(501,1)'" won't be as fast as fetching all visible tuples in the first 501 pages. I think that's what Heikki was referring about. I'd love to see enhanced CTID operators, to fetch all visible tuples in a page using a tidscan. Something like: WHERE ctid =~ '(501,*)' or a ctidrange. --strk; ,--o-. | __/ |Delivering high quality PostGIS 2.0 ! | / 2.0 |http://strk.keybit.net - http://vizzuality.com `-o--' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote: > 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta: > > 2012-04-06 14:47 keltezéssel, Cousin Marc írta: > >> On 05/04/12 08:02, Boszormenyi Zoltan wrote: > >>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta: > I think this patch is doing two things: first touching infrastructure > stuff and then adding lock_timeout on top of that. Would it work to > split the patch in two pieces? > > >>> Sure. Attached is the split version. > >>> > >>> Best regards, > >>> Zoltán Böszörményi > >>> > >> Hi, > >> > >> I've started looking at and testing both patches. > >> > >> Technically speaking, I think the source looks much better than the > >> first version of lock timeout, and may help adding other timeouts in the > >> future. I haven't tested it in depth though, because I encountered the > >> following problem: > >> > >> While testing the patch, I found a way to crash PG. But what's weird is > >> that it crashes also with an unpatched git version. > >> > >> Here is the way to reproduce it (I have done it with a pgbench schema): > >> > >> - Set a small statement_timeout (just to save time during the tests) > >> > >> Session1: > >> =#BEGIN; > >> =#lock TABLE pgbench_accounts ; > >> > >> Session 2: > >> =#BEGIN; > >> =#lock TABLE pgbench_accounts ; > >> ERROR: canceling statement due to statement timeout > >> =# lock TABLE pgbench_accounts ; > >> > >> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be > >> done with a rollback to savepoint of course. > >> > >> Session 2 crashes with this : TRAP : FailedAssertion(« > >> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne : > >> 749). > >> > >> It can also be done without a statement_timeout, and a control-C on the > >> second lock table. > >> > >> I didn't touch anything but this. It occurs everytime, when asserts are > >> activated. > >> > >> I tried it on 9.1.3, and I couldn't make it crash with the same sequence > >> of events. So maybe it's something introduced since ? Or is the assert > >> still valid ? > >> > >> Cheers > >> > > > > Attached are the new patches. I rebased them to current GIT and > > they are expected to be applied after Robert Haas' patch in the > > "bug in fast-path locking" thread. > > > > Now it survives the above scenario. > > > > Best regards, > > Zoltán Böszörményi > > New patch attached, rebased to today's GIT. > > Best regards, > Zoltán Böszörményi > Ok, I've done what was missing from the review (from when I had a bug in locking the other day), so here is the full review. By the way, this patch doesn't belong to current commitfest, but to the next one. Is the patch in context diff format? Yes Does it apply cleanly to the current git master? Yes Does it include reasonable tests, necessary doc patches, etc? The new lock_timeout GUC is documented. There are regression tests. Read what the patch is supposed to do, and consider: Does the patch actually implement that? Yes Do we want that? I do. Mostly for administrative jobs which could lock the whole application. It would be much easier to run reindexes, vacuum full, etc… without worrying about bringing application down because of lock contention. Do we already have it? No. Does it follow SQL spec, or the community-agreed behavior? I don't know if there is a consensus on this new GUC. statement_timeout is obviously not in the SQL spec. Does it include pg_dump support (if applicable)? Not applicable Are there dangers? Yes, as it rewrites all the timeout code. I feel it is much cleaner this way, as there is a generic set of function managing all sigalarm code, but it heavily touches this part. Have all the bases been covered? I tried all sql statements I could think of (select, insert, update, delete, truncate, drop, create index, adding constraint, lock. I tried having statement_timeout, lock_timeout and deadlock_timeout at very short and close or equal values. It worked too. Rollback to savepoint while holding locks dont crash PostgreSQL anymore. Other timeouts such as archive_timeout and checkpoint_timeout still work. Does the feature work as advertised? Yes Are there corner cases the author has failed to consider? I didn't find any. Are there any assertion failures or crashes? No. Does the patch slow down simple tests? No If it claims to improve performance, does it? Not applicable Does it slow down other things? No Does it follow the project coding guidelines? I think so Are there portability issues? No, all the portable code (acquiring locks and manipulating sigalarm) is the same as before. Will it work on Windows/BSD etc? It should. I couldn't test it though. Are the comments sufficient and accurate? Yes Does it do what it says, correctly? Yes Does it produce compiler warnings? No Can you make it crash? Not anymore Is everything done in a way that fits together coherently with other features/modules? Yes, I think so. The new way o
[HACKERS] Patch: add conversion from pg_wchar to multibyte
Hackers, attached patch adds conversion from pg_wchar string to multibyte string. This functionality is needed for my patch on index support for regular expression search http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php . Analyzing conversion from multibyte to pg_wchar I found following types of conversion: 1) Trivial conversion for single-byte encoding. It just adds leading zeros to each byte. 2) Conversion from UTF-8 to unicode. 3) Conversions from euc* encodings. They write bytes of a character to pg_wchar in inverse order starting from lower byte (this explanation assume little endian system). 4) Conversion from mule encoding. This conversion is unclear for me and also seems to be lossy. It was easy to write inverse conversion for 1-3. I've changed 4 conversion to behave like 3. I'm not sure my change is ok, because I didn't understand original conversion. -- With best regards, Alexander Korotkov. wchar2mb-0.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers