Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.
On Sun, Jun 1, 2014 at 1:05 PM, Andres Freund wrote: > On 2014-05-30 10:30:42 +0530, Amit Kapila wrote: > > On Wed, May 28, 2014 at 8:53 PM, Andres Freund > > > Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses > > > WaitLatch() to wait. That's fine in itself. But > > > procsignal_sigusr1_handler, which is used e.g. when resolving recovery > > > conflicts, doesn't unconditionally do a SetLatch(). > > > That means that we'll we'll currently not be able to cancel conflicting > > > backends during recovery for 10min. Now, I don't think that'll happen > > > too often in practice, but it's still annoying. > > > > How will such a situation occur, aren't we using pg_usleep during > > RecoveryConflict functions > > (ex. in ResolveRecoveryConflictWithVirtualXIDs)? > > I am not sure what you mean. pg_sleep() is the SQL callable function, a > different thing to pg_usleep(). I was not clear how such a situation can occur, but now looking at it bit more carefully, I think I understood that any backend calling pg_sleep() during recovery conflict resolution can face this situation. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] jsonb access operators inefficiency
On 05/30/2014 01:41 PM, Tom Lane wrote: Andrew Dunstan writes: If we're going to construct varlena objects inside a StringInfo, maybe we need a proper API for it. Isn't there a danger that data member of the StringInfo won't be properly aligned to allow us to do this? In any case, we should get most of the benefit of your patch without this "optimization". As noted, the data buffer is maxaligned; but nonetheless I agree that this is a serious stylistic abuse, and it's not buying much compared to the rest of the patch. I'd stick with the cstring_to_text_with_len coding. At the other end of the process, why are we using PG_GETARG_TEXT_P and not PG_GETARG_TEXT_PP to avoid a "detoast" cycle on short-header inputs? The function body is using VARDATA_ANY/VARSIZE_ANY_EXHDR, so it's already prepared for unaligned input. Committed with these changes. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
On 06/01/2014 03:44 AM, johnlumby wrote: If you look at the new patch, you'll see that for the different-pid case, I still call aio_suspend with a timeout. As you or Claudio pointed out earlier,it could just as well sleep for the same timeout, but the small advantage of calling aio_suspend is if the io completed just between the aio_error returning EINPROGRESS and the aio_suspend call. Also it makes the code simpler. In fact this change is quite small, just a few lines in backend/storage/buffer/buf_async.c and backend/storage/file/fd.c Based on this,I think it is not necessary to get rid of the polling altogether (and in any case, as far as I can see, very difficult). That's still just as wrong as it always has been. Just get rid of it. Don't put aiocb structs in shared memory at all. They don't belong there. Well, as mentioned earlier, it is not broken. Whether it is efficient I am not sure. I have looked at the mutex in aio_suspend that you mentioned and I am not quite convinced that, if caller is not the original aio_read process, it renders the suspend() into an instant timeout. I will see if I can verify that. I don't see the point of pursuing this design further. Surely we don't want to use polling here, and you're relying on undefined behavior anyway. I'm pretty sure aio_return/aio_error won't work from a different process on all platforms, even if it happens to work on Linux. Even on Linux, it might stop working if the underlying implementation changes from the glibc pthread emulation to something kernel-based. Good point. I have included the guts of your little test program (modified to do polling) into the existing autoconf test program that decides on the #define USE_AIO_ATOMIC_BUILTIN_COMP_SWAP. See config/c-library.m4. I hope this goes some way to answer your concern about robustness, as at least now if the implementation changes in some way that renders the polling ineffective, it will be caught in configure. No, that does not make it robust enough. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Documenting the Frontend/Backend Protocol update criteria
Hi, Currently the criteria on updating the F/B protocol is undefined. This makes it hard to update the protocol going forward. It makes it also hard to write library/driver/application implementations that will be more future proof to future server versions. Ideally the documentation for 9.4 (backport?) would say what kind of things are allowed to change within the v3 protocol, and thus implies what kind of changes need a new v4 protocol. Is there some wishlist page of items to do differently for v4 already? I did find the following text in the documentation: "binary representations for complex data types might change across server versions". But having more specific rules would help, especially since it seems to be there just to scare: so far changes have been strongly discouraged. An example to consider: some binary formats have flags (arrays) or version (jsonb) field. We should explicitly say that clients must reject any unknown bits/versions that they do not know about. This guarantees they detect small format updates instead of silently accepting then and possibly returning corrupt data. My motivation: Two years ago accidentally I opened a discussion on how to do updates to the binary encoding of data in the protocol [1]. I would like to reopen the discussion now since the jsonb 'binary' encoding is just a version '1' + text json. The result from the last discussion was that having a version or flags as part of the binary format is not enough, since drivers/libraries (fixable) and applications (unfixable) are depending on the current encoding. And if we add a new bit to the flags or bump the version number we will break backward compatibility. To summarize the previous discussion: - there are currently no written rules for modifying the binary encoding formats - bytea modification was done with a GUC, but GUC was seen as a bad solution in general - my proposal was to add a minor format version number was not good either since any per session state would be problematic for connection poolers [1]: http://grokbase.com/t/postgresql/pgsql-hackers/11bwhv1esa/add-minor-version-to-v3-protocol-to-allow-changes-without-breaking-backwards-compatibility
Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)
On 06/01/2014 05:35 PM, Tom Lane wrote: I wrote: 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0 (non breaking space) or U+00E1 (a-acute). I think this would probably work for most encodings but it might still fail in the Far East. Another objection is that the expected/plpython_unicode.out file would contain that character in UTF8 form. In principle that would work, since the test sets client_encoding = utf8 explicitly, but I'm worried about accidental corruption of the expected file by text editors, file transfers, etc. (The current usage of U+0080 doesn't suffer from this risk because psql special-cases printing of multibyte UTF8 control characters, so that we get exactly "\u0080".) I did a little bit of experimentation and determined that none of the LATIN1 characters are significantly more portable than what we've got: for instance a-acute fails to convert into 16 of the 33 supported server-side encodings (versus 17 failures for U+0080). However, non-breaking space is significantly better: it converts into all our supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW. It seems likely that we won't do better than that except with a basic ASCII character. Yeah, I just looked at the copyright symbol, with similar results. Let's just stick to ASCII. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)
I wrote: > 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0 > (non breaking space) or U+00E1 (a-acute). I think this would probably > work for most encodings but it might still fail in the Far East. Another > objection is that the expected/plpython_unicode.out file would contain > that character in UTF8 form. In principle that would work, since the test > sets client_encoding = utf8 explicitly, but I'm worried about accidental > corruption of the expected file by text editors, file transfers, etc. > (The current usage of U+0080 doesn't suffer from this risk because psql > special-cases printing of multibyte UTF8 control characters, so that we > get exactly "\u0080".) I did a little bit of experimentation and determined that none of the LATIN1 characters are significantly more portable than what we've got: for instance a-acute fails to convert into 16 of the 33 supported server-side encodings (versus 17 failures for U+0080). However, non-breaking space is significantly better: it converts into all our supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW. It seems likely that we won't do better than that except with a basic ASCII character. In principle we could make the test "pass" even in these encodings by adding variant expected files, but I doubt it's worth it. I'd be inclined to just add a comment to the regression test file indicating that that's a known failure case, and move on. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)
Tomas Vondra writes: > On 13.5.2014 20:58, Tom Lane wrote: >> Tomas Vondra writes: >>> Yeah, not really what we were shooting for. I've fixed this by >>> defining the missing locales, and indeed - magpie now fails in >>> plpython tests. >> I saw that earlier today (tho right now the buildfarm server seems >> to not be responding :-(). Probably we should use some >> more-widely-used character code in that specific test? > Any idea what other character could be used in those tests? ISTM fixing > this universally would mean using ASCII characters - the subset of UTF-8 > common to all the encodings. But I'm afraid that'd contradict the very > purpose of those tests ... We really ought to resolve this issue so that we can get rid of some of the red in the buildfarm. ISTM there are three possible approaches: 1. Decide that we're not going to support running the plpython regression tests under "weird" server encodings, in which case Tomas should just remove cs_CZ.WIN-1250 from the set of encodings his buildfarm animals test. Don't much care for this, but it has the attraction of being minimal work. 2. Change the plpython_unicode test to use some ASCII character in place of \u0080. We could keep on using the \u syntax to create the character, but as stated above, this still seems like it's losing a significant amount of test coverage. 3. Try to select some "more portable" non-ASCII character, perhaps U+00A0 (non breaking space) or U+00E1 (a-acute). I think this would probably work for most encodings but it might still fail in the Far East. Another objection is that the expected/plpython_unicode.out file would contain that character in UTF8 form. In principle that would work, since the test sets client_encoding = utf8 explicitly, but I'm worried about accidental corruption of the expected file by text editors, file transfers, etc. (The current usage of U+0080 doesn't suffer from this risk because psql special-cases printing of multibyte UTF8 control characters, so that we get exactly "\u0080".) Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GSoC] Clustering in MADlib - status update
Hi all! I've pushed my report for this week on my repo [0]. Here is a copy! Attached is the patch containing my work for this week. Week 2 - 2014/01/01 This week, I have worked on the beginning of the kmedoids module. Unfortunately, I was supposed to have something working for last Wednesday, and it is still not ready, mostly because I've lost time this week by being sick, and by packing all my stuff in preparation for relocation. The good news now: this week is my last school (exam) week, and that means full-time GSoC starting next Monday! Also, I've studied the kmeans module quite thoroughly, and I can finally understand how it all goes on, at the exception of one bit: the enormous SQL request used to update the IterationController. For kmedoids, I've abandoned the idea of making the loop by myself and have decided instead to stick to copying kmeans as much as possible, as it seems easier than doing it all by myself. The only part that remains to be adapted is that big SQL query I haven't totally understood yet. I've asked the help of Atri, but surely the help of an experienced MADlib hacker would speed things up :) Atri and I would also like to deal with this through a voip meeting, to ease communication. If anyone wants to join, you're welcome! As for the technology we'll use, I have a Mumble server running somewhere, if that fits to everyone. Otherwise, suggest something! I am available from Monday 2 at 3 p.m. (UTC) to Wednesday 4 at 10 a.m. (exam weeks are quite light). This week, I have also faced the first design decisions I have to make. For kmedoids, the centroids are points of the dataset. So, if I wanted to identify them precisely, I'd need to use their ids, but that would mean having a prototype different than the kmeans one. So, for now, I've decided to use the points coordinates only, hoping I will not run into trouble. If I ever do, switching to ids should'nt be too hard. Also, if the user wants to input initial medoids, he can input whatever points he wants, be they part of the dataset or not. After the first iteration, the centroids will anyway be points of the dataset (maybe I could just select the points nearest to the coordinates they input as initial centroids). Second, I'll need to refactor the code in kmeans and kmedoids, as these two modules are very similar. There are several options for this: 1. One big "clustering" module containing everything clustering-related (ugly but easy option); 2. A "clustering" module and "kmeans", "kmedoids", "optics", "utils" submodules (the best imo, but I'm not sure it's doable); 3. A "clustering_utils" module at the same level as the others (less ugly than the first one, but easy too). Any opinions? Next week, I'll get a working kmedoids module, do some refactoring, and then add the extra methods, similar to what's done in kmeans, for the different seedings. Once that's done, I'll make it compatible with all three ports (I'm currently producing Postgres-only code, as it's the easiest for me to test), and write the tests and doc. The deadline for this last step is in two weeks; I don't know yet if I'll be on time by then or not. It will depend on how fast I can get kmedoids working, and how fast I'll go once I'm full time GSoC. Finally, don't hesitate to tell me if you think my decisions are wrong, I'm glad to learn :) [0] http://git.viod.eu/viod/gsoc_2014/blob/master/reports.rst -- Maxence Ahlouche 06 06 66 97 00 diff --git a/src/config/Modules.yml b/src/config/Modules.yml index bf48d82..8f3431f 100644 --- a/src/config/Modules.yml +++ b/src/config/Modules.yml @@ -20,6 +20,7 @@ modules: depends: ['svec_util'] - name: kmeans depends: ['array_ops', 'svec_util', 'sample'] +- name: kmedoids - name: lda depends: ['array_ops'] - name: linalg diff --git a/src/ports/postgres/modules/kmedoids/__init__.py_in b/src/ports/postgres/modules/kmedoids/__init__.py_in new file mode 100644 index 000..e69de29 diff --git a/src/ports/postgres/modules/kmedoids/kmedoids.py_in b/src/ports/postgres/modules/kmedoids/kmedoids.py_in new file mode 100644 index 000..e6e6167 --- /dev/null +++ b/src/ports/postgres/modules/kmedoids/kmedoids.py_in @@ -0,0 +1,38 @@ +import plpy + +from utilities.validate_args import table_exists +from utilities.validate_args import table_is_empty + +# -- + + +# TODO:refactor (directly copied from kmeans module) +def kmedoids_validate_src(schema_madlib, rel_source, **kwargs): +if rel_source is None or rel_source.strip().lower() in ('null', ''): +plpy.error("kmeans error: Invalid data table name!") +if not table_exists(rel_source): +plpy.error("kmeans error: Data table does not exist!") +if table_is_empty(rel_source): +plpy.error("kmeans error: Data table is empty!") + +# -- + + +def compute_kmedoids(schem
Re: [HACKERS] Changeset Extraction v7.6.1
On 6/1/14, 10:49 AM, Euler Taveira wrote: On 01-06-2014 02:57, Andres Freund wrote: On 2014-06-01 00:50:58 -0500, Jim Nasby wrote: On 5/31/14, 9:11 AM, Andres Freund wrote: On 2014-02-21 15:14:15 -0600, Jim Nasby wrote: On 2/17/14, 7:31 PM, Robert Haas wrote: But do you really want to keep that snapshot around long enough to copy the entire database? I bet you don't: if the database is big, holding back xmin for long enough to copy the whole thing isn't likely to be fun. I can confirm that this would be epic fail, at least for londiste. It takes about 3 weeks for a new copy of a ~2TB database. There's no way that'd work with one snapshot. (Granted, copy performance in londiste is rather lackluster, but still...) I'd marked this email as todo: If you have such a huge database you can, with logical decoding at least, use a basebackup using pg_basebackup or pg_start/stop_backup() and roll forwards from that... That'll hopefull make such huge copies much faster. Just keep in mind that one of the use cases for logical replication is upgrades. Should still be fine. Make a physical copy; pg_upgrade; catchup via logical rep. Have in mind that it is not an option if you want to copy *part* of the database(s) (unless you have space available and want to do the cleanup after upgrade). In a near future, a (new) tool could do (a) copy schema, (b) accumulate modifications while copying data, (c) copy whole table and (d) apply modifications for selected table(s)/schema(s). Such a tool could even be an alternative to pg_upgrade. There's also things that pg_upgrade doesn't handle, so it's not always an option. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.
Andres Freund writes: > I am pretty sure by now that the sane fix for this is to add a > SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers > that deal with query cancelation et al. do so, so it seems right that > RecoveryConflictInterrupt() does so as well. +1 regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] teaching Var about NOT NULL
David Rowley writes: > I quickly put together the attached patch which adds a "knownnotnull" bool > field to Var which we can set to true when we're completely sure that the > Var cannot contain any NULL values. This is utterly the wrong way to go about it. How will you update views containing such Vars, when someone does an ALTER TABLE SET/DROP NOT NULL? The right thing is for the planner to look *at plan time* to see if the column is marked NOT NULL. (This is safe against the ALTER TABLE problem because an ALTER will result in a relcache invalidation signal, forcing any plans referencing the table to be regenerated before next use.) One way to go about it would be to add a bool array to RelOptInfo and teach plancat.c to populate the array. However, that would only be a win if we could expect the information to get used (preferably multiple times) in most planning attempts. That does not seem to me to be likely for this use case, so I'd go with just doing a pg_attribute catcache lookup on-the-fly when necessary. I'd suggest (1) new lsyscache.c utility function taking a relation OID and an attnum; (2) new function somewhere in the planner that decides whether an expression is known not-null. For a Var, it'd fetch the matching RTE, see if it's RTE_RELATION, and if so call the lsyscache.c function. There are a lot of other potential cases that such a function could be taught about later, if it proves useful. BTW, you'd need to be pretty careful about semantics here. Even if the Var is known not-null at the point of scanning the relation, what if that relation is nullable by some upper outer join? Perhaps the function (2) would need to take an argument describing the join level at which we're making the test. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changeset Extraction v7.6.1
On 01-06-2014 02:57, Andres Freund wrote: > On 2014-06-01 00:50:58 -0500, Jim Nasby wrote: >> On 5/31/14, 9:11 AM, Andres Freund wrote: >>> On 2014-02-21 15:14:15 -0600, Jim Nasby wrote: On 2/17/14, 7:31 PM, Robert Haas wrote: > But do you really want to keep that snapshot around long enough to > copy the entire database? I bet you don't: if the database is big, > holding back xmin for long enough to copy the whole thing isn't likely > to be fun. I can confirm that this would be epic fail, at least for londiste. It takes about 3 weeks for a new copy of a ~2TB database. There's no way that'd work with one snapshot. (Granted, copy performance in londiste is rather lackluster, but still...) >>> >>> I'd marked this email as todo: >>> If you have such a huge database you can, with logical decoding at >>> least, use a basebackup using pg_basebackup or pg_start/stop_backup() >>> and roll forwards from that... That'll hopefull make such huge copies >>> much faster. > >> Just keep in mind that one of the use cases for logical replication is >> upgrades. > > Should still be fine. Make a physical copy; pg_upgrade; catchup via > logical rep. > Have in mind that it is not an option if you want to copy *part* of the database(s) (unless you have space available and want to do the cleanup after upgrade). In a near future, a (new) tool could do (a) copy schema, (b) accumulate modifications while copying data, (c) copy whole table and (d) apply modifications for selected table(s)/schema(s). Such a tool could even be an alternative to pg_upgrade. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] teaching Var about NOT NULL
Hi, While hacking away at implementing join removal support for ANTI-JOINs I realised that I couldn't just replace the join with a WHERE false condition... Let me explain... With a query such as: SELECT * FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id); Where a.b_id has a foreign key on b(id) I'm working on a join removal patch which will turn this into: SELECT * FROM a WHERE b_id IS NULL; This seemed like a bit of a shame since with my test tables b_id is defined NOT NULL, but there seemed to be no way to tell if I needed to add a WHERE FALSE or a IS NULL check to the WHERE clause. I quickly put together the attached patch which adds a "knownnotnull" bool field to Var which we can set to true when we're completely sure that the Var cannot contain any NULL values. I'm populating this from pg_attribute.attnotnull where I can and setting it to false where I can't see another way to tell for sure that nulls cannot exist. The only use of knownnotnull that I've added to the patch is to turn a query such as: SELECT * FROM a WHERE b_id IS NULL; To not scan the table, since id is defined as NOT NULL. postgres=# alter table a alter column b_id drop not null; ALTER TABLE postgres=# explain select id from a where b_id is null; QUERY PLAN --- Seq Scan on a (cost=0.00..31.40 rows=11 width=4) Filter: (b_id IS NULL) Planning time: 0.340 ms (3 rows) postgres=# alter table a alter column b_id set not null; ALTER TABLE postgres=# explain select id from a where b_id is null; QUERY PLAN Result (cost=0.00..31.40 rows=1 width=4) One-Time Filter: false -> Seq Scan on a (cost=0.00..31.40 rows=1 width=4) Planning time: 0.402 ms (4 rows) Having this extra flag could likely help optimise NOT IN(SELECT notnullcol FROM table) to allow this to become an ANTI-JOIN. It will also help join optimise join removals a little more. The patch is just a few minutes old and there's no regression tests yet. I'd rather have some feedback before I proceed with it. Regards David Rowley var_not_null_v0.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "pivot aggregation" with a patched intarray
>On Sat, May 31, 2014 at 12:31 AM, Marc Mamin wrote: >> I have patched intarray with 3 additional functions in order to >> count[distinct] event IDs >> into arrays, whereas the array position correspond to the integer values. >> (mimic column oriented storage) > >I didn't look at the feature itself, but here are some comments about >the format of the patch: >- Be careful the newlines on the file you posted use ¥r¥n, which is >purely Windows stuff... This will generate unnecessary diffs with the >source code oops, will fix >- Here are some guidelines about the code convention: >http://www.postgresql.org/docs/devel/static/source.html >- And here are a couple of rules to respect when submitting a patch: >https://wiki.postgresql.org/wiki/Submitting_a_Patch >Following those rules will make patch review as well as the life of >reviewers easier. thanks for your comments I don't mean to suggests this directly as a patch, I'm first interested to see if there are some interest for such an aggregation type. regards, Marc Mamin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "pivot aggregation" with a patched intarray
On Sat, May 31, 2014 at 12:31 AM, Marc Mamin wrote: > I have patched intarray with 3 additional functions in order to > count[distinct] event IDs > into arrays, whereas the array position correspond to the integer values. > (mimic column oriented storage) I didn't look at the feature itself, but here are some comments about the format of the patch: - Be careful the newlines on the file you posted use ¥r¥n, which is purely Windows stuff... This will generate unnecessary diffs with the source code - Here are some guidelines about the code convention: http://www.postgresql.org/docs/devel/static/source.html - And here are a couple of rules to respect when submitting a patch: https://wiki.postgresql.org/wiki/Submitting_a_Patch Following those rules will make patch review as well as the life of reviewers easier. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.
On 2014-05-30 10:30:42 +0530, Amit Kapila wrote: > On Wed, May 28, 2014 at 8:53 PM, Andres Freund > wrote: > > Hi, > > > > Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses > > WaitLatch() to wait. That's fine in itself. But > > procsignal_sigusr1_handler, which is used e.g. when resolving recovery > > conflicts, doesn't unconditionally do a SetLatch(). > > That means that we'll we'll currently not be able to cancel conflicting > > backends during recovery for 10min. Now, I don't think that'll happen > > too often in practice, but it's still annoying. > > How will such a situation occur, aren't we using pg_usleep during > RecoveryConflict functions > (ex. in ResolveRecoveryConflictWithVirtualXIDs)? I am not sure what you mean. pg_sleep() is the SQL callable function, a different thing to pg_usleep(). The latter isn't interruptible on all platforms, but the sleep times should be short enough for that not to matter. I am pretty sure by now that the sane fix for this is to add a SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers that deal with query cancelation et al. do so, so it seems right that RecoveryConflictInterrupt() does so as well. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] avoiding tuple copying in btree index builds
On Tue, May 6, 2014 at 12:04 AM, Robert Haas wrote: > On Mon, May 5, 2014 at 2:13 PM, Andres Freund wrote: > > On 2014-05-05 13:52:39 -0400, Robert Haas wrote: > >> Today, I discovered that when building a btree index, the btree code > >> uses index_form_tuple() to create an index tuple from the heap tuple, > >> calls tuplesort_putindextuple() to copy that tuple into the sort's > >> memory context, and then frees the original one it built. This seemed > >> inefficient, so I wrote a patch to eliminate the tuple copying. It > >> works by adding a function tuplesort_putindextuplevalues(), which > >> builds the tuple in the sort's memory context and thus avoids the need > >> for a separate copy. I'm not sure if that's the best approach, but > >> the optimization seems wortwhile. > > > > Hm. It looks like we could quite easily just get rid of > > tuplesort_putindextuple(). The hash usage doesn't look hard to convert. > > I glanced at that, but it wasn't obvious to me how to convert the hash > usage. If you have an idea, I'm all ears. I also think it's possible to have similar optimization for hash index incase it has to spool the tuple for sorting. In function hashbuildCallback(), when buildstate->spool is true, we can avoid to form index tuple. To check for nulls before calling _h_spool(), we can traverse the isnull array. It seems converting hash index usage is not as straightforward as btree index, but doesn't look too complex either. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com