Re: [HACKERS] max_prepared_transactions default ... why 5?
> > I'm writing up the new GUCs, and noticed that max_prepared_transactions > > defaults to 5. This is too many for most applications (which don't use > > them > > at all) and far too few for applications which use them regularly. > > I think the intention was to have enough so you could test 'em (in > particular, run the regression tests) without eating resources for > the majority of installations that aren't using them. > > Certainly an installation that *is* using 'em would want a higher > setting. Can' we make the default 0, which is what the majority should want, and have the regression test explicitly set it up on the commandline? /Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Release notes introductory text
D'Arcy J.M. Cain wrote: > On Thu, 11 Oct 2007 16:34:14 -0400 (EDT) > Bruce Momjian <[EMAIL PROTECTED]> wrote: > > Kevin Grittner wrote: > > > > PostgreSQL. Many complex ideas that normally take years > > > > to implement were added rapidly to this release by our development team. > > > > > > You do realize that this will make many managers very reluctant to adopt > > > it before it has settled in for many months, right? > > > > > > If the goal is to provide fair warning of a high-than-usual-risk > > > release, you've got it covered. > > > > No, that was not the intent. The indent was to say we got a lot done in > > one year. You have a suggestion? > > What if you changed "were added rapidly" to "were quickly brought to > maturity" or something like that? Updated text is: This release represents a major leap forward for PostgreSQL by adding significant new functionality and performance enhancements. This was made possible by a growing community that has dramatically accelerated the pace of development. This release adds the follow major capabilities: -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Proposal: generate_iterator functions
Hello this function can help with array's iteration. create function generate_iterator(anyarray) returns setof integer as $$ select i from generate_series(array_lower($1,1), array_upper($1,1)) g(i) $$ language sql; -- multidimensional create function generate_iterator(anyarray, integer) returns setof integer as $$ select generate_series(array_lower($1,$2), array_upper($1,$2)) g(i) $$ language sql; It can be internal function, not only shortcut for generate_series sample: create function array_sort(anyarray) returns anyarray as $$ select array(select $1[i] from generate_iterator($1) order by 1) $$ language sql; ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] max_prepared_transactions default ... why 5?
Josh Berkus wrote: > On Wednesday 17 October 2007 21:35, Tom Lane wrote: > > Josh Berkus <[EMAIL PROTECTED]> writes: > > > I'm writing up the new GUCs, and noticed that max_prepared_transactions > > > defaults to 5. This is too many for most applications (which don't use > > > them at all) and far too few for applications which use them regularly. > > > > I think the intention was to have enough so you could test 'em (in > > particular, run the regression tests) without eating resources for > > the majority of installations that aren't using them. > > > > Certainly an installation that *is* using 'em would want a higher > > setting. > > Yeah, given the amount of memory per xact, I guess we can't actually set the > default higher. I just hate to see a setting that is liable to bite someone > on the tuchas so easily. They will see the failure at 5 faster and adjust it accordingly. If it was higher they might hit the limit only under heavy load and it would surprise them. -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] max_prepared_transactions default ... why 5?
Josh Berkus <[EMAIL PROTECTED]> writes: > Yeah, given the amount of memory per xact, I guess we can't actually set the > default higher. I just hate to see a setting that is liable to bite someone > on the tuchas so easily. I seem to recall thinking about replacing the setting with a "prepared_transactions = on/off" boolean parameter, where "off" could be defined as still allowing enough to run the regression tests. The problem is to choose the "on" setting --- it's not too hard to think of application behaviors where you need *more* than max_connections entries. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] max_prepared_transactions default ... why 5?
On Wednesday 17 October 2007 21:35, Tom Lane wrote: > Josh Berkus <[EMAIL PROTECTED]> writes: > > I'm writing up the new GUCs, and noticed that max_prepared_transactions > > defaults to 5. This is too many for most applications (which don't use > > them at all) and far too few for applications which use them regularly. > > I think the intention was to have enough so you could test 'em (in > particular, run the regression tests) without eating resources for > the majority of installations that aren't using them. > > Certainly an installation that *is* using 'em would want a higher > setting. Yeah, given the amount of memory per xact, I guess we can't actually set the default higher. I just hate to see a setting that is liable to bite someone on the tuchas so easily. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] max_prepared_transactions default ... why 5?
Josh Berkus <[EMAIL PROTECTED]> writes: > I'm writing up the new GUCs, and noticed that max_prepared_transactions > defaults to 5. This is too many for most applications (which don't use them > at all) and far too few for applications which use them regularly. I think the intention was to have enough so you could test 'em (in particular, run the regression tests) without eating resources for the majority of installations that aren't using them. Certainly an installation that *is* using 'em would want a higher setting. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] max_prepared_transactions default ... why 5?
Folks, I'm writing up the new GUCs, and noticed that max_prepared_transactions defaults to 5. This is too many for most applications (which don't use them at all) and far too few for applications which use them regularly. It seems like we should either set the value to 0, or to something higher, like 50. It would also be nice to be able to just set the value to be equal to max_connections automatically, but I'm sure I brought that point up too late. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] ts_rewrite bug?
Hmm ... playing a bit more with the example I was just on about: regression=# create table ts_subst(target tsquery, subst tsquery); CREATE TABLE regression=# insert into ts_subst values('a', 'foo|bar'); INSERT 0 1 regression=# insert into ts_subst values('bar', 'baz'); INSERT 0 1 regression=# select ts_rewrite('a & b'::tsquery, 'select target, subst from ts_subst'); ts_rewrite - 'b' & ( 'foo' | 'bar' ) (1 row) Shouldn't I have gotten 'b' & ( 'foo' | 'baz' ) ??? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] ts_rewrite aggregate API seems mighty ugly
While working on the documentation it's finally sunk into me what ts_rewrite is all about, and I'm not very happy. The simple three-argument form is fine, it's basically a tsquery-specific version of replace(): regression=# select ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery); ts_rewrite - 'b' & ( 'foo' | 'bar' ) (1 row) (BTW, would it be possible to get it to not randomly rearrange the order of the clauses? That makes it quite confusing to understand what it's doing, and I don't offhand see a reason for that to happen.) And the two-argument form is okay, though a bit hard to wrap your head around --- the second argument is the text of a SQL query that returns successive second and third arguments for successive ts_rewrite operations on the first argument. For instance, regression=# create table ts_subst(target tsquery, subst tsquery); CREATE TABLE regression=# insert into ts_subst values('a', 'foo|bar'); INSERT 0 1 regression=# insert into ts_subst values('b', 'baz'); INSERT 0 1 regression=# select ts_rewrite('a & b'::tsquery, 'select target, subst from ts_subst'); ts_rewrite --- 'baz' & ( 'foo' | 'bar' ) (1 row) The point of this of course is to be able to apply many different substitutions stored in a table. However, the aggregate form seems just plain weird: regression=# select ts_rewrite(array['a & b'::tsquery, target, subst]) from ts_subst; ts_rewrite --- 'baz' & ( 'baz' | 'foo' ) (1 row) The assumption here is that target and substitute come from successive rows of the table, and again we're trying to map an original tsquery through a bunch of different replace() substitutions. The array[] bit is ugly and inefficient. I suppose it's a holdover from days when aggregates could only take one argument. Now that we have multiple-argument aggregates it would seem to make more sense to express the thing as a three-argument aggregate, except that then we'd have to give it a different name to distinguish it from the basic three-argument non-aggregate function. Another thing that I find ugly about this is that the 'original' tsquery is meaningful only on the first accumulation cycle; after that it's ignored, but the second and third array elements do have meaning. That's inconsistent. But the killer problem is that the aggregate form actually does the Wrong Thing if there are no rows to aggregate over --- it'll return NULL, not the original tsquery as you'd want. This is not too improbable if you follow the documentation's recommendation to filter the rows using an @> operator: regression=# select ts_rewrite(array['a & b'::tsquery, target, subst]) from ts_subst where 'a & b'::tsquery @> target; ts_rewrite --- 'baz' & ( 'bar' | 'foo' ) (1 row) regression=# select ts_rewrite(array['quux'::tsquery, target, subst]) from ts_subst where 'quux'::tsquery @> target; ts_rewrite (1 row) So it seems to me this is just wrong, and the only safe way to do it is to use the two-argument form, with the selection from the table happening as a SPI query. Since we're already committed to an initdb for beta2, it's not quite too late to reconsider the API here. My feeling at the moment is that we should just drop the aggregate form of ts_rewrite; it does nothing you can't do better with the two-argument form, and it is just ugly to boot. Also, if anyone does come up with a not-so-ugly design later, we can always add things in 8.4 or beyond; but once it's in core it's going to be a very hard sell to take it out. Comments? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Hash index todo list item
Tom, That is great. I am looking forward to your patch. After the issues that you needed to address, I think that it would be reasonable to add a few more user settings for the hash index. Fill-factor is too course a knob. The others that I have been considering are: maxtuples - Not really the maximum, but a target value to use for setting up the initial buckets. This would allow you to set it for data loads and avoid the "split-n-copy" trauma that you are trying to avoid with your new hash build process. multiplicity - Try to capture use cases that would require many overflow pages. In particular, if we discard the normal index page layout we can skip the space overhead of the page pointer and generate a more compact index. Then you could use a few more hash bits to lookup the index entry in the page. How many bits would be determined by this factor. 8-bits would give you 256 sub-pieces that could each hold about 3 entries using the current 4-byte hash, or 2 entries using an 8-byte hash. What do you think? Cheers, Ken On Wed, Oct 17, 2007 at 03:31:58PM -0700, Tom Raney wrote: > Kenneth, > > Great! > > Yes, we did update the code to use the estimate. I will post the patch > with this update. We only saw a very small difference in index build time, > but you may when you add many columns to the base relation. > With 1 billion tuples, you should start to see the hash index outperform > the btree for some equality probes, I would imagine. With a 90% fill > factor, the btree would require 4 levels to index that many tuples. If the > top two were in memory, there would be 3 IOs needed. I don't think PG > supports index only scans, so it will take the extra IO to probe the base > relation. The hash may take up to 2 IOs and maybe even less (or maybe more > depending on how many overflow buckets there are). It might be interesting > to fiddle with the fill factors of each index - hash pages (buckets) > default to 75% full. > -Tom >> Tom, >> >> I am getting ready to stitch in an updated, simplified version >> of Neil Conway's hash-only hash index patch. Did you have a >> chance to update your sizing function to use the planner-like >> estimate and not a full table scan? I would like to be able >> to use that when my test table start to have 10^9 entries. >> If you have not had a chance, I will try and add it myself. >> >> Regards, >> Ken >> >> > > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch2api project
"Pavel Stehule" <[EMAIL PROTECTED]> writes: > but I would > #define ENVELOPE_FCE(name,dest) \ > Datum name (PG_FUNCTION_ARGS); \ > PG_FUNCTION_INFO_V1(name); \ > Datum \ > name (PG_FUNCTION_ARGS) \ > { \ > return (dest)(fceinfo); \ > } That seems perfectly legitimate to me --- I'm pretty sure there are several instances of that in the core code already. I'd be more inclined to call the macro WRAPPER_FUNCTION, perhaps. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] About to remove current contents of contrib/tsearch2
Josh Berkus <[EMAIL PROTECTED]> writes: >> Barring objection I'll remove the current files, add the removets2.pl >> script that was batted around yesterday, and fix the Makefile to install >> just that script (and later any migration docs or other stuff we add). > I forget, did we want Oleg to create a Tsearch2 archive on pgfoundry? Archive for what? The current tsearch2 sources are useless in the context of 8.3 --- they conflict with the built-in types. And the main CVS server will still be a perfectly good source of the back-branch sources. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] CVS HEAD dumps core on simple tsvector input example
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> BTW, why does the 'a':6 lexeme disappear? To the extent that I >> understand how this should work, I'd have expected 'a' and 'a':6 >> to merge into 'a':6 not plain 'a'. > 'a':1,6 perhaps? No, it would be inappropriate to add a '1' that wasn't specified. My reasoning is that 'a':1 and 'a':6 are distinct bits of information, hence their combination is 'a':1,6. But 'a' doesn't give any more information than 'a':6 so it should be dropped by the duplicate-elimination code. It's not clear to me whether that's what Oleg and Teodor think, though. Hm, just found a variant of the bug: regression=# select 'a a:6'::tsvector; tsvector -- 'a' (1 row) regression=# select 'a a:6'::tsvector; tsvector -- 'a':6,16255C,0,12C,8C,2640,0,512,0,312,12C,400C,0,312,0,1,0,0,8448C,21,6 (1 row) This makes it look even more like a memory-corruption issue. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] CVS HEAD dumps core on simple tsvector input example
Tom Lane wrote: > BTW, why does the 'a':6 lexeme disappear? To the extent that I > understand how this should work, I'd have expected 'a' and 'a':6 > to merge into 'a':6 not plain 'a'. 'a':1,6 perhaps? -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J We take risks not to escape from life, but to prevent life escaping from us. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] CVS HEAD dumps core on simple tsvector input example
regression=# SELECT 'a very fat cat sat:4 on:5 a:6 mat:7'::tsvector; tsvector --- 'a' 'on':5 'cat' 'fat' 'mat':7 'sat':4 'very' (1 row) regression=# SELECT 'a very fat cat sat:4 on:5 a:6 mat:7'::tsvector; server closed the connection unexpectedly Notice it's the same input both times --- only the second one crashes. The coredump happens inside repalloc, making me suspect a memory clobber is involved. BTW, why does the 'a':6 lexeme disappear? To the extent that I understand how this should work, I'd have expected 'a' and 'a':6 to merge into 'a':6 not plain 'a'. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Windows and locales and UTF-8 (oh my)
Magnus Hagander wrote: > Got some help on IRC to dentify the charafters as ç and Ç. > Exact. > I can confirm that both work perfectly fine with UTF-8 and locale > Swedish_Sweden.1252. They sort correctly, and they work with both upper() > and lower() correctly. > I didn't remember what locale is. I'll check it. > This test is with 8.3-HEAD and the patch to allow UTF-8. > I tested with 8.2.4 and my encoding is LATIN1 IIRC. Didn't try UTF-8. I'll give it a try when i have my dev environment. -- Euler Taveira de Oliveira http://www.timbira.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] About to remove current contents of contrib/tsearch2
Tom, > Barring objection I'll remove the current files, add the removets2.pl > script that was batted around yesterday, and fix the Makefile to install > just that script (and later any migration docs or other stuff we add). I forget, did we want Oleg to create a Tsearch2 archive on pgfoundry? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch2api project
2007/10/17, Tom Lane <[EMAIL PROTECTED]>: > "Pavel Stehule" <[EMAIL PROTECTED]> writes: > >> Surely this shouldn't be creating its own tsvector datatype? > > > I would to use only pg_catalog.tsvector. But dump contains CREATE TYPE > > statement, and custom functions which prefere it, because path is: > > public, pg_catalog :(. I didn't find any other solution with full > > compatibility. > > I think we are really going to have to tell people to use the removets2 > script on their dumps in any case. I can't imagine that it'll be a good > idea to have a bogus tsvector type in the system -- it'll mess things > up going forward. I think probably the sort of solution you should be > after is > 1. load the compatibility module; > 2. load dump that's been stripped of old tsearch2 stuff; > 3. don't have to change anything else. > I agree. @1 99% is done I have question. Have I call directfunctioncallx interface for envelop fce? If I don't modify any param from fce_info I can call directly wrapped function: I used #define ENVELOPE_FCE1(name,dest) \ Datum name (PG_FUNCTION_ARGS); \ PG_FUNCTION_INFO_V1(name); \ Datum \ name (PG_FUNCTION_ARGS) \ { \ Datum arg0 = PG_GETARG_DATUM(0); \ return DirectFunctionCall1(dest, arg0); \ } but I would #define ENVELOPE_FCE(name,dest) \ Datum name (PG_FUNCTION_ARGS); \ PG_FUNCTION_INFO_V1(name); \ Datum \ name (PG_FUNCTION_ARGS) \ { \ return (dest)(fceinfo); \ } is it correct? or is better simple modify pg_proc? There is maybe 10-15 fce where is necessary do something Pavel ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: PL/pgSQL EXECUTE INTO USING (for 8.4)
2007/10/17, Brendan Jurd <[EMAIL PROTECTED]>: > On 10/17/07, Pavel Stehule <[EMAIL PROTECTED]> wrote: > > New syntax: > > > > a) EXECUTE stringexpr > > [INTO [STRICT] varlist > > [USING exprlist] > > > > b) FOR varlist IN EXECUTE stringexpr USING exprlist LOOP > > Just chiming in with a +1. I would find this feature very useful. > Substitution of parameters is way more elegant than quoting, and the > syntax looks nice. > I am doing some simple speed tests, and with USING run dynamic queries little bit faster (15%). Prepared statement accepts params in binary form, so we don't need call out functions. Pavel ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch2api project
"Pavel Stehule" <[EMAIL PROTECTED]> writes: >> Surely this shouldn't be creating its own tsvector datatype? > I would to use only pg_catalog.tsvector. But dump contains CREATE TYPE > statement, and custom functions which prefere it, because path is: > public, pg_catalog :(. I didn't find any other solution with full > compatibility. I think we are really going to have to tell people to use the removets2 script on their dumps in any case. I can't imagine that it'll be a good idea to have a bogus tsvector type in the system -- it'll mess things up going forward. I think probably the sort of solution you should be after is 1. load the compatibility module; 2. load dump that's been stripped of old tsearch2 stuff; 3. don't have to change anything else. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch2api project
> > Surely this shouldn't be creating its own tsvector datatype? Having > both public.tsvector and pg_catalog.tsvector seems like a seriously > bad idea, if only because of confusion. ISTM you should only be > creating new public.foo objects for the functions whose names changed. > I would to use only pg_catalog.tsvector. But dump contains CREATE TYPE statement, and custom functions which prefere it, because path is: public, pg_catalog :(. I didn't find any other solution with full compatibility. Maybe we can have second API, which isn't usable for loading of dump, but can be used for API compatibility. This API can be more cleaner and can be stored in contrib. else there are two points * load or trasformation dump * application's modification for new API > > Anyway, the picture that's starting to emerge for me is that we > should repurpose contrib/tsearch2 as a repository for scripts > and documentation to help people migrate from previous use of > tsearch2 to use of the new core facilities; and for people who > want to try to *not* migrate, but keep using the old API, > a compatibility module on pgfoundry seems to make sense. > Migration is one true way. But I know lot of admins who are unable do it :( Pavel > We could alternatively put the migration support into a new > subdirectory named contrib/tsearch_migrate or something like that. > But I'm thinking tsearch2 is a good place because that's where > users of the old tsearch2 are likely to look. > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] About to remove current contents of contrib/tsearch2
I'm getting annoyed by glimpse hits on the old code when I'm looking for bits of tsearch code... Is there any reason to keep the current contents of contrib/tsearch2 any longer? It'll all still be obtainable from CVS, of course, but it doesn't seem like it has any more purpose in HEAD. Barring objection I'll remove the current files, add the removets2.pl script that was batted around yesterday, and fix the Makefile to install just that script (and later any migration docs or other stuff we add). regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch2api project
Tom Lane wrote: Anyway, the picture that's starting to emerge for me is that we should repurpose contrib/tsearch2 as a repository for scripts and documentation to help people migrate from previous use of tsearch2 to use of the new core facilities; and for people who want to try to *not* migrate, but keep using the old API, a compatibility module on pgfoundry seems to make sense. +1 Also, something that's not been addressed at all yet, AFAICS, is providing a way to migrate an existing tsearch2 *configuration* (as opposed to data or application code). I'm not sure there can be any automatic way to do that, but at the very least we need some documentation about what corresponds to what. I'm afraid the defaults have "just worked" for me, so I never played with any of the config stuff (old or new)... I don't know if I'll be of any help here. -Andy ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch2api project
"Pavel Stehule" <[EMAIL PROTECTED]> writes: > I created new project on pgfoundry. It's wrapper of integrated > fulltext and it's binary compatible with TSearch2 API. > * it works, (I am able load 82 dump without changes) > * it is ugly :( . I expected, so this wrapper can be more elegant, but > not. I had to create full dual interface to fulltext. Surely this shouldn't be creating its own tsvector datatype? Having both public.tsvector and pg_catalog.tsvector seems like a seriously bad idea, if only because of confusion. ISTM you should only be creating new public.foo objects for the functions whose names changed. Anyway, the picture that's starting to emerge for me is that we should repurpose contrib/tsearch2 as a repository for scripts and documentation to help people migrate from previous use of tsearch2 to use of the new core facilities; and for people who want to try to *not* migrate, but keep using the old API, a compatibility module on pgfoundry seems to make sense. We could alternatively put the migration support into a new subdirectory named contrib/tsearch_migrate or something like that. But I'm thinking tsearch2 is a good place because that's where users of the old tsearch2 are likely to look. Also, something that's not been addressed at all yet, AFAICS, is providing a way to migrate an existing tsearch2 *configuration* (as opposed to data or application code). I'm not sure there can be any automatic way to do that, but at the very least we need some documentation about what corresponds to what. Comments? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Deprecating heap_formtuple/heap_modifytuple/heap_deformtuple
I think we should go ahead and kill the old 'n'/' ' api for heaptuple.c. The code duplication here is really annoying and it makes it confusing for developers trying to read or write code where they have to keep straight which interface they're using. What I think we should do is just announce they're deprecated for 8.3 without changing anything and then early in 8.4 remove them and convert our own code to use the new api. We could add wrappers prior to the 8.4 release which converts the isnull/replaces arrays for the benefit of outside modules. As an exercise I just went ahead and removed all of our calls to it and while it was quite annoying there weren't really any show-stoppers. The worst thing I find is that SPI uses a similar interface which means it'll be inconsistent with the underlying interface -- but there's no direct binding between the two so it doesn't cause any actual breakage, just potential confusion. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Why copy_relation_data only use walwhenWALarchivingis enabled
Simon Riggs wrote: > On Wed, 2007-10-17 at 18:13 +0100, Heikki Linnakangas wrote: > >>> The test script you >>> showed cheats six-ways-from-Sunday to cause an OID collision that would >>> never happen in practice. The only case where it would really happen >>> is if a table that has existed for a long time (~ 2^32 OID creations) >>> gets dropped and then you're unlucky enough to recycle that exact OID >>> before the next checkpoint --- and then crash before the checkpoint. >> Yeah, it's unlikely to happen, but the consequences are horrible. > > When is this going to happen? > > We'd need to insert 2^32 toast chunks, which is >4 TB of data, or insert > 2^32 large objects, or create 2^32 tables, or any combination of the > above all within one checkpoint duration *and* exactly hit the exact > same relation. The consumption of the OIDs don't need to happen within one checkpoint duration. As long as the DROP and the reuse happens in the same checkpoint cycle, you're screwed. Granted that you're not likely to ever experience OID wrap-around unless you have a heavily used user table with OIDs. Or create/drop temp tables a lot. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled
On Wed, 2007-10-17 at 18:13 +0100, Heikki Linnakangas wrote: > > The test script you > > showed cheats six-ways-from-Sunday to cause an OID collision that would > > never happen in practice. The only case where it would really happen > > is if a table that has existed for a long time (~ 2^32 OID creations) > > gets dropped and then you're unlucky enough to recycle that exact OID > > before the next checkpoint --- and then crash before the checkpoint. > > Yeah, it's unlikely to happen, but the consequences are horrible. When is this going to happen? We'd need to insert 2^32 toast chunks, which is >4 TB of data, or insert 2^32 large objects, or create 2^32 tables, or any combination of the above all within one checkpoint duration *and* exactly hit the exact same relation. That's a weird and huge application, a very fast server and an unlucky DBA to hit the exact OID to be reused and then have the server crash so we'll ever notice. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why copy_relation_data only use walwhenWALarchivingis enabled
Simon Riggs wrote: > On Wed, 2007-10-17 at 17:36 +0100, Heikki Linnakangas wrote: >> Simon Riggs wrote: >>> On Wed, 2007-10-17 at 15:02 +0100, Heikki Linnakangas wrote: Simon Riggs wrote: > If you've got a better problem statement it would be good to get that > right first before we discuss solutions. Reusing a relfilenode of a deleted relation, before next checkpoint following the commit of the deleting transaction, for an operation that doesn't WAL log the contents of the new relation, leads to data loss on recovery. >>> OK, thanks. >>> >>> I wasn't aware we reused refilenode ids. The code in GetNewOid() doesn't >>> look deterministic to me, or at least isn't meant to be. >>> GetNewObjectId() should be cycling around, so although the oid index >>> scan using SnapshotDirty won't see committed deleted rows that shouldn't >>> matter for 2^32 oids. So what gives? >> I don't think you still quite understand what's happening. > > Clearly. It's not a problem to admit that. > >> GetNewOid() >> is not interesting here, look at GetNewRelFileNode() instead. And >> neither are snapshots or MVCC visibility rules. > > Which calls GetNewOid() in all cases, AFAICS. > > How does the reuse you say is happening come about? Seems like the bug > is in the reuse, not in how we cope with potential reuse. After a table is dropped, the dropping transaction has been committed, and the relation file has been deleted, there's nothing preventing the reuse. There's no trace of that relfilenode in the system (except in the WAL, which we never look into except on WAL replay). There's a dead row in pg_class with that relfilenode, but even that could be vacuumed away (not that it matters because we don't examine that). Now the problem is that there's a record in the WAL to delete a relation file with that relfilenode. If that relfilenode was reused, we delete the contents of the new relation file when we replay that WAL record. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled
On Wed, 2007-10-17 at 17:36 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Wed, 2007-10-17 at 15:02 +0100, Heikki Linnakangas wrote: > >> Simon Riggs wrote: > >>> If you've got a better problem statement it would be good to get that > >>> right first before we discuss solutions. > >> Reusing a relfilenode of a deleted relation, before next checkpoint > >> following the commit of the deleting transaction, for an operation that > >> doesn't WAL log the contents of the new relation, leads to data loss on > >> recovery. > > > > OK, thanks. > > > > I wasn't aware we reused refilenode ids. The code in GetNewOid() doesn't > > look deterministic to me, or at least isn't meant to be. > > GetNewObjectId() should be cycling around, so although the oid index > > scan using SnapshotDirty won't see committed deleted rows that shouldn't > > matter for 2^32 oids. So what gives? > > I don't think you still quite understand what's happening. Clearly. It's not a problem to admit that. > GetNewOid() > is not interesting here, look at GetNewRelFileNode() instead. And > neither are snapshots or MVCC visibility rules. Which calls GetNewOid() in all cases, AFAICS. How does the reuse you say is happening come about? Seems like the bug is in the reuse, not in how we cope with potential reuse. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled
Tom Lane wrote: > Simon has a legitimate objection; not that there's no bug, but that the > probability of getting bitten is exceedingly small. Oh, if that's what he meant, he's right. > The test script you > showed cheats six-ways-from-Sunday to cause an OID collision that would > never happen in practice. The only case where it would really happen > is if a table that has existed for a long time (~ 2^32 OID creations) > gets dropped and then you're unlucky enough to recycle that exact OID > before the next checkpoint --- and then crash before the checkpoint. Yeah, it's unlikely to happen, but the consequences are horrible. Note that it's not just DROP TABLE that's a problem, but anything that uses smgrscheduleunlink, including CLUSTER and REINDEX. > I tend to agree that truncating the file, and extending the fsync > request mechanism to actually delete it after the next checkpoint, > is the most reasonable route to a fix. Ok, I'll write a patch to do that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > I don't think you still quite understand what's happening. GetNewOid() > is not interesting here, look at GetNewRelFileNode() instead. And > neither are snapshots or MVCC visibility rules. Simon has a legitimate objection; not that there's no bug, but that the probability of getting bitten is exceedingly small. The test script you showed cheats six-ways-from-Sunday to cause an OID collision that would never happen in practice. The only case where it would really happen is if a table that has existed for a long time (~ 2^32 OID creations) gets dropped and then you're unlucky enough to recycle that exact OID before the next checkpoint --- and then crash before the checkpoint. I think we should think about ways to fix this, but I don't feel a need to try to backpatch a solution. I tend to agree that truncating the file, and extending the fsync request mechanism to actually delete it after the next checkpoint, is the most reasonable route to a fix. I think the objection about leaking files on crash is wrong. We'd have the replay of the deletion to fix things up --- it could probably delete the file immediately, and if not could certainly put it back on the fsync request queue. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] rolcanlogin vs. the flat password file
Magnus Hagander wrote: > On Wed, Oct 17, 2007 at 05:09:25PM +0100, Dave Page wrote: >> Stephen Frost wrote: >>> * Tom Lane ([EMAIL PROTECTED]) wrote: Heikki Linnakangas <[EMAIL PROTECTED]> writes: > There's legitimate use for creating a role with NOLOGIN and a password. If we think that, then we shouldn't have a message at all. >>> I'm not sure I agree with that. I don't agree that there's really a >>> legitimate use for creating a role w/ NOLOGIN and a password either, for >>> that matter. >> Preparing a new user account prior to an employee starting? In my last >> post we would do that regularly - setup all the accounts etc for the new >> user, but disable them all until the start date. > > Yeah, but did you actually set a password for them? Yeah, then have them change them all during day 1 IT induction training. We had a much smaller team that I know you do, and the staff that would do the account setup would often be busy first thing on Monday morning when new starters might often arrive - so we would just 'flip the switch' on the pre-configured accounts. /D ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] rolcanlogin vs. the flat password file
On Wed, Oct 17, 2007 at 11:27:10AM -0400, Tom Lane wrote: > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > > There's legitimate use for creating a role with NOLOGIN and a password. > > If we think that, then we shouldn't have a message at all. At least if we think it's more than a very narrow legitimate use, compared to the number of ppl making the mistake. I agree with making it a NOTICE instead of WARNING though. //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] rolcanlogin vs. the flat password file
On Wed, Oct 17, 2007 at 05:09:25PM +0100, Dave Page wrote: > Stephen Frost wrote: > > * Tom Lane ([EMAIL PROTECTED]) wrote: > >> Heikki Linnakangas <[EMAIL PROTECTED]> writes: > >>> There's legitimate use for creating a role with NOLOGIN and a password. > >> If we think that, then we shouldn't have a message at all. > > > > I'm not sure I agree with that. I don't agree that there's really a > > legitimate use for creating a role w/ NOLOGIN and a password either, for > > that matter. > > Preparing a new user account prior to an employee starting? In my last > post we would do that regularly - setup all the accounts etc for the new > user, but disable them all until the start date. Yeah, but did you actually set a password for them? We do that all the time here, but we don't set the passwords until they show up. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchivingis enabled
Simon Riggs wrote: > On Wed, 2007-10-17 at 15:02 +0100, Heikki Linnakangas wrote: >> Simon Riggs wrote: >>> If you've got a better problem statement it would be good to get that >>> right first before we discuss solutions. >> Reusing a relfilenode of a deleted relation, before next checkpoint >> following the commit of the deleting transaction, for an operation that >> doesn't WAL log the contents of the new relation, leads to data loss on >> recovery. > > OK, thanks. > > I wasn't aware we reused refilenode ids. The code in GetNewOid() doesn't > look deterministic to me, or at least isn't meant to be. > GetNewObjectId() should be cycling around, so although the oid index > scan using SnapshotDirty won't see committed deleted rows that shouldn't > matter for 2^32 oids. So what gives? I don't think you still quite understand what's happening. GetNewOid() is not interesting here, look at GetNewRelFileNode() instead. And neither are snapshots or MVCC visibility rules. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchiving is enabled
On Wed, 2007-10-17 at 15:02 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > If you've got a better problem statement it would be good to get that > > right first before we discuss solutions. > > Reusing a relfilenode of a deleted relation, before next checkpoint > following the commit of the deleting transaction, for an operation that > doesn't WAL log the contents of the new relation, leads to data loss on > recovery. OK, thanks. I wasn't aware we reused refilenode ids. The code in GetNewOid() doesn't look deterministic to me, or at least isn't meant to be. GetNewObjectId() should be cycling around, so although the oid index scan using SnapshotDirty won't see committed deleted rows that shouldn't matter for 2^32 oids. So what gives? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] rolcanlogin vs. the flat password file
Stephen Frost wrote: > * Tom Lane ([EMAIL PROTECTED]) wrote: >> Heikki Linnakangas <[EMAIL PROTECTED]> writes: >>> There's legitimate use for creating a role with NOLOGIN and a password. >> If we think that, then we shouldn't have a message at all. > > I'm not sure I agree with that. I don't agree that there's really a > legitimate use for creating a role w/ NOLOGIN and a password either, for > that matter. Preparing a new user account prior to an employee starting? In my last post we would do that regularly - setup all the accounts etc for the new user, but disable them all until the start date. /D ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] rolcanlogin vs. the flat password file
* Tom Lane ([EMAIL PROTECTED]) wrote: > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > > There's legitimate use for creating a role with NOLOGIN and a password. > > If we think that, then we shouldn't have a message at all. I'm not sure I agree with that. I don't agree that there's really a legitimate use for creating a role w/ NOLOGIN and a password either, for that matter. A 'NOTICE' level message would be fine with me. We have NOTICE messages for when we create an index for a PK. I find a message about an entirely unexpected and unworkable configuration alot more useful than those. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] rolcanlogin vs. the flat password file
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > There's legitimate use for creating a role with NOLOGIN and a password. If we think that, then we shouldn't have a message at all. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [COMMITTERS] pgsql: Re-allow UTF8 encodings on win32.
"Pavel Stehule" <[EMAIL PROTECTED]> writes: > could not determine encoding for locale "Czech_Czech Republic.1250": codeset > is > "CP1250" Hm, we seem to have missed an entry for PG_WIN1250. Fixed. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] rolcanlogin vs. the flat password file
Magnus Hagander wrote: > On Sun, Oct 14, 2007 at 06:16:04PM -0400, Stephen Frost wrote: >> * Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost <[EMAIL PROTECTED]> writes: > I wonder if the OP was unhappy because he created a role w/ a pw and > then couldn't figure out why the user couldn't log in? Hm, maybe. In that case just not filtering the entry out of the flat file would be good enough. >>> I've confirmed the confusing behavior in CVS HEAD. With password auth >>> selected in pg_hba.conf: >> [...] >>> Should we just do this, or is it worth working harder? >> I certainly like this. Honestly, I'd also like the warning when doing a >> 'create role'/'alter role' that sets/changes the pw on an account that >> doesn't have 'rolcanlogin'. Much better to have me notice that I goof'd >> the command and fix it before telling the user 'go ahead and log in' >> than to have the user complain that it's not working. :) >> >> Just my 2c. > > I think that's a good idea. Attached is a patch that implements this (I > think - haven't messed around in that area of the code before). Thoughts? Is WARNING an appropriate level for this? I think NOTICE is enough, it's not like something bad is going to happen if you do that, it just means that you've likely screwed up. There's legitimate use for creating a role with NOLOGIN and a password. Maybe you're going to give login privilege later on. It wouldn't be nice to get WARNINGs in that case, even NOTICEs would be sligthly annoying. Note that per-role guc variables will also have no effect on a role with no login privilege. How about connection limit, is that inherited? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] random dataset generator for SKYLINE operator
Hannes Eder worte: We wrote a little contrib module, which we'd like to share. It can be used to generate random datasets as they have been used in [Borzsonyi2001] and related work. [snip] The module was moved to: http://randdataset.projects.postgresql.org/ resp. http://randdataset.projects.postgres.org/ Have fun using it, -Hannes ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] rolcanlogin vs. the flat password file
* Magnus Hagander ([EMAIL PROTECTED]) wrote: > I think that's a good idea. Attached is a patch that implements this (I > think - haven't messed around in that area of the code before). Thoughts? Cool, thanks! My only comment is that you should probably stick to one 'zero' convention- either '!canlogin' or 'canlogin == 0'. I prefer the former, but the inconsistancy in a single patch is kind of odd. I'm not sure if there's an overall PG preference. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [COMMITTERS] pgsql: Re-allow UTF8 encodings on win32.
I did some test, but without success, Pavel I have win2003 Server .. with czech locales support. I:\PGSQL\BIN>initdb -D ../data -L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable encoding for locale Czech_Czech Republic.1250 Rerun INITDB with the -E option. Try "INITDB --help" for more information. I:\PGSQL\BIN> I:\PGSQL\BIN>initdb -E UTF-8 -D ../data -L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable text search configuration for locale Czech_Czech Republic.1250 The default text search configuration will be set to "simple". fixing permissions on existing directory ../data ... ok creating subdirectories ... ok selecting default max_connections ... 10 selecting default shared_buffers/max_fsm_pages ... 400kB/2 creating configuration files ... ok creating template1 database in ../data/base/1 ... FATAL: could not select a sui table default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. child process exited with exit code 1 INITDB: removing contents of data directory "../data" I:\PGSQL\BIN>initdb -E win1250 --locale="Czech_Czech Republic.1250" -D ../data - L i:\pgsql\share The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale Czech_Czech Republic.1250. could not determine encoding for locale "Czech_Czech Republic.1250": codeset is "CP1250" INITDB: could not find suitable text search configuration for locale Czech_Czech Republic.1250 The default text search configuration will be set to "simple". fixing permissions on existing directory ../data ... ok creating subdirectories ... ok selecting default max_connections ... 10 selecting default shared_buffers/max_fsm_pages ... 400kB/2 creating configuration files ... ok creating template1 database in ../data/base/1 ... FATAL: could not select a sui table default timezone DETAIL: It appears that your GMT time zone uses leap seconds. PostgreSQL does n ot support leap seconds. child process exited with exit code 1 INITDB: removing contents of data directory "../data" ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] rolcanlogin vs. the flat password file
On Sun, Oct 14, 2007 at 06:16:04PM -0400, Stephen Frost wrote: > * Tom Lane ([EMAIL PROTECTED]) wrote: > > > Stephen Frost <[EMAIL PROTECTED]> writes: > > >> I wonder if the OP was unhappy because he created a role w/ a pw and > > >> then couldn't figure out why the user couldn't log in? > > > > > Hm, maybe. In that case just not filtering the entry out of the flat > > > file would be good enough. > > > > I've confirmed the confusing behavior in CVS HEAD. With password auth > > selected in pg_hba.conf: > [...] > > Should we just do this, or is it worth working harder? > > I certainly like this. Honestly, I'd also like the warning when doing a > 'create role'/'alter role' that sets/changes the pw on an account that > doesn't have 'rolcanlogin'. Much better to have me notice that I goof'd > the command and fix it before telling the user 'go ahead and log in' > than to have the user complain that it's not working. :) > > Just my 2c. I think that's a good idea. Attached is a patch that implements this (I think - haven't messed around in that area of the code before). Thoughts? //Magnus Index: src/backend/commands/user.c === RCS file: /cvsroot/pgsql/src/backend/commands/user.c,v retrieving revision 1.177 diff -c -r1.177 user.c *** src/backend/commands/user.c 3 Sep 2007 18:46:30 - 1.177 --- src/backend/commands/user.c 17 Oct 2007 14:45:33 - *** *** 250,255 --- 250,260 if (dvalidUntil) validUntil = strVal(dvalidUntil->arg); + /* Warn about combination that's likely incorrect */ + if (password && !canlogin) + ereport(WARNING, + (errmsg("created user with password that cannot log in"))); + /* Check some permissions first */ if (issuper) { *** *** 649,654 --- 654,664 DirectFunctionCall1(textin, CStringGetDatum(encrypted_password)); } new_record_repl[Anum_pg_authid_rolpassword - 1] = 'r'; + /* Warn about combination that's likely incorrect */ + if (canlogin == 0 || + Form_pg_authid) GETSTRUCT(tuple))->rolcanlogin == 0) && canlogin != 1)) + ereport(WARNING, + (errmsg("set password for a user that cannot log in"))); } /* unset password */ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] DBLink's default user: postgres
Rodrigo Hjort wrote: Is this the regular behavior on DBLink? rot=> SELECT user, current_database(); current_user | current_database --+-- sa_rot | rot (1 registro) rot=> SELECT * rot-> FROM dblink('dbname=escola', rot(> 'SELECT user, current_database()') rot-> AS (usr name, db name); usr| db --+ postgres | escola (1 registro) This way, I fear DBLink functions should become a vulnerability issue on my database. Is there any way to protect or override this setting? Or it should be done on pg_hba.conf only? This issue has been thoroughly discussed before. You can read more about it in f.ex these threads: http://archives.postgresql.org/pgsql-hackers/2007-06/msg00678.php http://archives.postgresql.org/pgsql-patches/2007-07/msg0.php -- Tommy Gildseth ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
Florian G. Pflug wrote: > Heikki Linnakangas wrote: >> I wrote: >>> Unfortunately I don't see any easy way to fix it. One approach would be >>> to avoid reusing the relfilenodes until next checkpoint, but I don't see >>> any nice place to keep track of OIDs that have been dropped since last >>> checkpoint. >> >> Ok, here's one idea: >> >> Instead of deleting the file immediately on commit of DROP TABLE, the >> file is truncated to release the space, but not unlink()ed, to avoid >> reusing that relfilenode. The truncated file can be deleted after next >> checkpoint. >> >> Now, how does checkpoint know what to delete? We can use the fsync >> request mechanism for that. When a file is truncated, a new kind of >> fsync request, a "deletion request", is sent to the bgwriter, which >> collects all such requests to a list. Before checkpoint calculates new >> RedoRecPtr, the list is swapped with an empty one, and after writing the >> new checkpoint record, all the files that were in the list are deleted. >> >> We would leak empty files on crashes, but we leak files on crashes >> anyway, so that shouldn't be an issue. This scheme wouldn't require >> catalog changes, so it would be suitable for backpatching. >> >> Any better ideas? > Couldn't we fix this by forcing a checkpoint before we commit the > transaction that created the new pg_class entry for the clustered table? > Or rather, more generally, before committing a transaction that created > a new non-temporary relfilenode but didn't WAL-log any subsequent inserts. Yes, that would work. As a small optimization, you could set a flag in shared mem whenever you delete a rel file, and skip the checkpoint when that flag isn't set. > Thats of course a rather sledgehammer-like approach to this problem - > but at least for the backbranched the fix would be less intrusive... Too much of a sledgehammer IMHO. BTW, CREATE INDEX is also vulnerable. And in 8.3, COPY to a table created/truncated in the same transaction. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why copy_relation_data only use wal whenWALarchiving is enabled
Simon Riggs wrote: > If you've got a better problem statement it would be good to get that > right first before we discuss solutions. Reusing a relfilenode of a deleted relation, before next checkpoint following the commit of the deleting transaction, for an operation that doesn't WAL log the contents of the new relation, leads to data loss on recovery. Or Performing non-WAL logged operations on a relation file leads to a truncated file on recovery, if the relfilenode of that file used to belong to a relation that was dropped after the last checkpoint. Happy? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
Simon Riggs wrote: On Wed, 2007-10-17 at 12:11 +0100, Heikki Linnakangas wrote: Simon Riggs wrote: On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: Second, suppose that no checkpoint has occured during the upper series--authough not quite possible; That part is irrelevant. It's forced out to disk and doesn't need recovery, with or without the checkpoint. There's no hole that I can see. No, Jacky is right. The same problem exists at least with CLUSTER, and I think there's other commands that rely on immediate fsync as well. Attached is a shell script that demonstrates the problem on CVS HEAD with CLUSTER. It creates two tables, T1 and T2, both with one row. Then T1 is dropped, and T2 is CLUSTERed, so that the new T2 relation file happens to get the same relfilenode that T1 had. Then we crash the server, forcing a WAL replay. After that, T2 is empty. Oops. Unfortunately I don't see any easy way to fix it. So, what you are saying is that re-using relfilenodes can cause problems during recovery in any command that alters the relfilenode of a relation? For what I understand, I'd say that creating a relfilenode *and* subsequently inserting data without WAL-logging causes the problem. If the relfilenode was recently deleted, the inserts might be effectively undone upon recovery (because we first replay the delete), but later *not* redone (because we didn't WAL-log the inserts). That brings me to another idea from a fix that is less heavyweight than my previous checkpoint-before-commit suggestion. We could make relfilenodes globally unique if we added the xid and epoch of the creating transaction to the filename. Those are 64 bits, so if we encode them in base 36 (using A-Z,0-9), that'd increase the length of the filenames by 13. regards, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] CVS and Eclipse
>>> On Wed, Oct 17, 2007 at 1:40 AM, in message <[EMAIL PROTECTED]>, "Kuriakose, Cinu Cheriyamoozhiyil" <[EMAIL PROTECTED]> wrote: > > I am trying to configure CVS through Eclipse, infact i was able to do that > but when I map the postgreSQL code into CVS through Eclipse, it is displaying > the folders but the files in those folders are not getting displayed. If you look at the subdirectories in your workspace, are the files there? -Kevin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
On Wed, 2007-10-17 at 12:11 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: > >> Second, suppose that no checkpoint has occured during the upper > >> series--authough not quite possible; > > > > That part is irrelevant. It's forced out to disk and doesn't need > > recovery, with or without the checkpoint. > > > > There's no hole that I can see. > > No, Jacky is right. The same problem exists at least with CLUSTER, and I > think there's other commands that rely on immediate fsync as well. > > Attached is a shell script that demonstrates the problem on CVS HEAD > with CLUSTER. It creates two tables, T1 and T2, both with one row. Then > T1 is dropped, and T2 is CLUSTERed, so that the new T2 relation file > happens to get the same relfilenode that T1 had. Then we crash the > server, forcing a WAL replay. After that, T2 is empty. Oops. > > Unfortunately I don't see any easy way to fix it. So, what you are saying is that re-using relfilenodes can cause problems during recovery in any command that alters the relfilenode of a relation? If you've got a better problem statement it would be good to get that right first before we discuss solutions. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] DBLink's default user: postgres
Is this the regular behavior on DBLink? rot=> SELECT user, current_database(); current_user | current_database --+-- sa_rot | rot (1 registro) rot=> SELECT * rot-> FROM dblink('dbname=escola', rot(> 'SELECT user, current_database()') rot-> AS (usr name, db name); usr| db --+ postgres | escola (1 registro) This way, I fear DBLink functions should become a vulnerability issue on my database. Is there any way to protect or override this setting? Or it should be done on pg_hba.conf only? -- Regards, Rodrigo Hjort http://icewall.org/~hjort
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
Heikki Linnakangas wrote: I wrote: Unfortunately I don't see any easy way to fix it. One approach would be to avoid reusing the relfilenodes until next checkpoint, but I don't see any nice place to keep track of OIDs that have been dropped since last checkpoint. Ok, here's one idea: Instead of deleting the file immediately on commit of DROP TABLE, the file is truncated to release the space, but not unlink()ed, to avoid reusing that relfilenode. The truncated file can be deleted after next checkpoint. Now, how does checkpoint know what to delete? We can use the fsync request mechanism for that. When a file is truncated, a new kind of fsync request, a "deletion request", is sent to the bgwriter, which collects all such requests to a list. Before checkpoint calculates new RedoRecPtr, the list is swapped with an empty one, and after writing the new checkpoint record, all the files that were in the list are deleted. We would leak empty files on crashes, but we leak files on crashes anyway, so that shouldn't be an issue. This scheme wouldn't require catalog changes, so it would be suitable for backpatching. Any better ideas? Couldn't we fix this by forcing a checkpoint before we commit the transaction that created the new pg_class entry for the clustered table? Or rather, more generally, before committing a transaction that created a new non-temporary relfilenode but didn't WAL-log any subsequent inserts. Thats of course a rather sledgehammer-like approach to this problem - but at least for the backbranched the fix would be less intrusive... regards, Florian Pflug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
I wrote: > Unfortunately I don't see any easy way to fix it. One approach would be > to avoid reusing the relfilenodes until next checkpoint, but I don't see > any nice place to keep track of OIDs that have been dropped since last > checkpoint. Ok, here's one idea: Instead of deleting the file immediately on commit of DROP TABLE, the file is truncated to release the space, but not unlink()ed, to avoid reusing that relfilenode. The truncated file can be deleted after next checkpoint. Now, how does checkpoint know what to delete? We can use the fsync request mechanism for that. When a file is truncated, a new kind of fsync request, a "deletion request", is sent to the bgwriter, which collects all such requests to a list. Before checkpoint calculates new RedoRecPtr, the list is swapped with an empty one, and after writing the new checkpoint record, all the files that were in the list are deleted. We would leak empty files on crashes, but we leak files on crashes anyway, so that shouldn't be an issue. This scheme wouldn't require catalog changes, so it would be suitable for backpatching. Any better ideas? Do we care enough about this to fix this? Enough to backpatch? The probability of this happening is pretty small, but the consequences are really bad, so my vote is "yes" and "yes". -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
Forgot to attach the script I promised.. You need to set $PGDATA before running the script. And psql,pg_ctl and pg_resetxlog need to be in $PATH. After running the script, restart postmaster and run "SELECT * FROM t2". There should be one row in the table, but it's empty. Heikki Linnakangas wrote: > Simon Riggs wrote: >> On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: >>> Second, suppose that no checkpoint has occured during the upper >>> series--authough not quite possible; >> That part is irrelevant. It's forced out to disk and doesn't need >> recovery, with or without the checkpoint. >> >> There's no hole that I can see. > > No, Jacky is right. The same problem exists at least with CLUSTER, and I > think there's other commands that rely on immediate fsync as well. > > Attached is a shell script that demonstrates the problem on CVS HEAD > with CLUSTER. It creates two tables, T1 and T2, both with one row. Then > T1 is dropped, and T2 is CLUSTERed, so that the new T2 relation file > happens to get the same relfilenode that T1 had. Then we crash the > server, forcing a WAL replay. After that, T2 is empty. Oops. > > Unfortunately I don't see any easy way to fix it. One approach would be > to avoid reusing the relfilenodes until next checkpoint, but I don't see > any nice place to keep track of OIDs that have been dropped since last > checkpoint. > -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com cluster-relfilenode-clash.sh.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
> On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: >> Second, suppose that no checkpoint has occured during the upper >> series--authough not quite possible; > > That part is irrelevant. It's forced out to disk and doesn't need > recovery, with or without the checkpoint. > > There's no hole that I can see. Yes, it's really forced out. But if there's no checkpoint, the recovery process will begin from the time point before T1 is created, and as T1 was dropped, it'll remove T2's file! > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > ---(end of broadcast)--- > TIP 2: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Why copy_relation_data only use wal when WALarchiving is enabled
Simon Riggs wrote: > On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: >> Second, suppose that no checkpoint has occured during the upper >> series--authough not quite possible; > > That part is irrelevant. It's forced out to disk and doesn't need > recovery, with or without the checkpoint. > > There's no hole that I can see. No, Jacky is right. The same problem exists at least with CLUSTER, and I think there's other commands that rely on immediate fsync as well. Attached is a shell script that demonstrates the problem on CVS HEAD with CLUSTER. It creates two tables, T1 and T2, both with one row. Then T1 is dropped, and T2 is CLUSTERed, so that the new T2 relation file happens to get the same relfilenode that T1 had. Then we crash the server, forcing a WAL replay. After that, T2 is empty. Oops. Unfortunately I don't see any easy way to fix it. One approach would be to avoid reusing the relfilenodes until next checkpoint, but I don't see any nice place to keep track of OIDs that have been dropped since last checkpoint. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Why copy_relation_data only use wal when WAL archiving is enabled
On Wed, 2007-10-17 at 17:18 +0800, Jacky Leng wrote: > Second, suppose that no checkpoint has occured during the upper > series--authough not quite possible; That part is irrelevant. It's forced out to disk and doesn't need recovery, with or without the checkpoint. There's no hole that I can see. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] CVS Commands
Hi All, The ls and rls commands are used to list files and directories in the repository, but we should be logged in to the repository for getting the listing of the same. I use the command "cvs login" to connect to the repository but instead of connecting to the local repository it is connecting to the postgreSQL repository, but when I specify the command -> cvs -d :pserver:[EMAIL PROTECTED]:/u01/Installs/cvsrepository/cvsroot login where /u01/Installs/cvsrepository/cvsroot is the repository path, root is the username and inblr-kuriakcc.eu.uis.unisys.com is the hostname it is asking for the password, now I am not sure what is the password, can anyone please tell me the method to change the CVS password, so that I can change the password and get connected to the local repository. Thanks in Advance Cinu ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Why copy_relation_data only use wal when WAL archiving is enabled
> Jacky Leng wrote: >> If I run the database under non-archiving mode, and execute the following >> command: >> alter table t set tablespace tblspc1; >> Isn't it possible that the "new t" cann't be recovered? > > No. At the end of copy_relation_data we call smgrimmedsync, which fsyncs > the new relation file. Usually it's true, but how about this situation: * First, do the following series: * Create two tablespace SPC1, SPC2; * Create table T1 in SPC1 and insert some values into it, suppose T1's oid/relfilenode is OID1; * Drop table T1;--OID1 was released in pg_class and can be reused. * Do anything that will make the next oid that'll be allocated from pg_class be OID1, e.g. insert many many tuples into a relation with oid; * Create table T2 in SPC2, and insert some values into it, and its oid/relfilenode is OID1; * Alter table T2 set tablespace SPC1;-T2 goes to SPC1 and uses the same file name with old T1; * Second, suppose that no checkpoint has occured during the upper series--authough not quite possible; * Kill the database abnormaly; * Restart the database; Let's analyze what will happen during the recovery process: * When T1 is re-created, it finds that its file has already been there--actually this file is T2's; * "T1" ' s file(actually T2's) is re-dropped; * * T2 is re-created, and finds that its file has disappeared, so it re-create one; * As copy_relation_data didn't record any xlog about T2's AlterTableSpace op, after recovery, we'll find that T2 is empty!!! > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [COMMITTERS] pgsql: Re-allow UTF8 encodings on win32.
Hiroshi Saito wrote: > Hi. > > From: "Magnus Hagander" <[EMAIL PROTECTED]> > >>> But, Please see. >>> http://winpg.jp/~saito/pg83/pg83b1-err3.txt >>> Japanese_Japan.65001 is error... >>> Japanese_Japan is true. >> >> Yes, that is expected. If you explicitly ask for the .65001 locale it >> will try the one that doesn't have the proper NLS files, and that >> shouldn't work. If you just put in Japanese_Japan, it will use the UTF16 >> locale. > > Umm, As for result ... initdb -E UTF8 --locale=Japanese_Japan -D../data > http://winpg.jp/~saito/pg83/pg83b1-err4.txt > It seems that it is only complemented. Yes, that is expected, though not entirely to my tastes. The cluster should still actually be in utf-8 however. /D ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] tsearch2api project
Hello I created new project on pgfoundry. It's wrapper of integrated fulltext and it's binary compatible with TSearch2 API. * it works, (I am able load 82 dump without changes) * it is ugly :( . I expected, so this wrapper can be more elegant, but not. I had to create full dual interface to fulltext. Pavel I appreciate the comments ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match