Re: [HACKERS] Insufficient attention to security in contrib (mostly)
On Tuesday 28 August 2007 15:38, Tom Lane wrote: Therefore, I propose the same choices as before for table-size (no restriction) and database-size (must have CONNECT priv), and this for tablespace-size: you must have the ability to create tables in the target tablespace. This could be either an explicit CREATE grant, or implicit because you are in a DB that has it as the default tablespace. +1 -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] reviving dead buildfarm animals
Tom, I notice that five different buildfarm members are about to slide off the HEAD list for not having reported in within a month. Do we have any process for pestering their owners to revive them? If the hardware went south, or there was some other deliberate decision to retire them, that's fine --- I'm just wondering if the owners don't realize they're busted. Actually, most of these seem to belong to Stefan. Stefan? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] reviving dead buildfarm animals
Tom Lane wrote: I notice that five different buildfarm members are about to slide off the HEAD list for not having reported in within a month. Do we have any process for pestering their owners to revive them? If the hardware went south, or there was some other deliberate decision to retire them, that's fine --- I'm just wondering if the owners don't realize they're busted. owners ae usually aware since the buildfarm has a feature to notify owners about animals that have not reported in a while. As in the case of zebra/impala/shad and clownfish - I'm fully aware that they are offline and I intend to bring them back soon but some of these animals(which are in fact all on the same VMware host) there have reported rather strange failures in the weeks before like: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=zebradt=2007-07-21%2022:25:05 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=zebradt=2007-07-23%2010:25:04 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=zebradt=2007-08-04%2006:25:05 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clownfishdt=2007-08-04%2005:05:34 which I had not yet time to investigate. Maybe we should have a way for buildfarm owners to be able to tag animals and/or builds with a bit of status information in such cases ? Stefan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [GENERAL] Undetected corruption of table files
* Alban Hertroys: If you have a proper production database server, your memory has error checking, and your RAID controller has something of the kind as well. To my knowledge, no readily available controller performs validation on reads (not even for RAID-1 or RAID-10, where it would be pretty straightforward). Something like an Adler32 checksum (not a full CRC) on each page might be helpful. However, what I'd really like to see is something that catches missed writes, but this is very difficult to implement AFAICT. -- Florian Weimer[EMAIL PROTECTED] BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] initdb failed on Windows 2000
Hi, From: Yoshiyuki Asaba [EMAIL PROTECTED] Subject: [HACKERS] initdb failed on Windows 2000 Date: Mon, 27 Aug 2007 20:46:35 +0900 (JST) I have compiled PostgreSQL 8.2.4 with MinGW on Windows 2000. Then I have executed initdb as Administrator. However initdb failed with the following message. The program postgres is needed by initdb but was not found in the same directory as C:\msys\1.0\local\pgsql\bin/initdb. Check your installation. So, I have debugged initdb.exe. I found that CreatePipe() was failed with ERROR_ACCESS_DENIED in exec.c:pipe_read_line(). The attached files are test programs. % gcc -o child.exe child.c % gcc -o parent.exe parent.c When parent.exe is executed by Power Users or Users, the result is good. However, CreatePipe() is failed when Administrator do. % ./parent.exe CreatePipe() failed: 5 Regards, -- Yoshiyuki Asaba [EMAIL PROTECTED] #include stdio.h #include windows.h typedef BOOL(WINAPI * __CreateRestrictedToken) (HANDLE, DWORD, DWORD, PSID_AND_ATTRIBUTES, DWORD, PLUID_AND_ATTRIBUTES, DWORD, PSID_AND_ATTRIBUTES, PHANDLE); #define DISABLE_MAX_PRIVILEGE 0x1 /* * Create a restricted token and execute the specified process with it. * * Returns 0 on failure, non-zero on success, same as CreateProcess(). * * On NT4, or any other system not containing the required functions, will * NOT execute anything. */ static int CreateRestrictedProcess(char *cmd) { BOOLb; STARTUPINFO si; HANDLE origToken; HANDLE restrictedToken; SID_IDENTIFIER_AUTHORITY NtAuthority = {SECURITY_NT_AUTHORITY}; SID_AND_ATTRIBUTES dropSids[2]; __CreateRestrictedToken _CreateRestrictedToken = NULL; HANDLE Advapi32Handle; PROCESS_INFORMATION pi; ZeroMemory(pi, sizeof(pi)); ZeroMemory(si, sizeof(si)); si.cb = sizeof(si); Advapi32Handle = LoadLibrary(ADVAPI32.DLL); if (Advapi32Handle != NULL) { _CreateRestrictedToken = (__CreateRestrictedToken) GetProcAddress(Advapi32Handle, CreateRestrictedToken); } if (_CreateRestrictedToken == NULL) { fprintf(stderr, WARNING: Unable to create restricted tokens on this platform\n); if (Advapi32Handle != NULL) FreeLibrary(Advapi32Handle); return 0; } /* Open the current token to use as a base for the restricted one */ if (!OpenProcessToken(GetCurrentProcess(), TOKEN_ALL_ACCESS, origToken)) { fprintf(stderr, Failed to open process token: %lu\n, GetLastError()); return 0; } /* Allocate list of SIDs to remove */ ZeroMemory(dropSids, sizeof(dropSids)); if (!AllocateAndInitializeSid(NtAuthority, 2, SECURITY_BUILTIN_DOMAIN_RID, DOMAIN_ALIAS_RID_ADMINS, 0, 0, 0, 0, 0, 0, dropSids[0].Sid) || !AllocateAndInitializeSid(NtAuthority, 2, SECURITY_BUILTIN_DOMAIN_RID, DOMAIN_ALIAS_RID_POWER_USERS, 0, 0, 0, 0, 0, 0, dropSids[1].Sid)) { fprintf(stderr, Failed to allocate SIDs: %lu\n, GetLastError()); return 0; } b = _CreateRestrictedToken(origToken, DISABLE_MAX_PRIVILEGE, sizeof(dropSids) / sizeof(dropSids[0]), dropSids, 0, NULL, 0, NULL, restrictedToken); FreeSid(dropSids[1].Sid); FreeSid(dropSids[0].Sid); CloseHandle(origToken); FreeLibrary(Advapi32Handle); if (!b) { fprintf(stderr, Failed to create restricted token: %lu\n, GetLastError()); return 0; } CreateProcessAsUser(restrictedToken, NULL, cmd, NULL, NULL, TRUE, 0, NULL, NULL, si, pi); WaitForSingleObject(pi.hProcess, INFINITE); CloseHandle(pi.hThread); CloseHandle(pi.hProcess); return 0; } int main(void) { CreateRestrictedProcess(child.exe); return 0; } #include stdio.h #include windows.h int main(void) { SECURITY_ATTRIBUTES sattr; HANDLE childstdoutrd, childstdoutwr, childstdoutrddup, file, pipe; PROCESS_INFORMATION pi; STARTUPINFO si; sattr.nLength = sizeof(SECURITY_ATTRIBUTES); sattr.bInheritHandle = TRUE; sattr.lpSecurityDescriptor = NULL;
Re: [HACKERS] Contrib modules documentation online
There is a problem with line feeds for contrib/xml2: http://www.nan-tic.com/ftp/pgdoc/xml2.html As for idea itself, I find it very useful (besides usability improvements, it would help to promote Postgres advanced features). On 8/29/07, Albert Cervera i Areny [EMAIL PROTECTED] wrote: I've been working on converting the current README files for all contrib modules into sgml and add it to the documentation. There are still some fixes to do but i'd like to have some feedback. Indeed, it wasn't agreed to have all if any of the modules together with the core documentation. You can see the docs on [1] in chapter VIII. If you think these could be a good addition, please fill free to comment on how you think sections should be organized to be consistent and easy to read. [1] http://www.nan-tic.com/ftp/pgdoc ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Best regards, Nikolay ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] initdb failed on Windows 2000
Yoshiyuki Asaba wrote: I have compiled PostgreSQL 8.2.4 with MinGW on Windows 2000. Then I have executed initdb as Administrator. However initdb failed with the following message. The program postgres is needed by initdb but was not found in the same directory as C:\msys\1.0\local\pgsql\bin/initdb. Check your installation. So, I have debugged initdb.exe. I found that CreatePipe() was failed with ERROR_ACCESS_DENIED in exec.c:pipe_read_line(). The attached files are test programs. % gcc -o child.exe child.c % gcc -o parent.exe parent.c When parent.exe is executed by Power Users or Users, the result is good. However, CreatePipe() is failed when Administrator do. % ./parent.exe CreatePipe() failed: 5 What do you want us to do about it? Isn't this a case of don't do that, then? Run initdb as some other user. We don't let you run initdb as root on Unix, so we're just being consistent. In fact, we're being more liberal on Windows than on Unix because we make some provision for the restricted token gadget. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] initdb failed on Windows 2000
Hi, From: Andrew Dunstan [EMAIL PROTECTED] Subject: Re: [HACKERS] initdb failed on Windows 2000 Date: Wed, 29 Aug 2007 08:57:55 -0400 What do you want us to do about it? Isn't this a case of don't do that, then? Run initdb as some other user. We don't let you run initdb as root on Unix, so we're just being consistent. In fact, we're being more liberal on Windows than on Unix because we make some provision for the restricted token gadget. Administrator can run initdb on Windows XP, Server 2003 and Vista. Is this right? -- Yoshiyuki Asaba [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] initdb failed on Windows 2000
Yoshiyuki Asaba wrote: Administrator can run initdb on Windows XP, Server 2003 and Vista. Is this right? Well, I gave up on trying to get Vista to work, and I found I needed a non-Administrator user to run my new buildfarm member on XP-Pro, so I am means sure it is right. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MSVC build system
Magnus Hagander wrote: David Boreham wrote: To add my 2d worth to this: after working on a few very large projects that built on both Unix and Windows my preference is to use a single autotools-based build for both, with a script called cccl that translates cc-style arguments for Microsoft's cl compiler/linker tool chain (plus Cygwin for the command line utilities, gmake etc). We have a locally-enhanced version of cccl that's a bit more capable than the latest public version, I seem to remember. But that still requires you to have a full set of unix style commandline tools on your windows box in order to build, no? And if it doesn't generate project files and such, it won't be usable in Visual Studio, just the commandline compiler... Correct on both counts, and for me neither is a problem. I can't imagine using a Windows box for software development without Unix tools installed, and I've never found building huge pieces of software using an IDE to be a useful thing to do. As I said, just my 2d worth... ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Dictionary chaining and stop words
It's nice to be able to chain tsearch dictionaries, but I find that it's not as flexible as it should be. Currently we have these dictionaries built-in: dict_simple - lowercases and checks against stop word list, accepts everything not in stop word list dict_synonym - replaces with synonym, if found dict_thesaurus - similar to synonym, but can recognize phrases dict_ispell - lowercases, checks dictionary, then checks stop words dict_snowball - lowercases, checks stop words, then stems The way things are at the moment, you can't for example use any of the built-in dictionaries in case-sensitive mode, without writing custom C code. Or check against stop words before going through an ispell dictionary (dict_simple accepts everything, so you can't put it in front of dict_ispell). Or use ispell dictionary first, then replace synonyms with dict_synonym, and so forth. To make the chaining more useful, I'm proposing some changes to dictionary API and the set of built-in dictionaries. Currently, a dictionary can either: - Accept the word (and possibly replace it with something else) - Reject the word - Do nothing There's clearly need for transforming a word and passing on the transformed version to the next dictionary. dict_thesaurus does exactly that by supporting a subdictionary which is called before invoking the thesaurus, but it should be generic capability not specific to any dictionary. Let's modify the lexize API so that a dictionary can: - Accept the word (and possibly input with something else) - Reject the word - Transform word into another (or pass on as is) If we do that, and modularize the lowercasing and stopwords functionality into separate dictionaries, we end up with this nice, orthogonal set of dictionaries that you can use as building blocks for a wide range of more complex rules: dict_lowercase - lowercases, doesn't accept or reject anything dict_simple - accepts or rejects (depending on dict option) words in list, passes on others. This can be used for stop words functionality, or to accept words found in a simple list of words dict_accept - accepts everything (for use as a terminator in the chain, if you want to accept everything not accepted or rejected by other dictionaries) dict_synonym- replaces input with synonym, passes on or accepts matches depending on dict option dict_thesaurus - replaces input with preferred term, passes on or accepts matches depending on dict option dict_ispell - replaces input with basic form from dictionary, passes on or accepts matches depending on dict option dict_snowball - replaces input with stem, passes on I don't know what the current plan for beta is, but it would be nice to get the API right even though there is some work to do. I can write a patch if no-one objects. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] initdb failed on Windows 2000
I wrote: so I am means sure it is right. I meant, of course, so I am by no means sure it is right That's what I get for multitasking. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Dictionary chaining and stop words
Heikki Linnakangas [EMAIL PROTECTED] writes: There's clearly need for transforming a word and passing on the transformed version to the next dictionary. dict_thesaurus does exactly that by supporting a subdictionary which is called before invoking the thesaurus, but it should be generic capability not specific to any dictionary. Let's modify the lexize API so that a dictionary can: - Accept the word (and possibly input with something else) - Reject the word - Transform word into another (or pass on as is) This doesn't seem to be enough to solve thesaurus' problem though. The difficulty there is that (1) it wants to look at several words at once, (2) it wants to know which words were rejected as stopwords. If filtering happens before it then how can it do that? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Dictionary chaining and stop words
Heikki, we know about this ( I call it filtering), but we leave it for the future after we'll have everything in core. The more demonstrative example is well-known accent-removal problem. I used to recommend to preprocess string before tsearch2, but there is a problem with headline() when this will not work, so, clearly, we need accent removal in dictionary chain using simple pg_unaccent dictionary, which should return an original word without accent and then pass it to the next dictionary. Currently, this is impossible. But, it's not obvious in the general case, when dictionary return array of lexems. So, we decide to leave it for future. I'm very pleased, that we have now many developers interested in the text search development ! We have many interesting todo like 'phrase search'. Oleg On Wed, 29 Aug 2007, Heikki Linnakangas wrote: It's nice to be able to chain tsearch dictionaries, but I find that it's not as flexible as it should be. Currently we have these dictionaries built-in: dict_simple - lowercases and checks against stop word list, accepts everything not in stop word list dict_synonym - replaces with synonym, if found dict_thesaurus - similar to synonym, but can recognize phrases dict_ispell - lowercases, checks dictionary, then checks stop words dict_snowball - lowercases, checks stop words, then stems The way things are at the moment, you can't for example use any of the built-in dictionaries in case-sensitive mode, without writing custom C code. Or check against stop words before going through an ispell dictionary (dict_simple accepts everything, so you can't put it in front of dict_ispell). Or use ispell dictionary first, then replace synonyms with dict_synonym, and so forth. To make the chaining more useful, I'm proposing some changes to dictionary API and the set of built-in dictionaries. Currently, a dictionary can either: - Accept the word (and possibly replace it with something else) - Reject the word - Do nothing There's clearly need for transforming a word and passing on the transformed version to the next dictionary. dict_thesaurus does exactly that by supporting a subdictionary which is called before invoking the thesaurus, but it should be generic capability not specific to any dictionary. Let's modify the lexize API so that a dictionary can: - Accept the word (and possibly input with something else) - Reject the word - Transform word into another (or pass on as is) If we do that, and modularize the lowercasing and stopwords functionality into separate dictionaries, we end up with this nice, orthogonal set of dictionaries that you can use as building blocks for a wide range of more complex rules: dict_lowercase - lowercases, doesn't accept or reject anything dict_simple - accepts or rejects (depending on dict option) words in list, passes on others. This can be used for stop words functionality, or to accept words found in a simple list of words dict_accept - accepts everything (for use as a terminator in the chain, if you want to accept everything not accepted or rejected by other dictionaries) dict_synonym- replaces input with synonym, passes on or accepts matches depending on dict option dict_thesaurus - replaces input with preferred term, passes on or accepts matches depending on dict option dict_ispell - replaces input with basic form from dictionary, passes on or accepts matches depending on dict option dict_snowball - replaces input with stem, passes on I don't know what the current plan for beta is, but it would be nice to get the API right even though there is some work to do. I can write a patch if no-one objects. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] StringInfo misc. issues
Hi, I palloc0'ed a variable of type StringInfo and without doing an initStringInfo() (forgot to do it i.e.) tried to append some stuff to it using appendStringInfo(). It went into a tight loop within the function enlargeStringInfo() at: while (needed newlen) Must be a common enough case for a palloc0'ed field right? The attached patch should fix this. *** 226,232 ! if (needed 0 || ((Size) needed) = (MaxAllocSize - (Size) str-len)) elog(ERROR, invalid string enlargement request size %d, needed); --- 226,232 ! if (needed = 0 || ((Size) needed) = (MaxAllocSize - (Size) str-len)) elog(ERROR, invalid string enlargement request size %d, needed); I also found the absence of a function like resetStringInfo() a bit puzzling. A found a lot of places where the code was resetting the len field to 0 and assigning '\0' to the data field to reset the variable. This seems to be the only missing API which will be needed while working with the StringInfo type. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com Index: src/backend/lib/stringinfo.c === RCS file: /repositories/edbhome/cvs/EDBAS82/edb/edb-postgres/src/backend/lib/stringinfo.c,v retrieving revision 1.3 diff -c -r1.3 stringinfo.c *** src/backend/lib/stringinfo.c 9 Nov 2006 11:09:09 - 1.3 --- src/backend/lib/stringinfo.c 29 Aug 2007 14:37:58 - *** *** 226,232 * bogus data. Without this, we can get an overflow or infinite loop in * the following. */ ! if (needed 0 || ((Size) needed) = (MaxAllocSize - (Size) str-len)) elog(ERROR, invalid string enlargement request size %d, needed); --- 226,232 * bogus data. Without this, we can get an overflow or infinite loop in * the following. */ ! if (needed = 0 || ((Size) needed) = (MaxAllocSize - (Size) str-len)) elog(ERROR, invalid string enlargement request size %d, needed); *** *** 259,261 --- 259,272 str-maxlen = newlen; } + + /* + * resetStringInfo + * Reset the len field and the data field contents for a fresh start + */ + void + resetStringInfo(StringInfo str) + { + str-len = 0; + str-data[0] = '\0'; + } Index: src/include/lib/stringinfo.h === RCS file: /repositories/edbhome/cvs/EDBAS82/edb/edb-postgres/src/include/lib/stringinfo.h,v retrieving revision 1.3 diff -c -r1.3 stringinfo.h *** src/include/lib/stringinfo.h 9 Nov 2006 11:09:17 - 1.3 --- src/include/lib/stringinfo.h 29 Aug 2007 14:37:58 - *** *** 138,141 --- 138,146 */ extern void enlargeStringInfo(StringInfo str, int needed); + /* + * resetStringInfo + * Reset the len field and the data field contents for a fresh start + */ + extern void resetStringInfo(StringInfo str); #endif /* STRINGINFO_H */ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] reviving dead buildfarm animals
Stefan Kaltenbrunner [EMAIL PROTECTED] writes: Maybe we should have a way for buildfarm owners to be able to tag animals and/or builds with a bit of status information in such cases ? +1 regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] StringInfo misc. issues
NikhilS wrote: I also found the absence of a function like resetStringInfo() a bit puzzling. A found a lot of places where the code was resetting the len field to 0 and assigning '\0' to the data field to reset the variable. This seems to be the only missing API which will be needed while working with the StringInfo type. er, what? stringinfo.h has: /* * resetStringInfo * Clears the current content of the StringInfo, if any. The * StringInfo remains valid. */ extern void resetStringInfo(StringInfo str); cheers andrew cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] StringInfo misc. issues
Andrew Dunstan escribió: NikhilS wrote: I also found the absence of a function like resetStringInfo() a bit puzzling. A found a lot of places where the code was resetting the len field to 0 and assigning '\0' to the data field to reset the variable. This seems to be the only missing API which will be needed while working with the StringInfo type. er, what? stringinfo.h has: /* * resetStringInfo * Clears the current content of the StringInfo, if any. The * StringInfo remains valid. */ extern void resetStringInfo(StringInfo str); I think Neil added this recently. Maybe NikhilS is looking at 8.2 or something. -- Alvaro Herrera Developer, http://www.PostgreSQL.org/ The Gord often wonders why people threaten never to come back after they've been told never to return (www.actsofgord.com) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] StringInfo misc. issues
NikhilS [EMAIL PROTECTED] writes: The attached patch should fix this. And break other things, no doubt. needed = 0 is a perfectly valid edge case and mustn't be rejected here. (In fact, I doubt you'd even get through the regression tests with this patch ... how much did you test it?) The real problem with what you describe is that you should have used makeStringInfo(). I also found the absence of a function like resetStringInfo() a bit puzzling. CVS HEAD is way ahead of you. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] correct behavior of ANALYZE ...
i came across some interesting behavior of pg_stats and i am not sure if this is something we should treat the way we do it. consider: test_hans=# select * from pg_stats where attname = 'id' and tablename = 't_testhugo'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ++-+---+--- ++- +--- +--+- public | t_testhugo | id | 0 | 8 | 15 | {18,17,16,19,20,15} | {0.20,0.186333,0.155333,0.148667,0.095,0.090} | {11,13,14,14,14,21,21,22,25} |0.557774 (1 row) test_hans=# alter TABLE t_testhugo alter column id set statistics 2; ALTER TABLE test_hans=# ANALYZE t_testhugo ; ANALYZE test_hans=# select * from pg_stats where attname = 'id' and tablename = 't_testhugo'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ++-+---+--- ++--+--- +--+- public | t_testhugo | id | 0 | 8 | 12 | {18,17} | {0.21,0.19} | {12,19,23} | 0.597255 (1 row) test_hans=# alter TABLE t_testhugo alter column id set statistics 0; ALTER TABLE i expected the histogram to be gone her and stats should be disabled. instead, we keep the old histogram here. test_hans=# ANALYZE t_testhugo ; ANALYZE test_hans=# select * from pg_stats where attname = 'id' and tablename = 't_testhugo'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation ++-+---+--- ++--+--- +--+- public | t_testhugo | id | 0 | 8 | 12 | {18,17} | {0.21,0.19} | {12,19,23} | 0.597255 (1 row) is that what we expect? if no, i will go and fit it ... hans -- Cybertec Geschwinde Schönig GmbH Gröhrmühlgasse 26, 2700 Wiener Neustadt Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at
Re: [HACKERS] correct behavior of ANALYZE ...
Hans-Juergen Schoenig [EMAIL PROTECTED] writes: i came across some interesting behavior of pg_stats and i am not sure if this is something we should treat the way we do it. Setting target zero means expend no work on this column. In my book that includes not doing anything to any pre-existing pg_stats entry. What you propose would defeat the ability to analyze an unchanging column once and then make ANALYZE skip over it henceforth. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Contrib modules documentation online
Albert, (crossed over to -docs, where it really belongs) I've been working on converting the current README files for all contrib modules into sgml and add it to the documentation. There are still some fixes to do but i'd like to have some feedback. Indeed, it wasn't agreed to have all if any of the modules together with the core documentation. You can see the docs on [1] in chapter VIII. If you think these could be a good addition, please fill free to comment on how you think sections should be organized to be consistent and easy to read. [1] http://www.nan-tic.com/ftp/pgdoc Wow, this is really, really cool! You're my hero. I'm very strongly in favor of having this documentation. However, I think it might make sense to put Contrib Modules as a section under either Reference or Appendices. Also, I don't think it's necessary to make each command option a separate subchapter, but I can see how that would be hard to avoid in an automated system. Guys, would it be out of the question to do this in 8.3? Please please? If we go ahead with this, I'll commit to doing a contrib README cleanup so the doc system works better. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Hi When we lazily assign XIDs, we gain another flag beside the existing MyXactMadeTempRelUpdates, MyXactMadeXLogEntry, MyLastRecPtr and smgr's pendingDeletes to tell what kind of actions a transaction performed. Adding TransactionIsIsValid(GetCurrentTransactionIdIfAny()) on top of that makes things quite impenetrable - at least for me. So I'm trying to wrap my head around that logic, and simplify it a bit if possible. (Nowadays, async commit even adds a bit more complexity) Currently, we write out a COMMIT record if a transaction either created any transaction-controlled XLOG entries (MyLastRecPtr.xrecoff != 0), or scheduled the deletion of files on commit. Afterwards, the xlog is flushed to the end last record created by the session (ProcLastRecEnd) if the transaction created any xlog record at all. If we either made transaction-controlled XLOG entries, or temporary relation updates, we update the XID status in the CLOG. An ABORT record is currently created if a transaction either created transaction-controlled XLOG entries or scheduled the deletion of files on abort. If we schedules file deletions, we flush the XLOG up to the ABORT record. If we either made transaction-controlled XLOG entries, updated temporary relations, or scheduled deletions we update the XID status in the CLOG. For subtransaction commit, a COMMIT record is emitted if we either made transaction-controlled XLOG entries, or updated temporary relations. No XLOG flush is performed. Subtransaction ABORTS are handled the same way as regular transaction aborts. For toplevel transaction commits, we defer flushing the xlog if synchronous_commit = off, and we didn't schedule any file deletions. Now, with lazy XID assignment I believe the following holds true .) If we didn't assign a valid XID, we cannot have made transaction-controlled XLOG entries (Can be checked by asserting that the current transaction id is valid if XLOG_NO_TRAN isn't set in XLogInsert). .) We cannot have scheduled files for deletion (on either COMMIT or ABORT) if we don't have a valid XID, since any such deletion will happen together with a catalog update. Note that this is already assumed to be true for subtransactions, since they only call RecordSubTransaction{Commit|Abort} if they have an XID assigned. I propose to do the following in my lazy XID assignment patch - can anyone see a hole in that? .) Get rid of MyLastRecPtr and MyXactMadeTempRelUpdates. Those are superseeded by TransactionIdIsValid(GetCurrentTransactionIdIfAny()). .) Get rid of MyXactMadeXLogEntry. Instead, just reset ProcLast .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting a new toplevel transaction. Transaction COMMIT: Write an COMMIT record if and only if we have a valid XID. Then, flush the XLOG to XactLastRecEnd if that is set, and synchronous_commit=on. Aferwards, update the CLOG if and only if we have a valid XID. Transaction ABORT: Write an ABORT record if and only if we have a valid XID. Then, flush the XLOG to XactLastRecEnd if that is set, and we scheduled on-abort deletions. Update the CLOG if and only if we have a valid XID. Subtransaction COMMIT: Update the CLOG if and only if we have a valid XID. Subtransaction ABORT: Write an ABORT record if and only if we have a valid XID. Then, flush the XLOG to XactLastRecEnd if that is set, and we scheduled on-abort deletions. Update the CLOG if and only if we have a valid XID. I think we might go even further, and *never* flush the XLOG on abort, since if we crash just before the abort won't log anything either. But if we leak the leftover files in such a case, that's probably a bad idea. greetings, Florian Pflug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Contrib modules documentation online
On Wed, Aug 29, 2007 at 10:09:07AM -0700, Josh Berkus wrote: Albert, (crossed over to -docs, where it really belongs) I've been working on converting the current README files for all contrib modules into sgml and add it to the documentation. There are still some fixes to do but i'd like to have some feedback. Indeed, it wasn't agreed to have all if any of the modules together with the core documentation. You can see the docs on [1] in chapter VIII. If you think these could be a good addition, please fill free to comment on how you think sections should be organized to be consistent and easy to read. [1] http://www.nan-tic.com/ftp/pgdoc Wow, this is really, really cool! You're my hero. I'm very strongly in favor of having this documentation. However, I think it might make sense to put Contrib Modules as a section under either Reference or Appendices. Also, I don't think it's necessary to make each command option a separate subchapter, but I can see how that would be hard to avoid in an automated system. Guys, would it be out of the question to do this in 8.3? Please please? If we go ahead with this, I'll commit to doing a contrib README cleanup so the doc system works better. One question... would there still be a README in each contrib directory? I think getting this stuff in the docs is great, but the README in the source is also very valuable and I'd hate to lose it. -- Decibel!, aka Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpbJLvWh95Cg.pgp Description: PGP signature
Re: [DOCS] [HACKERS] Contrib modules documentation online
Josh Berkus [EMAIL PROTECTED] writes: If we go ahead with this, I'll commit to doing a contrib README cleanup so the doc system works better. Why wouldn't we just remove the README files altogether? I can't see maintaining duplicate sets of documentation. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] [HACKERS] Contrib modules documentation online
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: If we go ahead with this, I'll commit to doing a contrib README cleanup so the doc system works better. Why wouldn't we just remove the README files altogether? I can't see maintaining duplicate sets of documentation. +1 Athough I could see a README at the top of contrib that says, contrib documentation is now here link / directory etc... Joshua D. Drake regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG1bQQATb/zqfZUUQRAuYoAJ95zQkchY8pSq2BCyiy62ZAbA0hGgCdEHKt SXzpwREgcgVNXjolnQh927o= =mawb -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] [HACKERS] Contrib modules documentation online
On Wed, 2007-08-29 at 13:53 -0400, Tom Lane wrote: Why wouldn't we just remove the README files altogether? I can't see maintaining duplicate sets of documentation. I agree that duplication is bad, but I think README files in the individual contrib directories is useful and worth keeping: if I'm about to install a contrib module and want to learn how to install and use it, this change would only make that information *more* difficult to find. I wonder if it would be possible to keep the master version of the contrib docs as SGML, and generate plaintext READMEs from it during the documentation build. -Neil ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [DOCS] [HACKERS] Contrib modules documentation online
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: If we go ahead with this, I'll commit to doing a contrib README cleanup so the doc system works better. Why wouldn't we just remove the README files altogether? I can't see maintaining duplicate sets of documentation. Right. Also, let's recall what has previously been discussed for contrib, namely that we break it out into standard modules (think Perl standard modules) and other tools, and that we abandon the wholly misleading contrib name altogether. I really want to see that happen next release. Getting the modules properly documented is a very important milestone along the way to getting that done. Maybe then the modules will be considered more first class citizens (until the buildfarm came along they were often hardly tested at all). cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [DOCS] [HACKERS] Contrib modules documentation online
On 29/08/2007, Neil Conway [EMAIL PROTECTED] wrote: I wonder if it would be possible to keep the master version of the contrib docs as SGML, and generate plaintext READMEs from it during the documentation build. Hello Neil, I think I'm doing something similar but not with README files. Currently I'm writing the FAQ into Docbook XML, that's why we can build the HTML and plain text at one. I'm going to finish this week then I'll show the results. -- http://www.advogato.org/person/mgonzalez/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Florian G. Pflug [EMAIL PROTECTED] writes: I propose to do the following in my lazy XID assignment patch - can anyone see a hole in that? Cleaning up this area seems like a good idea. Just FYI, one reason why there are so many LastRec pointer variables is that the WAL record format used to include a back-link to the previous record of the same transaction, so we needed to track that location. Since that's gone, simplification is definitely possible. A lot of the other behavior you're looking at just grew as incremental optimizations added over time. One comment is that at the time we make an entry into smgr's pending-deletes list, I think we might not have acquired an XID yet --- if I understand your patch correctly, a CREATE TABLE would acquire an XID when it makes its first catalog insertion, and that happens after creating the on-disk table file. So it seems like a good idea for smgr itself to trigger acquisition of an XID before it makes a pending-deletes entry. This ensures that you can't have a situation where you have deletes to record and no XID; otherwise, an elog between smgr insertion and catalog insertion would lead to just that. .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting a new toplevel transaction. I'm not very happy with that name for the variable, because it looks like it might refer to the last transaction-controlled record we emitted, rather than the last record of any type. Don't have a really good suggestion though --- CurXactLastRecEnd is the best I can do. One thought here is that it's not clear that we really need a concept of transaction-controlled vs not-transaction-controlled xlog records anymore. In CVS HEAD, the *only* difference no_tran makes is whether to set MyLastRecPtr, and you propose removing that variable. This seems sane to me --- the reason for having the distinction at all was Vadim's plan to implement transaction UNDO by scanning its xlog records backwards, and that idea is as dead as a doornail. So we could simplify matters conceptually if we got rid of any reference to such a distinction. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Hi Since generating transient XIDs (named ResourceOwnerIDs in my patch, since their lifetime is coupled to the lifetime of a transaction's toplevel resource owner) seems to be to way to go for lazx xid assignment, I need to find a way to represent them in the pg_locks view. ResourceOwnerIds are a structure composed of two uint32s, a processID (could be the PID of the backend, but to make sure that it isn't reused too quickly, it's actually a synthentic ID generated at backend start), and localTransactionId which is just incremented whenever a new transaction is started in a backend. This design was the result of my discussion with Tom - it's main advantage is that it needs no lock to generate a new ResourceOwnerId. I see 3 possibilities to represent this in system views A) Make ResourceOwnerID a full-blown type, with in and out methods, very similar to tids. processId/localTransactionId would be a natural string representation. B) Just convert the ResourceOwnerId into a string in pg_lock_status. Looks quite similar to (A) from a user's point of view, but the implementation is much shorter. C) Combine the two uint32 fields of ResourceOwnerId into a int8. Might be more efficient than (B). The main disadvantage is that some ResourceOwnerIds will be represented by *negative* integers, which is pretty ugly. D) Just make them two int4 fields. This has the same negativity issue that (C) has, and might cause confusion if users don't read the docs carefully. I'm leaning towards (A), but it adds a lot new code (although most if it would be copied nearly 1-to-1 from tid.c) for maybe too little gain. If (A) is deemed not appropriate, doing (C) and restricting processIds to = 0x8000 might be an option. greetings, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Contrib modules documentation online
On Wed, 29 Aug 2007, Josh Berkus wrote: Guys, would it be out of the question to do this in 8.3? Please please? Are you suggesting to add an additional piece of work to the already behind schedule 8.3 timeline when there's already this idea floating around to overhaul the entire contrib structure in 8.4, which may very well make much of that work redundant? Albert's work is cool and all, but from from back here where I sit I'd expect anyone in a position to integrate it into 8.3 properly should be working on something that's already on the to-do list instead. I know I'm about to dump a big stack of 8.3 data onto the list I'd appreciate some attention from you on, rather than having you distracted cleaning up documentation that's perfectly functional for now. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: I propose to do the following in my lazy XID assignment patch - can anyone see a hole in that? One comment is that at the time we make an entry into smgr's pending-deletes list, I think we might not have acquired an XID yet --- if I understand your patch correctly, a CREATE TABLE would acquire an XID when it makes its first catalog insertion, and that happens after creating the on-disk table file. So it seems like a good idea for smgr itself to trigger acquisition of an XID before it makes a pending-deletes entry. This ensures that you can't have a situation where you have deletes to record and no XID; otherwise, an elog between smgr insertion and catalog insertion would lead to just that. I wonder a bit about the whole special-casing of COMMITs/ABORTs with pending delete, though. A crash might always leave stray file around, so there ought to be a way to clean them up anyway. Still, for now I'll go with your suggestion, and force XID assignment in the smgr. .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting a new toplevel transaction. I'm not very happy with that name for the variable, because it looks like it might refer to the last transaction-controlled record we emitted, rather than the last record of any type. Don't have a really good suggestion though --- CurXactLastRecEnd is the best I can do. Hm.. don't have a good suggestion, either - the reason I want to rename it is that ProcLastRecEnd doesn't sound like it's be reset at transaction start. One thought here is that it's not clear that we really need a concept of transaction-controlled vs not-transaction-controlled xlog records anymore. In CVS HEAD, the *only* difference no_tran makes is whether to set MyLastRecPtr, and you propose removing that variable. This seems sane to me --- the reason for having the distinction at all was Vadim's plan to implement transaction UNDO by scanning its xlog records backwards, and that idea is as dead as a doornail. So we could simplify matters conceptually if we got rid of any reference to such a distinction. I've thinking about keeping XLOG_NO_TRAN, and doing if (!no_tran) Assert(TransactionIdIsValid(GetCurrentTransactionIdIfAny()) in xlog.c as a safety measure. We can't make that assertion unconditionally, I think, because nextval() won't force XID assigment, but might do XLogInsert. greetings, Florian Pflug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [DOCS] [HACKERS] Contrib modules documentation online
On 8/29/07, Mario Gonzalez [EMAIL PROTECTED] wrote: On 29/08/2007, Neil Conway [EMAIL PROTECTED] wrote: I wonder if it would be possible to keep the master version of the contrib docs as SGML, and generate plaintext READMEs from it during the documentation build. Hello Neil, I think I'm doing something similar but not with README files. Currently I'm writing the FAQ into Docbook XML, that's why we can build the HTML and plain text at one. While I like the idea of the READMEs from contrib being in the docs, I can't tell you the number of times I've installed a contrib module in a dark ops center at 2am with no html browser handy (or at best a text based one) or with no access to external internet etc... and just needed a line or two from the README file that came with the contrib module. Could the contrib README files couldn't be generated from the same source as the docs (i.e. sgml) and then put into the appropriate contrib/module/ directory. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Florian G. Pflug [EMAIL PROTECTED] writes: Since generating transient XIDs (named ResourceOwnerIDs in my patch, since their lifetime is coupled to the lifetime of a transaction's toplevel resource owner) seems to be to way to go for lazx xid assignment, I need to find a way to represent them in the pg_locks view. This is going very far towards gilding the lily. Try to avoid loading the patch down with a new datatype. I'm inclined to think that it'd be sufficient to show the high half of the ID (that is, the session number) in pg_locks, because there will never be cases where there are concurrently existing locks on different localTransactionIds. This could probably be displayed in the transactionID columns, which would mean we're abusing the user-visible xid datatype, but I don't see much harm in it. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Florian G. Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: One thought here is that it's not clear that we really need a concept of transaction-controlled vs not-transaction-controlled xlog records anymore. I've thinking about keeping XLOG_NO_TRAN, and doing if (!no_tran) Assert(TransactionIdIsValid(GetCurrentTransactionIdIfAny()) in xlog.c as a safety measure. Why do you think this is a safety measure? All that it is checking is whether the caller has preserved an entirely useless distinction. The real correctness property is that you can't write your XID into a heap tuple or XLOG record if you haven't acquired an XID, but that seems nearly tautological. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [DOCS] [HACKERS] Contrib modules documentation online
On Aug 29, 2007, at 13:27 , Andrew Dunstan wrote: Also, let's recall what has previously been discussed for contrib, namely that we break it out into standard modules (think Perl standard modules) and other tools, and that we abandon the wholly misleading contrib name altogether. I really want to see that happen next release. +1 Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Contrib modules documentation online
Greg, Are you suggesting to add an additional piece of work to the already behind schedule 8.3 timeline when there's already this idea floating around to overhaul the entire contrib structure in 8.4, which may very well make much of that work redundant? Albert's work is cool and all, but from from back here where I sit I'd expect anyone in a position to integrate it into 8.3 properly should be working on something that's already on the to-do list instead. Or the contrib overhaul may *not* get into 8.4 (ala updatable views). Having the contrib stuff in the main docs would remove one of the largest barriers to people knowing about the contrib features. Further, you know we don't finish the docs until beta. Ever. I know I'm about to dump a big stack of 8.3 data onto the list I'd appreciate some attention from you on, rather than having you distracted cleaning up documentation that's perfectly functional for now. What kind of data? On bgwriter_lru autotuning? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: Since generating transient XIDs (named ResourceOwnerIDs in my patch, since their lifetime is coupled to the lifetime of a transaction's toplevel resource owner) seems to be to way to go for lazx xid assignment, I need to find a way to represent them in the pg_locks view. This is going very far towards gilding the lily. Try to avoid loading the patch down with a new datatype. I'm inclined to think that it'd be sufficient to show the high half of the ID (that is, the session number) in pg_locks, because there will never be cases where there are concurrently existing locks on different localTransactionIds. Hm.. I'm not too happy with that. I you for example join pg_locks to pg_stat_activity (which would need to show the RID too), than you *might* get a bogus result if a transaction ends and a new one starts on the same backend between the time pg_lock_status is called, and the time the proc array is read. This could probably be displayed in the transactionID columns, which would mean we're abusing the user-visible xid datatype, but I don't see much harm in it. I'm even more unhappy with that, because the session id of a RID might coincide with a currently in-use XID. What about the following. .) Remove the right-hand side XID from pg_locks (The one holder or waiter of the lock). It seems to make more sense to store a RID here, and let the user fetch the XID via a join to pg_stat_activity. We could also show both the XID (if set) and the RID, but that might lead people to believe that their old views or scripts on top of pg_locks still work correctly when they actually do not. .) On the left-hand side (The locked object), add a RID column of type int8, containing (2^32)*sessionID + localTransactionId. .) To prevent the int8 from being negative, we limit the sessionID to 31 bytes - which is still more then enough. greetings, Florian Pflug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: Tom Lane wrote: One thought here is that it's not clear that we really need a concept of transaction-controlled vs not-transaction-controlled xlog records anymore. I've thinking about keeping XLOG_NO_TRAN, and doing if (!no_tran) Assert(TransactionIdIsValid(GetCurrentTransactionIdIfAny()) in xlog.c as a safety measure. Why do you think this is a safety measure? All that it is checking is whether the caller has preserved an entirely useless distinction. The real correctness property is that you can't write your XID into a heap tuple or XLOG record if you haven't acquired an XID, but that seems nearly tautological. I was confused. I wanted to protect against the case the an XID hits the disk, but doesn't show up in any xl_xid field, and therefore might be reused after crash recovery. But of course, to make that happen you'd have to actually *store* the XID into the data you pass to XLogInsert, which is kind of hard if you haven't asked for it first. So, I now agree, XLOG_NO_TRAN should be buried. greetings, Florian Pflug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Contrib modules documentation online
Josh Berkus wrote: Greg, Are you suggesting to add an additional piece of work to the already behind schedule 8.3 timeline when there's already this idea floating around to overhaul the entire contrib structure in 8.4, which may very well make much of that work redundant? Albert's work is cool and all, but from from back here where I sit I'd expect anyone in a position to integrate it into 8.3 properly should be working on something that's already on the to-do list instead. Or the contrib overhaul may *not* get into 8.4 (ala updatable views). Having the contrib stuff in the main docs would remove one of the largest barriers to people knowing about the contrib features. I don't agree with Greg that we shouldn't make this docs improvement. I do think we should do it in such a way that it will fit with our plans for the future. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Why is there a tsquery data type?
Why does text search need a tsquery data type? I realize it needs tsvector so it can create indexes and updated trigger columns, but it seems tsquery could instead just be a simple text string. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Contrib modules documentation online
Josh Berkus wrote: Having the contrib stuff in the main docs would remove one of the largest barriers to people knowing about the contrib features. Using PostgreSQL since Version 7.1.3 and reading this List since - I dont't know exactly but my current archives start in 2003 which was the last time I crashed my system - Albert's work is actually the first piece of contrib documentation I ever read. I always knew about contrib but as I haven't been missing any feature in the core distribution (well - actually I've been missing PL/R until it came up - but as you know it's not in contrib ;-) I never found time to dig into the contrib directories and README's. Now that I did I'm pretty amazed about the nice features I've never thought about. Further, you know we don't finish the docs until beta. Ever. Whenever you want - but do it. I also highly appreciate Albert's Idea/Work. Regards, Brar ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [DOCS] [HACKERS] Contrib modules documentation online
Scott Marlowe escribió: Could the contrib README files couldn't be generated from the same source as the docs (i.e. sgml) and then put into the appropriate contrib/module/ directory. Sure they can. We already do that for INSTALL for example. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ ¡Ja ja ja! ¡Sólo hablaba en serio! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] int8 INT64_IS_BUSTED
Hi I'm confused about whether int8s work on a machine on which INT64_IS_BUSTED. My reading of the code suggests that int8 will be available, but be, well, busted in such a machine. For example, int8mul seems as if I'd just return the wrong answer on such a machine. Or are platforms with INT64_IS_BUSTED no longer supported, and are all those #ifdefs only legacy code? Please enlighten a poor linux+gcc user who can't remember ever using a compiler without a long long datatype after leaving TurboC under DOS. greetings, Florian Pflug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] text search function renaming
Hi there, I notice, that not all functions were renamed, for example, strip(), setweight(), numnode() Is this intentional ? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Contrib modules documentation online
On Wed, 29 Aug 2007, Josh Berkus wrote: Further, you know we don't finish the docs until beta. Ever. In that context, as long as the documentation cleanup doesn't slow the schedule for when beta starts I think it would be a great thing to slip into 8.3. In fact, if those are going higher-profile, I may slip an update into the docs for pg_buffercache with the queries I keep recommending people look at. What kind of data? On bgwriter_lru autotuning? That would be my data; HOT is in a similar category. I have many test results to pass along, and I'd like to know that the people who might help confirm/deny what I've discovered are as focused as I've been on trying to wrap all this up already, before wandering into new tangents that aren't already blocking the schedule. That's all I'm saying. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] [HACKERS] Contrib modules documentation online
On 8/29/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Scott Marlowe escribió: Could the contrib README files couldn't be generated from the same source as the docs (i.e. sgml) and then put into the appropriate contrib/module/ directory. Sure they can. We already do that for INSTALL for example. OK, s/Could/May/ up there. :) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Re: Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
On Wed, 2007-08-29 at 19:32 +0200, Florian G. Pflug wrote: I propose to do the following in my lazy XID assignment patch The lazy XID assignment seems to be the key to unlocking this whole area. - can anyone see a hole in that? .) Get rid of MyLastRecPtr and MyXactMadeTempRelUpdates. Those are superseeded by TransactionIdIsValid(GetCurrentTransactionIdIfAny()). .) Get rid of MyXactMadeXLogEntry. Instead, just reset ProcLast .) Rename ProcLastRecEnd to XactLastRecEnd, and reset when starting a new toplevel transaction. I followed you up to this point. Nothing bad springs immediately to mind, but please can you explain the proposals rather than just assert them and ask us to find the holes? I think we might go even further, and *never* flush the XLOG on abort, since if we crash just before the abort won't log anything either. But if we leak the leftover files in such a case, that's probably a bad idea. That doesn't gain us much, yet we lose the ability to tell the difference between an abort and a crash. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Florian G. Pflug [EMAIL PROTECTED] writes: What about the following. .) Remove the right-hand side XID from pg_locks (The one holder or waiter of the lock). It seems to make more sense to store a RID here, Yeah, we have to do that since there might not *be* an XID holding the lock. But I still think the session ID would be sufficient here. (Perhaps we don't need the PID either, although then we'd need to change pg_stat_activity to provide session id as a join key...) .) On the left-hand side (The locked object), add a RID column of type int8, containing (2^32)*sessionID + localTransactionId. I'm a bit uncomfortable with that since it renders the view completely useless if you don't have a working int8 type. .) To prevent the int8 from being negative, we limit the sessionID to 31 bytes - which is still more then enough. Hmm ... actually, that just begs the question of how many bits we need at all. Could we display, say, 24 bits of sessionID and 8 bits of localXID merged into a column of nominal XID type? There's a theoretical risk of false join matches but it seems pretty theoretical, and a chance match would not break any system functionality anyway since all internal operations would be working with full-width counters. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [ADMIN] reindexdb hangs
Tom Lane wrote: I examined cluster.c and it does seem to be missing a check too. I'm not sure where to add one though; the best choice would be the place where the list of rels is built, but that scans only pg_index, so it doesn't have access to the namespace of each rel. So one idea would be to get the pg_class row for each candidate, but that seems slow. Another idea would be to just add all the candidates and silently skip the temp indexes in cluster_rel. Yeah, an extra fetch of the pg_class row doesn't seem all that nice. I think you'd want to check it in approximately the same two places where pg_class_ownercheck() is applied (one for the 1-xact and one for the multi-xact path). Actually, the 1-xact path does not need it, because the check is already elsewhere. We only need logic enough to skip temp tables silently while building the list in the multi-xact path. What this means is that very few people, if any, clusters temp tables; because as soon as you do, a plain CLUSTER command in another session errors out with alvherre=# cluster; ERROR: cannot cluster temporary tables of other sessions So, patch attached. Are there any other commands to be worried about? I can't see any besides VACUUM/ANALYZE, and those seem covered. I can't think of any. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support Index: src/backend/commands/cluster.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/commands/cluster.c,v retrieving revision 1.162 diff -c -p -r1.162 cluster.c *** src/backend/commands/cluster.c 19 May 2007 01:02:34 - 1.162 --- src/backend/commands/cluster.c 29 Aug 2007 22:07:04 - *** cluster_rel(RelToCluster *rvtc, bool rec *** 276,281 --- 276,287 } /* + * Note: we don't need to check whether the table is a temp for a + * remote session here, because it will be checked in + * check_index_is_clusterable, below. + */ + + /* * Check that the index still exists */ if (!SearchSysCacheExists(RELOID, *** swap_relation_files(Oid r1, Oid r2, Tran *** 995,1004 } /* ! * Get a list of tables that the current user owns and ! * have indisclustered set. Return the list in a List * of rvsToCluster ! * with the tableOid and the indexOid on which the table is already ! * clustered. */ static List * get_tables_to_cluster(MemoryContext cluster_context) --- 1001,1037 } /* ! * Returns whether a pg_class tuple belongs to a temp namespace which is not ! * our backend's. ! */ ! static bool ! relid_is_other_temp(Oid class_oid) ! { ! HeapTuple tuple; ! Form_pg_class classForm; ! bool istemp; ! ! tuple = SearchSysCache(RELOID, ! ObjectIdGetDatum(class_oid), ! 0, 0, 0); ! if (!HeapTupleIsValid(tuple)) ! ereport(ERROR, ! (errcode(ERRCODE_UNDEFINED_TABLE), ! errmsg(relation with OID %u does not exist, class_oid))); ! ! classForm = (Form_pg_class) GETSTRUCT(tuple); ! istemp = isOtherTempNamespace(classForm-relnamespace); ! ! ReleaseSysCache(tuple); ! ! return istemp; ! } ! ! /* ! * Get a list of tables that the current user owns, have indisclustered set, ! * and are not temp tables of remote backends. Return the list in a List * of ! * rvsToCluster with the tableOid and the indexOid on which the table is ! * already clustered. */ static List * get_tables_to_cluster(MemoryContext cluster_context) *** get_tables_to_cluster(MemoryContext clus *** 1031,1036 --- 1064,1072 if (!pg_class_ownercheck(index-indrelid, GetUserId())) continue; + if (relid_is_other_temp(index-indrelid)) + continue; + /* * We have to build the list in a different memory context so it will * survive the cross-transaction processing Index: src/backend/commands/indexcmds.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/commands/indexcmds.c,v retrieving revision 1.162 diff -c -p -r1.162 indexcmds.c *** src/backend/commands/indexcmds.c 25 Aug 2007 19:08:19 - 1.162 --- src/backend/commands/indexcmds.c 25 Aug 2007 22:11:18 - *** ReindexDatabase(const char *databaseName *** 1292,1297 --- 1292,1301 if (classtuple-relkind != RELKIND_RELATION) continue; + /* Skip temp tables of other backends; we can't reindex them at all */ + if (isOtherTempNamespace(classtuple-relnamespace)) + continue; + /* Check user/system classification, and optionally skip */ if (IsSystemClass(classtuple)) { ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] int8 INT64_IS_BUSTED
On Wed, 2007-08-29 at 22:41 +0200, Florian G. Pflug wrote: Or are platforms with INT64_IS_BUSTED no longer supported, and are all those #ifdefs only legacy code? Personally I think we should head in that direction: if we enable integer datetimes by default in 8.4 (per earlier discussion), such machines will be more broken still. We could fallback to using FP datetimes on INT64_IS_BUSTED machines, but IMHO it is just fundamentally unwise to have the behavior of a builtin data type dependent on this sort of thing. Please enlighten a poor linux+gcc user who can't remember ever using a compiler without a long long datatype after leaving TurboC under DOS. I'm not aware of any platform we might conceivably care about that doesn't have a 64-bit integral type. To verify this, Peter E. suggested that we emit a build-time warning if compiling on such a platform for 8.3, which I think would be worth doing. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [DOCS] [HACKERS] Contrib modules documentation online
Josh Berkus [EMAIL PROTECTED] writes: Further, you know we don't finish the docs until beta. Ever. Right, working on docs is a standard beta-period activity. I think Greg is suggesting that right now is not the time to think about improving contrib docs --- right now is the time to keep our eyes on the ball and *get* to beta. If you've got time to worry about it afterward, do so then. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: What about the following. .) Remove the right-hand side XID from pg_locks (The one holder or waiter of the lock). It seems to make more sense to store a RID here, Yeah, we have to do that since there might not *be* an XID holding the lock. But I still think the session ID would be sufficient here. (Perhaps we don't need the PID either, although then we'd need to change pg_stat_activity to provide session id as a join key...) Yeah, the PID seems to be redundant if we add the RID. But OTOH it does no harm to leave it there - other than the xid, which gives a false sense of security. Don't know what our policy for system-catalog backwards-compatibility is, though... .) On the left-hand side (The locked object), add a RID column of type int8, containing (2^32)*sessionID + localTransactionId. I'm a bit uncomfortable with that since it renders the view completely useless if you don't have a working int8 type. Yeah, I only now realized that int8 really *is* busted if INT64_IS_BUSTED is defined. I always thought that there is some kind of emulation code in place, but apparently there isn't. :-( So there goes this idea .) To prevent the int8 from being negative, we limit the sessionID to 31 bytes - which is still more then enough. Hmm ... actually, that just begs the question of how many bits we need at all. Could we display, say, 24 bits of sessionID and 8 bits of localXID merged into a column of nominal XID type? There's a theoretical risk of false join matches but it seems pretty theoretical, and a chance match would not break any system functionality anyway since all internal operations would be working with full-width counters. Hm.. If we go down that router, we could just calculate some hash value from sessionID and localTransactionId that fits into 31 bits, and use an int4. Or 32 bits, and use xid. I am, however a bit reluctant to do this. I'd really hate to spend a few hours tracking down some locking problem, only to find out that I'd been looking at the wrong place because of some id aliasing... I know it's only a 1-in-4-billion chance, but still it gives me an uneasy feeling. What about a string representation? Something like sessionId/localTransactionId? Should we ever decide that indeed this *should* get it's own datatype, a string representation would allow for a very painless transition... greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Why is there a tsquery data type?
Bruce Momjian [EMAIL PROTECTED] writes: Why does text search need a tsquery data type? I realize it needs tsvector so it can create indexes and updated trigger columns, but it seems tsquery could instead just be a simple text string. By that logic, we don't need any data types other than text. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Contrib modules documentation online
I'm very strongly in favor of having this documentation. However, I think it might make sense to put Contrib Modules as a section under either Reference or Appendices. Also, I don't think it's necessary to make each command option a separate subchapter, but I can see how that would be hard to avoid in an automated system. It's not an automated system, README files have different structures so it's all manual work. That's why I asked how you think it should be organized. Anyone else thinks we should put it in Reference or Appendixes? About command options if done different things, it depends on the module I need to revisit this. I also think one command per subchapter isn't very handy. There's also the install issue. By now it's on the introduction of the chapter. And I've repeated it in some of the modules, not all. Do you think it be better put the exact instructions for compiling and installing for each one? What about 'extra' notes, such us some performance tests, and so one. Some of the notes should probably stay in the README files, just like the README files that can be found in some dirs of core. So I'd keep information targeted to developers into the README's and general info into the main doc. Guys, would it be out of the question to do this in 8.3? Please please? I will try to have everything before 8.3. I'd like it gave very little or no work to core developers. If so many people is interested you can help me revise it before the final version. By the way, if somebody has updated any of the contrib README files recently, please send me an e-mail and I'll check if I have the last changes in. -- Albert Cervera i Areny http://www.NaN-tic.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] int8 INT64_IS_BUSTED
Florian G. Pflug [EMAIL PROTECTED] writes: I'm confused about whether int8s work on a machine on which INT64_IS_BUSTED. My reading of the code suggests that int8 will be available, but be, well, busted in such a machine. The datatype exists, but it's really only int32. For example, int8mul seems as if I'd just return the wrong answer on such a machine. Well, obviously it's gonna overflow sooner than you'd think, but it will give valid answers as long as you never try to compute a value that doesn't fit in int32; and it will correctly complain if you do. Or are platforms with INT64_IS_BUSTED no longer supported, and are all those #ifdefs only legacy code? There are people around here who think it's all useless legacy code, but I'm not prepared to agree quite yet. My position is that all the core functionality should still work if INT64_IS_BUSTED. You'll see a surprisingly limited range for bigint, and pgstat counters will overflow sooner than they otherwise would, and some other noncritical problems. But the database still works. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] text search function renaming
Oleg Bartunov [EMAIL PROTECTED] writes: I notice, that not all functions were renamed, for example, strip(), setweight(), numnode() Is this intentional ? Yeah, I thought those were OK as-is, since they only apply to tsearch-specific data types. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Contrib modules documentation online
Albert Cervera i Areny wrote: I'm very strongly in favor of having this documentation. However, I think it might make sense to put Contrib Modules as a section under either Reference or Appendices. Also, I don't think it's necessary to make each command option a separate subchapter, but I can see how that would be hard to avoid in an automated system. It's not an automated system, README files have different structures so it's all manual work. That's why I asked how you think it should be organized. Anyone else thinks we should put it in Reference or Appendixes? I would far rather have a new top level heading. Something like Standard Modules and Tools. (Please avoid the use of the word contrib). If not, than as a sub-chapter of References. I don't think it belongs in the Appendixes. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [ADMIN] reindexdb hangs
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Yeah, an extra fetch of the pg_class row doesn't seem all that nice. I think you'd want to check it in approximately the same two places where pg_class_ownercheck() is applied (one for the 1-xact and one for the multi-xact path). Actually, the 1-xact path does not need it, because the check is already elsewhere. Yeah, but if you do it there it's one added comparison (isOtherTempNamespace is very cheap, and you can get the namespace cheaply from the already-open rel). This way you need an extra syscache lookup because you are insisting on doing the check in a place where you don't have easy access to the pg_class row. Doesn't seem better. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] int8 INT64_IS_BUSTED
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: I'm confused about whether int8s work on a machine on which INT64_IS_BUSTED. My reading of the code suggests that int8 will be available, but be, well, busted in such a machine. The datatype exists, but it's really only int32. For example, int8mul seems as if I'd just return the wrong answer on such a machine. Well, obviously it's gonna overflow sooner than you'd think, but it will give valid answers as long as you never try to compute a value that doesn't fit in int32; and it will correctly complain if you do. I still think int8mul is buggy. It calculates result as arg1 * arg2, and then checks for an overflow by dividing again, and seeing if the right answer comes out. Which sounds good. But it *skips* that check if both arguments fit into an int32 - check is (arg1 == (int64) ((int32) arg1) arg2 == (int64) ((int32) arg2)). Which for INT64_IS_BUSTED seems to be equivalent to (arg1 == arg1 arg2 == arg2), and thus the check will never fire in that case. I didn't test this though - so maybe I'm just reading it wrong. greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why is there a tsquery data type?
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Why does text search need a tsquery data type? I realize it needs tsvector so it can create indexes and updated trigger columns, but it seems tsquery could instead just be a simple text string. By that logic, we don't need any data types other than text. What is tsquery giving us that text would not? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] int8 INT64_IS_BUSTED
Florian G. Pflug [EMAIL PROTECTED] writes: I still think int8mul is buggy. It calculates result as arg1 * arg2, and then checks for an overflow by dividing again, and seeing if the right answer comes out. Which sounds good. But it *skips* that check if both arguments fit into an int32 - check is (arg1 == (int64) ((int32) arg1) arg2 == (int64) ((int32) arg2)). Good point --- we should probably #ifdef out that part for INT64_IS_BUSTED. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Representation of ResourceOwnerIds (transient XIDs) in system views (lazy xid assignment)
Florian G. Pflug [EMAIL PROTECTED] writes: What about a string representation? Something like sessionId/localTransactionId? Should we ever decide that indeed this *should* get it's own datatype, a string representation would allow for a very painless transition... Yeah, that's probably the best way. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why is there a tsquery data type?
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: By that logic, we don't need any data types other than text. What is tsquery giving us that text would not? A preprocessed representation that can be compared to tsvector efficiently. Now, if you'd asked whether we need *both* tsvector and tsquery, that'd be a fair question. I'm not 100% clear on what the differences are, but they seem pretty subtle. Do you think that having only one datatype would be clearer? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Tom Lane wrote: One comment is that at the time we make an entry into smgr's pending-deletes list, I think we might not have acquired an XID yet --- if I understand your patch correctly, a CREATE TABLE would acquire an XID when it makes its first catalog insertion, and that happens after creating the on-disk table file. So it seems like a good idea for smgr itself to trigger acquisition of an XID before it makes a pending-deletes entry. This ensures that you can't have a situation where you have deletes to record and no XID; otherwise, an elog between smgr insertion and catalog insertion would lead to just that. Hm.. I was just going to implement this, but I'm now wondering if thats really worth it. For smgrcreate, this would catch the following case: .) CREATE something .) smgrcreate: Creates file, and puts it onto the delete-on-abort list .) We elog() *before* acquiring an XID .) RecordTransactionAbort or RecordSubTransactionAbort: We don't write an ABORT record. .) We crash *before* actually deleting the file Compare the probability of that happening (The elog *and* the crash) with the probability of .) CREATE something .) smgrcreate: Creates the file .) We crash *before* we have to chance to commit or abort. The window in which a crash causes us to leak the file seems to be much wider in the second case, yet forcing XID assignment will not help to preven it, unless I'm overlooking something. In the smgrunlink case, there is no reason at all to force XID assignment, because if we abort or crash, we don't want to unlink anyway, and if we survive until we commit, we'll assign the XID during the inevitable catalog update. The only thing the forced XID assignment would buy is to be able to stick if (TransactionIdIsValid(GetCurrentTransactionIdIfAny())) Assert(nrels == 0); into the various Record{Sub|}Transction{Commit|Abort} functions So unless I'm overlooking something, I believe for now it's best to ignore this issued, and to do a proper fix in the long run that removes *all* possible leakages. greetings, Florian Pflug ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Florian G. Pflug [EMAIL PROTECTED] writes: One comment is that at the time we make an entry into smgr's pending-deletes list, I think we might not have acquired an XID yet Hm.. I was just going to implement this, but I'm now wondering if thats really worth it. Basically what you'd give up is the ability to Assert() that there are no deletable files if there's no XID, which seems to me to be an important cross-check ... although maybe making smgr do that turns this cross-check into a tautology ... hmm. I guess the case that's bothering me is where we reach commit with deletable files and no XID. But that should probably be an error condition anyway, ie, we should error out and turn it into an abort. On the abort side we'd consider it OK to have files and no XID. Seems reasonable to me. The only way we could make this more robust is if we could have WAL-before-data rule for file *creation*, but I think that's not possible given that we don't know what relfilenode number we will use until we've successfully created a file. So there will always be windows where a crash leaks unreferenced files. There's been some debate about having crash recovery search for and delete such files, but so far I've resisted it on the grounds that it sounds like data loss waiting to happen --- someday it'll delete a file you wished it'd kept. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: One comment is that at the time we make an entry into smgr's pending-deletes list, I think we might not have acquired an XID yet Hm.. I was just going to implement this, but I'm now wondering if thats really worth it. Basically what you'd give up is the ability to Assert() that there are no deletable files if there's no XID, which seems to me to be an important cross-check ... although maybe making smgr do that turns this cross-check into a tautology ... hmm. I guess the case that's bothering me is where we reach commit with deletable files and no XID. But that should probably be an error condition anyway, ie, we should error out and turn it into an abort. On the abort side we'd consider it OK to have files and no XID. Seems reasonable to me. I've done that now, and it turned out nicely. There is an Assertion on (nrels == 0) || xid assigned in the COMMIT path, but not in the ABORT path. Seems reasonable and safe. And I'm quite tempted to not flush the XLOG at all during ABORT, and to only force synchronous commits if one of the to-be-deleted files is non-temporary. The last idea widens the leakage window quite a bit though, so I maybe I should rather resist that temptation... OTOH, it'd allow aynchronous commits for transactions that created temporary tables. The only way we could make this more robust is if we could have WAL-before-data rule for file *creation*, but I think that's not possible given that we don't know what relfilenode number we will use until we've successfully created a file. So there will always be windows where a crash leaks unreferenced files. There's been some debate about having crash recovery search for and delete such files, but so far I've resisted it on the grounds that it sounds like data loss waiting to happen --- someday it'll delete a file you wished it'd kept. It seems doable, but it's not pretty. One possible scheme would be to emit a record *after* chosing a name but *before* creating the file, and then a second record when the file is actually created successfully. Then, during replay we could remember a list of xids and filenames, and remove those files for which we either haven't seen a created successfully record, or no COMMIT record for the creating xid. With this scheme, I'd be natural to force XID assignment in smgrcreate, because we'd actually depend on logging the xid there. greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why is there a tsquery data type?
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: By that logic, we don't need any data types other than text. What is tsquery giving us that text would not? A preprocessed representation that can be compared to tsvector efficiently. Now, if you'd asked whether we need *both* tsvector and tsquery, that'd be a fair question. I'm not 100% clear on what the differences are, but they seem pretty subtle. Do you think that having only one datatype would be clearer? There is no question things would be clearer with only one text search data type. The only value I can see to having a tsquery data type is that you can store a tsquery value in a column, but why would that be much better than just storing it in a TEXT field? Internally I assume you would have to generate a tsquery structure from a TEXT string, so the idea of a query representation wouldn't go away; it would just be internal. The one thing we would lose is the ability to process the query string with a named configuration. If we always cast to TEXT, I assume we would always be using default_text_search_config, and I am a little worried about queries in triggers that have to wire-down the configuration name. As I understand it the tsquery goes through the configuration just like the tsvector. Right now you can already do: 'query' @@ 'boy girl'::tsvector and the system casts your text string to tsquery automatically. Perhaps we just need to minimize tsquery in the documentation and mention its special purpose. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Some more msvc++ 8.2.4 build feedback
In addition to the encoding issues I have with the 8.2.4 MSVC++ build (see previous email), there are some other issues that I've run into: 1. The custom build step that generates DEF files for DLLS using a perl script does not work. The perl script runs, and a slew of symbol information is output to the MSVC++ console. And a def file is created, but it always empty. If you run the perl script (gendef.pl) by hand from a DOS prompt, the DEF file is created correctly. That of course is a bit of pain since you have to do it for a number of the MSVC++ projects. 2. The build has hard-coded paths (as far as I can see) to zlib and openssl. Is there a way to a) remove them or b) change them ahead of time so you don't have to manually change 20+ project files? 3. If you have a contrib module that is not known to the build system it blows up. For example, I use PostGIS. When running build.bat it stops at postgis (sorry, I didn't write down the error message but can easily get it if needed). Could unknown contrib modules just be skipped instead? 4. I was surprised to find out that a MSVC++ build cannot open a database cluster created by a MSYS build (not sure about vice versa since I didn't checked). When running pg_ctl start a checksum error is returned. Is this expected? I got around it by doing a dump from the MSYS database and a restore into the MSVC++ database, which worked fine. Hope these comments help - and thanks for providing the build scripts. They make it *much* easier to debug postgresql and contrib modules on Windows. Out of curiosity, will the MSVC++ build become the default for 8.3 on Windows or will it remain MSYS for a bit longer? Thanks, Charlie ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] msvc++ build of 8.2.4 and encodings
Hope this is the right place for this post... I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on my development laptop (using window xp pro). I noticed the sort orders of queries changed. Investigating more, encodings don't seem to be working as expected. Using a MSVC++ build: CREATE DATABASE test1 WITH ENCODING = 'utf8'; show all lc_collate;English_United States.1252 lc_ctype;English_United States.1252 lc_messages;C lc_monetary;C lc_numeric;C lc_time;C Using a MSYS build: CREATE DATABASE test1 WITH ENCODING = 'utf8'; show all lc_collate;en_US.UTF-8 lc_ctype;en_US.UTF-8 lc_messages;C lc_monetary;C lc_numeric;C lc_time;C In both cases, the database clusters were created like this: initdb ---locale=c --encoding=utf8; Note that I successfully built all the various encoding projects for the MSVC++ build and have installed them. I'd be happy to debug this a bit more if would be helpful. Thanks, Charlie smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] Some more msvc++ 8.2.4 build feedback
Charlie Savage wrote: Out of curiosity, will the MSVC++ build become the default for 8.3 on Windows or will it remain MSYS for a bit longer? There is no such thing as a default build. As far as I am concerned we will support MSys into the indefinite future. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] msvc++ build of 8.2.4 and encodings
Using a MSYS build: CREATE DATABASE test1 WITH ENCODING = 'utf8'; show all lc_collate;en_US.UTF-8 lc_ctype;en_US.UTF-8 lc_messages;C lc_monetary;C lc_numeric;C lc_time;C Sorry, the above output is for Linux (Fedora Core 6). With an MSYS build on my XP laptop its: lc_collate;C lc_ctype;C lc_messages;C lc_monetary;C lc_numeric;C lc_time;C Still different than the MSVC++ build. Thanks, Charlie smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] msvc++ build of 8.2.4 and encodings
Charlie Savage wrote: Hope this is the right place for this post... I'm been trying out the msvc++ build scripts for postgresql 8.2.4 on my development laptop (using window xp pro). I noticed the sort orders of queries changed. Investigating more, encodings don't seem to be working as expected. Using a MSVC++ build: CREATE DATABASE test1 WITH ENCODING = 'utf8'; show all lc_collate;English_United States.1252 lc_ctype;English_United States.1252 lc_messages;C lc_monetary;C lc_numeric;C lc_time;C Using a MSYS build: CREATE DATABASE test1 WITH ENCODING = 'utf8'; show all lc_collate;en_US.UTF-8 lc_ctype;en_US.UTF-8 lc_messages;C lc_monetary;C lc_numeric;C lc_time;C In both cases, the database clusters were created like this: initdb ---locale=c --encoding=utf8; That seems most unlikely - without the superfluous dash it should set both lc_collate and lc_ctype to C. Please try the following in both cases: initdb --no-locale --encoding=utf8 data pg_controldata data | grep LC_ If it doesn't show this: LC_COLLATE: C LC_CTYPE: C then that's a bug. Or if after that you connect to the instance and show lc_collate or show lc_ctype don't likewise show C then that's a bug. Are you by any chance loading a library that calls setlocale() ? cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Some more msvc++ 8.2.4 build feedback
Joshua D. Drake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andrew Dunstan wrote: Charlie Savage wrote: Out of curiosity, will the MSVC++ build become the default for 8.3 on Windows or will it remain MSYS for a bit longer? There is no such thing as a default build. As far as I am concerned we will support MSys into the indefinite future. My immediate guess is he is asking, When I download the installer. What build am I getting?. My guess is that with 8.3 it will be MSVC++. That doesn't make it the default any more than the fact that Fedora's package is built with gcc makes it a default. But I guess I see what you mean. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] msvc++ build of 8.2.4 and encodings
Hi Andrew, Thank for the reply. In both cases, the database clusters were created like this: initdb ---locale=c --encoding=utf8; That seems most unlikely - without the superfluous dash it should set both lc_collate and lc_ctype to C. Ah, sorry, that was a typo. If you actually try it: C:\WINDOWS\system32initdb ---locale=C --encoding=utf8 c:\data_msvcc3 initdb: illegal option -- -locale=C Please try the following in both cases: initdb --no-locale --encoding=utf8 data pg_controldata data | grep LC_ If it doesn't show this: LC_COLLATE: C LC_CTYPE: C then that's a bug. With MSYS build: initdb --no-locale --encoding=utf8 c:\data_msys C:\WINDOWS\system32pg_controldata c:\data_msys | grep LC_ LC_COLLATE: C LC_CTYPE: C [connect to postgres database] show lc_collate C show lc_ctype C create database test with encoding='utf8' [switch to postgres database] show lc_collate C show lc_ctype C With VC++ build: initdb --no-locale --encoding=utf8 c:\data_msvcc C:\WINDOWS\system32pg_controldata c:\data_msvcc | grep LC_ LC_COLLATE: C LC_CTYPE: C show lc_collate C show lc_ctype C create database test with encoding='utf8' [switch to postgres database] show lc_collate C show lc_ctype C Ok, so this works. And if I use --locale=C for initdb it gives the same answers. Are you by any chance loading a library that calls setlocale() ? Hmm. Its postgresql 8.2.4 + tsearch2 + tree + postgis. postgis in turn loads proj4 and geos. I grepped through those 3 libraries source code and did not find any calls to setlocale. So I don't think so. So now I'm confused - if I go back to my other cluster that I originally wrote about (created with the MSVC++ build also) and create a database it has a different lc_collate (English_United States.1252). Could this be from the dump/reload? Charlie smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] Advice on MyXactMade* flags, MyLastRecPtr, pendingDeletes and lazy XID assignment
Florian G. Pflug [EMAIL PROTECTED] writes: And I'm quite tempted to not flush the XLOG at all during ABORT, and to only force synchronous commits if one of the to-be-deleted files is non-temporary. +1 on the first, but -1 on the second, because we'd have to track whether deleted files are temp or not ... it's very unclear that it'd be worth the trouble. OTOH, it'd allow aynchronous commits for transactions that created temporary tables. It'd be for xacts that *dropped* temp tables, no? I'm not sure that is a performance-critical path --- probably it more usually gets done after the client's already disconnected. The only way we could make this more robust is if we could have WAL-before-data rule for file *creation*, but I think that's not possible given that we don't know what relfilenode number we will use until we've successfully created a file. It seems doable, but it's not pretty. One possible scheme would be to emit a record *after* chosing a name but *before* creating the file, No, because the way you know the name is good is a successful open(O_CREAT). regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate