Re: [HACKERS] Patch queue triage
On 5/2/07, Tom Lane [EMAIL PROTECTED] wrote: * [PATCHES] HOT Patch - Ready for review /Pavan Deolasee/ This needs a *lot* of review. Can we break it down into more manageable chunks? I'm not sure that anyone's got a full grasp of the implications of this patch, and that's a scary thought. Sure, we can do that. I actually did that when I posted the incremental versions of the HOT-patch, each version implementing the next big chunk of the code. I can reverse engineer that again. When I do that, should I just break the patch into logical pieces without worrying about whether each piece alone builds/works correcttly ? Or should I try to make each piece complete ? I know the second would be a preferred way, but it would be more work. But if that can considerably ease review process, I would do that by all means. In any case, there will be dependecies amongst the patches. I am on leave today, so would start on this tomorrow. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Patch queue triage
On Tue, 1 May 2007, Tom Lane wrote: * [HACKERS] tsearch_core patch for inclusion /Teodor Sigaev/ Have we resolved whether the API and the dump/restore strategy are acceptable? The code needs review too, but not till we've settled on the basic question whether we like the feature set. There were several discussion and we tried to satisfy a claims. We added the ability of creation of FTS with one command (just create GIN index on text attribute), which many people like a lot, we changed SQL syntax to be more SQL-ish and we like it. We discussed the new FTS on two conferences in Moscow and got positive opinions. Also, we wrote FTSBOOK ( http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ ) in english and FTS introduction in russian ( http://www.sai.msu.su/~megera/postgres/talks/fts_pgsql_intro.html ). For the period from the patch was announced there were 3100 unique IP, which downloaded FTSBOOK. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Heap page diagnostic functions
On 5/2/07, Heikki Linnakangas [EMAIL PROTECTED] wrote: I'm going to go with pgdiagnostics. We could short it to just pgdiag, but that feels too short :). We could make it pgdiagfuncs, but that's not much shorter than pgdiagnostics. Just to add more confusion :-), how about pginspect ? Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
[HACKERS] Patch queue triage
Hello Tom, All what you wrote is true. plpgpsm copies-and-pastes about 30% of code and It is terrible for long run. But when I can change it? Writing differnt runtime is nonsense, better way is refactoring plpgsql and then sharing code with its. It's not propable in 8.4 .. there will by lot of important patches from 8.3, and it's mean so interpret wll not be in core before two years. All your last patches I merged in one day. In next months plpgpsm can follow plpgsql, or else. plpgpsm can be experimental and can be used for integration into core and creating SQL procedural language API in 8.5 (and plpgsql will be in 8.4, 8.5 without changes) and in 8.6 plpgsql will be modified to use this API. This road expect stable plpgsql for next two, three years. plpgpsm can solve some questions about future plpgsql. It contains some others construct which is foreign in plpgsql and plpgsql can be in Oracle's style forever (with David's patch Oracle collections are possible). Bigger problem for plpgpsm isn't runtime but users. It needs bigger discuss about integration into core, and it isn't possible without integration into core. Current API can be dismissed in others API. With variable API we can drop variables substitution in SQL, FAST SQL call can be part of SPI. But all needs time. From plpgsql view simple change of caching was big patch. I will be happy if 8.4 will contains true session variables. It can be used in SQL languages later. I afraid so all these steps needs long time. plpgpsm is ready. It's patch without dependencies and has not influence to other parts of postgresql. I am working on documentation now. Czech version is completed, waiting for translation to english. Regards Pavel Stehule * [PATCHES] plpgpsm /Pavel Stehule/ I think this has to be held for 8.4: it was submitted too late for the 8.3 feature deadline, and in fact I don't recall that there was any community discussion/consensus on accepting it into core at all. Another big problem is that it largely copies-and-pastes the plpgsql code, which I think is an unacceptable maintenance burden in the long run. We need to consider whether we can't refactor to avoid code duplication. _ Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Feature freeze progress report
On Tue, May 01, 2007 at 07:09:24PM -0400, Bruce Momjian wrote: The current patch-queue process is failing to scale with the project: every release it gets to be more work for you Tom to integrate the patches. We need to think of new approaches to make the review process scale. As a pointed example, you're about to go on tour for 2 weeks and patch review will stall while you're gone. That's not sustainable. I am traveling --- I left on Friday. I am in Sydney now. As far as scaling, patch information isn't our problem right now. If someone wants to help we can give them up-to-date information on exactly how to deal with the patch. It is people doing the work that is the bottleneck. snip As an example, how is patch information going to help us review HOT or group-item-index? There is frankly more information about these in the archives than someone could reasonable read. What someone needs is a summary of where we are now on the patches, and lots of time. FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. I think you just contradicted yourself. Information isn ot the problem, but you need more information... I think this is a fairly clear indication that we do need a better way to track this information. //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Getting parse Error at or near character and
Hi, I am getting the parse error while i try to execute a simple sql query in postgres. java.sql.SQLException: ERROR: parser: parse error at or near and at character 58 The Query has been changed and it is very much like the below one select * from emp where empName like 'X' and empId=206 when I try to execute the same query in the postgresql console it is working fine but when i try to execute the same as a prepared statement from my java code it is giving the java.sql.SQLException like the above. Do Anyone have the answer for this problem Expecting a Quick response. Here is my code: String getEmpId=SQLQueriesHandler.getQuery(eis.toGetEmpId); psmt=conn.prepareStatement(getEmpId); psmt.setString(1,empForm.getEmpName()); psmt.setInt(2,empForm.getDeptId()); System.out.println(The Query to empId with Parameters is : +psmt.toString()); rs=psmt.executeQuery(getEmpId); where the query will be like select empId from empTab where empName like 'XXX' and deptId=13 And in the above code empForm is the Struts ActionForm object and getters are the form beans. If i try to execute that as a statement it is executing well. and also if i try to execute that query as a pprepare statement in the code itself it is executing well. The only problem is when i try to get the query from the properties file eg:query.properties The variable names are changed keeping the code as it is. Plz reply back soon Thanks in Advance. Regards, urmyfriend - Ahhh...imagining that irresistible new car smell? Check outnew cars at Yahoo! Autos.
Re: [HACKERS] Fwd: [PATCHES] Preliminary GSSAPI Patches
Magnus Hagander wrote: Also, last I checked OpenSSL didn't ship with Windows and Kerberos encryption did. How long ago did you check? I've been using OpenSSL on windows for many years. Actually, it was supported just fine on Windows back when it was added to PostgreSQL *at least*. I didn't say *available for download*, I said *ship with*. That is, does a Windows Vista Pro box from the factory come with OpenSSL on it? It does come with Microsoft SSPI, although I don't know compatibility issues. No, of course not. Microsoft OSes don't ship with *any* third party software. So yeah, didn't get what you meant, and you do have a point there. Provided the SSPI stuff actually does gssapi encryption - but I'll trust the people who say it does. I've only ever used the authentication parts myself. I think it's largely irrelevant though - I can't see us shipping a non-ssl enabled build now (thats not to say we wouldn't add SSPI support of course). /D ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Getting parse Error at or near character and
friend 4ever wrote: Hi, I am getting the parse error while i try to execute a simple sql query in postgres. This isn't a question for the hackers list. Try the general, or jdbc lists. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Getting parse Error at or near character and
friend 4ever wrote: Hi, I am getting the parse error while i try to execute a simple sql query in postgres. java.sql.SQLException: ERROR: parser: parse error at or near and at character 58 The Query has been changed and it is very much like the below one Oh, and the quickest way to find out what the error is would be to provide the *actual* query, not something very much like it. Perhaps turn statement logging on in your postgresql.conf if it isn't already. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Patch queue triage
* [PATCHES] Updateable cursors patch /FAST PostgreSQL/ This is incomplete, and I fear at this point has to be held over to 8.4. It is true that my original patch post said that I need to modify the patch to work with tidscan. Since then I have realized that this modification is not needed as it would have the same result as the 'branching out from sequential scan' solution currently implemented. I was hoping that I could discuss this with whoever picks up the patch for review before doing modifications if any is needed. So in my humble opinion, it would be great if this can be considered for 8.3 as there are not many modifications needed. P.S. Only Simon commented on my original patch. Simon, do you have time to revisit the patch so that we could discuss this? Rgds, Arul Shaji ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Feature freeze progress report
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: Actually, that can happen with the current system. The real blocker there is that some people, particularly Tom, work so fast that there's no chance for a new reviewer to tackle the easy stuff. Maybe the real solution is to encourage some of our other contributors to get their feet wet with easy patches so that they can help with the big ones later on? Yeah, I hear what you say. This is particularly a problem for small bug fixes: I tend to zing small bugs quickly, first because I enjoy finding/ fixing them and second because I worry that they'll fall off the radar screen if not fixed. But I am well aware that fixing those sorts of issues is a great way to learn your way around the code (I think that's largely how I learned whatever I know about Postgres). I'd be more willing to stand aside and let someone else do it if I had confidence that issues wouldn't get forgotten. So in a roundabout way we come back to the idea that we need a bug tracker (NOT a patch tracker), plus people putting in the effort to make sure it stays a valid source of up-to-date info. Without the latter it won't really be useful. A great way to learn would be to look at the patches in the queue, and find bugs in them. There's a lot more bugs to be found in submitted patches than in PostgreSQL itself. A patch tracker would help with that. I'm in favor of some kind of a patch tracker. It doesn't need to be too fancy, but if for each patch we had: Patch name: Kitchen sink addition to planner Latest patch: kitchen-sink-v123.patch, click to download Summary: Adds a kitchen-sink node type to the planner to enable liquid queries. Status: Will be rejected unless race conditions are fixed. Needs performance testing. Discussions:links to mail threads, like in the current patch queue That wouldn't necessarily help committers directly, but it'd give more visibility to the patches. That would encourage more people to review and test patches. And it'd make it clear what the status of all the patches are to anyone who's interested. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Patch queue triage
Pavan Deolasee [EMAIL PROTECTED] writes: On 5/2/07, Tom Lane [EMAIL PROTECTED] wrote: This needs a *lot* of review. Can we break it down into more manageable chunks? Sure, we can do that. I actually did that when I posted the incremental versions of the HOT-patch, each version implementing the next big chunk of the code. I can reverse engineer that again. Can we? I mean, sure you can break the patch up into chunks which might make it easier to read, but are any of the chunks useful alone? I suppose inserting HOT tuples without index maintenance is useful even if no changes to the space allocation is made is useful. It won't get the space usage but it would save on index thrashing. But that still implies all the code to handle scans, updates, index builds, etc. Those chunks could be separated out but you can't commit without them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Patch queue triage
Tom Lane [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. This message is an attempt to sort out which patch queue entries have no hope of getting into 8.3 (and so we shouldn't spend more time on them right now), which ones can get in but are waiting on their authors for more work, and which ones are ready for immediate review. Thanks for this. This is exactly what we've been missing recently I think. * [PATCHES] non-recursive WITH clause support /Gregory Stark/ I think the consensus is that we should wait for a more complete WITH implementation to be submitted, since this one creates compatibility issues without any real return in the form of added functionality. I submitted it because it filled in a missing ANSI feature but nobody's piped up saying they missed that feature so, sure. * [PATCHES] Concurrent psql v4 [WIP] /stark/ This is waiting on the author to change the API per list discussions; we can't apply something that is about to have some commands removed ... I did finish the api changes -- though I'm not too happy with the names. I was expecting the list to play the name game so I just put in placeholder names originally. I'm adding documentation and example regression tests now. Also I'm trying to fix the cursor-mode FETCH_COUNT support which it breaks. I'm thinking of once the first batch of rows arrives just going into a synchronous function to fetch the rest of the resultsets. * Re: [HACKERS] Modifying TOAST thresholds /Tom Lane/ At this point it seems nothing will be done about this issue for 8.3. I'm not sure anyone has an idea how to test it. TPCC isn't really useful because it has a fixed size (500 byte) string buffer. Perhaps if we modified it to have a random string length uniformly distributed between 0-2k ? But even then it never does any scans based on that buffer. But the problem with going with something more natural is that it'll be harder to tell exactly what it's testing. * [HACKERS] tsearch_core patch for inclusion /Teodor Sigaev/ Have we resolved whether the API and the dump/restore strategy are acceptable? The code needs review too, but not till we've settled on the basic question whether we like the feature set. Personally I actually do like the idea of promoting tsearch to first-class citizen by giving it keywords and a place in the grammar. I think it's a more fully integrated interface than the function based one. The only reason I might think otherwise was if it was just a crutch for missing features it had exposed that would be better implemented more generically. But I don't think that's the case. * Re: [PATCHES] [Fwd: Deferred Transactions, Transaction Guarantee and COMMITwithout waiting] /Simon Riggs/ Simon is on the hook to submit an updated patch. I hope this one makes it in, as it looks like a really nice performance improvement for those who can use it; but the original patch seems overcomplicated. I know Simon's really busy. I might be able to help with it if he wants. * Re: [PATCHES] LIMIT/SORT optimization /Gregory Stark/ * [PATCHES] updated SORT/LIMIT patch /Gregory Stark/ I will look at this. I recall being concerned about whether there wasn't a cleaner way to introduce the required inter-node communication. The next big thing to keep in mind in this area is a unique sort which would need to know if there's a Unique node above it with the same key. If the resulting inter-node communication arrangement has your blessing and can handle that as well then I'll probably do that for 8.4. Incidentally I would prefer it if you want to make changes that you explain the changes to me and let me make them. It gives me a better chance to understand what the changes really are and the motivation than trying to read a patch later and understand why you made the changes you did. I understand sometimes it's easier to just write the code than to explain the idea to someone else and then review the resulting code though and there's already enough work your plate so if that's the case then so be it. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Heap page diagnostic functions
Any suggestions? pgdiagnostics? Yes, I like diagnostics, or internals. I just think forensics isn't going to be understood by the average native English speaker, let alone non-English speakers. I think forensics is ok. The world is currently beeing swamped with related tv shows :-) Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Fwd: [PATCHES] Preliminary GSSAPI Patches
On Tue, May 01, 2007 at 04:26:13PM -0700, Henry B. Hotz wrote: On May 1, 2007, at 3:11 PM, Magnus Hagander wrote: Also, last I checked OpenSSL didn't ship with Windows and Kerberos encryption did. How long ago did you check? I've been using OpenSSL on windows for many years. Actually, it was supported just fine on Windows back when it was added to PostgreSQL *at least*. I didn't say *available for download*, I said *ship with*. That is, does a Windows Vista Pro box from the factory come with OpenSSL on it? It does come with Microsoft SSPI, although I don't know compatibility issues. No, of course not. Microsoft OSes don't ship with *any* third party software. So yeah, didn't get what you meant, and you do have a point there. Provided the SSPI stuff actually does gssapi encryption - but I'll trust the people who say it does. I've only ever used the authentication parts myself. The SSPI has encryption and integrity functions, just like the GSSAPI. I don't remember Jeffrey Altman's interop example code well enough to say if he demonstrates that they interoperate as well. Spending 5 seconds looking at it, the SSPI appears to make a distinction between message and stream encryption that the GSSAPI does not make, so there is at least some profiling needed to identify what's common. I suspect that interoperability was intended. If we find bugs and tell the right people Microsoft might even fix them someday. Ok. Well, that's for later. As to the question of GSSAPI vs SSL, I would never argue we don't want both. Part of what made the GSSAPI encryption mods difficult was my intent to insert them above the SSL encryption/buffering layer. That way you could double-encrypt the channel. Since GSSAPI and SSL are (probably, not necessarily) referenced to completely different ID infrastructure there are scenarios where that's beneficial. We might want to consider restructuring how SSL works when we do, that might make it easier. The way it is now with #ifdefs all around can lead to a horrible mess if there are too many different things to choose from. Something like transport filters or whatever might be a way to do it. I recall having looked at that at some point, but it was too long ago to remember any details.. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Patch queue triage
On Tue, May 01, 2007 at 10:44:38PM -0400, Tom Lane wrote: * [PATCHES] Preliminary GSSAPI Patches /Henry B. Hotz/ Magnus is reviewing this one. Check. * [PATCHES] Clear up strxfrm() in UTF-8 with locale on Windows /ITAGAKI Takahiro/ Someone else needs to look at this; I can't test it. Magnus? Yup, it's on my list. //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Feature freeze progress report
Magnus Hagander wrote: As an example, how is patch information going to help us review HOT or group-item-index? There is frankly more information about these in the archives than someone could reasonable read. What someone needs is a summary of where we are now on the patches, and lots of time. FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. I think you just contradicted yourself. Information isn ot the problem, but you need more information... I think this is a fairly clear indication that we do need a better way to track this information. No, my point is that 100% information is already available by looking at email archives. What we need is a short description of where we are on each patch --- that is a manual process, not something that can be automated. Tom has posted it --- tell me how we will get such a list in an automated manner. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Feature freeze progress report
Josh Berkus wrote: Bruce, As an example, how is patch information going to help us review HOT or group-item-index? There is frankly more information about these in the archives than someone could reasonable read. What someone needs is a summary of where we are now on the patches, and lots of time. The idea is to provide ways for other people to help where they can and to provide better feedback to patch submitters so that they fix their own issues faster. Also, lesser PostgreSQL hackers than you could take on reviewing the small patches, leaving you to devote all of your attention to the big patches. Actually, that can happen with the current system. The real blocker there is that some people, particularly Tom, work so fast that there's no chance for a new reviewer to tackle the easy stuff. Maybe the real solution is to encourage some of our other contributors to get their feet wet with easy patches so that they can help with the big ones later on? That is, if the problem is people and not tools, then what are we doing to train up the people we need? We seem to handle trivial patches just fine. The current problem is that the remaining patches require domain or subsystem-specific knowledge to apply, e.g. XML or WAL, and those skills are available in a limited number of people. If I had the expertise in those areas, I would have applied the patches already. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Getting parse Error at or near character and
Expecting a Quick response. Don't hold your breath. Please post questions like this to the JDBC and/or the 'general' mailing list. Yours, Laurenz Albe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Patch queue triage
Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. This message is an attempt to sort out which patch queue entries have no hope of getting into 8.3 (and so we shouldn't spend more time on them right now), which ones can get in but are waiting on their authors for more work, and which ones are ready for immediate review. Thanks for this. This is exactly what we've been missing recently I think. 100% agree. * Re: [HACKERS] Modifying TOAST thresholds /Tom Lane/ At this point it seems nothing will be done about this issue for 8.3. I'm not sure anyone has an idea how to test it. TPCC isn't really useful because it has a fixed size (500 byte) string buffer. Perhaps if we modified it to have a random string length uniformly distributed between 0-2k ? But even then it never does any scans based on that buffer. But the problem with going with something more natural is that it'll be harder to tell exactly what it's testing. My idea on this was to create two backends, one with the default TOAST value, and a second with a value of 50 bytes. Create a table with one TEXT field, and several other columns, each column 50 bytes. Then, fill the table with random data (script attached that might help), and the try 2000, 1500, 1000, etc, bytes in the TEXT column for each row (use random data so the compression code doesn't shrink it). Then run a test with both backends acessing the TEXT column and non-TEXT column and measure the difference between the two backends, i.e. the backend with a TOAST value of 50 should show faster access on the non-TEXT field, but slower access on the TEXT field. Then, figure out where the gains on the non-TEXT field seem to diminish in usefulness. Basically, with a lower TOAST value, we are going to spend more time accessing the TEXT field, but the speedup for the non-TEXT field should be large enough win that we don't care. As the TEXT column becomes shorter, it has less affect on the non-TEXT access. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + : ROWS=10 COLSIZE=2500 echo DROP TABLE test; CREATE TABLE test(i SERIAL, t text); INSERT INTO test (t) SELECT array_to_string(ARRAY( SELECT chr(32 + (95 * random())::integer) FROM generate_series(1,$COLSIZE)),'') FROMgenerate_series(1, $ROWS); SELECT pg_relation_size('test') AS HEAP, pg_total_relation_size('test') - pg_relation_size('test') AS TOAST; SET log_min_duration_statement = 0; SET client_min_messages = 'log'; SELECT t FROM test WHERE i = 234329823; | sql test ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Heap page diagnostic functions
Pavan Deolasee wrote: On 5/2/07, Heikki Linnakangas [EMAIL PROTECTED] wrote: I'm going to go with pgdiagnostics. We could short it to just pgdiag, but that feels too short :). We could make it pgdiagfuncs, but that's not much shorter than pgdiagnostics. Just to add more confusion :-), how about pginspect ? I like it. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Heap page diagnostic functions
Heikki Linnakangas wrote: I'm going to go with pgdiagnostics. We could short it to just pgdiag, but that feels too short :). We could make it pgdiagfuncs, but that's not much shorter than pgdiagnostics. pgdiagfn? It is only 8 chars length ;). Zdenek ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
Bruce Momjian [EMAIL PROTECTED] writes: Then, figure out where the gains on the non-TEXT field seem to diminish in usefulness. Basically, with a lower TOAST value, we are going to spend more time accessing the TEXT field, but the speedup for the non-TEXT field should be large enough win that we don't care. As the TEXT column becomes shorter, it has less affect on the non-TEXT access. I guess the key is to break down what it is we want to measure into several parts. These can each be measured precisely for various sized TOASTed data. Costs: 1) cost of retrieving data from TOAST pointer versus retrieving data from inline tuple. We just want the absolute time difference between the two operations, not the percentage difference. 2) cost of creating TOAST pointer (ie, inserting a new tuple with a TOAST pointer or updating a previously inlined tuple to have a TOASTed column). 3) cost of deleting a TOAST pointer (ie, deleting a tuple or updating a tuple to no longer have a TOASTed column) 3) cost of deleting a tuple with an existing TOAST pointer (or updating a tuple to be all inlined) versus deleting an plain tuple or updating a plain tuple. Savings: 1) time savings accessing a tuple without retrieving the TOAST pointer versus having to access the tuple with the data inlined. 2) time savings updating a tuple without modifying toasted data versus updating same tuple with the data inlined in both versions. The plan you described would be testing costs 1 and savings 1 but I think we need to continue to the others as well. Then the trick is to somehow make some argument about the frequency of the various operations and the acceptable tradeoff. I think you're right that the time spent accessing the data would be the most important metric. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Feature freeze progress report
Bruce Momjian [EMAIL PROTECTED] writes: We seem to handle trivial patches just fine. You keep saying that but I think it's wrong. There are trivial patches that were submitted last year that are still sitting in the queue. In fact I claim we handle complex patches better than trivial ones. HOT, LDC, DSM etc receive tons of feedback and acquire a momentum of their own. Admittedly GII is a counter-example though. On the other hand trivial patches tend to interest relatively few people and have little urgency. The current problem is that the remaining patches require domain or subsystem-specific knowledge to apply, e.g. XML or WAL, and those skills are available in a limited number of people. If I had the expertise in those areas, I would have applied the patches already. Well, I claim it's often the trivial patches that require the domain-specific knowledge you describe. If they were major patches they would touch more parts of the system. But that means they should be easy to commit if you could just fill in the missing knowledge. Could you pick a non-committer with the domain-specific knowledge you think a patch needs and ask for their analysis of the patch then commit it yourself? You can still review it for general code quality and trust the non-committer's review of whether the domain-specific change is correct. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Heap page diagnostic functions
Tom Lane [EMAIL PROTECTED] writes: Hmm ... the Oxford English Dictionary defines forensic as pertaining to, connected with, or used in courts of law. There are also some senses related to argumentation, but nothing specifically about evidence analysis, whether after-the-fact or not. So yeah, it doesn't seem like a good name for these functions anyhow. Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: I think forensics is ok. The world is currently beeing swamped with related tv shows :-) Indeed, it's the only proprosal so far with connotation of the right level of detail that the functions about the data. Police forensics laboratories spend their time picking up fibres and laboriously fingerprinting every surface which is about what it feels like to dig through every tuple of a heap page. Incidentally, Tom, were you consulting the dead-tree edition of the OED? In which case you would have naturally not seen this in 1993 additions: ADDITIONS SERIES 1993 forensic, a. and n. Add: [B.] 2. ellipt. use of the adj. A forensic science department, laboratory, etc. colloq. 1963 Guardian 2 Sept. 8/5 When a police officer hisses in my ear in court, `Are you from forensic?' I no longer protest. I just weakly nod my head. 1971 W. J. BURLEY Guilt Edged iv. 74 A breakdown truck is taking it to Division. Forensic can look at it in their garage. 1991 J. NEEL Death of Partner vi. 83 `Forensic rang,' Bruce reported dourly. `The autopsy report is on its way.' Though it seems like a poor definition. My English teachers would have chided me for using the word in its definition... -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Feature freeze progress report
Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: We seem to handle trivial patches just fine. You keep saying that but I think it's wrong. There are trivial patches that were submitted last year that are still sitting in the queue. You seem to be looking at something different than me. Which patches? In fact I claim we handle complex patches better than trivial ones. HOT, LDC, DSM etc receive tons of feedback and acquire a momentum of their own. Admittedly GII is a counter-example though. Well, I claim it's often the trivial patches that require the domain-specific knowledge you describe. If they were major patches they would touch more parts of the system. But that means they should be easy to commit if you could just fill in the missing knowledge. Could you pick a non-committer with the domain-specific knowledge you think a patch needs and ask for their analysis of the patch then commit it yourself? You can still review it for general code quality and trust the non-committer's review of whether the domain-specific change is correct. We are already pushing out patches to people with domain-specific knowledge. Tom posted that summary today. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
Another complexity is testing cases where the table fits in shared memory/RAM, and those that don't, and testing cases where heap fits in RAM only because we pushed things to TOAST, and cases where we have to do lots of TOAST access that doesn't fit in RAM. I wonder if it is even worth testing it because pushing more to TOAST probably means the more frequently accessed data is in RAM. Anyway, how is going to run these tests? --- Gregory Stark wrote: Bruce Momjian [EMAIL PROTECTED] writes: Then, figure out where the gains on the non-TEXT field seem to diminish in usefulness. Basically, with a lower TOAST value, we are going to spend more time accessing the TEXT field, but the speedup for the non-TEXT field should be large enough win that we don't care. As the TEXT column becomes shorter, it has less affect on the non-TEXT access. I guess the key is to break down what it is we want to measure into several parts. These can each be measured precisely for various sized TOASTed data. Costs: 1) cost of retrieving data from TOAST pointer versus retrieving data from inline tuple. We just want the absolute time difference between the two operations, not the percentage difference. 2) cost of creating TOAST pointer (ie, inserting a new tuple with a TOAST pointer or updating a previously inlined tuple to have a TOASTed column). 3) cost of deleting a TOAST pointer (ie, deleting a tuple or updating a tuple to no longer have a TOASTed column) 3) cost of deleting a tuple with an existing TOAST pointer (or updating a tuple to be all inlined) versus deleting an plain tuple or updating a plain tuple. Savings: 1) time savings accessing a tuple without retrieving the TOAST pointer versus having to access the tuple with the data inlined. 2) time savings updating a tuple without modifying toasted data versus updating same tuple with the data inlined in both versions. The plan you described would be testing costs 1 and savings 1 but I think we need to continue to the others as well. Then the trick is to somehow make some argument about the frequency of the various operations and the acceptable tradeoff. I think you're right that the time spent accessing the data would be the most important metric. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Patch queue triage
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. This message is an attempt to sort out which patch queue entries have no hope of getting into 8.3 (and so we shouldn't spend more time on them right now), which ones can get in but are waiting on their authors for more work, and which ones are ready for immediate review. Excellent list. I have a little time available now, so I'll work on some, starting with trying to complete arrays of complex types. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Feature freeze progress report
Naz Gassiep wrote: Andrew Dunstan wrote: Naz Gassiep wrote: I believe the suggestion was to have an automated process that only ran on known, sane patches. How do we know in advance of reviewing them that they are sane? Same way as happens now. The question was rhetorical ... there is no list of certified sane but unapplied patches. You are proceeding on the basis of a faulty understanding of how our processes work. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Feature freeze progress report
Tom Lane wrote: So in a roundabout way we come back to the idea that we need a bug tracker (NOT a patch tracker), plus people putting in the effort to make sure it stays a valid source of up-to-date info. Without the latter it won't really be useful. Hallelujah Brother! BTW, a bug tracker can be used as a patch tracker, although the reverse isn't true. For example, the BZ people use BZ that way, in fact - most patches arrive as attachments to bugs. And trackers can be used just as well for tracking features as well as bugs. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] strange buildfarm failures
Tom Lane wrote: I wrote: Hmm ... I was about to say that the postmaster never sets PG_exception_stack, but maybe an error out of a PG_TRY/PG_RE_THROW could do it? Does the postmaster ever execute PG_TRY? Doh, I bet that's it, and it's not the postmaster that's at issue but PG_TRY blocks executed during subprocess startup. Inheritance of a PG_exception_stack setting from the postmaster could only happen if the postmaster were to fork() within a PG_TRY block, which I think we can safely say it doesn't. But suppose we get an elog(ERROR) inside a PG_TRY block when there is no outermost longjmp catcher. elog.c will think it should longjmp, and that will eventually lead to executing #define PG_RE_THROW() \ siglongjmp(*PG_exception_stack, 1) with PG_exception_stack = NULL; which seems entirely likely to cause a stack smash of gruesome dimensions. What's more, nothing would have been printed to the postmaster log beforehand, agreeing with observation. I agree that that would be a bug and we should fix it, but I don't think it explains the problem we're seeing because there is no PG_TRY block in the autovac startup code that I can see :-( -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Sequential scans
Hi, I'm starting to review the synchronized scans and scan-resistant buffer cache patches. The patches have complex interactions so I'm taking a holistic approach. There's four outstanding issues with the sync scans in particular: 1. The simplistic hash approach. While it's nice to not have a lock, I'm worried of collisions. If you had a collision every now and then, it wouldn't be that bad, but because the hash value is computed from the oid, a collision would be persistent. If you create a database and happen to have two frequently seqscanned tables that collide, the only way to get rid of the collision is to drop and recreate a table. Granted, that'd probably be very rare in practice, but when it happens it would be next to impossible to figure out what's going on. Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. 2. Under what circumstances does the patch help and when does it hurt? I think the patch is safe in that it should never be any worse than what we have now. But when does it help? That needs to be looked at together with the other patch. I need to dig the archives for the performance test results you posted earlier and try to understand them. There's six distinct scenarios I've come up with this far that need to be looked at: A. A seq scan on a small table B. A seq scan on a table that's 110% the size of shared_buffers, but smaller than RAM C. A seq scan on a table that's 110% the size of RAM D. A seq scan on a huge table E. Two simultaneous seq scans on a large table starting at the same time F. Two simultaneous seq scans on a large table, 2nd one starting when the 1st one is halfway through Also, does it change things if you have a bulk update instead of read-only query? How about bitmap heap scans and large index scans? And vacuums? And the above scenarios need to be considered both alone, and in the presence of other OLTP kind of workload. I realize that we can't have everything, and as long as we get some significant benefit in some scenarios, and don't hurt others, the patch is worthwhile. But let's try to cover as much as we reasonably can. One random idea I had to cover B C without having the offset variable: Start scanning *backwards* from the page that's in the shared hash table, until you hit a page that's not in buffer cache. Then you continue scanning forwards from the page you started from. This needs more thought but I think we can come up with a pretty simple solution that covers the most common cases. 3. By having different backends doing the reads, are we destroying OS readahead as Tom suggested? I remember you performed some tests on that, and it was a problem on some systems but not on others. This needs some thought, there may be some simple way to address that. 4. It fails regression tests. You get an assertion failure on the portal test. I believe that changing the direction of a scan isn't handled properly; it's probably pretty easy to fix. Jeff, could you please fix 1 and 4? I'll give 2 and 3 some more thought, and take a closer look at the scan-resistant scans patch. Any comments and ideas are welcome, of course.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Feature freeze progress report
Andrew Dunstan wrote: Tom Lane wrote: So in a roundabout way we come back to the idea that we need a bug tracker (NOT a patch tracker), plus people putting in the effort to make sure it stays a valid source of up-to-date info. Without the latter it won't really be useful. Hallelujah Brother! Amen BTW, a bug tracker can be used as a patch tracker, although the reverse isn't true. For example, the BZ people use BZ that way, in fact - most patches arrive as attachments to bugs. And trackers can be used just as well for tracking features as well as bugs. The pidgin (previously known as Gaim) guys also use it that way. They add a bug for each thing they want to change, even new features, and track the patches in there. Then they have a list of issues that should be solved for each release, so it's easy to see which ones are still missing using their Trac interface. http://developer.pidgin.im/roadmap So the status email that Tom sent yesterday would be a very simple thing to generate, just looking at the bugs to fix page. I'm not saying we should use Trac, mainly because I hate how it (doesn't) interact with email. But it does say that a bug tracker can be useful to us. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera [EMAIL PROTECTED] writes: I agree that that would be a bug and we should fix it, but I don't think it explains the problem we're seeing because there is no PG_TRY block in the autovac startup code that I can see :-( I'm wondering if there is some code path that invokes a PG_TRY deep in the bowels of the system. Anyway, I'll go fix this, and we should know soon enough if it changes the buildfarm behavior. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Feature freeze progress report
[EMAIL PROTECTED] (Andrew Dunstan) writes: Tom Lane wrote: So in a roundabout way we come back to the idea that we need a bug tracker (NOT a patch tracker), plus people putting in the effort to make sure it stays a valid source of up-to-date info. Without the latter it won't really be useful. Hallelujah Brother! BTW, a bug tracker can be used as a patch tracker, although the reverse isn't true. For example, the BZ people use BZ that way, in fact - most patches arrive as attachments to bugs. And trackers can be used just as well for tracking features as well as bugs. Well, Command Prompt set up a Bugzilla instance specifically so people could track PG bugs. If only someone took interest and started using it... -- cbbrowne,@,linuxfinances.info http://cbbrowne.com/info/lisp.html Do Roman paramedics refer to IV's as 4's? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] strange buildfarm failures
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I agree that that would be a bug and we should fix it, but I don't think it explains the problem we're seeing because there is no PG_TRY block in the autovac startup code that I can see :-( I'm wondering if there is some code path that invokes a PG_TRY deep in the bowels of the system. Well, I checked all the bowels involved in autovacuum startup. Anyway, I'll go fix this, and we should know soon enough if it changes the buildfarm behavior. Agreed. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Patch queue triage
Tom Lane wrote: * [pgsql-patches] Ctid chain following enhancement /Pavan Deolasee/ I'm not very excited about this --- it seems to me to complicate the code in some places that are not in fact performance-critical. While it doesn't seem likely to break things, I'm not in favor of reducing code readability unless a measurable performance improvement can be shown. Can we have some tests showing this is worthwhile? IIRC this patch was originally part of an old HOT patch, and it was submitted as a separate patch because it has some benefit on its own but more importantly getting it applied first would make the HOT patch slightly smaller. I'm not sure if the latest HOT patch requires or includes this change anymore. If not we should drop this. If it does, then let's deal with this before attacking the hard core of HOT. * [HACKERS] Grouped Index Tuples /Heikki Linnakangas/ * [HACKERS] Grouped Index Tuples / Clustered Indexes /Heikki Linnakangas/ Needs review. I'm not sure how many people besides Heikki have really looked at this (I know I haven't). The patch is ugly as it is. We need API changes to make it less ugly, I had hoped to discuss and reach consensus on them well before feature freeze, that's what the indexam API proposal and Stream bitmaps threads in the patch queue are all about. But those discussions and patches stalled, so the clustered index patch is still in the same ugly state. I'm afraid we're getting past due on clustered indexes. The patch isn't ready for committing as it is, and we still don't have agreement on the API changes or even on the design in general. :( * [HACKERS] Stream bitmaps /Heikki Linnakangas/ I think this is on hold unless a finished bitmap-index patch shows up; however there was some discussion of using this in support of clustered indexes, so maybe it's live anyway? Heikki? This particular thread is closely related to bitmap indexes. But see next item: * Re: [HACKERS] [PATCHES] Bitmapscan changes /Heikki Linnakangas/ I had objected to this on the grounds that it seemed to be covering only a narrow territory between HOT and bitmap indexes, but given the probability that one or even both of those won't make it, we need to give this one a second look. So: live, needs review. Are you talking about the patch I submitted at the beginning of that thread? Because the mail in the patch queue is actually about whether or not we want clustered indexes. I think the original bitmapscan changes patch I submitted is live and needs review, even if clustered indexes and bitmap indexes are rejected. It should give some performance benefit when you do bitmap ANDs with partially lossy bitmaps, and from setting bits directly in the bitmap in the indexam in one call, instead of calling amgetmulti many times. Though I never measured that. * [HACKERS] Indexam interface proposal /Heikki Linnakangas/ AFAICS this discussion died out with no actual patch submitted. This is part of clustered indexes.. This was my proposal of what the indexam API changes would be like. This patch is either live or dead together with clustered indexes. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Patch queue triage
http://www.sigaev.ru/misc/tsearch_core-0.46.gz Patch is synced with current CVS HEAD and synced with bugfixes in contrib/tsearch2 -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I agree that that would be a bug and we should fix it, but I don't think it explains the problem we're seeing because there is no PG_TRY block in the autovac startup code that I can see :-( I'm wondering if there is some code path that invokes a PG_TRY deep in the bowels of the system. Well, I checked all the bowels involved in autovacuum startup. Huh, hang on ... there is one caller, which is to set client_encoding (call_string_assign_hook uses a PG_TRY block), but it is called *after* the sigsetjmp block -- in InitPostgres. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: I'm wondering if there is some code path that invokes a PG_TRY deep in the bowels of the system. Huh, hang on ... there is one caller, which is to set client_encoding (call_string_assign_hook uses a PG_TRY block), but it is called *after* the sigsetjmp block -- in InitPostgres. While testing the PG_RE_THROW problem I noted that what I get here is a SIGSEGV crash, rather than SIGABRT as seen on Stefan's machines, so that's another hint that this may be unrelated. Still, it's clearly at risk of causing a problem as more PG_TRY's get added to the code, so I'm going to fix it anyway. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera wrote: Stefan Kaltenbrunner wrote: well - i now have a core file but it does not seem to be much worth except to prove that autovacuum seems to be the culprit: Core was generated by `postgres: autovacuum worker process '. Program terminated with signal 6, Aborted. [...] #0 0x0ed9 in ?? () warning: GDB can't find the start of the function at 0xed9. I just noticed an ugly bug in the worker code which I'm fixing. I think this one would also throw SIGSEGV, not SIGABRT. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera wrote: Alvaro Herrera wrote: Stefan Kaltenbrunner wrote: well - i now have a core file but it does not seem to be much worth except to prove that autovacuum seems to be the culprit: Core was generated by `postgres: autovacuum worker process '. Program terminated with signal 6, Aborted. [...] #0 0x0ed9 in ?? () warning: GDB can't find the start of the function at 0xed9. I just noticed an ugly bug in the worker code which I'm fixing. I think this one would also throw SIGSEGV, not SIGABRT. Nailed it -- this is the actual bug that causes the abort. But I am surprised that it doesn't print the error message in Stefan machine's; here it outputs TRAP: FailedAssertion(!unsigned long)(elem)) ShmemBase)), File: /pgsql/source/00head/src/backend/storage/ipc/shmqueue.c, Line: 107) 16496 2007-05-02 11:30:31 CLT DEBUG: server process (PID 16540) was terminated by signal 6: Aborted 16496 2007-05-02 11:30:31 CLT LOG: server process (PID 16540) was terminated by signal 6: Aborted 16496 2007-05-02 11:30:31 CLT LOG: terminating any other active server processes 16496 2007-05-02 11:30:31 CLT DEBUG: sending SIGQUIT to process 16541 16496 2007-05-02 11:30:31 CLT DEBUG: sending SIGQUIT to process 16498 16496 2007-05-02 11:30:31 CLT DEBUG: sending SIGQUIT to process 16500 16496 2007-05-02 11:30:31 CLT DEBUG: sending SIGQUIT to process 16499 16541 2007-05-02 11:30:33 CLT WARNING: terminating connection because of crash of another server process Maybe stderr is going somewhere else? That would be strange, I think. I'll commit the fix shortly; attached. -- Alvaro Herrera http://www.flickr.com/photos/alvherre/ La primera ley de las demostraciones en vivo es: no trate de usar el sistema. Escriba un guión que no toque nada para no causar daños. (Jakob Nielsen) Index: src/backend/postmaster/autovacuum.c === RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/autovacuum.c,v retrieving revision 1.42 diff -c -p -r1.42 autovacuum.c *** src/backend/postmaster/autovacuum.c 18 Apr 2007 16:44:18 - 1.42 --- src/backend/postmaster/autovacuum.c 2 May 2007 15:25:27 - *** AutoVacWorkerMain(int argc, char *argv[] *** 1407,1431 * Get the info about the database we're going to work on. */ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); ! MyWorkerInfo = (WorkerInfo) MAKE_PTR(AutoVacuumShmem-av_startingWorker); ! dbid = MyWorkerInfo-wi_dboid; ! MyWorkerInfo-wi_workerpid = MyProcPid; ! ! /* insert into the running list */ ! SHMQueueInsertBefore(AutoVacuumShmem-av_runningWorkers, ! MyWorkerInfo-wi_links); /* ! * remove from the starting pointer, so that the launcher can start a new ! * worker if required */ ! AutoVacuumShmem-av_startingWorker = INVALID_OFFSET; ! LWLockRelease(AutovacuumLock); ! on_shmem_exit(FreeWorkerInfo, 0); ! /* wake up the launcher */ ! if (AutoVacuumShmem-av_launcherpid != 0) ! kill(AutoVacuumShmem-av_launcherpid, SIGUSR1); if (OidIsValid(dbid)) { --- 1407,1442 * Get the info about the database we're going to work on. */ LWLockAcquire(AutovacuumLock, LW_EXCLUSIVE); ! /* ! * beware of startingWorker being INVALID; this could happen if the ! * launcher thinks we've taking too long to start. */ ! if (AutoVacuumShmem-av_startingWorker != INVALID_OFFSET) ! { ! MyWorkerInfo = (WorkerInfo) MAKE_PTR(AutoVacuumShmem-av_startingWorker); ! dbid = MyWorkerInfo-wi_dboid; ! MyWorkerInfo-wi_workerpid = MyProcPid; ! ! /* insert into the running list */ ! SHMQueueInsertBefore(AutoVacuumShmem-av_runningWorkers, ! MyWorkerInfo-wi_links); ! /* ! * remove from the starting pointer, so that the launcher can start a new ! * worker if required ! */ ! AutoVacuumShmem-av_startingWorker = INVALID_OFFSET; ! LWLockRelease(AutovacuumLock); ! on_shmem_exit(FreeWorkerInfo, 0); ! /* wake up the launcher */ ! if (AutoVacuumShmem-av_launcherpid != 0) ! kill(AutoVacuumShmem-av_launcherpid, SIGUSR1); ! } ! else ! /* no worker entry for me, go away */ ! LWLockRelease(AutovacuumLock); if (OidIsValid(dbid)) { *** AutoVacWorkerMain(int argc, char *argv[] *** 1466,1473 } /* ! * FIXME -- we need to notify the launcher when we are gone. But this ! * should be done after our PGPROC is released, in ProcKill. */ /* All done, go away */ --- 1477,1484 } /* ! * The launcher will be notified of my death in ProcKill, *if* we managed ! * to get a worker slot at all */ /* All done, go away */ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] strange buildfarm failures
Alvaro Herrera [EMAIL PROTECTED] writes: Nailed it -- this is the actual bug that causes the abort. But I am surprised that it doesn't print the error message in Stefan machine's; Hm, maybe we need an fflush(stderr) in ExceptionalCondition? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Fwd: [PATCHES] Preliminary GSSAPI Patches
On May 2, 2007, at 3:11 AM, Magnus Hagander wrote: As to the question of GSSAPI vs SSL, I would never argue we don't want both. Part of what made the GSSAPI encryption mods difficult was my intent to insert them above the SSL encryption/buffering layer. That way you could double-encrypt the channel. Since GSSAPI and SSL are (probably, not necessarily) referenced to completely different ID infrastructure there are scenarios where that's beneficial. We might want to consider restructuring how SSL works when we do, that might make it easier. The way it is now with #ifdefs all around can lead to a horrible mess if there are too many different things to choose from. Something like transport filters or whatever might be a way to do it. I recall having looked at that at some point, but it was too long ago to remember any details.. //Magnus If someone wants to make it easier, that would be nice, I'm not up for it, I don't think. The opinions expressed in this message are mine, not those of Caltech, JPL, NASA, or the US Government. [EMAIL PROTECTED], or [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Creating new system catalog problems.
Hello. I wanted to create a new system catalog in Postgres. So I changed the source code. Everything seems to be OK. I compiled it, but now after initdb I receive: $ initdb -D /usr/home/postgres/post1 The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale ru_RU.KOI8-R. The default database encoding has accordingly been set to KOI8. creating directory /usr/home/postgres/post1 ... ok creating subdirectories ... ok selecting default max_connections ... 40 selecting default shared_buffers/max_fsm_pages ... 28MB/179200 creating configuration files ... ok creating template1 database in /usr/home/postgres/post1/base/1 ... FATAL: could not open relation with OID 2617 child process exited with exit code 1 initdb: removing data directory /usr/home/postgres/post1 I can't understand the reason. Can somebody help me. Akmal.
Re: [HACKERS] Feature freeze progress report
On Wed, 2007-02-05 at 08:27 -0400, Andrew Dunstan wrote: Naz Gassiep wrote: Andrew Dunstan wrote: Naz Gassiep wrote: I believe the suggestion was to have an automated process that only ran on known, sane patches. How do we know in advance of reviewing them that they are sane? Same way as happens now. The question was rhetorical ... there is no list of certified sane but unapplied patches. You are proceeding on the basis of a faulty understanding of how our processes work. Why do we need to know the patch is sane? If it does not apply cleanly or causes regression tests to fail, the process would figure that out quickly and cheaply. There is little cost in attempting to apply a non-sane patch. I am not sure that I have explained exactly what I was suggesting. Some people seem to grok it, others seem to be talking something slightly different. To clarify, here it is in pseudo-code: for each patch in the queue regression_success := false patch_success := attempt to apply patch to head if patch_success regression_success := attempt to run regression tests -- (On one machine only, not on the buildfarm) end if if this is a new patch maybe mail the author and tell them patch_success and regression_success else if status is different from last time mail the author and tell them their patch has changed status end end record the status for this patch end loop __ Marc signature.asc Description: This is a digitally signed message part
[HACKERS] autovacuum starvation
Hi, The recently discovered autovacuum bug made me notice something that is possibly critical. The current autovacuum code makes an effort not to leave workers in a starting state for too long, lest there be failure to timely tend all databases needing vacuum. This is how the launching of workers works: 1) the launcher puts a pointer to a WorkerInfo entry in shared memory, called the starting worker pointer 2) the launcher sends a signal to the postmaster 3) the postmaster forks a worker 4) the new worker checks the starting worker pointer 5) the new worker resets the starting worker pointer 6) the new worker connects to the given database and vacuums it The problem is this: I originally added some code in the autovacuum launcher to check that a worker does not take too long to start. This is autovacuum_naptime seconds. If this happens, the launcher resets the starting worker pointer, which means that the newly starting worker will not see anything that needs to be done and exit quickly. The problem with this is that on a high load machine, for example lionfish during buildfarm runs, this would cause autovacuum starvation for the period in which the high load is sustained. This could prove dangerous. The problem is that things like fork() failure cannot be communicated back to the launcher. So when the postmaster tries to start a process and it fails for some reason (failure to fork, or out of memory) we need a way to re-initiate the worker that failed. The current code resets the starting worker pointer, and leave the slot free for another worker, maybe in another database, to start. I recently added code to resend the postmaster signal when the launcher sees the starting worker pointer not invalid -- step 2 above. I think this is fine, but 1) we should remove the logic to remove the starting worker pointer. It is not needed, because database-local failures will be handled by subsequent checks 2) we should leave the logic to resend the postmaster, but we should make an effort to avoid sending it too frequently Opinions? If I haven't stated the problem clearly please let me know and I'll try to rephrase. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] conversion_procs makefiles
I noticed that conversion_procs is sadly single-tasked to build. I am wondering if it would be acceptable to rework the Makefile.shlib to have an option to allow building multiple libs, by creating a rule to collect libraries to build, and have each conversion_proc Makefile add a target to that. Then the whole lot of libraries would be built by a single non-recursive make pass, instead of the current recursive approach. I don't have a patch yet but I've been playing with it a bit. I don't think I'd have time for 8.3 though. Objections? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Feature freeze progress report
On Wed, May 02, 2007 at 06:44:12AM -0400, Bruce Momjian wrote: Magnus Hagander wrote: As an example, how is patch information going to help us review HOT or group-item-index? There is frankly more information about these in the archives than someone could reasonable read. What someone needs is a summary of where we are now on the patches, and lots of time. FYI, Tom, Heikki, I need one of you to post the list of patches and where we think we are on each one, even if the list is imperfect. I think you just contradicted yourself. Information isn ot the problem, but you need more information... I think this is a fairly clear indication that we do need a better way to track this information. No, my point is that 100% information is already available by looking at email archives. Yes, and hard to find. What we need is a short description of where we are on each patch --- that is a manual process, not something that can be automated. Right. But it can be presented in a central way and incrementally updated. Tom has posted it --- tell me how we will get such a list in an automated manner. There's no way to get it automated, but you can get it incrementally updated. //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] strange buildfarm failures
Tom Lane [EMAIL PROTECTED] writes: Alvaro Herrera [EMAIL PROTECTED] writes: Nailed it -- this is the actual bug that causes the abort. But I am surprised that it doesn't print the error message in Stefan machine's; Hm, maybe we need an fflush(stderr) in ExceptionalCondition? stderr is supposed to be line-buffered by default. Couldn't hurt I suppose. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Sequential scans
On Wed, 2007-05-02 at 14:26 +0100, Heikki Linnakangas wrote: Hi, I'm starting to review the synchronized scans and scan-resistant buffer cache patches. The patches have complex interactions so I'm taking a holistic approach. There's four outstanding issues with the sync scans in particular: 1. The simplistic hash approach. While it's nice to not have a lock, I'm worried of collisions. If you had a collision every now and then, it wouldn't be that bad, but because the hash value is computed from the oid, a collision would be persistent. If you create a database and happen to have two frequently seqscanned tables that collide, the only way to get rid of the collision is to drop and recreate a table. Granted, that'd probably be very rare in practice, but when it happens it would be next to impossible to figure out what's going on. Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. What should be the maximum size of this hash table? Is there already- existing hash table code that I should use to be consistent with the rest of the code? I'm still trying to understand the effect of using the full relfilenode. Do you mean using the entire relation _segment_ as the key? That doesn't make sense to me. Or do you just mean using the relfilenode (without the segment) as the key? 3. By having different backends doing the reads, are we destroying OS readahead as Tom suggested? I remember you performed some tests on that, and it was a problem on some systems but not on others. This needs some thought, there may be some simple way to address that. Linux with CFQ I/O scheduler performs very poorly and inconsistently with concurrent sequential scans regardless of whether the scans are synchronized or not. I suspect the reason for this is that CFQ is designed to care more about the process issuing the request than any other factor. Every other I/O system performed either ideally (no interference between scans) or had some interference but still much better than current behavior. Of course, my tests are limited and there are many possible combinations of I/O systems that I did not try. 4. It fails regression tests. You get an assertion failure on the portal test. I believe that changing the direction of a scan isn't handled properly; it's probably pretty easy to fix. I will examine the code more carefully. As a first guess, is it possible that test is failing because of the non-deterministic order in which tuples are returned? Jeff, could you please fix 1 and 4? I'll give 2 and 3 some more thought, and take a closer look at the scan-resistant scans patch. Any comments and ideas are welcome, of course.. Yes. I'll also try to address the other issues in your email. Thanks for your comments. Regards, Jeff Davis ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Sequential scans
Heikki Linnakangas [EMAIL PROTECTED] writes: Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. It's probably fine to just do that. But if we find it's a performance bottleneck we could probably still manage to avoid the lock except when actually inserting a new hash element. If you just store in the hash an index into an array stored in global memory then you could get away without a lock on the element in the array. It starts to get to be a fair amount of code when you think about how you would reuse elements of the array. That's why I suggest only looking at this if down the road we find that it's a bottleneck. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Feature freeze progress report
Bruce, all, No, my point is that 100% information is already available by looking at email archives. What we need is a short description of where we are on each patch --- that is a manual process, not something that can be automated. Tom has posted it --- tell me how we will get such a list in an automated manner. Several of us have already suggested a method. If we want the information to be up-to-date, then the patch manager, or bug tracker, needs to be a required part of the approval application process, NOT an optional accessory. That is, if patches bug fixes can come in, get modified, get approved applied entirely on pgsql-patches or pgsql-bugs without ever touching the tracker tool, then the tracker tool will be permanently out of date and useless. It's going to require the people who are doing the majority of the bug hunting patch review to change the way they work, with the idea that any extra time associated with the new tool will be offset by being able to spread the work more and having information easy to find later, for you as well as others. Tom seems to be willing; are you? Status: Will be rejected unless race conditions are fixed. Needs performance testing. Discussions:links to mail threads, like in the current patch queue ... this brings up another reason we could use a tracker. I now have access to a performance testing lab and staff. However, these people are NOT going to follow 3 different high-traffic mailing lists in order to keep up with which patches to test. As a result, they haven't done much testing of 8.3 patches; they're depenant on me to keep them updated on new patch versions and known issues and I'm on the road a lot. If I had a web tool I could point them to where they could simply download the current version of the patch, test, and comment a report, we'd get a LOT more useful performance feedback from Sun. I suspect the same is true of Unisys. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Feature freeze progress report
Josh Berkus wrote: Bruce, all, No, my point is that 100% information is already available by looking at email archives. What we need is a short description of where we are on each patch --- that is a manual process, not something that can be automated. Tom has posted it --- tell me how we will get such a list in an automated manner. Several of us have already suggested a method. If we want the information to be up-to-date, then the patch manager, or bug tracker, needs to be a required part of the approval application process, NOT an optional accessory. That is, if patches bug fixes can come in, get modified, get approved applied entirely on pgsql-patches or pgsql-bugs without ever touching the tracker tool, then the tracker tool will be permanently out of date and useless. It's going to require the people who are doing the majority of the bug hunting patch review to change the way they work, with the idea that any extra time associated with the new tool will be offset by being able to spread the work more and having information easy to find later, for you as well as others. Tom seems to be willing; are you? Hello, Well according to himself the last time this came up: http://archives.postgresql.org/pgsql-hackers/2006-08/msg01253.php No, he isn't. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] autovacuum does not start in HEAD
ITAGAKI Takahiro wrote: I wrote: I found that autovacuum launcher does not launch any workers in HEAD. The attached autovacuum-fix.patch could fix the problem. I changed to use 'greater or equal' instead of 'greater' at the decision of next autovacuum target. I have committed a patch which might fix this issue in autovacuum.c rev 1.44. Please retest. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Sequential scans
Gregory Stark wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. It's probably fine to just do that. But if we find it's a performance bottleneck we could probably still manage to avoid the lock except when actually inserting a new hash element. If you just store in the hash an index into an array stored in global memory then you could get away without a lock on the element in the array. It starts to get to be a fair amount of code when you think about how you would reuse elements of the array. That's why I suggest only looking at this if down the road we find that it's a bottleneck. Another trick you could do is to use acquire the lock conditionally when updating it. But I doubt it's a problem anyhow, if we put some sane lower limit in there so that it's not used at all for small tables. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] temporal variants of generate_series()
Here's a shorter version: On the date variant, I wasn't sure how to handle intervals with parts smaller than days: floor, ceiling, round or error out To get round, the last parameters of generate_series would be extract('epoch' FROM '1 day'::interval)::bigint * round(extract('epoch' FROM $3) / extract('epoch' FROM '1 day'::interval))::bigint CREATE OR REPLACE FUNCTION generate_series ( start_ts timestamptz, end_ts timestamptz, step interval ) RETURNS SETOF timestamptz STRICT LANGUAGE sql AS $$ SELECT 'epoch'::timestamptz + s.i * '1 second'::interval AS generate_series FROM generate_series( extract('epoch' FROM $1)::bigint, extract('epoch' FROM $2)::bigint, extract('epoch' FROM $3)::bigint ) s(i); $$; CREATE OR REPLACE FUNCTION generate_series ( start_ts date, end_ts date, step interval ) RETURNS SETOF date STRICT LANGUAGE sql AS $$ SELECT ('epoch'::date + s.i * '1 second'::interval)::date AS generate_series FROM generate_series( extract('epoch' FROM $1)::bigint, extract('epoch' FROM $2)::bigint, extract('epoch' FROM date_trunc('day', $3))::bigint -- does a floor ) s(i); $$; Jean-Pierre Pelletier e-djuster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] temporal variants of generate_series()
Here's a shorter version: On the date variant, I wasn't sure how to handle intervals with parts smaller than days: floor, ceiling, round or error out To get round, the last parameters of generate_series would be extract('epoch' FROM '1 day'::interval)::bigint * round(extract('epoch' FROM $3) / extract('epoch' FROM '1 day'::interval))::bigint CREATE OR REPLACE FUNCTION generate_series ( start_ts timestamptz, end_ts timestamptz, step interval ) RETURNS SETOF timestamptz STRICT LANGUAGE sql AS $$ SELECT 'epoch'::timestamptz + s.i * '1 second'::interval AS generate_series FROM generate_series( extract('epoch' FROM $1)::bigint, extract('epoch' FROM $2)::bigint, extract('epoch' FROM $3)::bigint ) s(i); $$; CREATE OR REPLACE FUNCTION generate_series ( start_ts date, end_ts date, step interval ) RETURNS SETOF date STRICT LANGUAGE sql AS $$ SELECT ('epoch'::date + s.i * '1 second'::interval)::date AS generate_series FROM generate_series( extract('epoch' FROM $1)::bigint, extract('epoch' FROM $2)::bigint, extract('epoch' FROM date_trunc('day', $3))::bigint -- does a floor ) s(i); $$; Jean-Pierre Pelletier e-djuster ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Sequential scans
Jeff Davis wrote: On Wed, 2007-05-02 at 14:26 +0100, Heikki Linnakangas wrote: Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. What should be the maximum size of this hash table? Good question. And also, how do you remove entries from it? I guess the size should somehow be related to number of backends. Each backend will realistically be doing just 1 or max 2 seq scan at a time. It also depends on the number of large tables in the databases, but we don't have that information easily available. How about using just NBackends? That should be plenty, but wasting a few hundred bytes of memory won't hurt anyone. I think you're going to need an LRU list and counter of used entries in addition to the hash table, and when all entries are in use, remove the least recently used one. The thing to keep an eye on is that it doesn't add too much overhead or lock contention in the typical case when there's no concurrent scans. For the locking, use a LWLock. Is there already- existing hash table code that I should use to be consistent with the rest of the code? Yes, see utils/hash/dynahash.c, and ShmemInitHash (in storage/ipc/shmem.c) since it's in shared memory. There's plenty of examples that use hash tables, see for example storage/freespace/freespace.c. I'm still trying to understand the effect of using the full relfilenode. Do you mean using the entire relation _segment_ as the key? That doesn't make sense to me. Or do you just mean using the relfilenode (without the segment) as the key? No, not the segment. RelFileNode consists of tablespace oid, database oid and relation oid. You can find it in scan-rs_rd-rd_node. The segmentation works at a lower level. Linux with CFQ I/O scheduler performs very poorly and inconsistently with concurrent sequential scans regardless of whether the scans are synchronized or not. I suspect the reason for this is that CFQ is designed to care more about the process issuing the request than any other factor. Every other I/O system performed either ideally (no interference between scans) or had some interference but still much better than current behavior. Hmm. Should we care then? CFG is the default on Linux, and an average sysadmin is unlikely to change it. What we could do quite easily is: - when ReadBuffer is called, let the caller know if the read did physical I/O. - when the previous ReadBuffer didn't result in physical I/O, assume that we're not the pack leader. If the next buffer isn't already in cache, wait a few milliseconds before initiating the read, giving the pack leader a chance to do it instead. Needs testing, of course.. 4. It fails regression tests. You get an assertion failure on the portal test. I believe that changing the direction of a scan isn't handled properly; it's probably pretty easy to fix. I will examine the code more carefully. As a first guess, is it possible that test is failing because of the non-deterministic order in which tuples are returned? No, it's an assertion failure, not just different output than expected. But it's probably quite simple to fix.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Feature freeze progress report
Marc Munro wrote: On Wed, 2007-02-05 at 08:27 -0400, Andrew Dunstan wrote: Naz Gassiep wrote: Andrew Dunstan wrote: Naz Gassiep wrote: I believe the suggestion was to have an automated process that only ran on known, sane patches. How do we know in advance of reviewing them that they are sane? Same way as happens now. The question was rhetorical ... there is no list of certified sane but unapplied patches. You are proceeding on the basis of a faulty understanding of how our processes work. Why do we need to know the patch is sane? Because not doing so is dangerous and a major security hole. I won't run arbitrary code on my machine and I won't create infrastructure (e.g. buildfarm) to get others to do it either. You are also conveniently ignoring all the other reasons why this won't help anyone much (e.g. see Bruce's comments upthread). cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Optimization in convert_string_datum?
I'm reviewing the strxfrm patch, and while comparing that code to the code in varstr_cmp (which uses the same UTF8/UTF16 workaround but for strcoll instead), and I noticed that in varstr_cmp we have an optimization to use a stack based buffer instead of palloc if the string is short enough. Is convert_string_datum performance-critical enough to make it worthwhile to put a similar optimization there? //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Feature freeze progress report
Marc Munro [EMAIL PROTECTED] writes: On Wed, 2007-02-05 at 08:27 -0400, Andrew Dunstan wrote: The question was rhetorical ... there is no list of certified sane but unapplied patches. You are proceeding on the basis of a faulty understanding of how our processes work. Why do we need to know the patch is sane? If it does not apply cleanly or causes regression tests to fail, the process would figure that out quickly and cheaply. There is little cost in attempting to apply a non-sane patch. Unless it contains a trojan horse. I don't think many buildfarm owners are running the tests inside a sandbox so tight that they don't care how nasty the code that runs there might be. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Feature freeze progress report
Joshua D. Drake wrote: Well according to himself the last time this came up: http://archives.postgresql.org/pgsql-hackers/2006-08/msg01253.php No, he isn't. The last paragraph of http://archives.postgresql.org/pgsql-hackers/2007-04/msg01258.php is somewhat more positive regarding a patch tracker. Regards, Dave ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Feature freeze progress report
Chris Browne wrote: [EMAIL PROTECTED] (Andrew Dunstan) writes: Tom Lane wrote: So in a roundabout way we come back to the idea that we need a bug tracker (NOT a patch tracker), plus people putting in the effort to make sure it stays a valid source of up-to-date info. Without the latter it won't really be useful. Hallelujah Brother! BTW, a bug tracker can be used as a patch tracker, although the reverse isn't true. For example, the BZ people use BZ that way, in fact - most patches arrive as attachments to bugs. And trackers can be used just as well for tracking features as well as bugs. Well, Command Prompt set up a Bugzilla instance specifically so people could track PG bugs. If only someone took interest and started using it... I lost interest last time around when it seemed clear to me that there was not enough traction. Maybe there is this time around. The effort Tom mentions above is crucial to success. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Sequential scans
On Wed, 2007-05-02 at 20:02 +0100, Heikki Linnakangas wrote: Gregory Stark wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. It's probably fine to just do that. But if we find it's a performance bottleneck we could probably still manage to avoid the lock except when actually inserting a new hash element. If you just store in the hash an index into an array stored in global memory then you could get away without a lock on the element in the array. It starts to get to be a fair amount of code when you think about how you would reuse elements of the array. That's why I suggest only looking at this if down the road we find that it's a bottleneck. Another trick you could do is to use acquire the lock conditionally when updating it. But I doubt it's a problem anyhow, if we put some sane lower limit in there so that it's not used at all for small tables. The more sophisticated the data structure the less able we are to avoid locking, correct? For instance, if we have an LRU list it might be tricky or impossible to avoid locking even on just reads. Regards, Jeff Davis ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_index updates and SI invalidation
Pavan Deolasee [EMAIL PROTECTED] writes: Please see the attached updated patch, based on Tom's comments. Attempt to reload index information for system indexes such as pg_class_oid_index can cause infinite recursion. But I realized that we don't need to reload system index information because we neither allow CREATE INDEX or CIC on system relations. Only REINDEX is allowed which does not need any reload. So we skip index information reload for system relations. Applied with revisions --- mostly, trying to keep the comments in sync with the code. I also added a forced relcache inval on the index's parent table at the end of CREATE INDEX CONCURRENTLY; this is to flush cached plans and allow the newly valid index to be considered in replanning. (The relcache inval on the index won't do it, since by definition the index is not mentioned in any such plan...) regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Optimization in convert_string_datum?
Magnus Hagander [EMAIL PROTECTED] writes: I'm reviewing the strxfrm patch, and while comparing that code to the code in varstr_cmp (which uses the same UTF8/UTF16 workaround but for strcoll instead), and I noticed that in varstr_cmp we have an optimization to use a stack based buffer instead of palloc if the string is short enough. Is convert_string_datum performance-critical enough to make it worthwhile to put a similar optimization there? No, I don't believe so. It should only get invoked a few times per query at most, since only the planner uses it. It would be far more useful to figure out a way to make that code actually do something sane with multibyte encodings than to micro-optimize what's there. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Sequential scans
Jeff Davis wrote: On Wed, 2007-05-02 at 20:02 +0100, Heikki Linnakangas wrote: Gregory Stark wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: Let's use a normal hash table instead, and use a lock to protect it. If we only update it every 10 pages or so, the overhead should be negligible. To further reduce contention, we could modify ReadBuffer to let the caller know if the read resulted in a physical read or not, and only update the entry when a page is physically read in. That way all the synchronized scanners wouldn't be updating the same value, just the one performing the I/O. And while we're at it, let's use the full relfilenode instead of just the table oid in the hash. It's probably fine to just do that. But if we find it's a performance bottleneck we could probably still manage to avoid the lock except when actually inserting a new hash element. If you just store in the hash an index into an array stored in global memory then you could get away without a lock on the element in the array. It starts to get to be a fair amount of code when you think about how you would reuse elements of the array. That's why I suggest only looking at this if down the road we find that it's a bottleneck. Another trick you could do is to use acquire the lock conditionally when updating it. But I doubt it's a problem anyhow, if we put some sane lower limit in there so that it's not used at all for small tables. The more sophisticated the data structure the less able we are to avoid locking, correct? For instance, if we have an LRU list it might be tricky or impossible to avoid locking even on just reads. Agreed. I'm not concerned about reads, though. You only need to read from the structure once when you start a scan. It's the updates that cause most of the traffic. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Patch queue triage
On Tue, 2007-05-01 at 22:44 -0400, Tom Lane wrote: Nice summary Tom. * Re: [PATCHES] [Fwd: Deferred Transactions, Transaction Guarantee and COMMITwithout waiting] /Simon Riggs/ Simon is on the hook to submit an updated patch. I hope this one makes it in, as it looks like a really nice performance improvement for those who can use it; but the original patch seems overcomplicated. It will, but it will be next week now. Changes agreed though. * [PATCHES] Minor recovery changes /Simon Riggs/ The submission mentions open issues --- when will those be resolved? Should we apply the patch-so-far anyway? Patch is OK to go in as-is. There are open issues, but they need more thought and discussion and I regret won't happen in a reasonable timescale due to other work. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Creating new system catalog problems.
Akmal Akmalhojaev [EMAIL PROTECTED] writes: I wanted to create a new system catalog in Postgres. So I changed the source code. Everything seems to be OK. I compiled it, but now after initdb I receive: creating template1 database in /usr/home/postgres/post1/base/1 ... FATAL: could not open relation with OID 2617 I can't understand the reason. Can somebody help me. Not when you didn't show us what you changed ... 2617 is pg_operator, so a possible theory is that you did something that tries to access pg_operator before it's been created. But just adding a new table declaration in include/catalog shouldn't have that effect. I suspect you added some code you haven't mentioned. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [ADMIN] reindexdb hangs
dx k9 [EMAIL PROTECTED] writes: [ stuck reindex ] It turns out it was a temporary database and temporary table, that just wasn't there maybe it thought it was there from some type of snapshot then the next minute it was gone. Hmm, there is not any filter in ReindexDatabase() to exclude temp tables of other backends, but it sure seems like there needs to be. CLUSTER might have the same issue. I think we fixed this in VACUUM long ago, but we need to check the other commands that grovel over all of a database. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Sequential scans
On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote: Jeff Davis wrote: What should be the maximum size of this hash table? Good question. And also, how do you remove entries from it? I guess the size should somehow be related to number of backends. Each backend will realistically be doing just 1 or max 2 seq scan at a time. It also depends on the number of large tables in the databases, but we don't have that information easily available. How about using just NBackends? That should be plenty, but wasting a few hundred bytes of memory won't hurt anyone. One entry per relation, not per backend, is my current design. I think you're going to need an LRU list and counter of used entries in addition to the hash table, and when all entries are in use, remove the least recently used one. The thing to keep an eye on is that it doesn't add too much overhead or lock contention in the typical case when there's no concurrent scans. For the locking, use a LWLock. Ok. What would be the potential lock contention in the case of no concurrent scans? Also, is it easy to determine the space used by a dynahash with N entries? I haven't looked at the dynahash code yet, so perhaps this will be obvious. No, not the segment. RelFileNode consists of tablespace oid, database oid and relation oid. You can find it in scan-rs_rd-rd_node. The segmentation works at a lower level. Ok, will do. Hmm. Should we care then? CFG is the default on Linux, and an average sysadmin is unlikely to change it. Keep in mind that concurrent sequential scans with CFQ are *already* very poor. I think that alone is an interesting fact that's somewhat independent of Sync Scans. - when ReadBuffer is called, let the caller know if the read did physical I/O. - when the previous ReadBuffer didn't result in physical I/O, assume that we're not the pack leader. If the next buffer isn't already in cache, wait a few milliseconds before initiating the read, giving the pack leader a chance to do it instead. Needs testing, of course.. An interesting idea. I like that the most out of the ideas of maintaining a pack leader. That's very similar to what the Linux anticipatory scheduler does for us. 4. It fails regression tests. You get an assertion failure on the portal test. I believe that changing the direction of a scan isn't handled properly; it's probably pretty easy to fix. I will examine the code more carefully. As a first guess, is it possible that test is failing because of the non-deterministic order in which tuples are returned? No, it's an assertion failure, not just different output than expected. But it's probably quite simple to fix.. Ok, I'll find and correct it then. Regards, Jeff Davis ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Sequential scans
Jeff Davis wrote: On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote: Jeff Davis wrote: What should be the maximum size of this hash table? Good question. And also, how do you remove entries from it? I guess the size should somehow be related to number of backends. Each backend will realistically be doing just 1 or max 2 seq scan at a time. It also depends on the number of large tables in the databases, but we don't have that information easily available. How about using just NBackends? That should be plenty, but wasting a few hundred bytes of memory won't hurt anyone. One entry per relation, not per backend, is my current design. Umm, you naturally have just entry per relation, but we were talking about how many entries the table needs to hold.. You're patch had a hard-coded value of 1000 which is quite arbitrary. I think you're going to need an LRU list and counter of used entries in addition to the hash table, and when all entries are in use, remove the least recently used one. The thing to keep an eye on is that it doesn't add too much overhead or lock contention in the typical case when there's no concurrent scans. For the locking, use a LWLock. Ok. What would be the potential lock contention in the case of no concurrent scans? If you only have one seq scan in the system, there's no contention. If you have more, they will all need to acquire the lock to update the page number in the hash table, regardless of the table their scanning, so there's potential for contention. To put things into perspective, though, any time you need to evict a buffer from the buffer cache to read in a new buffer, you need to acquire the BufFreelistLock. If we only update the page number say every 10 pages, the overhead and lock contention of that lock would be roughly 1/10th of that arising from BufFreelistLock. And we can make it a conditional acquire, in which case the backends won't need to slow down their scans, but the updates of the entries in the hash table will get less frequent instead. We could also take just a shared lock to update the counter, and rely on the atomicity of the write like your patch did. You'd only need to take an exclusive lock to move entries in the LRU list or to add or remove an entry. But let's keep it simple at first, and do some testing.. Also, is it easy to determine the space used by a dynahash with N entries? I haven't looked at the dynahash code yet, so perhaps this will be obvious. Yes, see hash_estimate_size. BTW: Thanks for the patch! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Avoiding unnecessary reads in recovery
Heikki Linnakangas [EMAIL PROTECTED] writes: What we could have is the semantics of Return a buffer, with either correct contents or completely zeroed out. It would act just like ReadBuffer if the buffer was already in memory, and zero out the page otherwise. That's a bit strange semantics to have, but is simple to implement and works for the use-cases we've been talking about. Patch implementing that attached. I named the function ReadOrZeroBuffer. Applied. BTW, I realized that there is a potential issue created by this, which is that the smgr level might see a write for a page that it never saw a read for. I don't think there are any bad consequences of this ATM, but it is skating around the edges of some bugs we've had previously with relation extension. In particular ReadOrZeroBuffer avoids the error that would normally occur if one tries to read a page that's beyond the logical EOF; and if the page is subsequently modified and written, md.c is likely to get confused/unhappy, particularly if the page is beyond the next segment boundary. This isn't a problem in XLogReadBuffer's usage because it carefully checks the EOF position before trying to use ReadOrZeroBuffer, but it's a limitation other callers will need to think about. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Sequential scans
On Wed, 2007-05-02 at 23:59 +0100, Heikki Linnakangas wrote: Jeff Davis wrote: On Wed, 2007-05-02 at 20:58 +0100, Heikki Linnakangas wrote: Jeff Davis wrote: What should be the maximum size of this hash table? Good question. And also, how do you remove entries from it? I guess the size should somehow be related to number of backends. Each backend will realistically be doing just 1 or max 2 seq scan at a time. It also depends on the number of large tables in the databases, but we don't have that information easily available. How about using just NBackends? That should be plenty, but wasting a few hundred bytes of memory won't hurt anyone. One entry per relation, not per backend, is my current design. Umm, you naturally have just entry per relation, but we were talking about how many entries the table needs to hold.. You're patch had a hard-coded value of 1000 which is quite arbitrary. I think I'm missing something from your statement I guess the size should somehow be related to number of backends. If there is only one entry per relation, why do more backends require a larger hash table? If you only have one seq scan in the system, there's no contention. If you have more, they will all need to acquire the lock to update the page number in the hash table, regardless of the table their scanning, so there's potential for contention. To put things into perspective, though, any time you need to evict a buffer from the buffer cache to read in a new buffer, you need to acquire the BufFreelistLock. If we only update the page number say every 10 pages, the overhead and lock contention of that lock would be roughly 1/10th of that arising from BufFreelistLock. And we can make it a conditional acquire, in which case the backends won't need to slow down their scans, but the updates of the entries in the hash table will get less frequent instead. We could also take just a shared lock to update the counter, and rely on the atomicity of the write like your patch did. You'd only need to take an exclusive lock to move entries in the LRU list or to add or remove an entry. If I just step back for a second to consider a simpler design: We will most likely have no more than a few relations larger than the minimum threshold for Sync Scanning in any database being scanned concurrently. What if I just make an LRU that occupies a fixed size? Any reads or updates can start at the MRU item and work backward, and any evictions can happen at the LRU item. Does a hash table really save us cycles if we keep the list small, say, 100 items? Looking at all 100 items would only be required perhaps on a scan startup. I don't have a good sense of scale here, is following 50 pointers while holding a lock a significant source of contention? 100 is of course arbitrary, and that could grow or shrink automatically at runtime. Yes, see hash_estimate_size. Ok. Regards, Jeff Davis ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Feature freeze progress report
Gregory Stark [EMAIL PROTECTED] writes: You keep saying that but I think it's wrong. There are trivial patches that were submitted last year that are still sitting in the queue. Um ... which ones exactly? I don't see *anything* in the current queue that is utterly without issues, other than Heikki's ReadOrZeroBuffer patch which certainly doesn't date from last year (and besides, has now been applied). I'm a bit more worried about stuff that may have slid through the cracks, like the Darwin SysV-vs-Posix-semaphores patch that I complained of in my triage listing. But the stuff that is listed on Bruce's patch queue page is not trivial. It's either large or has got problems. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Heap page diagnostic functions
Gregory Stark [EMAIL PROTECTED] writes: Incidentally, Tom, were you consulting the dead-tree edition of the OED? I plead guilty to having blown some dust off it before opening it ... but when looking for an authoritative reference, I like things that have got heft to them ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Optimization in convert_string_datum?
I'm reviewing the strxfrm patch, and while comparing that code to the code in varstr_cmp (which uses the same UTF8/UTF16 workaround but for strcoll instead), and I noticed that in varstr_cmp we have an optimization to use a stack based buffer instead of palloc if the string is short enough. Is convert_string_datum performance-critical enough to make it worthwhile to put a similar optimization there? No, I don't believe so. It should only get invoked a few times per query at most, since only the planner uses it. ok. It would be far more useful to figure out a way to make that code actually do something sane with multibyte encodings than to micro-optimize what's there. I'm not volunteering to do that - at least not now ;) /Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend