Re: [HACKERS] Where to hook my custom access control module?
On Wed, 9 May 2007, Andrew Dunstan wrote: Karel Gardas wrote: [...] I'd like to look at a possibility of integrating OpenPMF (http://www.openpmf.org) with the PostgreSQL. [...] Exactly what is open about this product other than the name? It looks closed and proprietary to me. It was free software (GPL) at the beginning, but we were not able to attract community around it, it seems it was put to public too early, so we decided to put it back to closed source and especially improve its platform support. Hence I'm asking here for any authorization hook inside PGSQL. Thanks, Karel -- Karel Gardas [EMAIL PROTECTED] ObjectSecurity Ltd. http://www.objectsecurity.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Tom Lane írta: After some more study of the SQL spec, the distinction between GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY is not what I thought it was. * As far as I can find from the spec, there is *no* difference between the two cases for INSERT commands. The rule is that you ignore any user-supplied data and use the default (ie, nextval()) unless OVERRIDING SYSTEM VALUE is specified. It is not an error to try to insert data into an identity column, it's just ignored unless OVERRIDING SYSTEM VALUE. * The difference for UPDATE commands is that you can update a BY DEFAULT identity column to anything you want, whereas for an ALWAYS identity it's an error to update to anything but DEFAULT (which causes a fresh nextval() to be assigned). Both behaviors are different from a generated column, which is updated whether you mention it or not. The quoted SIGMOD paper mentioned that specifying a value for a generated column should raise an error in INSERT but this behaviour is not mentioned by the standard. BTW, do you know what's a self-referencing column? I haven't found a definition of it and there are places where the standard uses this term on behaviour that would be natural for generated columns. E.g. page 860 in latest drafts, section 10.14, or SQL:2003, section 14.8, about INSERT statement: the value the user specified should be stored if some underlying column of Ci is a self-referencing column and OVERRIDING SYSTEM VALUE is specified. This means that GENERATED BY DEFAULT AS IDENTITY is not at all equivalent to our historical behavior for SERIAL columns and hence we cannot merge the two cases. Yes, they are equivalent if you read 5IWD2-02-Foundation-2006-04.pdf or 5CD2-02-Foundation-2006-01.pdf, i.e. the latest two drafts. (The latter seems to be misnamed considering that www.wiscorp.com refreshed the sql200n.zip on 2007-03-11.) Page 860, section 14.10, INSERT. The value the user provides should be accepted for storage if: - the column is an identity column and you provide OVERRIDING SYSTEM VALUE, or - the column is an GENERATED BY DEFAULT AS IDENTITY and you provide neither OVERRIDING USER VALUE nor the DEFAULT specification for the column. I think the babble about OVERRIDING USER VALUE is somewhat controversial. Why would you want to do e.g. INSERT INTO tabname (id, ...) OVERRIDING USER VALUE (N, ...); where N is an explicit constant? And I haven't even implemented handling it. Anyway, without specifying OVERRIDING USER VALUE the GENERATED BY DEFAULT AS IDENTITY is equivalent with traditional SERIAL in PostgreSQL. Implementing OVERRIDING USER VALUE behaviour means that GENERATED BY DEFAULT AS IDENTITY (or SERIAL) would be marked as an identity as well, not as a column simply having a DEFAULT clause. Otherwise OVERRIDING USER VALUE would override every user-specified value for regular columns having a DEFAULT expression. The lack of any behavioral difference for INSERT seems surprising and counterintuitive; have I just missed something in the spec? No, I was just ahead of the times and read newer drafts than SQL:2003. BTW, I found what they did about the problem that generated columns are out of sync with their underlying columns during BEFORE-trigger execution: in 11.39 12)If BEFORE is specified, then: ... c) The triggered action shall not contain a field reference that references a field in the new transition variable corresponding to a generated column of T. I vaguely remember reading it, although the idea seem to have remained in my mind. :-) IOW they just pretend you can't look. So I think we need not worry about leaving the values out-of-date until after the triggers fire. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Zoltan Boszormenyi írta: The quoted SIGMOD paper mentioned that specifying a value for a generated column should raise an error in INSERT but this behaviour is not mentioned by the standard. I found it now, I haven't read hard enough before. SQL:2003, section 14.8, syntax rules: 10) If contextually typed table value constructor CTTVC is specified, then every contextually typed row value constructor element simply contained in CTTVC whose positionally corresponding column name in insert column list references a column of which some underlying column is a generated column shall be a default specification. So, I can only omit the generated column or specify DEFAULT. Anything else should raise an error. Should it be done in analyze.c or in rewriteHandler.c? -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PERFORM] Cannot make GIN intarray index be used by the planner
Hello again, I got the opclass for the index and it looks like it is a default one myvideoindex=# select pg_opclass.*, pg_type.typname myvideoindex-# from pg_index, pg_opclass, pg_type myvideoindex-# where pg_index.indexrelid = 'idx_nonnulls_myintarray_int4_gin'::regclass myvideoindex-#and pg_opclass.oid = any (pg_index.indclass::oid[] ) myvideoindex-#and pg_type.oid = pg_opclass.opcintype; opcamid | opcname | opcnamespace | opcowner | opcintype | opcdefault | opckeytype | typname -+---+--+--+---+++- 2742 | _int4_ops | 11 | 10 | 1007 | t | 23 | _int4 (1 row) The search_path is set to the following myvideoindex=# show search_path; search_path versionA, public (1 row) With best regards, -- Valentine On 5/9/07, Tom Lane [EMAIL PROTECTED] wrote: [cc'ing to pgsql-hackers since this is looking like a contrib/intarray bug] Valentine Gogichashvili [EMAIL PROTECTED] writes: here is the DT That works fine for me in 8.2: regression=# explain SELECT id, (myintarray_int4) FROM myintarray_table_nonulls WHERE ARRAY[8] @ myintarray_int4; QUERY PLAN -- Index Scan using idx_nonnulls_myintarray_int4_gin on myintarray_table_nonulls (cost=0.00..8.27 rows=1 width=36) Index Cond: ('{8}'::integer[] @ myintarray_int4) (2 rows) What I am betting is that you've installed contrib/intarray in this database and that's bollixed things up somehow. In particular, intarray tries to take over the position of default gin opclass for int4[], and the opclass that it installs as default has operators named just like the built-in ones. If somehow your query is using pg_catalog.@ instead of intarray's public.@, then the planner wouldn't think the index is relevant. In a quick test your example still works with intarray installed, because what it's really created is public.@ (integer[], integer[]) which is an exact match and therefore takes precedence over the built-in pg_catalog.@ (anyarray, anyarray). But if for example you don't have public in your search_path then the wrong operator would be chosen. Please look at the pg_index entry for your index, eg select * from pg_index where indexrelid = 'versionA.idx_nonnulls_myintarray_int4_gin'::regclass; and see whether the index opclass is the built-in one or not. Note to hackers: we've already discussed that intarray shouldn't be trying to take over the default gin opclass, but I am beginning to wonder if it still has a reason to live at all. We should at least consider removing the redundant operators to avoid risks like this one. regards, tom lane -- ვალენტინ გოგიჩაშვილი Valentine Gogichashvili
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
On Wed, May 09, 2007 at 12:46:52PM +0100, Dave Page wrote: Oh, hang on... Vista's new 'security' features include popups that ask permission from the user before running any installers. One of the more basic checks they use is the filename - *anything* called setup.exe will cause user confirmation to be required before it will run. I believe for non-interactive sessions it'll just refuse to run. I just tried running update.exe myself, and yes, you guessed it, a user confirmation dialog popped up :-( Seems to be a little bit braindead to me. But anyway, I renamed it and just committed the changes. Let's see if this works. Michael P.S.: More on the other problem later. -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Michael Meskes wrote: On Wed, May 09, 2007 at 12:46:52PM +0100, Dave Page wrote: Oh, hang on... Vista's new 'security' features include popups that ask permission from the user before running any installers. One of the more basic checks they use is the filename - *anything* called setup.exe will cause user confirmation to be required before it will run. I believe for non-interactive sessions it'll just refuse to run. I just tried running update.exe myself, and yes, you guessed it, a user confirmation dialog popped up :-( Seems to be a little bit braindead to me. Yeah - according to Microsoft we should include a manifest with the executable these days that can prevent the check by specifying that administrative privileges won't be needed by the executable - but that involves us adding version resources to the exe, and generating the manifest during build which seems somewhat over the top for a quick regression test. But anyway, I renamed it and just committed the changes. Let's see if this works. Thanks. P.S.: More on the other problem later. OK. Regards, Dave. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Seq scans roadmap
In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page - invalidate used pages of in the scan so they do not force out other useful pages A few comments regarding the effects: I do not see how this speedup could be caused by readahead, so what are the effects ? (It should make no difference to do the CPU work for count(*) inbetween reading each block when the pages are not dirtied) Is the improvement solely reduced CPU because no search for a free buffer is needed and/or L2 cache locality ? What effect does the advance pinnig have, avoid vacuum ? A 16 x 8k page ring is too small to allow the needed IO blocksize of 256k. The readahead is done 4 x one page at a time (=32k). What is the reasoning behind 1/4 ring for readahead (why not 1/2), is 3/4 the trail for followers and bgwriter ? I think in anticipation of doing a single IO call for more that one page, the KillAndReadBuffer function should be split into two parts. One that does the killing for n pages, and one that does the reading for n pages. Killing n before reading n would also have the positive effect of grouping perhaps needed writes (not interleaving them with the reads). I think the 60% Nbuffers is a very good starting point. I would only introduce a GUC when we see evidence that it is needed (I agree with Simon's partitioning comments, but I'd still wait and see). Andreas ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Seq scans roadmap
Zeugswetter Andreas ADI SD wrote: In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page - invalidate used pages of in the scan so they do not force out other useful pages A few comments regarding the effects: I do not see how this speedup could be caused by readahead, so what are the effects ? I was wondering that as well. We'd really need to test all the changes separately to see where the improvements are really coming from. Also, that patch doesn't address the VACUUM issue at all. And using a small fixed size ring with scans that do updates can be devastating. I'm experimenting with different ring sizes for COPY at the moment. Too small ring leads to a lot of WAL flushes, it's basically the same problem we have with VACUUM in CVS HEAD. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Seq scans roadmap
Also, that patch doesn't address the VACUUM issue at all. And using a small fixed size ring with scans that do updates can be devastating. I'm experimenting with different ring sizes for COPY at the moment. Too small ring leads to a lot of WAL flushes, it's basically the same problem we have with VACUUM in CVS HEAD. My first take on that would be to simply abandon any dirty (and actually also any still pinned) buffer from the ring and replace the ring slot with a buffer from the freelist. If the freelist is empty and LSN allows writing the buffer, write it (and maybe try to group these). If the LSN does not allow the write, replace the slot with a buffer from LRU. Andreas ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Seq scans roadmap
Zeugswetter Andreas ADI SD wrote: Also, that patch doesn't address the VACUUM issue at all. And using a small fixed size ring with scans that do updates can be devastating. I'm experimenting with different ring sizes for COPY at the moment. Too small ring leads to a lot of WAL flushes, it's basically the same problem we have with VACUUM in CVS HEAD. My first take on that would be to simply abandon any dirty (and actually also any still pinned) buffer from the ring and replace the ring slot with a buffer from the freelist. If the freelist is empty and LSN allows writing the buffer, write it (and maybe try to group these). If the LSN does not allow the write, replace the slot with a buffer from LRU. That would effectively disable the ring for COPY and the 2nd phase of VACUUM. One problem with looking at the LSN is that you need the content lock to read it, and I wouldn't want to add any new locking. It could be done inside FlushBuffer when we hold the lock anyway, but I'm afraid the changes would be pretty invasive. I'm struggling to get a grip of what the optimal ring size is under various circumstances. Some thoughts I have this far: - a small ring gives better L2 cache behavior - for read-only queries, and for queries that just hint bits, 1 buffer is enough - small ring with query that writes WAL (COPY, mass updates, 2nd phase of VACUUM) leads to a lot of WAL flushes, which can become bottleneck. But all these assumptions need to be validated. I'm setting up tests with different ring sizes and queries to get a clear picture of this: - VACUUM on a clean table - VACUUM on a table with 1 dead tuple per page - read-only scan, large table - read-only scan, table fits in OS cache - COPY In addition, I'm going to run VACUUM in a DBT-2 test to see the affect on other queries running concurrently. I think a ring that grows when WAL flushes occur covers all the use cases reasonably well, but I need to do the testing... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Hiroshi Inoue wrote: Alvaro Herrera wrote: Robert Treat wrote: On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: Maybe it's BSD which is different from the license of psqlodbc (LGPL). Is there no problem with their coexistence ? Or is it possible for psqlodbc to be LGPL entirely ? I am having difficulty in understanding what the problem is. My understanding is that using BSD licensed code is ok in an LGPL project, but (probably) not vice versa. To my knowledge you can do it either way, as long as you remember that any changes to the lgpl code have to be released. It's generally a very bad idea for a BSD licensed project to include lgpl licensed code because people who try and use your work in thier own projects, under the assumption that it really is bsd licensed, get bitten when they find out that they have now illegally included code that is licensed via some other license. Of course, the developer who owns the LGPL-licensed copyright is free to relicense his work under a different license, so if the ODBC developers want to contribute code to Postgres they can give their work under the Postgres license. (They must obtain permission from all the involved developers, obviously). There are no original developers in the project now and I don't know where or how they are now. I personally am not so eager to change the license to BSD because it has been LGPL too long. Yes, that is a problem for releasing old code whose developers are long gone. (What I was thinking was copying *new* code from psqlodbc into Postgres). Oppositely I thought we can implement the BSD licensed autoconf macros by ourselves but I'm not sure how it can be considered as *not derived*. ISTM it would be necessary to get legal advice to be sure that it would be considered not derived, but one would think that that's too much hassle for something that can be done much more simply by including the differently-licensed files in the first place, which is legal anyway. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Tom Lane wrote: Hiroshi Inoue [EMAIL PROTECTED] writes: Robert Treat wrote: It's generally a very bad idea for a BSD licensed project to include lgpl licensed code Psqlodbc package is LGPL licensed and seems to have little problem to include copy of BSD licensed code as a part of it. Right, that direction is fine, it's the other way around that's problematic (because adding some BSD code adds no new restrictions on what users can do with an overall-LGPL project, whereas having some LGPL components in a supposedly BSD project does limit what they can do with it). I don't see any reason why you shouldn't include those PG autoconf macros in psqlodbc. You just need to document that they have a BSD license, in case someone wants to use them by themselves. Yes I am in that direction. You may see some confusing words in my other posts but they are all I once thought. Thanks for your clarification. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Zoltan Boszormenyi [EMAIL PROTECTED] writes: Tom Lane írta: This means that GENERATED BY DEFAULT AS IDENTITY is not at all equivalent to our historical behavior for SERIAL columns and hence we cannot merge the two cases. Yes, they are equivalent if you read 5IWD2-02-Foundation-2006-04.pdf or 5CD2-02-Foundation-2006-01.pdf, i.e. the latest two drafts. Hm. So what we've got here is that the committee has decided the 2003 spec is broken, and they may someday come out with a revised definition that might, or might not, bear any resemblance to the current 200n working papers. According to some off-list discussion, nobody is entirely sure what the current draft is trying to say anyway. That about tears it for me: I think we should reject at least the IDENTITY parts of this patch, and very likely the entire thing. I've spent more than three days now trying to get it into a committable form, time I can't really afford to spend right now on a patch that adds such a marginal feature. AFAICS the only actual new feature that IDENTITY adds is the ability to make the default expression silently override user-specified insertion data, as in fact was the use-case argued by you here: http://archives.postgresql.org/pgsql-hackers/2006-08/msg00038.php Now we find that as the spec is actually written, it doesn't work that way: if you try to specify a value other than DEFAULT for an inserted column, you get a syntax error, unless you specify OVERRIDING SYSTEM VALUE or OVERRIDING USER VALUE, neither of which are going to be found in legacy apps ported from other DBMSes, and neither of which add any actual new functionality (if you have to write OVERRIDING USER VALUE, you could just as well not specify the column). So I'm seeing a lot of complexity and a lot of confusion added for not much, not even considering the risks of trying to track a moving standards target. As for GENERATED ALWAYS AS (expr), now that we understand that it's not supposed to define a virtual column, what's the point? You can get the same behavior with a trivial BEFORE INSERT/UPDATE trigger that recomputes the derived value, and you don't have to buy into the rather ill-defined spec behavior (in particular the point that the generated column is effectively undefined during trigger firing seems really poorly done). In fact, given that the only plausible use-cases involve expressions that are expensive to compute, a trigger can probably do *better* than the built-in feature, since it can make use of application knowledge about when a recomputation is really necessary. The current patch recomputes the expression on every UPDATE, and would have a hard time being any brighter than that, given that we don't know what BEFORE triggers might do to the row. So at this point I'm feeling that we've poured a lot of effort down a hole :-(. We are not really close to having a patch that implements the current 200n draft (in particular note that OVERRIDING USER VALUE is not a no-op according to this morning's understanding of the draft). Even if we were, I'd vote against implementing a part of the draft that's clearly still in flux --- if they change it again, we'd be stuck. But the real bottom line is that I don't see enough use-case for these features to justify the work done already, let alone a bunch more work. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory
Alvaro Herrera wrote: Hiroshi Inoue wrote: Alvaro Herrera wrote: Robert Treat wrote: On Monday 07 May 2007 15:52, Joshua D. Drake wrote: Andrew Dunstan wrote: Hiroshi Inoue wrote: snip Of course, the developer who owns the LGPL-licensed copyright is free to relicense his work under a different license, so if the ODBC developers want to contribute code to Postgres they can give their work under the Postgres license. (They must obtain permission from all the involved developers, obviously). There are no original developers in the project now and I don't know where or how they are now. I personally am not so eager to change the license to BSD because it has been LGPL too long. Yes, that is a problem for releasing old code whose developers are long gone. (What I was thinking was copying *new* code from psqlodbc into Postgres). What do you mean by *new* code? New line?, word? or other kind of classification? Oppositely I thought we can implement the BSD licensed autoconf macros by ourselves but I'm not sure how it can be considered as *not derived*. ISTM it would be necessary to get legal advice to be sure that it would be considered not derived, but one would think that that's too much hassle for something that can be done much more simply by including the differently-licensed files in the first place, which is legal anyway. OK I understand. Thanks. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
On Thu, May 10, 2007 at 11:05:39AM +0100, Dave Page wrote: P.S.: More on the other problem later. OK. Dave, I just committed some small changes to get additional error logging. Hopefully this enables me to find out where exactly the error is coming up. If possible could you please restart a run on vista? Thanks Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [pgsql-es-ayuda] Error SSL en Postgresql
Just got the report below in pgsql-es-ayuda. He says his logs files are filled with 50 MB of the error message LOG: SSL SYSCALL error: Unknown winsock error 10004 I looked up 10004 and apparently it's WSAEINTR, according to http://msdn2.microsoft.com/en-us/library/aa924071.aspx Interrupted function call. This error is returned when a socket is closed or a process is terminated, on a pending Winsock operation for that socket. I wonder if the SSL code should be rather interpreting this error code as one of those for which it must retry. It seems strange that it be named EINTR (which sounds to me like operation interrupted) and then talk about closed sockets and terminated processes. This is Windows XP, Postgres 8.2.3. Henry escribió: hola ke tal, estoy en una bd de prueba con postgresql 8.2.3 y Winxp, y he generado un certificado SSL con nombre server.crt y server.key en la carpeta Data, y establecido la opcion a SSL=True no se como de la nada comenzo a darme malos resultados las funciones de postgres, me refiero a los valores de retorno cuando doy una ojeada en los log y me doy con la sorpresa que tengo como 5 archivos .txt que pesan 10MB y cada linea que ocupa un archivo dice: LOG: SSL SYSCALL error: Unknown winsock error 10004 LOG: SSL SYSCALL error: Unknown winsock error 10004 LOG: SSL SYSCALL error: Unknown winsock error 10004 LOG: SSL SYSCALL error: Unknown winsock error 10004 -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Michael Meskes wrote: On Thu, May 10, 2007 at 11:05:39AM +0100, Dave Page wrote: P.S.: More on the other problem later. OK. Dave, I just committed some small changes to get additional error logging. Hopefully this enables me to find out where exactly the error is coming up. If possible could you please restart a run on vista? Running now - I won't have access to the machine again until Monday now though. Regards, Dave ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Seq scans roadmap
Heikki Linnakangas wrote: But all these assumptions need to be validated. I'm setting up tests with different ring sizes and queries to get a clear picture of this: - VACUUM on a clean table - VACUUM on a table with 1 dead tuple per page - read-only scan, large table - read-only scan, table fits in OS cache - COPY Just to keep you guys informed, here's my results on a read-only scan on a table bigger than shared_buffers but smaller than RAM: select-1| 00:00:10.853831 select-1| 00:00:10.380667 select-1| 00:00:11.530528 select-2| 00:00:08.634105 select-2| 00:00:02.674084 select-4| 00:00:02.65664 select-8| 00:00:02.662922 select-16 | 00:00:02.682475 select-32 | 00:00:02.693163 select-64 | 00:00:02.722031 select-128 | 00:00:02.873645 select-256 | 00:00:03.185586 select-512 | 00:00:03.534285 select-1024 | 00:00:03.741867 lshw utility tells me that this server has 32KB of L1 cache and 4MB of L2 cache. The performance starts to drop between 64-128 buffers, which is 512 - 1024 KB, so I'm not sure how it's related to cache size but using a small number of buffers is clearly better than using a large number. However, it caught me by total surprise that the performance with 1 buffer is so horrible. Using 2 buffers is enough to avoid whatever the issue is with just 1 buffer. I have no idea what's causing that. There must be some interaction that I don't understand. All the numbers are quite repeatable, I ran the same test script many times. The runtime of the first select-2 test however varied between 3-10 seconds, somehow the bad karma from using just 1 buffer in the earlier test carries over to the next test. I'm not sure what to think about this, but I'll set up more test scenarios with VACUUM and COPY. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Question concerning failed installation of Postgres 8.2.3 on Windows XP
Dear mailing list, I have got a problem installing Postgres 8.2.3 on Windows XP Pro having all the latest updates. All postgres installations, folders and the serviceaccount have been removed before so that it was clean installation. The following steps lead to a reproducible result on a few machines but not on all: I chose English as installer language and deselected all database drivers (JDBC, NPSQL, ...) for installation. Then I just clicked Next in the following installer pages leaving all options as is. I entered passwords for both the superuser and the serviceaccount. I selected German (Germany) as locale and UTF8 as encoding. Near the end of the installation a message box appears reporting the following error: Failed to connect to the database. Procedural languages files are installed, but are not activated in any databases. These are the critical lines written by msiexec in its log file when encountering the error: MSI (s) (C0:E0) [15:17:00:531]: Executing op: ActionStart(Name=InstallPl,Description=Activating procedural languages...,) Action 15:17:00: InstallPl. Activating procedural languages... MSI (s) (C0:E0) [15:17:00:531]: Executing op: CustomActionSchedule(Action=InstallPl,ActionType=1025,Source=BinaryData,[EMAIL PROTECTED],CustomActionData=1033;postgres;kdsahp;5432;C:\Programme\PostgreSQL\8.2\share\;1;;;) MSI (s) (C0:48) [15:17:00:593]: Invoking remote custom action. DLL: C:\WINDOWS\Installer\MSI77.tmp, Entrypoint: [EMAIL PROTECTED] MSI (c) (D8:DC) [15:17:15:095]: Note: 1: 2205 2: 3: Error MSI (c) (D8:DC) [15:17:15:095]: Note: 1: 2228 2: 3: Error 4: SELECT `Message` FROM `Error` WHERE `Error` = 2867 DEBUG: Error 2867: The error dialog property is not set The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2867. The arguments are: , , MSI (c) (D8:DC) [15:17:15:095]: Font created. Charset: Req=0, Ret=0, Font: Req=MS Shell Dlg, Ret=MS Shell Dlg MSI (c) (D8:DC) [15:18:24:198]: Note: 1: 2205 2: 3: Error MSI (c) (D8:DC) [15:18:24:214]: Note: 1: 2228 2: 3: Error 4: SELECT `Message` FROM `Error` WHERE `Error` = 2888 DEBUG: Error 2888: Executing the TextStyle view failed The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2888. The arguments are: TextStyle, , MSI (c) (D8:DC) [15:18:38:028]: Note: 1: 2205 2: 3: Error MSI (c) (D8:DC) [15:18:38:028]: Note: 1: 2228 2: 3: Error 4: SELECT `Message` FROM `Error` WHERE `Error` = 2888 DEBUG: Error 2888: Executing the TextStyle view failed The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2888. The arguments are: TextStyle, , MSI (c) (D8:DC) [15:18:51:531]: Note: 1: 2205 2: 3: Error MSI (c) (D8:DC) [15:18:51:531]: Note: 1: 2228 2: 3: Error 4: SELECT `Message` FROM `Error` WHERE `Error` = 2888 DEBUG: Error 2888: Executing the TextStyle view failed The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2888. The arguments are: TextStyle, , MSI (c) (D8:DC) [15:21:42:346]: Note: 1: 2205 2: 3: Error MSI (c) (D8:DC) [15:21:42:346]: Note: 1: 2228 2: 3: Error 4: SELECT `Message` FROM `Error` WHERE `Error` = 2888 DEBUG: Error 2888: Executing the TextStyle view failed The installer has encountered an unexpected error installing this package. This may indicate a problem with this package. The error code is 2888. The arguments are: TextStyle, , Failed to connect to the database. Procedural languages files are installed, but are not activated in any databases. -- Could you please tell me how to work around or avoid this error? Thank you very much in advance for your help! Best regards, Sven ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Seq scans roadmap
Heikki Linnakangas wrote: However, it caught me by total surprise that the performance with 1 buffer is so horrible. Using 2 buffers is enough to avoid whatever the issue is with just 1 buffer. I have no idea what's causing that. There must be some interaction that I don't understand. Ok, I found the reason for that. I was using this query for the selects: SELECT COUNT(*) FROM (SELECT 1 FROM stock_copytest LIMIT 1000) AS a; Stock_copytest is larger than RAM size, that's why I used the LIMIT to make the result set memory resident. That had the side effect that apparently the limit-node kept the single buffer pinned which defeated the buffer ring completely. To avoid issues like that we apparently want to use 2-4 buffers instead of just 1. I'll review my test methodology and keep testing... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Seq scans roadmap
The patch has no effect on scans that do updates. The KillAndReadBuffer routine does not force out a buffer if the dirty bit is set. So updated pages revert to the current performance characteristics. -cktan GreenPlum, Inc. On May 10, 2007, at 5:22 AM, Heikki Linnakangas wrote: Zeugswetter Andreas ADI SD wrote: In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page - invalidate used pages of in the scan so they do not force out other useful pages A few comments regarding the effects: I do not see how this speedup could be caused by readahead, so what are the effects ? I was wondering that as well. We'd really need to test all the changes separately to see where the improvements are really coming from. Also, that patch doesn't address the VACUUM issue at all. And using a small fixed size ring with scans that do updates can be devastating. I'm experimenting with different ring sizes for COPY at the moment. Too small ring leads to a lot of WAL flushes, it's basically the same problem we have with VACUUM in CVS HEAD. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Planning large IN lists
When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. In a test case that someone sent me privately, a simple query involving two cheap joins and a ~1800 element IN list in the WHERE clause requires about 100ms to plan but only ~10 ms to execute -- about 85% of the total runtime is spent in scalararraysel(). (I'd include the profiling data, but KCacheGrind seems stubbornly opposed to providing a textual summary of its results...) Clearly, the current approach is fine when the array is small -- perhaps for arrays above a certain number of elements, we could switch to randomly sampling array elements, estimating their selectivities, and then using that information to infer the estimated selectivity of the entire array expression. That seems fairly crude, though: does anyone have any better ideas? -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Seq scans roadmap
Sorry, 16x8K page ring is too small indeed. The reason we selected 16 is because greenplum db runs on 32K page size, so we are indeed reading 128K at a time. The #pages in the ring should be made relative to the page size, so you achieve 128K per read. Also agree that KillAndReadBuffer could be split into a KillPinDontRead(), and ReadThesePinnedPages() functions. However, we are thinking of AIO and would rather see a ReadNPagesAsync() function. -cktan Greenplum, Inc. On May 10, 2007, at 3:14 AM, Zeugswetter Andreas ADI SD wrote: In reference to the seq scans roadmap, I have just submitted a patch that addresses some of the concerns. The patch does this: 1. for small relation (smaller than 60% of bufferpool), use the current logic 2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page - invalidate used pages of in the scan so they do not force out other useful pages A few comments regarding the effects: I do not see how this speedup could be caused by readahead, so what are the effects ? (It should make no difference to do the CPU work for count(*) inbetween reading each block when the pages are not dirtied) Is the improvement solely reduced CPU because no search for a free buffer is needed and/or L2 cache locality ? What effect does the advance pinnig have, avoid vacuum ? A 16 x 8k page ring is too small to allow the needed IO blocksize of 256k. The readahead is done 4 x one page at a time (=32k). What is the reasoning behind 1/4 ring for readahead (why not 1/2), is 3/4 the trail for followers and bgwriter ? I think in anticipation of doing a single IO call for more that one page, the KillAndReadBuffer function should be split into two parts. One that does the killing for n pages, and one that does the reading for n pages. Killing n before reading n would also have the positive effect of grouping perhaps needed writes (not interleaving them with the reads). I think the 60% Nbuffers is a very good starting point. I would only introduce a GUC when we see evidence that it is needed (I agree with Simon's partitioning comments, but I'd still wait and see). Andreas ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Planning large IN lists
Neil Conway wrote: Clearly, the current approach is fine when the array is small -- perhaps for arrays above a certain number of elements, we could switch to randomly sampling array elements, estimating their selectivities, and then using that information to infer the estimated selectivity of the entire array expression. That seems fairly crude, though: does anyone have any better ideas? Optimizer hints in SQL /me ducks and runs for cover. regards, Lukas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Planning large IN lists
Neil Conway [EMAIL PROTECTED] writes: When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Planning large IN lists
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Thursday, May 10, 2007 11:53 AM To: Neil Conway Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Planning large IN lists Neil Conway [EMAIL PROTECTED] writes: When planning queries with a large IN expression in the WHERE clause, the planner transforms the IN list into a scalar array expression. In clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr by calling scalararraysel(), which in turn estimates the selectivity of *each* array element in order to determine the selectivity of the array expression as a whole. This is quite inefficient when the IN list is large. That's the least of the problems. We really ought to convert such cases into an IN (VALUES(...)) type of query, since often repeated indexscans aren't the best implementation. It seems to me that if you have a unique index on the in list column, then the problem is simplified. In that case, you just have to estimate how many index seeks cost more than a table scan. Usually, it's around 5-10% of the table size for the average database. Not sure how it works out in PostgreSQL. So in the special case of an in list on a unique indexed column, compare the cardinality of the table with the number of in list items and decide to table scan or index seek based on that. For arbitrary queries, it seems that it would be necessary to keep histograms for the columns in question. Perhaps it could be collected with an advanced analyze option. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Windows Vista support (Buildfarm Vaquita)
Dave Page wrote: Michael Meskes wrote: On Thu, May 10, 2007 at 11:05:39AM +0100, Dave Page wrote: P.S.: More on the other problem later. OK. Dave, I just committed some small changes to get additional error logging. Hopefully this enables me to find out where exactly the error is coming up. If possible could you please restart a run on vista? Running now - I won't have access to the machine again until Monday now though. It's passing what was the update test now which is good - thanks :-). Here's the results of the run, showing just the dt_test failure: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=vaquitadt=2007-05-10%2015:55:16 Regards, Dave. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Feature lists for 8.3 and 8.4
On May 8, 7:16 am, [EMAIL PROTECTED] (Abraham, Danny) wrote: Hi, I am migrating from Sybase to Postgres. 3. Who is the guy to ask about a feature like startup migrate in ORACLE? You could check out EnterpriseDB, which is based on Postgres and provides an Oracle compatibility layer. That gives you room to breath while you migrate to a PostgreSQL native implementation. I've never used it, but I hear good things. Andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
On Thu, 2007-05-10 at 10:11 -0400, Tom Lane wrote: As for GENERATED ALWAYS AS (expr), now that we understand that it's not supposed to define a virtual column, what's the point? You can get the same behavior with a trivial BEFORE INSERT/UPDATE trigger that recomputes the derived value, and you don't have to buy into the rather ill-defined spec behavior (in particular the point that the generated column is effectively undefined during trigger firing seems really poorly done). In fact, given that the only plausible use-cases involve expressions that are expensive to compute, a trigger can probably do *better* than the built-in feature, since it can make use of application knowledge about when a recomputation is really necessary. The current patch recomputes the expression on every UPDATE, and would have a hard time being any brighter than that, given that we don't know what BEFORE triggers might do to the row. We do need virtual columns, whether the spec requires them or not. They would allow us to completely remove the column value when using value-list based partitioning, giving considerable space savings for VLDBs. ISTM that we should interpret this as a requirement for a virtual column. We can always move from that to a stored column if the spec becomes more specific, though it would be harder to move the other way. And as you point out, storing the value would make no sense. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Feature lists for 8.3 and 8.4
On Tue, 2007-05-08 at 17:16 +0300, Abraham, Danny wrote: I am migrating from Sybase to Postgres. trying to get a hold on the issue of future releases feature list. 1. Where can I find the feature list for 8.3 ? When is it expected? 2. And for 8.4? 3. Who is the guy to ask about a feature like startup migrate in ORACLE? There is some planning done for each release, though reacting to requirements is also a large part of how things work here. What do you need? In-place upgrades are possible, but only when the on-disk format stays the same. It is set to change for 8.3 -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Simon Riggs [EMAIL PROTECTED] writes: On Thu, 2007-05-10 at 10:11 -0400, Tom Lane wrote: As for GENERATED ALWAYS AS (expr), now that we understand that it's not supposed to define a virtual column, what's the point? We do need virtual columns, whether the spec requires them or not. Agreed, they seem more useful than what the spec's got in mind. You can fake a virtual column using a view, but it's a whole lot more painful than faking a GENERATED column using a trigger (at least if you wish the view to be updatable). ISTM that we should interpret this as a requirement for a virtual column. We can always move from that to a stored column if the spec becomes more specific, though it would be harder to move the other way. If you're suggesting commandeering the spec's GENERATED ALWAYS syntax to represent virtual columns, when the committee has made it clear that that's not what they intend, I say that's sheer folly. What will you do when they tweak the spec to the point where a virtual column clearly doesn't satisfy it? If we want a nonstandard feature we should use a nonstandard syntax for it. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly