Re: [HACKERS] An idea for parallelizing COPY within one backend
Hi, Le mercredi 27 février 2008, Florian G. Pflug a écrit : Upon reception of a COPY INTO command, a backend would .) Fork off a dealer and N worker processes that take over the client connection. The dealer distributes lines received from the client to the N workes, while the original backend receives them as tuples back from the workers. This looks so much like what pgloader does now (version 2.3.0~dev2, release candidate) at the client side, when configured for it, that I can't help answering the mail :) http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading section_threads = N split_file_reading = False Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! Regards, -- dim signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Two Coverity Scan volunteers needed
On Tue, Feb 26, 2008 at 02:57:12PM -0800, Joshua D. Drake wrote: If we get volunteers set up, they will start running it daily. Would there be a way to script the responses to flag us for things that are important? There was (briefly) a way for them to send emails whenever something new was detected. That was kinda useful. However, the number of false positives is quite large. Maybe it got better but last time I checked (a while back admittedly) it didn't notice the ereport(ERROR,...) never returned. It is possible to export results, and I did that once for all the ECPG errors so the developers could fix them. Looking at the latest results it has a lot of warnings about dead-code in libstemmer, which is not entirely surprising given that it's generated code. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Those who make peaceful revolution impossible will make violent revolution inevitable. -- John F Kennedy signature.asc Description: Digital signature
Re: [HACKERS] pg_dump additional options for performance
On Tue, 2008-02-26 at 20:14 +, Gregory Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Simon Riggs [EMAIL PROTECTED] writes: I've not been advocating improving pg_restore, which is where the -Fc tricks come in. ... I see you thought I meant pg_restore. I don't thinking extending pg_restore in that way is of sufficiently generic use to make it worthwhile. Extending psql would be worth it, since not all psql scripts come from pg_dump. OK, the reason I didn't grasp what you are proposing is that it's insane. We can easily, and backwards-compatibly, improve pg_restore to do concurrent restores. Trying to make psql do something like this will require a complete rewrite, and there is no prospect that it will work for any input that didn't come from (an updated version of) pg_dump anyway. I didn't read everything in the thread previously so I'm not sure if this is what Simon had in mind. But I think one thing that could be done in parallel even in psql scripts is index builds. That doesn't help speed up COPY but it does speed up a case where we currently are limited by only occupying a single cpu. And I would expect it to play well With synchronized scans too. The complete rewrite in this case would be the concurrent psql patch I submitted a while back. I think it's a bit of a mess right now because I was trying to chase down some bugs with sigint handling so I've been thinking of rewriting it. I think this is a low-hanging fruit which would help a lot of users. The ability to load multiple COPY dumps would be the other piece of the puzzle but personally I'm not sure how to tackle that. The current design for concurrent psql includes commands that say which session a command should be run on. Switches between sessions are explicit. That is good, but prevents us from easily saying use N sessions to make it go faster because we already hardwired the commands to the sessions. If we able to express dependency info then we would be able to alter the amount of parallelism. That would require us to * identify each command * identify its dependents possibly like this psql id=5 absolute-dependents=3,4 Some SQL... /psql the current default behaviour is this psql id=5 relative-dependents=-1 ... That's a leap ahead of concurrent psql. I'd rather we had concurrent psql as it is now than attempt to leap ahead too far, but the dependent definition approach seems likely to yield benefits in the long run. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] pg_dump additional options for performance
Le mardi 26 février 2008, Tom Lane a écrit : Or in more practical terms in this case, we have to balance speed against potentially-large costs in maintainability, datatype extensibility, and suchlike issues if we are going to try to get more than percentage points out of straight COPY. Could COPY begin with checking the table type involved and use some internal knowledge about -core types to avoid extensibility costs, if any? Ok that sounds as a maintainability cost :) Or maybe just provide an option to pg_dump to force usage of binary COPY format, which then allow pg_restore to skip alltogether the data parsing. If that's not the case (no data parsing), maybe it's time for another COPY format to be invented? On the binary compatibility between architectures, I'm wondering whether using pg_dump in binary format from the new architecture couldn't be a solution. Of course, when you only have the binary archives, lost server A and need to get the data to server B which do not share the A architecture, you're not in a comfortable situation. But pg_dump binary option would make clear you don't want to use it for your regular backups... And it wouldn't help the case when data is not coming from PostgreSQL. It could still be a common enough use case to bother? Just trying to put some ideas in the game, hoping this is more helpful than not, -- dim They did not know it was impossible, so they did it! -- Mark Twain signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Proposed changes to DTrace probe implementation
On Tue, Feb 26, 2008 at 03:48:28PM -0600, Robert Lor wrote: Gregory Stark wrote: I think both types of probes are useful to different people. I think certain higher level probes can be really useful to DBAs. Perhaps looking at the standard database SNMP MIB counters would give us a place to start for upward facing events people want to trace for databases in general. Great idea. I found this link for public RDBMS MIB http://www.mnlab.cs.depaul.edu/cgi-bin/sbrowser.cgi?HOST=OID=RDBMS-MIB!rdbmsMIB The stats in rdbmsSrvInfoTable is quite useful, and it's one of the tables that Oracle implements in their SNMP support. http://download-east.oracle.com/docs/cd/B14099_19/manage.1012/b16244/appdx_d_rdbms.htm Incidentally, most of that's already supported by the pg snmp provider, through the stats system. //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] win32 build problem (cvs, msvc 2005 express)
On Wed, Feb 27, 2008 at 02:02:29AM -0800, craigp wrote: I'm having trouble compiling the current cvs version on windows xp (msvc 2005 express). Compile errors below. Did you by any chance use a tree that's been sitting around for a long time? Like sometime earlier in the 8.3 series. We had a problem like that which was fixed, but therem ight be an old file sitting around since then. Try deleting parse.h in include/parser - it's not supposed to be there. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] win32 build problem (cvs, msvc 2005 express)
I'm having trouble compiling the current cvs version on windows xp (msvc 2005 express). Compile errors below. I have bison 1.875 (I can't find 2.2+ for windows) and flex 2.5.4. These tools seem to generate correct outputs. It looks like it might be including parse.h from include/parser/parse.h instead of src/backend/parser/parse.h. Are these two files supposed to be identical? NULLS_P, etc is defined in the latter, but not the former. If I manually copy the file over, compilation succeeds (still struggling with linking). Here are the compile errors: Build started: Project: postgres, Configuration: Release|Win32 Running bison on src\backend\parser\gram.y 1 file(s) copied. Running flex on src\backend\parser\scan.l Compiling... parser.c .\src\backend\parser\parser.c(100): error C2065: 'NULLS_P' : undeclared identifier .\src\backend\parser\parser.c(100): error C2051: case expression not constant .\src\backend\parser\parser.c(111): error C2065: 'NULLS_FIRST' : undeclared identifier .\src\backend\parser\parser.c(114): error C2065: 'NULLS_LAST' : undeclared identifier keywords.c .\src\backend\parser\keywords.c(45): error C2065: 'ALWAYS' : undeclared identifier .\src\backend\parser\keywords.c(45): error C2099: initializer is not a constant .\src\backend\parser\keywords.c(90): error C2065: 'CONFIGURATION' : undeclared identifier .\src\backend\parser\keywords.c(90): error C2099: initializer is not a constant .\src\backend\parser\keywords.c(94): error C2065: 'CONTENT_P' : undeclared identifier .\src\backend\parser\keywords.c(94): error C2099: initializer is not a constant thanks! --craig Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_dump additional options for performance
Le mardi 26 février 2008, Joshua D. Drake a écrit : Think 100GB+ of data that's in a CSV or delimited file. Right now the best import path is with COPY, but it won't execute very fast as a single process. Splitting the file manually will take a long time (time that could be spend loading instead) and substantially increase disk usage, so the ideal approach would figure out how to load in parallel across all available CPUs against that single file. You mean load from position? That would be very, very cool. Did I mention pgloader now does exactly this when configured like this: http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading section_threads = N split_file_reading = True IIRC, Simon and Greg Smith asked for pgloader to get those parallel loading features in order to have some first results and ideas about the performance gain, as a first step in the parallel COPY backend implementation design. Hope this helps, -- dim signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] An idea for parallelizing COPY within one backend
On Wed, 2008-02-27 at 09:09 +0100, Dimitri Fontaine wrote: Hi, Le mercredi 27 février 2008, Florian G. Pflug a écrit : Upon reception of a COPY INTO command, a backend would .) Fork off a dealer and N worker processes that take over the client connection. The dealer distributes lines received from the client to the N workes, while the original backend receives them as tuples back from the workers. This looks so much like what pgloader does now (version 2.3.0~dev2, release candidate) at the client side, when configured for it, that I can't help answering the mail :) http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading section_threads = N split_file_reading = False Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! ISTM the external parallelization approach is more likely to help us avoid bottlenecks, so I support Dimitri's approach. We also need error handling which pgloader also has. Writing error handling and parallelization into COPY isn't going to be easy, and not very justifiable either if we already have both. There might be a reason to re-write it in C one day, but that will be fairly easy task if we ever need to do it. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] OSSP can be used in the windows environment now!
What exactly is needed for building the required libuuid files? rom what I can tell, the authorh as no binaries available, correct? It builds with mingw only? Or with msvc? does the mingw build generate all the required libraries for the msvc build as well? (sorry, I'm on a win64 box right now, so mingw doesn't work at all, so I can't test that right now) Also, documentation updates will be needed, but I can handle those. //Magnus On Mon, Feb 25, 2008 at 06:44:07PM +0900, Hiroshi Saito wrote: Hi. Please check it. build is successful for it in my environment. Thanks! Regards, Hiroshi Saito - Original Message - From: Hiroshi Saito [EMAIL PROTECTED] Hi. From: Magnus Hagander [EMAIL PROTECTED] we can include in the next release.:-) Thanks! Good news. Can you provide a patch for the msvc build system to build with it? We can't really ship it in the next release if we can't build with it ;-) Ahh Ok, I try it first and need to check clear environment. Probably, Monday will come by the reason I'm very busy. Thanks! Regards, Hiroshi Saito ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Full text search - altering the default parser
The default parser doesn't allow commas in numbers (I can see why, I think). SELECT ts_parse('default', '123,000'); ts_parse -- (22,123) (12,,) (22,000) One option of course is to pre-process the text, but since we can support custom parsers I thought I'd take a look at the code to teach it some flexibility on numbers. I'm guessing this would be of interest to anyone wanting to support European-style , decimal indicators too. My C is horribly rusty, so can I check I've got this right? Before I start exploring compiler errors I've not seen for decades ;-) The parser functions (prsd_xxx) are all defined in backend/tsearch/wparser_def.c The state machine is driven through the TParserStateActionItem definitions on lines 644 - 1263. Changing one of these will change the definition of the corresponding token-type. To add a new token-type, I'd add it to the various lists line 30-194, then add the relevant TParserStateActionItems. Thanks -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Proposal: wildcards in pg_service.conf
I'd like to extend the libpq service file by allowing wildcards, e.g. like this: [%] host=dbhost.mycompany.com dbname=% Such an entry would match all service parameters, and all ocurrences of the wildcard right of a = would be replaced with the service parameter. That implies that a [%] entry is only useful as last entry in pg_service.conf. I'd like to keep it simple and only accept standalone wildcards (not things like 'prefix%suffix'), but that's debatable (at the cost of more work). As a special case, I would want to allow wildcards in LDAP URLs like this: [%] ldap://ldap.mycompany.com/cn=%,cn=databases?connectstring?one?objectclass=* This also explains my preference of '%' over '*' as a wildcard to avoid problems with * in LDAP filter expressions. The added value is that you don't need one line per database in the configuration file. The two examples above illustrate two cases where I think that such a setup could be useful: a) You have most of your databases in one cluster. b) You use LDAP to store your connection parameters centrally. What do you think? Yours, Laurenz Albe ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] An idea for parallelizing COPY within one backend
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: ... Neither the dealer, nor the workers would need access to the either the shared memory or the disk, thereby not messing with the one backend is one transaction is one session dogma. ... Unfortunately, this idea has far too narrow a view of what a datatype input function might do. Just for starters, consider enum input, which certainly requires catalog access. We have also explicitly acknowledged the idea that datatype I/O functions might try to store typmod-related data in some special catalog somewhere. Hm... how many in-core datatypes are there which need catalog access in their input or output functions? Maybe we could change the API for i/o functions in a way that allows us to request all needed information to be cached? regards, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] OSSP can be used in the windows environment now!
Hi. - Original Message - From: Magnus Hagander [EMAIL PROTECTED] What exactly is needed for building the required libuuid files? rom what I can tell, the authorh as no binaries available, correct? Yes, both can be built. however, msvc-build is not official though regrettable. but, It can be built without source-code changing. It is created as our module. http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/ I have checked by examination. It builds with mingw only? Or with msvc? does the mingw build generate all the required libraries for the msvc build as well? (sorry, I'm on a win64 box right now, so mingw doesn't work at all, so I can't test that right now) Therefore, It has built in the environment of both now. Also, documentation updates will be needed, but I can handle those. Yeah:-) Thanks. Regards, Hiroshi Saito ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Required make version
Peter Eisentraut wrote: Using the order-only prerequisites feature, which is what is failing with the old make version, solves item 1). The alternative is your suggestion If the dependencies need to stay as they are, maybe we could avoid the annoyance by having make not print the echo command. but I'm not a friend of hiding commands because you don't know what is going on and it will come back to bite you. How about we use order-only prerequisite only if present, and use the ugly or undesirable way as fallback? I see that you can find out if your Make version supports it by checking .FEATURES. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] An idea for parallelizing COPY within one backend
Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! As far as I understand, pgloader starts N threads or processes that open up N individual connections to the server. In that case, moving then text-binary conversion from the backend into the loader won't give any additional performace I'd say. The reason that I'd love some within-one-backend solution is that I'd allow you to utilize more than one CPU for a restore within a *single* transaction. This is something that a client-side solution won't be able to deliver, unless major changes to the architecture of postgres happen first... regards, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Required make version
Alvaro Herrera wrote: How about we use order-only prerequisite only if present, and use the ugly or undesirable way as fallback? I see that you can find out if your Make version supports it by checking .FEATURES. I think this can be used with a conditional like ifneq (,$(findstring order-only,$(.FEATURES))) ... endif I am not sure I understand the problem being complained about to propose a more specific patch. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] select avanced
I have the following table: Objeto Valor ob1 10 ob1 20 ob2 50 ob2 10 ob3 50 With the following command: select distinct Objeto, sum(valor) from tb group by Objeto; I have to return: Objeto Valor ob1 30 ob2 60 ob3 50 What you need to do is add ob2 and ob3 in one field, Leading me the following table: Objeto Valor ob1 30 ob2e3 110 It can do this only with Select? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] An idea for parallelizing COPY within one backend
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: ... Neither the dealer, nor the workers would need access to the either the shared memory or the disk, thereby not messing with the one backend is one transaction is one session dogma. ... Unfortunately, this idea has far too narrow a view of what a datatype input function might do. Just for starters, consider enum input, which certainly requires catalog access. We have also explicitly acknowledged the idea that datatype I/O functions might try to store typmod-related data in some special catalog somewhere. regards, tom lane Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? I'm a big user of copy, generally into very simple tables- few indexes, simple column types (numeric, varchar, and int almost exclusively), no fancy features. A parallel copy input in the simple cases would be of great advantage to me, even if it doesn't parallelize complicated cases. Brian
Re: [HACKERS] An idea for parallelizing COPY within one backend
A.M. wrote: On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: The reason that I'd love some within-one-backend solution is that I'd allow you to utilize more than one CPU for a restore within a *single* transaction. This is something that a client-side solution won't be able to deliver, unless major changes to the architecture of postgres happen first... It seems like multiple backends should be able to take advantage of 2PC for transaction safety. Yeah, but it wouldn't take advantage of, say, the hack to disable WAL when the table was created in the same transaction. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] select avanced
[EMAIL PROTECTED] wrote: I have the following table: The hackers list is for development of the PostgreSQL database itself. Please try reposting on the general or sql mailing lists. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] idea: simple variadic functions in SQL and PL/pgSQL
Hello On 26/02/2008, Andrew Dunstan [EMAIL PROTECTED] wrote: Pavel Stehule wrote: Hello, I found easy implementation of variadic functions. It's based on adapation FuncnameGetCandidates. When I found variadic function, then I should create accurate number of last arguments (diff between pronargs and nargs). Variadic function can be signed via flag or via some pseudotype. Flag is better - allows variadic arguments of any type. In static languages (like SQL or PL/pgSQL) variadic variables can ba accessed via array (variadic arguments can be only nonarray). This isn't problem in C language, there are arguments available directly. There are a whole slew of questions around this, ISTM. sure. It's time to thing about it. For example: What will be the type inferred for the array of variadic args in plpgsql? SQL and PL/SQL should be limited. In my prototype I use fragment: /* replace ANYNONARRAYS argument with ARRAY OF some */ argtype = get_array_type(argtype); if (!OidIsValid(argtype)) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(could not determine actual array type for variadic argument))); SPI is unlimited (C language). I can use list of arrays in perl or python - there I don't see complications too. Domains and arrays can be problem. Using variadic array variables isn't typical. So I will to solve only domains. But it's more general problem. Maybe 8.4 can support array of domains. If we are going to do this I think we need some discussion on design before we rush into it. yes, I agree. Variadic functions are simple. And I belive it can simplify life to some programmers that have to manage functions like: foo(int), foo(int, int) foo(int, int, int)... regards Pavel cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] proposal: plpgsql return execute ...
Hello I thing RETURN QUERY is successful idea. It should be completed with support of dynamic SQL. Syntax: RETURN EXECUTE sqlstring [USING]; This is shortcut for FOR r IN EXECUTE sqlstring USING LOOP RETURN NEXT r; END LOOP; Regards Pavel Stehule ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] OSSP can be used in the windows environment now!
On Wed, Feb 27, 2008 at 09:46:14PM +0900, Hiroshi Saito wrote: What exactly is needed for building the required libuuid files? rom what I can tell, the authorh as no binaries available, correct? Yes, both can be built. however, msvc-build is not official though regrettable. but, It can be built without source-code changing. It is created as our module. http://winpg.jp/~saito/pg_work/OSSP_win32/msvc/ I have checked by examination. Ok. Do you know if there are any plans to include this in the distribution? I would make life a whole lot easier. If not, perhaps we should include the win32.mak file in a subdir to our uuid module? I found a couple of other problems with your patch, but i'ev been able to fix those. Building a test with msvc now, and will write documentation. //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] An idea for parallelizing COPY within one backend
On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! As far as I understand, pgloader starts N threads or processes that open up N individual connections to the server. In that case, moving then text-binary conversion from the backend into the loader won't give any additional performace I'd say. The reason that I'd love some within-one-backend solution is that I'd allow you to utilize more than one CPU for a restore within a *single* transaction. This is something that a client-side solution won't be able to deliver, unless major changes to the architecture of postgres happen first... It seems like multiple backends should be able to take advantage of 2PC for transaction safety. Cheers, M ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] An idea for parallelizing COPY within one backend
A.M. wrote: On Feb 27, 2008, at 9:11 AM, Florian G. Pflug wrote: Dimitri Fontaine wrote: Of course, the backends still have to parse the input given by pgloader, which only pre-processes data. I'm not sure having the client prepare the data some more (binary format or whatever) is a wise idea, as you mentionned and wrt Tom's follow-up. But maybe I'm all wrong, so I'm all ears! As far as I understand, pgloader starts N threads or processes that open up N individual connections to the server. In that case, moving then text-binary conversion from the backend into the loader won't give any additional performace I'd say. The reason that I'd love some within-one-backend solution is that I'd allow you to utilize more than one CPU for a restore within a *single* transaction. This is something that a client-side solution won't be able to deliver, unless major changes to the architecture of postgres happen first... It seems like multiple backends should be able to take advantage of 2PC for transaction safety. Yes, whatever is coordinating the multiple backends (a master backend? i haven't followed this thread closely) would then have to have logic to finish the prepared transactions if you crash after you've committed one but not all of them. IOW, it would need a mini transaction log of its own. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: wildcards in pg_service.conf
Albe Laurenz [EMAIL PROTECTED] writes: I'd like to extend the libpq service file by allowing wildcards, e.g. like this: [%] host=dbhost.mycompany.com dbname=% Such an entry would match all service parameters, and all ocurrences of the wildcard right of a = would be replaced with the service parameter. This seems poorly thought out. How could you have any other service entries besides this one? What is the point of doing it like this and not simply overriding the service's database selection? The special case for LDAP makes it even more obvious that this is a kluge. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] An idea for parallelizing COPY within one backend
Brian Hurt wrote: Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: ... Neither the dealer, nor the workers would need access to the either the shared memory or the disk, thereby not messing with the one backend is one transaction is one session dogma. ... Unfortunately, this idea has far too narrow a view of what a datatype input function might do. Just for starters, consider enum input, which certainly requires catalog access. We have also explicitly acknowledged the idea that datatype I/O functions might try to store typmod-related data in some special catalog somewhere. Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY performance after adding an ENUM column to his table. I was thinking more along the line of letting a datatype specify a function void* ioprepare(typmod) which returns some opaque object specifying all that the input and output function needs to know. We could than establish the rule that input/output functions may not access the catalog, and instead pass them a pointer to that opaque object. All pretty pie-in-the-sky at the moment, though... regards, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposed changes to DTrace probe implementation
but putting these and other counters in context is what could be missing. Correlating a given (set of) stats with others (possible outside of the application domain) is one of the assets offered by DTrace. Besides the generic transaction begin/start/end it could also be helpful to see the number of parses,binds,executes per transaction, user, connection etc. And yes, I feel Tom is right in fearing that these things can be used in creative ways. However is this not true for most benchmarks/ results when ones objective is to show how perfect/better/whatever product/platform A behaves/is compared to B, C, etc... One benefit for generalizing a subset of the DTrace probes is the possibility of creating generic DTrace scripts that can be used for many database installations. DTrace is great, programming DTrace scripts is fun (my view, and sure I am biased as a Sun employee :-), but it is not likely to be something a dba would like to master. The availability of generic scripts does add value. BTW I wonder if we could somehow combine DTrace as a contextual tool with the counters provided through the stats interface. Any insight/ ideas? --Paul. On 27-feb-2008, at 10:28, Magnus Hagander wrote: On Tue, Feb 26, 2008 at 03:48:28PM -0600, Robert Lor wrote: Gregory Stark wrote: I think both types of probes are useful to different people. I think certain higher level probes can be really useful to DBAs. Perhaps looking at the standard database SNMP MIB counters would give us a place to start for upward facing events people want to trace for databases in general. Great idea. I found this link for public RDBMS MIB http://www.mnlab.cs.depaul.edu/cgi-bin/sbrowser.cgi? HOST=OID=RDBMS-MIB!rdbmsMIB The stats in rdbmsSrvInfoTable is quite useful, and it's one of the tables that Oracle implements in their SNMP support. http://download-east.oracle.com/docs/cd/B14099_19/manage.1012/ b16244/appdx_d_rdbms.htm Incidentally, most of that's already supported by the pg snmp provider, through the stats system. //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq - Paul van den Bogaard [EMAIL PROTECTED] ISV-E -- ISV Engineering, Opensource Engineering group Sun Microsystems, Inc phone:+31 334 515 918 Saturnus 1 extentsion: x (70)15918 3824 ME Amersfoort mobile: +31 651 913 354 The Netherlands fax:+31 334 515 001 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] An idea for parallelizing COPY within one backend
On Wed, Feb 27, 2008 at 9:26 PM, Florian G. Pflug [EMAIL PROTECTED] wrote: I was thinking more along the line of letting a datatype specify a function void* ioprepare(typmod) which returns some opaque object specifying all that the input and output function needs to know. We could than establish the rule that input/output functions may not access the catalog, and instead pass them a pointer to that opaque object. Callers of IO functions don't always know which type they're dealing with - we had to go to some lengths to pass type information along with the enum value itself so that it could be looked up in the syscache in the output function. I think the main culprits are the P/Ls, but I think there was a security related concern about passing the type through to the IO function as well. If you want to do something like this, it would certainly be possible to cache the enum info for a particular type, but you might want to have a separate set of io functions just for this case. On the plus side, if such a cache were to be used by IO generally, we could reimplement enums to just store the ordinal on disk and save a couple of bytes, like I wanted but was unable to do the first time around. :) Enums are an easy case, though, as there's very little data to deal with. I don't know about other UDTs out there - do any require more extensive catalog access? Cheers Tom ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] OSSP can be used in the windows environment now!
Hi. - Original Message - From: Magnus Hagander [EMAIL PROTECTED] Ok. Do you know if there are any plans to include this in the distribution? I would make life a whole lot easier. If not, perhaps we should include the win32.mak file in a subdir to our uuid module? Ahh, I don't have a good idea... build of MinGW is required before win32.mak. Probably, it needs to be written to a document. I found a couple of other problems with your patch, but i'ev been able to fix those. Building a test with msvc now, and will write documentation. Thanks! Regards, Hiroshi Saito ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] An idea for parallelizing COPY within one backend
Alvaro Herrera [EMAIL PROTECTED] writes: Yeah, but it wouldn't take advantage of, say, the hack to disable WAL when the table was created in the same transaction. In the context of a parallelizing pg_restore this would be fairly easy to get around. We could probably teach the thing to combine table creation and loading steps into one action (transaction) in most cases. If that couldn't work because of some weird dependency chain, the data loading transaction could be done as BEGIN; TRUNCATE table; COPY table FROM stdin; ... COMMIT; which I believe already invokes the no-WAL optimization, and could certainly be made to do so if not. Obviously, pg_restore would have to be aware of whether or not it had created that table in the current run (else it mustn't TRUNCATE), but it would be tracking more or less exactly that info anyway to handle dependency ordering. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] An idea for parallelizing COPY within one backend
Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY performance after adding an ENUM column to his table. I am yet to be convinced that this is even theoretically a good path to follow. Any sufficiently large table could probably be partitioned and then we could use the parallelism that is being discussed for pg_restore without any modification to the backend at all. Similar tricks could be played by an external bulk loader for third party data sources. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] An idea for parallelizing COPY within one backend
Andrew Dunstan wrote: Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY performance after adding an ENUM column to his table. I am yet to be convinced that this is even theoretically a good path to follow. Any sufficiently large table could probably be partitioned and then we could use the parallelism that is being discussed for pg_restore without any modification to the backend at all. Similar tricks could be played by an external bulk loader for third party data sources. I was just floating this as an idea- I don't know enough about the backend to know if it was a good idea or not, it sounds like not. Brian ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] ResourceOwners for Snapshots? holdable portals
Hi, I'm toying around with the idea of tracking snaphots more accurately to be able to advance Xmin for read committed transactions. I think it's relatively easy to do it in the straightforward way, which is to just add destroy snapshots in the spots where a snapshot variable goes out of scope. However, I've been thinking in doing it in a little more elaborate (and, AFAICS, better) way: having the ResourceOwner code be responsible for keeping track of snapshots. Offhand I don't see any big problem with that, althought I admit I haven't yet tried any code. One thing that jumps at me, however, is the handling of holdable portals. We currently just copy the portal's content into a Materialize node, and let the snapshot go away at transaction's end. This works, but ISTM we could improve that by keeping track of the portal's snapshot separately from the transaction -- that is to say, to hang it from the portal's ResourceOwner. This would allow us to avoid the Materialize node altogether, and just keep the xmin back until the portal's gone. Vacuum would, of course, not be able to clean up rows needed by the portal. I don't see this as a problem, but rather as an improvement. Thoughts? Also, is there anything else on the whole Snapshots-on-ResourceOwners idea that could be a showstopper? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] An idea for parallelizing COPY within one backend
Andrew Dunstan wrote: Florian G. Pflug wrote: Would it be possible to determine when the copy is starting that this case holds, and not use the parallel parsing idea in those cases? In theory, yes. In pratice, I don't want to be the one who has to answer to an angry user who just suffered a major drop in COPY performance after adding an ENUM column to his table. I am yet to be convinced that this is even theoretically a good path to follow. Any sufficiently large table could probably be partitioned and then we could use the parallelism that is being discussed for pg_restore without any modification to the backend at all. Similar tricks could be played by an external bulk loader for third party data sources. That assumes that some specific bulkloader like pg_restore, pgloader or similar is used to perform the load. Plain libpq-users would either need to duplicate the logic these loaders contain, or wouldn't be able to take advantage of fast loads. Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). CPU gain more and more cores, so in the long run I fear that we will have to find ways to utilize more than one of those to execute a single query. But of course the architectural details need to be sorted out before any credible judgement about the feasability of this idea can be made... regards, Florian Pflug ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] An idea for parallelizing COPY within one backend
Florian G. Pflug [EMAIL PROTECTED] writes: Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). This thinking is exactly what makes me scream loudly and run in the other direction. I don't want threads introduced into the backend, whether gently or otherwise. The portability and reliability hits that we'll take are too daunting. Threads that invoke user-defined code (as anything involved with datatype-specific operations must) are especially fearsome, as there is precisely 0 chance of that code being thread-safe. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] An idea for parallelizing COPY within one backend
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: Plus, I'd see this as a kind of testbed for gently introducing parallelism into postgres backends (especially thinking about sorting here). This thinking is exactly what makes me scream loudly and run in the other direction. I don't want threads introduced into the backend, whether gently or otherwise. The portability and reliability hits that we'll take are too daunting. Threads that invoke user-defined code (as anything involved with datatype-specific operations must) are especially fearsome, as there is precisely 0 chance of that code being thread-safe. Exactly my thinking. That is why I was looking for a way to introduce parallelism *without* threading. Though it's not so much the user-defined code that scares me, but rather the portability issues. The differences between NPTL and non-NPTL threads on linux alone make me shudder... Was I was saying is that there might be a chance to get some parallelism without threading, by executing well-defined pieces of code with controlled dependencies in separate processes. COPY seemed like an ideal testbed for that idea, since the conversion of received lines into tuples seemed reasonable self-contained, and with little outside dependencies. If the idea can't be made to work there, it probably won't work anywhere. If it turns out that it does (with an API change for input/output functions) however, then it *might* be possible to apply it to other relatively self-contained parts in the future... To restate, I don't want threaded backends. Not in the foreseeable future at least. But I'd still love to see a single transaction using more than one core. regards, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] ResourceOwners for Snapshots? holdable portals
Alvaro Herrera [EMAIL PROTECTED] writes: We currently just copy the portal's content into a Materialize node, and let the snapshot go away at transaction's end. This works, but ISTM we could improve that by keeping track of the portal's snapshot separately from the transaction -- that is to say, to hang it from the portal's ResourceOwner. This would allow us to avoid the Materialize node altogether, and just keep the xmin back until the portal's gone. That's a pretty horrid idea: what if the query being executed by the portal has side-effects? You can't get away with not executing it to completion before you close the transaction. Not to mention that it depends on locks not just snapshots. As far as the general point goes, I had been thinking of managing snapshots in a central cache, because if you want to advance xmin intratransaction then some piece of code has to be aware of *all* the open snapshots in the backend; and the ResourceOwners can't do that conveniently because they're fairly independent. Or were you meaning that you would do that and on top of it have the ResourceOwners track references into the cache? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] proposal: plpgsql return execute ...
On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote: I thing RETURN QUERY is successful idea. It should be completed with support of dynamic SQL. Yeah, I can see that being useful. RETURN EXECUTE sqlstring [USING]; What is the USING clause for? -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ResourceOwners for Snapshots? holdable portals
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: We currently just copy the portal's content into a Materialize node, and let the snapshot go away at transaction's end. This works, but ISTM we could improve that by keeping track of the portal's snapshot separately from the transaction -- that is to say, to hang it from the portal's ResourceOwner. This would allow us to avoid the Materialize node altogether, and just keep the xmin back until the portal's gone. That's a pretty horrid idea: what if the query being executed by the portal has side-effects? You can't get away with not executing it to completion before you close the transaction. Ah, excellent point -- I guess that's what I was missing. As far as the general point goes, I had been thinking of managing snapshots in a central cache, because if you want to advance xmin intratransaction then some piece of code has to be aware of *all* the open snapshots in the backend; and the ResourceOwners can't do that conveniently because they're fairly independent. Or were you meaning that you would do that and on top of it have the ResourceOwners track references into the cache? Yeah, I think there needs to be a separate list either way, but having references to it from ResourceOwners means there's no need to have extra cleanup calls at (sub)transaction commit/abort. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Required make version
Alvaro Herrera wrote: I think this can be used with a conditional like ifneq (,$(findstring order-only,$(.FEATURES))) ... endif Yes, that was my thought. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] OSSP can be used in the windows environment now!
Hiroshi Saito wrote: Hi. Ok. Do you know if there are any plans to include this in the distribution? I would make life a whole lot easier. If not, perhaps we should include the win32.mak file in a subdir to our uuid module? Ahh, I don't have a good idea... build of MinGW is required before win32.mak. Probably, it needs to be written to a document. I take it you are in contact with them, since you helped them with the port? Can you ask them if they are interested in distributing that file? If not, what do other people think about adding this Makefile and a README file to our contrib directory? //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] proposal: plpgsql return execute ...
On Wed, Feb 27, 2008 at 1:58 PM, Neil Conway [EMAIL PROTECTED] wrote: On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote: I thing RETURN QUERY is successful idea. It should be completed with support of dynamic SQL. Yeah, I can see that being useful. RETURN EXECUTE sqlstring [USING]; What is the USING clause for? parameter binding. see: http://archives.postgresql.org/pgsql-patches/2006-08/msg00267.php merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Idea for minor tstore optimization
I notice that several of the call sites of tuplestore_puttuple() start with arrays of datums and nulls, call heap_form_tuple(), and then switch into the tstore's context and call tuplestore_puttuple(), which deep-copies the HeapTuple into the tstore. ISTM it would be faster and simpler to provide a tuplestore_putvalues(), which just takes the datum + nulls arrays and avoids the additional copy. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] 2WRS [WIP]
Referring to tuplesort.c andtuplestore.c BACKGROUND: Starting from dumptuples() [ tuplesort.c ] write functions move the tuple from a buffer to another in order to finally write it in a logical tape. Is there a way (even the most inefficient way) to use current read/write functions provided by PostgreSQL in order to retrieve the first tuple of a certain run while performing External Sorting? NOTE: I need the first tuple in order to manipulate the whole corresponding run, tuple by tuple since they are written sequentially in a run. Thanks for your attention. Regards, Manolo. -- From: [EMAIL PROTECTED] Sent: Tuesday, February 26, 2008 4:10 PM To: Jaime Casanova [EMAIL PROTECTED]; [EMAIL PROTECTED] Cc: Decibel! [EMAIL PROTECTED]; David Fetter [EMAIL PROTECTED]; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] For the joy of all of you: that's the correct WIP patch. At the moment it only tries to create runs uding two heaps. Hope you can help me with writing those runs on tapes. I'd be very pleased to give you more details. Thenks for your time. Regards, Manolo. -- From: Jaime Casanova [EMAIL PROTECTED] Sent: Friday, February 22, 2008 5:30 AM To: [EMAIL PROTECTED] Cc: Decibel! [EMAIL PROTECTED]; Manolo _ [EMAIL PROTECTED]; David Fetter [EMAIL PROTECTED]; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] [PATCHES] 2WRS [WIP] On Thu, Feb 21, 2008 at 6:44 AM, [EMAIL PROTECTED] wrote: Hi. That's the last release and refers to 8.3.0 and not to 8.2.5 as before. Hope you can tell me if I created it correctly please. no, it doesn't... ! /* GUC variables */ #ifdef TRACE_SORT bool trace_sort = false; #endif - #ifdef DEBUG_BOUNDED_SORT - bool optimize_bounded_sort = true; - #endif it's seems you're removing something added in 8.3 -- regards, Jaime Casanova Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning. Richard Cook ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] proposal: plpgsql return execute ...
On 27/02/2008, Merlin Moncure [EMAIL PROTECTED] wrote: On Wed, Feb 27, 2008 at 1:58 PM, Neil Conway [EMAIL PROTECTED] wrote: On Wed, 2008-02-27 at 15:24 +0100, Pavel Stehule wrote: I thing RETURN QUERY is successful idea. It should be completed with support of dynamic SQL. Yeah, I can see that being useful. RETURN EXECUTE sqlstring [USING]; What is the USING clause for? parameter binding. see: http://archives.postgresql.org/pgsql-patches/2006-08/msg00267.php merlin I sent modernized version http://archives.postgresql.org/pgsql-patches/2007-10/msg00161.php Pavel ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] new warning message
On IRC today someone brought up a problem in which users were still able to connect to a database after a REVOKE CONNECT ... FROM theuser. The reason theuser is still able to connect is because PUBLIC still has privileges to connect by default (AndrewSN was the one who answered this). Would it be reasonable to throw a warning if you revoke a privilege from some role, and that role inherits the privilege from some other role (or PUBLIC)? Regards, Jeff Davis ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Varlena Type Creation
On Tue, Feb 26, 2008 at 06:19:48PM +0100, Dimitri Fontaine wrote: So... where do I start to create a varlena datatype which has to store the 3 following values: text prefix, char start, char end. It's not clear for me whether this is what I need to provide: typedef struct I see no-one responded to this: a varlena has no fixed header size, so you can't fit it in a structure anyway. Once you're passed a pointer you use the LEN/PTR macros to extract what you want. Not sure what the chars are for, but perhaps it would be easiest to treat it as a single text object with the two leading characters signifying something? Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Those who make peaceful revolution impossible will make violent revolution inevitable. -- John F Kennedy signature.asc Description: Digital signature
Re: [HACKERS] new warning message
Jeff Davis [EMAIL PROTECTED] writes: Would it be reasonable to throw a warning if you revoke a privilege from some role, and that role inherits the privilege from some other role (or PUBLIC)? This has been suggested and rejected before --- the consensus is it'd be too noisy. Possibly the REVOKE manual page could be modified to throw more stress on the point. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Snapshot Reuse
In Read Committed transactions we take snapshots much more frequently than transactions begin and commit. It would be help scalability if we didn't need to re-take a snapshot. That's only helpful if the chances of seeing the snapshot is relatively high. Now that we have virtual transactions we may more frequently find ourselves taking identical snapshots. If we had a counter that incremented each time the main snapshot altered in a meaningful way we could set that atomically. We could then read this when we take a snapshot to see if it matches our existing snapshot; if so then drop the lock quickly and continue with what we already have. I can see some downsides to this as well as potential benefits: * we ping the counter across CPUs - yes, we will, but that's probably better than pinging the whole procarray * this relies upon the rate of change of snapshots - need to do the math to see how often this might apply Not sure yet myself, but it seems worth recording in case it spurs an idea from someone else. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Idea for minor tstore optimization
Neil Conway [EMAIL PROTECTED] writes: I notice that several of the call sites of tuplestore_puttuple() start with arrays of datums and nulls, call heap_form_tuple(), and then switch into the tstore's context and call tuplestore_puttuple(), which deep-copies the HeapTuple into the tstore. ISTM it would be faster and simpler to provide a tuplestore_putvalues(), which just takes the datum + nulls arrays and avoids the additional copy. Seems reasonable. Check whether tuplesort should offer the same, while you are at it. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
Hello. I am currently playing with UUID data type and try to use it to store provided by third party (Hewlett-Packard) application. The problem is they format UUIDs as -------, so I have to replace(text,'-','')::uuid for this kind of data. Nooow, the case is quite simple and it might be that there are other applications formatting UUIDs too liberally. I am working on a patch to support this format (yes, it is a simple modification). And in the meanwhile I would like to ask you what do you think about it? Cons: Such format is not standard. Pros: This will help UUID data type adoption. [1] While good applications format their data well, there are others which don't follow standards. Also I think it is easier for a human being to enter UUID as 8 times 4 digits. Your thoughts? Should I submit a patch? Regards, Dawid [1]: My first thought when I received the error message was hey! this is not an UUID, it is too long/too short!, only later did I check that they just don't format it too well. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
Dawid, I am working on a patch to support this format (yes, it is a simple modification). I'd suggest writing a formatting function for UUIDs instead. Not sure what it should be called, though. to_char is pretty overloaded right now. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
I am working on a patch to support this format (yes, it is a simple modification). There was a proposal and a discussion regarding how this datatype would be before I started developing it. We decided to go with the format proposed in RFC. Unless there is strong case, I doubt any non standard formatting will be accepted into core. IIRC we where also opposed to support java like formatted uuid's back then. This is no different. Regards, Gevik. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] UUID data format 4x-4x-4x-4x-4x-4x-4x-4x
Josh Berkus [EMAIL PROTECTED] writes: I am working on a patch to support this format (yes, it is a simple modification). I'd suggest writing a formatting function for UUIDs instead. That seems like overkill, if not outright encouragement of people to come up with yet other nonstandard formats for UUIDs. I think the question we have to answer is whether we want to be complicit in the spreading of a nonstandard UUID format. Even if we answer yes for this HP case, it doesn't follow that we should create a mechanism for anybody to do anything with 'em. That way lies the madness people already have to cope with for datetime data :-( regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] OSSP can be used in the windows environment now!
Hi. - Original Message - From: Magnus Hagander [EMAIL PROTECTED] I take it you are in contact with them, since you helped them with the port? Can you ask them if they are interested in distributing that file? Yes, However, It is not discussing about MSVC. It is because it needed to think with correspondence of the generation process (xx.in) of the source code. Furthermore, uuid_cli had a problem more... Although I don't have the margin time now, it is taken as my TODO. If not, what do other people think about adding this Makefile and a README file to our contrib directory? If there is no contrary opinion, I believe that it is help to many people's Regards, Hiroshi Saito ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Buildfarm member gypsy_moth seems not to like alignment patch
It looks like gypsy_moth has been failing like this: creating directory /export/home/tmp/pg-test/build-suncc/HEAD/pgsql.21325/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers/max_fsm_pages ... 32MB/204800 creating configuration files ... ok creating template1 database in /export/home/tmp/pg-test/build-suncc/HEAD/pgsql.21325/src/test/regress/./tmp_check/data/base/1 ... ok initializing pg_authid ... ok initializing dependencies ... ok creating system views ... Bus Error - core dumped child process exited with exit code 138 initdb: data directory /export/home/tmp/pg-test/build-suncc/HEAD/pgsql.21325/src/test/regress/./tmp_check/data not removed at user's request since I put in this patch: http://archives.postgresql.org/pgsql-committers/2008-02/msg00270.php This is unfortunate and surprising, since that patch was intended to prevent compilers from making unsafe alignment assumptions, but it sure looks like this compiler has instead added a new one. Could you poke into it --- at least get a stack trace from the core dump? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Producer/Consumer Issues in the COPY across network
On Tue, 2008-02-26 at 12:29 +0100, Martijn van Oosterhout wrote: When we're running a COPY over a high latency link then network time is going to become dominant, so potentially, running COPY asynchronously might help performance for loads or initial Slony configuration. This is potentially more important on Slony where we do both a PQgetCopyData() and PQputCopyData() in a tight loop. When you check the packets being sent, are you showing only one record being sent per packet? If so, there's your problem. I've not inspected the packet flow. It seemed easier to ask. I also note that PQgetCopyData always returns just one row. Is there an underlying buffering between the protocol (which always sends one message per row) and libpq (which is one call per row)? It seems possible for us to request a number of rows from the server up to a preferred total transfer size. AIUI the server merely streams the rows to you, the client doesn't get to say how many :) Right, but presumably we generate a new message per PQgetCopyData() request? So my presumption is we need to wait for that to be generated each time? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Batch update of indexes on data loading
Simon Riggs [EMAIL PROTECTED] wrote: The LOCK is only required because we defer the inserts into unique indexes, yes? No, as far as present pg_bulkload. It creates a new relfilenode like REINDEX, therefore, access exclusive lock is needed. When there is violations of unique constraints, all of the loading is rollbacked at the end of loading. BTW, why REINDEX requires access exclusive lock? Read-only queries are forbidden during the operation now, but I feel they are ok because REINDEX only reads existing tuples. Can we do REINDEX holding only shared lock on the index? I very much like the idea of index merging, or put another way: batch index inserts. How big do the batch of index inserts have to be for us to gain benefit from this technique? Hmm, we might need to know *why* COPY with indexes is slow. If the major cause is searching position to insert, batch inserts will work well. However, if the cause is index splitting and following random i/o, batch insertion cannot solve the problem; rebuild is still required. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] ES7000 Windows 2003 server 64bit processor
Hello All, We are facing some problems while downloading the Postgresql 8.2.4 version of 64bit processor for both Windows and SUSE Linux ES-7000 partitions. The entire download URL's of PostgreSQL 8.2.4 version are displaying the error message The webpage can't found. Can you please let me know any archives for Postgresql 8.2.4 version of 64bit processor. Postgresql 8.2.4 Win32 Application I am trying to install postgresql-8.2.4 on ES7000 with Windows Server 2003 (pack 1). I tried to install on ES-7000 using postgresql-8.2.msi installer with: I executed the installation as Administrator (user group Administrators). I got the error Failed to run initdb: 128! With an error log that is empty Though PostgreSQL claims that PostgreSQL 8.2.4 win32 application supports Windows 64bit processor. We are assuming PostgreSQL 8.2.4 win32 bit application does not support the ES7000 Windows 2003 server 64bit processor. Can you please help me on this. Regards Suresh