Re: [HACKERS] [GENERAL] Physical Database Configuration
> DB2 looks good. I have horrid, horrid memories of wrestling with the Oracle > "extent" madness. I think Oracle's extents came from their fixed size data file legacy, in 9i the extent limits appear to be completely overridable and sometimes even ignored, such as the next extent size. I agree that the 128 extent limit was a pain, and the default for each new extent to be larger than the previous one created many problems. Oracle also took physical abstraction one level beyond 'tablespaces'. I think if each tablespace pointed to a specific directory, that'd be sufficient for me. And since I envision the tablespace as an attribute of the table that should take care of the 1GB file rollover issue, as the rollover would occur in the same directory as the first file. Without having delved into the code yet, setting up entries for user default tablespaces and system information is probably at least as much work as getting a tablespace to point to a specific directory for the purposes of opening or creating files for an object. My personal preference would be to have four tablespaces predefined as part of a new database, though initially they could all point to the same place: SYSTEM USER TEMP INDEXES What about the concepts of a 'read-only' tablespace, or taking tablespaces offline? -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Updating psql for features of new FE/BE protocol
Is it too late to suggest that there be a way to have output displayed on screen AND output to a file? I've got my Oracle systems set up so that all sqlplus sessions do this, complete with using the process or session number as part of the output file name so each is unique. This gives me a running record of what I did when, which saves me a LOT of time if I want to view the results of some query I ran last week. I can delete or zip up files if I get short on disk space space -- Mike Nolan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Updating psql for features of new FE/BE protocol
> [EMAIL PROTECTED] writes: > > Is it too late to suggest that there be a way to have output displayed > > on screen AND output to a file? > > tee perhaps? Tee ALMOST does it. Try doing a \d while tee'ing the output, for example. I don't quite get everything back before it asks for the next input line, sometimes all that is missing is the prompt itself. I haven't set up a 7.4 test system yet, but I've been looking into it in 7.3.3. it gives me something fairly harmless to work on as I learn more C. I think tee may write straight to sysout, so it is probably intermingling with the writes from within psql. I'm not sure why sometimes it is only missing a line or two and other times it is missing several lines. There doesn't appear to be a way to set the popen on the \o command to non-buffer mode or to force a flush on a pipe. (The equivalent of fflush.) I have also noticed that if I have timing on, the timing stats do not get sent to the output file, just to the screen. (That doesn't concern me at this point, it was just a side comment on screen vs file output.) > This is irrelevant to what I'm doing, in any case, and it's not an itch > I feel personally. Work on it yourself if you want it ... I'm trying to, now I really feel like a rookie! :-) -- Mike Nolan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Updating psql for features of new FE/BE protocol
> This is irrelevant to what I'm doing, in any case, and it's not an itch > I feel personally. Work on it yourself if you want it ... OK, I figured it out. :-) It's a fairly short patch in 7.3.3, what do I need to do to submit it for 7.4? I also made a minor functional change that may need to be turned into an additional variant on echo: if ECHO is set to 'queries' and output is being sent other than to stdout (\o), it echoes the query to the output file or pipe. If that's too much of a change and this needs to be a separate option, I'd suggest something like setting ECHO to 'queryout'. To be complete, would it need to have a command line equivalent, too? BTW, I figured out how to direct the output to a separate file each time, I put the following in .psqlrc: \o |tee `echo psql_${$}.txt` -- Mike Nolan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Physical Database Configuration
> That should be > > Tablespaces > databases >schemas > objects > > with each of them implemented as a directory and data files under it. If we > could get a quota check propogated in both direction, that would be pretty > good, may be a warning when things start getting close to limit. I disagree. Just as you can have multiple schemas within one database you can have multiple tablespaces within one database. And the tablespace is irrelevant as far as specifying an object is concerned. A fully qualified object would be: database.schema.object, not tablespace.database.schema.object or database.tablespace.schema.object. -- Mike Nolan ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [GENERAL] Physical Database Configuration
> > Well, with above proposal, drop database should be as simple. It's just that > > it would be more than one `rm -rf`rather than just one. > > Right, there would be potentially one per tablespace. The key point > here is that the tablespace definitions are known cluster-wide, so a > "DROP DATABASE x" command running in database y would still be able > to figure out which subdirectories it needs to zap. It sounds like you envision tablespaces (or at least the default tablespace) as being above databases in some contexts. Reducing other tablespaces to mere subdirectories under the 'base' directory for a database sounds like a practical implementation measure. I presume that the 'global' directory is for stuff that is not specific to any one database within a database cluster. If so, there is an inconsistency in the current directory structure in that SOME global information is in the main /usr/local/pgsql/data directory (or equivalent) while other global information is in the global subdirectory. Being able to zap a database with one or more 'rm -rf' commands assumes that there will be files from just ONE database permitted in any given tablespace, and ONLY files from that database. The former is probably a good thing to require and enforce, the latter is unenforcable but probably still advisable. Is this doable within the time frame for the 7.4 feature freeze? -- Mike Nolan ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [GENERAL] Physical Database Configuration
> Shridhar Daithankar <[EMAIL PROTECTED]> writes: > > On Thursday 26 June 2003 21:56, [EMAIL PROTECTED] wrote: > >> Is this doable within the time frame for the 7.4 feature freeze? > > > Good question. > > ROTFL... the answer is no. Feature freeze is Tuesday, people. In > practice, the time to start coding new stuff is already long past. > Especially major new stuff. > > If you start now you might have something done for 7.5. Forgive us, Tom, we are still learning how the cycle works. (Or at least I am.) I am also probably grossly underestimating the pervasiveness of implementing tablespaces. -- Mike Nolan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] A portable code question
In the little fix I came up with for psql last night, I need to be able to ensure that something sent to a pipe (and then to stdout) completes before issuing the prompt directly to stdout. I did this with: "system ('sleep 1');", but I'm fairly sure that is not portable nor does it ENSURE completion. What's the proper way to do this? And what's a good book on writing portable code? -- Mike Nolan ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] A portable code question
> Assuming you're using file streams to write to the pipe, fflush() will do > the trick. The problem is that the pipe (from \o |tee ) is intermingling writes to stdout by tee with direct writes to stdout from within psql. I do issue a fflush, because that's necessary to make the pipe do its thing, but the next line of code also does a write to stdout and the pipe generally doesn't have time to complete that write to stdout, resulting in intermingled output. (fflush makes sure the pipe GETS the stream, it doesn't wait around to make sure it's DONE with it, probably because there's no way for whatever the pipe calls to report back when it is done.) This is a bit of a hack, but adding an option to the \o code so that it writes simultaneously to the pipe and to stdout instead of using tee looks like a lot more work, especially since the code appears to have a couple of other places where intermingling to stdout is possible, especially if readline is used. Throwing in "system('sleep 1');" was the way I resolved the timing question here, but that may not be portable enough for inclusion into the code base. -- Mike Nolan ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: Tablespaces (was Re: [HACKERS] [GENERAL] Physical Database
> I thought Tablespaces were already implemented. Are they not? Apparently not. A group has been formed to work on it, though. -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [GENERAL] Physical Database Configuration
> Well, correct solution is to implement tablespaces on which objects like > databases, tables and indexes can be put. I've not looked at the SQL standard, but it seems to me like the order should be: Databases Tablespaces Schemas Objects (tables, indexes, functions, etc.) And it really isn't hierarchical. As I understand them (based on my Oracle background), tablespaces, unlike schemas, do NOT create a layer of data abstraction. That is to say, while the same table name can exist in multiple schemas, only one instance of a given table name within a given schema can exist, regardless of what tablespace it is in. That makes the tablespace a property of an object. Whether or not two databases can share tablespaces isn't clear to me, though as a DBA I can think of good reasons why they probably shouldn't do so, I'm not sure if that is an absolute. > I have no idea what is the status of that effort right now. You can search the > archives or I hope this kicks a fresh discussion..:-) I'm game, though I'm also not ready to lead such a project, probably not even the discussion on it. -- Mike Nolan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] initcap incompatibility issue
The initcap function is not completely consistent with Oracle's initcap function: SELECT initcap('alex hyde-whyte'); In Oracle 9.2i this will return 'Alex Hyde-White', in PostgreSQL 7.3.3 it returns 'Alex Hyde-white'. It looks like a relatively simple change to oracle_compat.c in backend/utils/adt, but is this a bugfix that can be made during the pre-beta period for 7.4 or does it need to wait? -- Mike Nolan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] initcap incompatibility issue
> > The initcap function is not completely consistent with Oracle's initcap > function: > > SELECT initcap('alex hyde-whyte'); > > In Oracle 9.2i this will return 'Alex Hyde-White', in PostgreSQL 7.3.3 > it returns 'Alex Hyde-white'. No, it doesn't change the 'y' to an 'i', that's a typo in my earlier note. As far as I can tell, not capitalizing the first letter after a dash is the only inconsistency with Oracle's implementation of this function. If a patch is in order at this time, I will try to produce it, I need to learn set up CVS and how to do that anyway. :-) -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] initcap incompatibility issue
> As far as I can tell, not capitalizing the first letter after a dash > is the only inconsistency with Oracle's implementation of this function. Wrong again. Oracle also capitalizes the first letter after a comma, semicolon, colon, period, and both a single and double quote. (And that's all I've tested so far.) So, I guess I need to write a program to test all possible combinations to see how incompatible the function is. Making this change will be a larger patch than I had initially anticipated. That also brings into question whether this is really a bugfix or a specification change, a question which is relevant since we're in the feature freeze for 7.4. -- Mike Nolan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] initcap incompatibility issue
> > It sounds like Oracle is simply regexing for anything that ISN'T a letter > > to initcap right after it. If that's the case, you could just regex too. > > Or more likely, use the appropriate ctype.h function (isalpha, probably). Having tested it, Oracle capitalizes after all non-alphanumeric characters, so !isalnum() is the appropriate function. (That makes it a one-line patch on 7.3.3, which I've already tested.) > AFAIK, our specification for this function is "be like Oracle", so it's > a bug fix and fair game for 7.4. Of course, the sooner you get it in > the more likely we'll see it that way ;-). Later in beta, only critical > bugfixes will be accepted, and this one surely ain't very critical. Now if I can just get CVS working on Redhat 8 and remember how to build a patch, even a one-liner. :-) -- Mike Nolan ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] did you read my mails ?
> This allows functions to read/write operating system files. Is this of > interest to anyone? Is this something that requires untrusted status, as it would if I wrote a function in perl to do the same thing? -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] make clean fails (more info)
Sorry if I didn't mention it before, but this was on FreeBSD. -- Mike Nolan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Make clean fails
Make clean fails on an up-to-date CVS (as of 2:40 PM CDT on Sunday). Here's the final part of the output: make[2]: Leaving directory `/home/nolan/beta/pgsql/contrib/tablefunc' make[2]: Entering directory `/home/nolan/beta/pgsql/contrib/tips' make[2]: Nothing to be done for `clean'. make[2]: Leaving directory `/home/nolan/beta/pgsql/contrib/tips' make[2]: Entering directory `/home/nolan/beta/pgsql/contrib/tsearch' rm -f libtsearch.a rm -f libtsearch.so libtsearch.so.0 libtsearch.so.0.0 rm -f tsearch.sql rm -f crc32.o morph.o txtidx.o query.o gistidx.o rewrite.o rm -f parser.c rm -rf results tmp_check log rm -f regression.diffs regression.out regress.out run_check.out make[2]: Leaving directory `/home/nolan/beta/pgsql/contrib/tsearch' make: *** tsearch2: No such file or directory. Stop. make: Entering an unknown directorymake: Leaving an unknown directorymake[1]: ** * [clean] Error 2 make[1]: Leaving directory `/home/n -- Mike Nolan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] make check createdb fails
More problems with the latest CVS on FreeBSD: make[3]: Leaving directory `/home/nolan/beta/pgsql/contrib/spi' /bin/sh ./pg_regress --temp-install --top-builddir=../../.. --schedule=./paralle l_schedule --multibyte=SQL_ASCII == removing existing temp installation== == creating temporary installation== == initializing database system == == starting postmaster== Segmentation fault (core dumped) Segmentation fault (core dumped) Segmentation fault (core dumped) -- Mike Nolan ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Make clean fails
> [EMAIL PROTECTED] writes: > > Make clean fails on an up-to-date CVS (as of 2:40 PM CDT on Sunday). > > > make: *** tsearch2: No such file or directory. Stop. > > I suspect you forgot "-d" in your cvs update commands. You really need > both -d and -P to make cvs update behave reasonably ... I have no idea > why they are not the default behavior. Yep, that solved both the make clean and the coredump problems. Is that piece of information in the developers FAQ anywhere? -- Mike Nolan ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Make clean fails
> > Yep, that solved both the make clean and the coredump problems. > > Is that piece of information in the developers FAQ anywhere? > > It's in the "how to use CVS" instructions ... So it is. I probably read that before I got CVS working here, and it isn't mentioned (or that section of the docs referenced) in the CVS section in the Developer's FAQ. I found both sections insufficient for me to get CVS working here, and since I am thinking about using it for another project I picked up a copy of 'ESSENTIAL CVS' to fill in some of the gaps in my knowledge. Just part of the baptism of fire for a newbie, I guess. :-) -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] Who is maintaining the dbf2pg contrib utility?
Is anyone actively maintaining the dbase/dbf2pg utility in the contrib library? I have a couple of patches to fix bugs and one feature enhancement to suggest, but I'm not sure of the proper procedure for a contrib package. I've tried contacting the author at the address given in the package ([EMAIL PROTECTED]) with no response, but maybe that's an old address. -- Mike Nolan ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] status of dbf2pg
I received the following note from the original author of dbf2pg: > Date: Tue, 05 Aug 2003 18:43:22 +0400 > From: Maarten Boekhold <[EMAIL PROTECTED]> > Subject: Re: status of dbf2pg > To: [EMAIL PROTECTED] > > On 08/03/2003 06:55:01 AM nolan wrote: > > What is the status of dbf2pg. There do not appear to have been any > > updates to it in quite some time. > > No status. I created this around 1995/96 and haven't looked at it since. There is a dbf2pg package on debian that appears to have a higher version number, but I don't know what to do with a .deb file extension and I'm getting ready to go out of town so I won't have time to look into it until mid-August at the earliest. -- Mike Nolan ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] 9.3 RC1 psql encoding reporting inconsistently?
This is 9.3 RC1 on a Fedora 7 system. Why does \l report the encoding as SQL_ASCII and \set report it as UTF8? psql (9.3rc1) Type "help" for help. postgres=# \l List of databases Name Owner Encoding Collate Ctype Access privileges - - --- - - postgres postgres SQL_ASCII C C template0 postgres SQL_ASCII C C =c/postgres + postgres=CTc/postgres template1 postgres SQL_ASCII C C =c/postgres + postgres=CTc/postgres (3 rows) postgres=# \set AUTOCOMMIT = 'on' ECHO = 'queries' PROMPT1 = '%/%R%# ' PROMPT2 = '%/%R%# ' PROMPT3 = '>> ' VERBOSITY = 'default' VERSION = 'PostgreSQL 9.3rc1 on i686-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27), 32-bit' DBNAME = 'postgres' USER = 'postgres' PORT = '5432' ENCODING = 'UTF8' postgres=# -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, 乔志强 wrote: > >> Yes, increase wal_keep_segments. Even if you set wal_keep_segments to 64, >> the amount of disk space for WAL files is only 1GB, so there is no need to >> worry so much, I think. No? > > But when a transaction larger than 1GB... Then you may need WAL space larger than 1GB as well. For replication to work, it seems likely that you may need to have sufficient WAL space to handle a row, possibly the entire transaction.. But since a single statement can update thousands or millions of rows, do you always need enough WAL space to hold the entire transaction? > So in sync streaming replication, if master delete WAL before sent to the > only standby, all transaction will fail forever, > "the master tries to avoid a PANIC error rather than termination of > replication." but in sync replication, termination of replication is THE > bigger PANIC error. That's somewhat debatable. Would I rather have a master that PANICED or a slave that lost replication? I would choose the latter. A third option, which may not even be feasible, would be to have the master fail the transaction if synchronous replication cannot be achieved, although that might have negative consequences as well. > Another question: > Does master send WAL to standby before the transaction commit ? That's another question for the core team, I suspect. A related question is what happens if there is a rollback? -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, Kevin Grittner wrote: > Michael Nolan wrote: >> On 4/11/12, 乔志强 wrote: > >>> But when a transaction larger than 1GB... >> >> Then you may need WAL space larger than 1GB as well. For >> replication to work, it seems likely that you may need to have >> sufficient WAL space to handle a row, possibly the entire >> transaction.. But since a single statement can update thousands >> or millions of rows, do you always need enough WAL space to hold >> the entire transaction? > > No. > >>> Does master send WAL to standby before the transaction commit ? > > Yes. > >> A related question is what happens if there is a rollback? > > PostgreSQL doesn't use a rollback log; WAL files can be reclaimed as > soon as the work they represent has been persisted to the database > by a CHECKPOINT, even if it is not committed. Because there can be > multiple versions of each row in the base table, each with its own > xmin (telling which transaction committed it) and xmax (telling > which transaction expired it) visibiliity checking can handle the > commits and rollbacks correctly. It also uses a commit log (CLOG), > hint bits, and other structures to help resolve visibility. It is a > complex topic, but it does work. Thanks, Kevin. That does lead to a question about the problem that started this thread, though. How does one determine how big the WAL space needs to be to not cause streaming replication to fail? Or maybe this is a bug after all? -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] [streaming replication] 9.1.3 streaming replication bug ?
On 4/11/12, Fujii Masao wrote: > On Wed, Apr 11, 2012 at 3:31 PM, 乔志强 wrote: >> So in sync streaming replication, if master delete WAL before sent to the >> only standby, all transaction will fail forever, >> "the master tries to avoid a PANIC error rather than termination of >> replication." but in sync replication, termination of replication is THE >> bigger PANIC error. > > I see your point. When there are backends waiting for replication, the WAL > files > which the standby might not have received yet must not be removed. If they > are > removed, replication keeps failing forever because required WAL files don't > exist in the master, and then waiting backends will never be released unless > replication mode is changed to async. This should be avoided. > > To fix this issue, we should prevent the master from deleting the WAL files > including the minimum waiting LSN or bigger ones. I'll think more and > implement > the patch. With asynchonous replication, does the master even know if a slave fails because of a WAL problem? And does/should it care? Isn't there a separate issue with synchronous replication? If it fails, what's the appropriate action to take on the master? PANICing it seems to be a bad idea, but having transactions never complete because they never hear back from the synchronous slave (for whatever reason) seems bad too. -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deprecating RULES
On 10/12/12, Josh Berkus wrote: > I realize you weren't around when we removed row OIDs, but I was *still* > getting flack from that in 2008. And we lost entire OSS projects to > other databases because of removing row OIDs. And those were marked > deprecated for 3 years before we removed them. FWIW, the documentation for 9.2 still mentions OIDs and the 'default_with_oids' parameter, in what release was it announced they would be removed and in what release were they removed? -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow temporary tables when using sync rep
On Mon, Apr 16, 2012 at 6:27 PM, Thom Brown wrote: > Hi, > > I've noticed that when using synchronous replication (on 9.2devel at > least), temporary tables become really slow: > > Since temporary tables are only present until the session ends (or possibly only until a commit), why are they replicated at all? BTW, should we have an entry in the index for 'temporary tables? -- Mike Nolan
Re: [HACKERS] Temporary tables under hot standby
What is the use case for temporary tables on a hot standby server? Perhaps this is a noobie question, but it seems to me that a hot standby server's use by* applications* or *users* should be limited to transactions that don't alter the database in any form. However, I can see where temporary tables might be needed at the system level (if not already available) in order to prepare more efficient plans for some complex read-only queries. -- Mike Nolan
[HACKERS] problem/bug in drop tablespace?
While researching a problem reported on the -general list by a user who lost a disk containing his index tablespace, I ran into something, but I'm not sure is a serious bug or just an inconsistency in how \d shows tables. Here are the steps I took. 1. Create a new database 'MYDB' and connect to it. 2. Create a new tablespace 'MYTBLSP' 3. Create a table 'MYTABLE' and populate it. 4. Create an index 'MYIND' on that table, with the index in the new tablespace, MYTBLSP. Now, exit psql and delete the files in the tablespace directory created in step 2, simulating the problem the user had. Trying to execute an SQL command on the table MYTABLE will, as expected, generate an error. Now, drop tablespace MYTBLSP. This will produce an error, but it will delete the tablespace according to \db. Recreate tablespace MYTBLSP. Regenerate the index on MYTABLE. Queries will work on this table again, as expected. Now, here's the problem I ran into: The index will be rebuilt in tablespace MYTBLSP, but \d on table MYTABLE will not show the index as being in that tablespace. -- Mike Nolan
Re: [HACKERS] problem/bug in drop tablespace?
The last portion of my original post got edited out by mistake. The tests I ran were on version 9.1.3, running Fedora 14, kernel 2.6.35.14-106.fc14-i686. It seems to me that DROP TABLESPACE should check to see if there are references in the system catalog to the tablespace before dropping it, not just that the tablespace itself is empty. That way it would have thrown an error when I tried to drop the tablespace. A somewhat separate issue is what to do when a tablespace is inaccessible, such as due to a disk failure. The thread on -general that prompted my tests was a relatively easy one to suggest how to repair, because the lost tablespace only had indexes in it. But that's not exactly a -hackers issue, more of a question of better backup protocols. -- Mike Nolan
Re: [HACKERS] problem/bug in drop tablespace?
On 5/9/12, Albe Laurenz wrote: > I cannot reproduce this on 9.1.3: Odd, I've tried it another two times, with similar results to my initial post. Here's what I get starting with the point where I deleted the files in the tablespace: mytest=# select * from mytable; select * from mytable; ERROR: could not open file "pg_tblspc/289477766/PG_9.1_201105231/289477763/289477785": No such file or directory mytest=# \d mytable Table "public.mytable" Column Type Modifiers -- --- - id integer not null valtext Indexes: "mytable_pkey" PRIMARY KEY, btree (id), tablespace "mytblspc" mytest=# drop tablespace mytblspc; drop tablespace mytblspc; WARNING: could not open directory "pg_tblspc/289477766/PG_9.1_201105231": No such file or directory DROP TABLESPACE Time: 16.460 ms mytest=# \d mytable Table "public.mytable" Column Type Modifiers -- --- - id integer not null valtext Indexes: "mytable_pkey" PRIMARY KEY, btree (id) mytest=# create tablespace mytblspc location '/home/postgres/mytb'; create tablespace mytblspc location '/home/postgres/mytb'; CREATE TABLESPACE Time: 42.396 ms mytest=# \d mytable Table "public.mytable" Column Type Modifiers -- --- - id integer not null valtext Indexes: "mytable_pkey" PRIMARY KEY, btree (id) mytest=# reindex table mytable; reindex table mytable; REINDEX Time: 112.981 ms mytest=# \d mytable Table "public.mytable" Column Type Modifiers -- --- - id integer not null valtext Indexes: "mytable_pkey" PRIMARY KEY, btree (id) Here's what's in the mytb directory now: [postgres@romaine PG_9.1_201105231]$ ls -lR : total 4 drwx--. 2 postgres postgres 4096 May 9 13:22 289477763 ./289477763: total 16 -rw---. 1 postgres postgres 16384 May 9 13:22 289477790 It appears that the index has been rebuilt in the mytblspc tablespace, though \d mytable does not show that. I get the same results whether I rebuild the specific index as you did or reindex the table, as I did. I'm running on 9.1.3 built from the source code, not a distribution. -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem/bug in drop tablespace?
I see one potential difference between your results and mine. When I rebuild the tablespace, I wind up with the same filename/OID as before, I'm not sure you do. -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem/bug in drop tablespace?
On 5/11/12, Albe Laurenz wrote: > Michael Nolan wrote: >> I see one potential difference between your results and mine. >> >> When I rebuild the tablespace, I wind up with the same filename/OID as >> before, I'm not sure you do. > > Right. That's strange. > Usually OIDs get incremented, so you shouldn't end up with the same > OID for the new tablespace. > > Can you provide a complete testcase? I thought I had, until you were unable to reproduce it. :-) -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem/bug in drop tablespace?
On 5/11/12, Michael Nolan wrote: > On 5/11/12, Albe Laurenz wrote: >> Michael Nolan wrote: >>> I see one potential difference between your results and mine. >>> >>> When I rebuild the tablespace, I wind up with the same filename/OID as >>> before, I'm not sure you do. >> >> Right. That's strange. >> Usually OIDs get incremented, so you shouldn't end up with the same >> OID for the new tablespace. >> >> Can you provide a complete testcase? > > I thought I had, until you were unable to reproduce it. :-) > -- > Mike Nolan > My plan at this point is to wait until beta 1 of 9.2 is out, then see if I can reproduce the problem there. -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] problem/bug in drop tablespace?
On Fri, May 11, 2012 at 10:03 PM, Tom Lane wrote: > > Well, the question to me is exactly how much good it will do to stop > deletion of the pg_tablespace entry, if the underlying files are gone. > I'm having a hard time getting excited about expending cycles on that. > There could be multiple reasons why the underlying files are not there, such as a filesystem that isn't currently mounted for some reason. It seems prudent to throw an error on drop tablespace if there are references to that tablespace in the catalog, or perhaps require a 'force' clause to override any errors, but it probably isn't something most DBAs would run into very often. Thanks for figuring it out, Tom. -- MIke Nolan
Re: [HACKERS] Re: [COMMITTERS] pgsql: Send new protocol keepalive messages to standby servers.
On 6/2/12, Tom Lane wrote: > Robert Haas writes: >> On the other hand, if we simply say "PostgreSQL computes the >> replication delay by subtracting the time at which the WAL was >> generated, as recorded on the master, from the time at which it is >> replayed by the slave" then, hey, we still have a wart, but it's >> pretty clear what the wart is and how to fix it, and we can easily >> document that. Again, if we could get rid of the failure modes and >> make this really water-tight, I think I'd be in favor of that, but it >> seems to me that we are in the process of expending a lot of energy >> and an even larger amount of calendar time to create a system that >> will misbehave in numerous subtle ways instead of one straightforward >> one. I don't see that as a good trade. > > Well, okay, but let's document "if you use this feature, it's incumbent > on you to make sure the master and slave clocks are synced. We > recommend running NTP." or words to that effect. What if the two servers are in different time zones? -- Mike Nolan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Ability to listen on two unix sockets
On Mon, Jun 11, 2012 at 4:47 PM, Peter Eisentraut wrote: > On sön, 2012-06-10 at 17:24 -0400, Robert Haas wrote: > > >> and also affects the naming of any UNIX sockets created. > > > > > > Why would that matter? If you configure M ports and N Unix socket > > > locations, you get M*N actual sockets created. > > > > ...I *seriously* doubt that this is the behavior anyone wants. > > Creating M sockets per directory seems patently silly. > > How else would it work? > > If I say, syntax aside, listen on "ports" 5432 and 5433, and use socket > directories /tmp and /var/run/postgresql, then a libpq-using client > would expect to be able to connect using > > -h /tmp -p 5432 > -h /tmp -p 5433 > -h /var/run/postgresql -p 5432 > -h /var/run/postgresql -p 5433 > > So you do need to create M*N sockets. > > I don't really see a problem with that. > What about entries in pg_hba.conf? Will they need to be able to specify both the directory and the port number? -- Mike Nolan
[HACKERS] New/Revised TODO? Gathering actual read performance data for use by planner
In the TODO list is this item: *Modify the planner to better estimate caching effects * Tom mentioned this in his presentation at PGCON, and I also chatted with Tom about it briefly afterwards. Based on last year's discussion of this TODO item, it seems thoughts have been focused on estimating how much data is being satisfied from PG's shared buffers. However, I think that's only part of the problem. Specifically, read performance is going to be affected by: 1. Reads fulfilled from shared buffers. 2. Reads fulfilled from system cache. 3. Reads fulfilled from disk controller cache. 4. Reads from physical media. #4 is further complicated by the type of physical media for that specific block. For example, reads that can be fulfilled from a SSD are going to be much faster than ones that access hard drives (or even slower types of media.) System load is going to impact all of these as well. Therefore, I suggest that an alternative to the above TODO may be to gather performance data without knowing (or more importantly without needing to know) which of the above sources fulfilled the read. This data would probably need to be kept separately for each table or index, as some tables or indexes may be mostly or fully in cache or on faster physical media than others, although in the absence of other data about a specific table or index, data about other relations in the same tablespace might be of some use. Tom mentioned that the cost of doing multiple system time-of-day calls for each block read might be prohibitive, it may also be that the data may also be too coarse on some systems to be truly useful (eg, the epoch time in seconds.) If this data were available, that could mean that successive plans for the same query could have significantly different plans (and thus actual performance), based on what has happened recently, so these statistics would have to be relatively short term and updated frequently, but without becoming computational bottlenecks. The problem is one I'm interested in working on. -- Mike Nolan
Re: [HACKERS] New/Revised TODO? Gathering actual read performance data for use by planner
On Wed, May 25, 2011 at 11:18 AM, Robert Haas wrote: > > I basically agree. There have been several recent discussions of this > topic on both -hackers and -performance; it is likely that the TODO > needs to be updated with some more recent links. > Anything to help the NKOTB to get up to speed would be appreciated, though I still think it is not just a 'caching' issue. The question I hesitated to ask in Ottawa was: So, what information would you like and what would you do with it? -- Mike Nolan
Re: [HACKERS] generate_series() Interpretation
On Mon, Jun 27, 2011 at 1:38 PM, David E. Wheeler wrote: > > Yeah, which is why I said it was subject to interpretation. Of course > there's no way to tell generate_series() which to use, which is what I > figured. > generate_series() is doing exactly what it was designed to do, the imprecision regarding adding '1 month' to something that may or may not have been intended to be 'last day of the month' is a limitation in the interval code. One way to change this would be to implement another interval type such as 'full_month' which would take a date that is know to be the last day of the month and make it the last day of the appropriate month. If the starting date is NOT the last day of a month, the existing logic would suffice. Or you can do as I have done and create your own last_day() function that takes any date and makes it the last day of that month, and apply it to the output of generate_series(); -- Mike Nolan no...@tssi.com
Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas wrote: > > But if that's what you want, just don't put your data in different > databases in the first place. That's what schemas are for. > Sadly, DBAs don't always have the ability to put all their data in one database, even if that is what schemas are for. The ability to do cross-database (most likely cross-server as well) queries would address a lot of real-world problems. - Mike Nolan no...@tssi.com
Re: [HACKERS] What Would You Like To Do?
The lists all seem to be focusing on the things that the developers would like to add to PostgreSQL, what about some things that users or ISPs might like to have, and thus perhaps something that companies might actually see as worth funding? For example: A fully integrated ability to query across multiple databases,possibly on multiple servers, something Oracle has had for nearly two decades. Complete isolation at the user level, allowing an ISP to support multiple independent customers on a server without having to fiddle with multiple back ends each running on a separate port, a feature that MySQL has had for as far back as I can recall, and one of the reasons ISPs are more likely to offer MySQL than PostgreSQL. The ability to restore a table from a backup file to a different table name in the same database and schema. A built-in report writer, capable of things like column totals. (SqlPlus has this, even though it isn't very pretty.) -- Mike Nolan
Re: [HACKERS] What Would You Like To Do?
On Tue, Sep 13, 2011 at 12:26 PM, Joshua D. Drake wrote: > > On 09/13/2011 10:13 AM, Michael Nolan wrote: > >> The lists all seem to be focusing on the things that the developers >> would like to add to PostgreSQL, what about some things that users or >> ISPs might like to have, and thus perhaps something that companies might >> actually see as worth funding? >> > > Well just my own two cents ... but it all depends on who is doing the > funding. At this point 80% of the work CMD codes for Pg (or tertiary > projects and modules) is funded by companies. So let's not assume that > companies aren't funding things. They are. But perhaps if a few 'commercial' features were on the wish list there would be more companies willing to fund development? The developers get a bit of what they want to work on, the production users get a bit of what they need, everybody's happy. > For example: >> >> A fully integrated ability to query across multiple databases,possibly >> on multiple servers, something Oracle has had for nearly two decades. >> > > > That isn't the approach to take. The fact that Oracle has it is not a > guarantee that it is useful or good. If you need to query across databases > (assuming within the same cluster) then you designed your database wrong and > should have used our SCHEMA support (what Oracle calls Namespaces) instead. This is the difference between developers and real world users. Real world users may not have the ability, time or resources to redesign their databases just because that's the 'best' way to do something. Will it be the most efficient way to do it? Almost certainly not. I've been involved in a few corporate mergers, and there was a short term need to do queries on the combined databases while the tiger team handling the IT restructuring figured out how (or whether) to merge the dabases together. (One of these happened to be an Oracle/Oracle situation, it was a piece of cake even though the two data centers were 750 miles apart and the table structures had almost nothing in common. Another was a two week headache, the third was even worse!) In a perfect world, it would be nice if one could do combined queries linking a PostgreSQL database with an Oracle one, or a MySQL one, too. Because sometimes, that's what you gotta do. Even something that is several hundred times slower is going to be faster than merging the databases together. When I do this today, I have to write a program (in perl or php) that accesses both databases and merges it by hand. > >> The ability to restore a table from a backup file to a different table >> name in the same database and schema. >> >> > This can be done but agreed it is not intuitive. Can you elaborate on tha a bit, please? The only way I've been able to do it is to edit the dump file to change the table name. That's not very practical with a several gigabyte dump file, even less so with one that is much larger. If this capability already exists, is it documented? > (SqlPlus has this, even though it isn't very pretty.) >> > A built-in report writer, capable of things like column totals. > > There are a billion and one tools that do this without us having to > reinvent the wheel. Why would we support that? > There are other databases out there, too, why reinvent the wheel by working on PostgreSQL? :-) The question shoud be, would this be USEFUL? -- Mike Nolan
Re: [HACKERS] What Would You Like To Do?
On Tue, Sep 13, 2011 at 2:55 PM, Joshua D. Drake wrote: > > On 09/13/2011 11:51 AM, Michael Nolan wrote: > > >>The ability to restore a table from a backup file to a different >>table >>name in the same database and schema. >> >> >>This can be done but agreed it is not intuitive. >> >> >> Can you elaborate on tha a bit, please? The only way I've been able to >> do it is to edit the dump file to change the table name. That's not >> very practical with a several gigabyte dump file, even less so with one >> that is much larger. If this capability already exists, is it documented? >> > > You use the -Fc method, extract the TOC and edit just the TOC (so you don't > have to edit a multi-gig file) > > That is, at best, a bit obscure. I've wondered at times if the -f tar option would have any benefits here, though it appears to have significant downsides. A downside of either method may be that I can't predict in advance when I will want to do a restore of a single table from a backup file, so I'd have to always use that method of generating the file. I did propose an extension to pg_restore a couple of months ago to add an option to re-name a table as it is restored, but that seemed to have generated no interest. Maybe an external tool that reads a pg_dump file looking for a specific table and writes that portion of the dump file to a separate file, changing the table name would be easier? It'd probably have to handle most of or all of the different pg_dump formats, but that doesn't sound like an unachievable goal. -- Mike Nolan
Re: [HACKERS] pg_xlog -> pg_xjournal?
Why not take a simpler approach and create a zero length file in directories that should not be fiddled with by non-experts using a file name something like "DO.NOT.DELETE.THESE.FILES"? No, it won't prevent the incredibly stupid from doing incredibly stupid things, nothing will. -- Mike Nolan
Re: [HACKERS] On columnar storage
On Thu, Jun 11, 2015 at 7:03 PM, Alvaro Herrera wrote: > We hope to have a chance to discuss this during the upcoming developer > unconference in Ottawa. Here are some preliminary ideas to shed some > light on what we're trying to do. > > > I've been trying to figure out a plan to enable native column stores > (CS or "colstore") for Postgres. Motivations: > > * avoid the 32 TB limit for tables > * avoid the 1600 column limit for tables > * increased performance > > Are you looking to avoid all hardware-based limits, or would using a 64 bit row pointer be possible? That would give you 2^64 or 1.8 E19 unique rows over whatever granularity/uniqueness you use (per table, per database, etc.) -- Mike Nolan.
Re: [HACKERS] On columnar storage
On Sun, Jun 14, 2015 at 10:30 AM, Tomas Vondra wrote: > >> Are you looking to avoid all hardware-based limits, or would using a 64 >> bit row pointer be possible? That would give you 2^64 or 1.8 E19 unique >> rows over whatever granularity/uniqueness you use (per table, per >> database, etc.) >> -- >> Mike Nolan. >> > > I don't think the number of tuples is the main problem here, it's the > number of pages a single relation can have. Looking at the numbers of rows > as a direct function of TID size is misleading, because the TID is split > into two fixed parts - page number (32b) and tuple number (16b). > > For the record, 2^48 is 281,474,976,710,656 which ought to be enough for > anybody, but we waste large part of that because we assume there might be > up to 2^16 tuples per page, although the actual limit is way lower (~290 > for 8kB pages, and ~1200 for 32kB pages. > > So we can only have ~4 billion pages, which is where the 32TB limit comes > from (with 32kB pages it's 128TB). > > Longer TIDs are one a straightforward way to work around this limit, > assuming you add the bits to the 'page number' field. Adding 16 bits (thus > using 64-bit pointers) would increase the limit 2^16-times to about 2048 > petabytes (with 8kB pages). But that of course comes with a cost, because > you have to keep those larger TIDs in indexes etc. > > Another option might be to split the 48 bits differently, by moving 5 bits > to the page number part of TID (so that we expect ~2048 tuples per page at > most). That'd increase the limit to 1PB (4PB with 32kB pages). > > The column store approach is somehow orthogonal to this, because it splits > the table vertically into multiple pieces, each stored in a separate > relfilenode and thus using a separate sequence of page numbers. > > And of course, the usual 'horizontal' partitioning has a very similar > effect (separate filenodes). > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com/ > > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > Thanks for the reply. It's been a while since my last data structures course (1971), but I do remember a few things. I have never personally needed more than 1500 columns in a table, but can see how some might. Likewise, the 32TB limit hasn't affected me yet, either. I doubt either ever will. Solving either or both of those seems like it may at some point require a larger bit space for (at least some) TIDs, which is why I was wondering if a goal here is to eliminate all (practical) limits, It probably doesn't make sense to force all users to use that large bit space (with the associated space and performance penalties) If there's a way to do this, then you are all truly wizards. (This all reminds me of how the IP4 bit space was parcelled up into Class A, B, C and D addresses, at a time when people thought 32 bits would last us forever. Maybe 128 bits actually will.) -- Mike Nolan > > >
Re: [HACKERS] Syntax bug? Group by?
Mark Woodward wrote: select ycis_id, min(tindex), avg(tindex) from y where ycis_id = 15; I still assert that there will always only be one row to this query. This is an aggregate query, so all the rows with ycis_id = 15, will be aggregated. Since ycis_id is the identifying part of the query, it should not need to be grouped. My question, is it a syntactic technicality that PostgreSQL asks for a "group by," or a bug in the parser? I think your point is that every non-aggregate column in the results of the query also appears in the where clause and is given a single value there, so conceivably, an all-knowing, all-powerful postgres could recognize this and do the implied GROUP by on these columns. I'm not in a position to give a definitive answer on this, but I suspect that adjusting the query parser/planner to allow an implied GROUP BY either gets prohibitively complicated, or fits too much of a special case to be worth implementing. select ycis_id, some_other_id, min(tindex), avg(tindex) from y where ycis_id = 15 group by some_other_id; Here, postgres would have to use the group by you specified, and also recognize the single-valued constant assigned to ycis_id. Maybe not too bad, but: select ycis_id, some_other_id, min(tindex), avg(tindex) from y where ycis_id = some_single_valued_constant(foo, bar) group by some_other_id; In this case, postgres doesn't know whether some_single_valued_constant() will really return the same single value for every tuple. Ultimately, as more complex queries are introduced, it would become a lot simpler for the query writer to just specify the group by columns instead of trying to guess it from the where clause. Final note: I could also see situations where an implied group by would silently allow a poorly written query to execute, instead of throwing an error that suggests to the query writer that they did something wrong. -- Nolan Cafferky Software Developer IT Department RBS Interactive [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Coding style question
I think Tom stated it pretty well: When the variable is going to be set anyway in straight-line code at the top of the function, then it's mostly a matter of taste whether you set it with an initializer or an assignment. the key phrase is: "set anyway in straigh-tline code at the top of the function" > (I don't go so far as to introduce artificial scopes just for the sake > of nesting variable declarations). I don't introduce artificial scopes either. However, I do try to declare variables in the most-tightly-enclosing scope. For example, if a variable is only used in one branch of an if statement, declare the variable inside that block, not in the enclosing scope. good... This may not inform the current conversation at all, but a while back I went on a cross-compiler compatibility binge for all of my active projects, and I found that some compilers (*cough* Borland *cough) had some very strange compiler/run time errors unless all variables were declared at the top of the function, before any other code gets executed. For better or for worse, I started strictly declaring all variables in this manner, with initialization happening afterward, and the behavior has stuck with me. I don't know whether any compilers used for postgres builds still have this issue - it's been a few years. I also find that if you're declaring a lot of variables in a single block, that's usually a sign that the block is too large and should be refactored (e.g. by moving some code into separate functions). If you keep your functions manageably small (which is not always the case in the Postgres code, unfortunately), the declarations are usually pretty clearly visible. I couldn't agree more. Insert emphatic agreement here. Refactoring into smaller functions or doing a bit of object orientation almost always solves that readability problem for me. -- Nolan Cafferky Software Developer IT Department RBS Interactive [EMAIL PROTECTED]