Re: [HACKERS] Views, views, views! (long)
On Friday the 6th of May 2005, Mr. Treat opined: > I also don't think it is any harder to learn to query the > system tables than it would be to learn to query these new > views (with a few caevets that I will come back to) and it > might actually be better. Admin tools are in a sense already a gui "view" into the database, so whether they elect to use these new views seems rather immaterial to this discussion, to me at least. The more important aspect to me is as a user. As someone last week pointed out (but it seems to have been lost in the tool discussion) is that these views give the user easy access to system information from the command line. This is where typing a multi-join querey against the system catalog is error-prone and a PITA. An even bigger point is that these views can be used in scripts or procedures where the results can be used in another query or put into a table. The '\d' command might show me a list of tables, but I for one find the Oracle-style views such as ALL_TABLES, etc. more useful. I've always found the slash-style commands to be basically useful in showing what's there but using them to build reports or extract data for use in other queries is painful compared to having queriable views of the same data. Darren ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Minor TODO list changes
You are correct. I would envision being able to alter a table "read-write" at any point. If the index(es) on the table are completely filled from being created in read-only mode, then the affected pages should be split with the default fillfactor when/if a row is inserted or updated. Altering the table back to read-only would simple leave the index as is with a few pages not filled, but still fully functional. The insert-only index is intriguing as well though. Darren -Original Message- From: Simon Riggs [mailto:[EMAIL PROTECTED] Sent: Thursday, November 04, 2004 1:50 PM To: Rod Taylor Cc: Darren King; Bruce Momjian; PostgreSQL Development Subject: Re: [HACKERS] Minor TODO list changes On Thu, 2004-11-04 at 18:15, Rod Taylor wrote: > > At some point it would also be nice to be able to mark tables as > > read-only and then any indexes created on that table after that > > would have a fillfactor of 100%. Then I'd be able to load the > > table, alter it to be read-only, then add the appropriate indexes > > that are automatically compacted. > > If it's read-only, you might as well remove a number of the internal > fields used for visibility as well. Yes, should be able to save 16 bytes/row for an INSERT only table that would still allow multiple simultaneous COPY jobs against it, with no more than 1 statement per transaction. I'd like to create an additional tuple layout using the tuple version bits, so you'd be able to set a flag at CREATE TABLE time to use that as an optional alternative from the standard one. UPDATEs and DELETEs would be permanently disallowed against such tables, just as if privileges had not been granted. TRUNCATE would still work, however. Call it something like NOMODIFY? You could then alter VACUUM to skip such tables, so you'd be able to do a VACUUM database without scanning all of the largest tables in your system. Darren's first idea would then be interpreted as automatically setting FILLFACTOR=100 on indexes of NOMODIFY tables. Darren's second idea was dynamic: i.e. an ALTER TABLE READONLY after loading, rather than using a different tuple layout, which would need to be done before loading, probably at CREATE TABLE time. Darren's second idea of READONLY tables is related, but not necessarily the same as the NOMODIFY concept that Rod brings up. -- Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Minor TODO list changes
In my data warehousing situation, I'd like to be able to specify that the indexes be as compact as possible (fillfactor = 100%) in order to hit as few index pages as necessary. For summary tables there will not be any more inserts or deletions so the index will not change either. In that case, there's no point to leaving any extra room for page-splitting. At some point it would also be nice to be able to mark tables as read-only and then any indexes created on that table after that would have a fillfactor of 100%. Then I'd be able to load the table, alter it to be read-only, then add the appropriate indexes that are automatically compacted. Darren -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Thursday, November 04, 2004 12:19 PM To: Simon Riggs Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Minor TODO list changes Simon Riggs wrote: > On Thu, 2004-11-04 at 16:51, Bruce Momjian wrote: > > OK, I updated all your items. > > Thanks > > > I removed fillfactor because I thought I was the only one who > > thought it was valuable and as I remember it was mostly useful for > > ISAM, which we don't support. Can you think of a use for a non-100% > > fillfactor? > > > > I was under the impression the factor was 67% for data loaded on the > leading-edge of an index, and 50% for other INSERTs. > (backend/access/nbtree/nbtinsert.c) > > Not sure, without checking, what CREATE INDEX and COPY do, but I'm > guessing it is similar? > > Other RDBMS use a higher leading-edge/standard fill factor. > > There are situations where I'd want to set it at 90%, or even 100%. If > I know the update rate is likely to be zero, then I'd like my indexes > to fit in 10-30% less memory and disk, please. > > Or am I missing something? Oh, good point. I was thinking of just the leaf pages which I think are 100% filled. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] contrib vs. gborg/pgfoundry for replication solutions
> I agree with the notion that "contrib" be removed > from the main distribution. There is, however, a > disconnect between supporting projects and the main system. > > Take a look at the www.postgresql.org web site. > Most people visually filter out the side bars. I've > been looking over effectiveness of paid advertisements > in the various search engines, and the numbers are clear > in that the paid links at the top of the page get the hits > while the side hits get almost completely ignored. What I > think this means is that people looking for projects will > simply filter out any reference to gborg. > > What would be good, is to bring gborg a little closer to > PostgreSQL by putting a link in the main site menu. Rather than: What would be even better IMHO is to bring "gborg" a little close to English or a word that makes sense to people visiting the PostgreSQL site. Is it "g-b-org" or "g-borg" or what? Is it a carry-over from the Great Bridge days or some lame hacker reference to the Borg? I understand the reference to Great Bridge, but do newbies? Same goes for "pgfoundry". It might be obvious to the people already using PostgreSQL, but those aren't the people that need help finding something at the web site. What does a foundry have to do with computer projects? We're not associated with an iron-works or Bethlehem Steel, we're a database project trying to attract and keep visitors. If users can't find what they want at our site, they'll find it somewhere else or move on to another product that helps them do what they want. Instead of "gborg projects", why not just call it what it is, "supported projects"? Darren ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Dreaming About Redesigning SQL
> "Bob" <[EMAIL PROTECTED]> spewed forth... > > We achieved 8 times the performance with exactly the same > hardware. What the hell is this idiot talking about us > relying on hardware? He is a moron. You will do everyone > a favour if you just bounce him off the bottom of your > killfile. > ... > [profanity-laced responses deleted] > ... It's about time to take this discussion private, guys. It has long since stopped being on-topic to hacking PostgreSql internals. Darren ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] suggestions to improve postgresql suitability for data-mining
> You want to process all invoices to count them > and to sum up the amounts on a per month/area/type > basis. The initial data size is in GB, but the > size of the expected result is in KB (namely 2 data > for each 100 areas * 12 months * 4 types). The key to handling large datasets for data mining is pre-aggregation based on the smallest time frame needed for details. I'd suggest running these large queries and storing the results in other tables, and then writing a set of functions to work with those aggregate tables. No sense in summing up the same set of static data more than once if you can help it. Darren ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Index location patch for review
> > Attached is a patch that adds support for specifying a location for > > indexes via the "create database" command. > > > > I believe this patch is complete, but it is my first . > > This patch allows index locations to be specified as > different from data locations. Is this a feature direction > we want to go in? Comments? Having the table and index on separate drives can do wonders for i/o performance. :) darrenk ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
RE: [HACKERS] Performance TODO items
> 3) I am reading the Solaris Internals book and there is mention of a > "free behind" capability with large sequential scans. When a large > sequential scan happens that would wipe out all the old cache entries, > the kernel detects this and places its previous pages first > on the free list. For out code, if we do a sequential scan of a table > that is larger than our buffer cache size, I think we should detect > this and do the same. See http://techdocs.postgresql.org for my > performance paper for an example. > > New TODO entries are: > > * Order duplicate index entries by tid > * Add queue of backends waiting for spinlock > * Add free-behind capability for large sequential scans So why do we cache sequetially-read pages? Or at least not have an option to control it? Oracle (to the best of my knowledge) does NOT cache pages read by a sequential index scan for at least two reasons/assumptions (two being all that I can recall): 1. Caching pages for sequential scans over sufficiently large tables will just cycle the cache. The pages that will be cached at the end of the query will be the last N pages of the table, so when the same sequential query is run again, the scan from the beginning of the table will start flushing the oldest cached pages which are more than likely going to be the ones that will be needed at the end of the scan, etc, etc. In a multi-user environment, the effect is worse. 2. Concurrent or consective queries in a dynamic database will not generate plans that use the same sequential scans, so they will tend to thrash the cache. Now there are some databases where the same general queries are run time after time and caching the pages from sequential scans does make sense, but in larger, enterprise-type systems, indices are created to help speed up the most used queries and the sequential cache entries only serve to clutter the cache and flush the useful pages. Is there any way that caching pages read in by a sequential scan could be made a configurable-option? Any chance someone could run pgbench on a test system set up to not cache sequential reads? Darren ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
RE: [HACKERS] What I do with PostgreSQL
>> I am pumping about 200gb a week through the pg database, >> and our estimated database size is something like 4tb by >> the end of the year. > > Can anyone say 'Woof!'? Amen, Lamar. I was trying to think of something myself besides 'Wow!'... As a side note, there's a blurb in the July 16, 2001 Interactive Week about the MySQL AB vs NuSphere spat and the last paragraph of the article casts a very favorable nod towards PostgreSQL. I quote (any typos are mine) ... "Analysts said MySQL must find a way to generate a development community and support if it wants to compete with another open source database, PostgreSQL, distributed by Red Hat and Great Bridge." Article doesn't say who the "analysts" are, but the implication that MySQL isn't up to competing with PostgreSQL was interesting to my eyes! :) Darren ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
RE: [HACKERS] Re: Re: "--tuning" compile and runtime option (?)
> I can't see how any configure option would be faster or > better than the existing command line /config file parameters > -- it would only serve to make things harder to deal with IMHO. > "Tuning" PostgreSQL is pretty simple, and is explained pretty > well throughout the manual (especially in the section titled > "Understanding Performance"). We have -S -B and the fsync > options, That's about it.. Voice> --- right? > All are explained in the manual and are as easy to use as anyone could > ask... Any OS tuning should be left up to the administrator > as that's what administrators are for :-) Someday, maybe other parameters (such as some of the optimizer's default values) will be configurable/changable on the fly. Would seem silly to have to stop/start the server to change those. Just an example, I'm sure there are other internal values that admins would like to tweak without recompiling or restarting. Would be much nicer to login as the superuser to do it rather than shut down my site. darrenk ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
RE: [HACKERS] 7.1 pg_dump fails for user-defined types (release stopper?)
> A more promising idea is to hack function creation > so that the OID assigned to the function is lower > than the OIDs assigned to any shell types created > when the function is defined. Or we could try to > hack pg_dump to fix this, but that doesn't seem > appetizing. Requiring OID ordering would open up a new can of worms. What happens if the user does a drop/create on the function after creating it? The function could potentially be recreated with a higher OID and then the user would be in the same situation. If the system requires (or works around) creation ordering when creating functions and types, ISTM that pg_dump should have to do the same. darrenk ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
RE: [HACKERS] Bug in CREATE OPERATOR
> > CREATE OPERATOR testbit ( > leftarg = bitset, > rightarg = int4, > procedure = testbit, > commutator = testbit > ); > > Now we have a big problem, as the DROP OPERATOR command > cannot delete the illegally named operator. Have you tried deleting it directly from pg_operator instead of using DROP OPERATOR? Darren
RE: [HACKERS] Hash index on macaddr -> crash
> We could fix this either by adding a new hash function to support > macaddr, or by removing the pg_amXXX entries that claim macaddr is > hashable. Either change will not take effect without an initdb, > however, and I'm loath to force one now that we've started beta. How about creating an SQL statement that will make the change and putting a blurb about it it in the README, INSTALL and/or FAQ? This wouldn't require an initdb and would let people have the fix. For things like this that update exising fields (vs adding/deleting fields hard-wired for use in the backend), it should work, no? Darren