Re: [PERFORM] [SQL] sql performance and cache
Chris Faulkner [EMAIL PROTECTED] writes: I am seeing this message in my logs. bt_fixroot: not valid old root page That's not good. I'd suggest reindexing that index. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[PERFORM] Large Text Search Help
Hi, I am trying to design a large text search database. It will have upwards of 6 million documents, along with meta data on each. I am currently looking at tsearch2 to provide fast text searching and also playing around with different hardware configurations. 1. With tsearch2 I get very good query times up until I insert more records. For example with 100,000 records tsearch2 returns in around 6 seconds, with 200,000 records tsearch2 returns in just under a minute. Is this due to the indices fitting entirely in memory with 100,000 records? 2. As well as whole word matching i also need to be able to do substring matching. Is the FTI module the way to approach this? 3. I have just begun to look into distibuted queries. Is there an existing solution for distibuting a postgresql database amongst multiple servers, so each has the same schema but only a subset of the total data? Any other helpful comments or sugestions on how to improve query times using different hardware or software techniques would be appreciated. Thanks, Mat ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL
Date: Sun, 12 Oct 2003 13:30:45 -0700 From: Josh Berkus [EMAIL PROTECTED] To: Nick Barr [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: go for a script! / ex: PostgreSQL vs. MySQL Message-ID: [EMAIL PROTECTED] This would be parameters such as the block size and a few other compile time parameters. If we can get to some of these read-only parameters than that would make this step easier, certainly for the new recruits amongst us. Actually, from my perspective, we shouldn't bother with this; if an admin knows enough to set an alternate blaock size for PG, then they know enough to tweak the Conf file by hand. I think we should just issue a warning that this script: 1) does not work for anyone who is using non-default block sizes, There was some talk, either on this list or freebsd-performance about setting the default block size for PostgreSQL running on FreeBSD to be 16k because of performance reasons. That is: *default* for the port, user is not asked. So an automagical method to scale non-default block sizes is a very needed thing. 2) may not work well for anyone using unusual locales, optimization flags, or other non-default compile options except for language interfaces. Depends on what you consider 'unusual'? I hope not things like iso8859-x (or, to be exact, European languages) :) -- Logic is a systematic method of coming to the wrong conclusion with confidence. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Sun performance - Major discovery!
Jeff, My first concern with the -fast option is that it makes an executable that is specific for the platform on which the compilation is run unless other flags are given. My second concern is the effect it has on IEEE floating point behavior w.r.t. rounding, error handling, And my third concern is that if you use -fast, all other code must be compiled and linked with the -fast option for correct operation, this includes any functional languages such as perl, python, R,... That is a pretty big requirement for a default compilation flag. Ken Marshall On Thu, Oct 09, 2003 at 12:07:20PM -0400, Jeff wrote: On Thu, 9 Oct 2003, Bruce Momjian wrote: 52 seconds to 19-20 seconds Wow, that's dramatic. Do you want to propose some flags for non-gcc Solaris? Is -fast the only one? Is there one that suppresses those warnings or are they OK? Well. As I said, I didn't see an obvious way to hide those warnings. I'd love to make those warnings go away. That is why I suggested perhaps printing a message to ensure the user knows that warnings may be printed when using sunsoft. -fast should be all you need - it picks the best settings to use for the platform that is doing the compile. -- Jeff Trout [EMAIL PROTECTED] http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] One or more processor ?
Do you know of any RDBMS that actually will execute a single query on multiple processors? SQL Server does in a sense. It can split a query onto multiple threads (which could possible use multiple processors) and then brings the results from the threads into one and then sends the results to the client. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL
Ivan, There was some talk, either on this list or freebsd-performance about setting the default block size for PostgreSQL running on FreeBSD to be 16k because of performance reasons. That is: *default* for the port, user is not asked. So an automagical method to scale non-default block sizes is a very needed thing. Hmmm ... possibly. My concern is that if someone uses a very non-default value, such as 256K, then they are probably better off doing their own tuning because they've got an unusual system. However, we could easily limit it to the range of 4K to 32K. Of course, since there's no GUC var, we'd have to ask the user to confirm their block size. I'm reluctant to take this approach because if the user gets it wrong, then the settings will be *way* off ... and possibly cause PostgreSQL to be unrunnable or have out of memory crashes. Unless there's a way to find it in the compiled source? 2) may not work well for anyone using unusual locales, optimization flags, or other non-default compile options except for language interfaces. Depends on what you consider 'unusual'? I hope not things like iso8859-x (or, to be exact, European languages) :) On second thought, I'm not sure what an unusual locale would be. Scratch that. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Large Text Search Help
Mat, 1. With tsearch2 I get very good query times up until I insert more records. For example with 100,000 records tsearch2 returns in around 6 seconds, with 200,000 records tsearch2 returns in just under a minute. Is this due to the indices fitting entirely in memory with 100,000 records? Maybe, maybe not. If you want a difinitive answer, post your EXPLAIN ANALYZE results with the original query. I assume that you have run VACUUM ANALYZE, first? Don't bother to respond until you have. 2. As well as whole word matching i also need to be able to do substring matching. Is the FTI module the way to approach this? Yes. 3. I have just begun to look into distibuted queries. Is there an existing solution for distibuting a postgresql database amongst multiple servers, so each has the same schema but only a subset of the total data? No, it would be ad-hoc. So far, Moore's law has prevented us from needing to devote serious effort to the above approach. Any other helpful comments or sugestions on how to improve query times using different hardware or software techniques would be appreciated. Read the archives of this list. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] further testing on IDE drives
On Tue, 14 Oct 2003, Tom Lane wrote: scott.marlowe [EMAIL PROTECTED] writes: open_sync was WAY faster at this than the other two methods. Do you not have open_datasync? That's the preferred method if available. Nope, when I try to start postgresql with it set to that, I get this error message: FATAL: invalid value for wal_sync_method: open_datasync This is on RedHat 9, but I have the same problem on a RH 7.2 box as well. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL
Josh Berkus [EMAIL PROTECTED] writes: Unless there's a way to find it in the compiled source? See pg_controldata. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] free space map usage
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is there any way to determine how much of the free space map is currently in use?(ie. where and what it is tracking?) I vacuum on a regular basis but I never hold in terms of disk space usage. I jacked up the free space map pages but this doesn't appear to be working. shared_buffers = 29400# 2*max_connections, min 16 max_fsm_relations = 1000# min 10, fsm is free space map max_fsm_pages = 1000 # min 1000, fsm is free space map - -- Jeremy M. Guthrie Systems Engineer Berbee 5520 Research Park Dr. Madison, WI 53711 Phone: 608-298-1061 Berbee...Decade 1. 1993-2003 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/jCo4qtjaBHGZBeURAj9EAKCL+tiioPO5K1YM1sn62yS0L1Ry5QCfVifq 22s22gFNFHAHquS+iiUZO6s= =AQ2Y -END PGP SIGNATURE- attachment: archive.png ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL
This would be parameters such as the block size and a few other compile time parameters. If we can get to some of these read-only parameters than that would make this step easier, certainly for the new recruits amongst us. Actually, from my perspective, we shouldn't bother with this; if an admin knows enough to set an alternate blaock size for PG, then they know enough to tweak the Conf file by hand. I think we should just issue a warning that this script: 1) does not work for anyone who is using non-default block sizes, There was some talk, either on this list or freebsd-performance about setting the default block size for PostgreSQL running on FreeBSD to be 16k because of performance reasons. That is: *default* for the port, user is not asked. Real quick, this isn't true, the block size is tunable, but does not change the default. You can set PGBLOCKSIZE to the values 16K or 32K to change the block size, but the default remains 8K. http://lists.freebsd.org/pipermail/freebsd-database/2003-October/000111.html -sc -- Sean Chittenden ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] free space map usage
Jeremy M. Guthrie [EMAIL PROTECTED] writes: Is there any way to determine how much of the free space map is currently i= n=20 use?(ie. where and what it is tracking?) I vacuum on a regular basis but I= =20 never hold in terms of disk space usage. Not in 7.3 AFAIR. In 7.4 a full-database VACUUM VERBOSE will end with the info you want: regression=# vacuum verbose; ... much cruft ... INFO: free space map: 11 relations, 144 pages stored; 272 total pages needed DETAIL: Allocated FSM size: 1000 relations + 2 pages = 178 kB shared memory. VACUUM regression=# This tells me I'm only using about 1% of the FSM space (272 out of 2 page slots). I jacked up the free space map=20 pages but this doesn't appear to be working. You know you have to restart the postmaster to make those changes take effect, right? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Any issues with my tuning...
On Mon, 2003-10-13 at 15:43, David Griffiths wrote: Here are part of the contents of my sysctl.conf file (note that I've played with values as low as 60 with no difference) kernel.shmmax=14 kernel.shmall=14 This is only a system-wide limit -- it either allows the shared memory allocation to proceed, or it does not. Changing it will have no other effect on the performance of PostgreSQL. - Index Scan using comm_ent_usr_acc_id_i on commercial_entity ce (cost=0.00..4787.69 rows=78834 width=24) (actual time=0.02..55.64 rows=7991 loops=1) Interesting that we get this row count estimate so completely wrong (although it may or may not have anything to do with the actual performance problem you're running into). Have you run ANALYZE on this table recently? If so, does increasing this column's statistics target (using ALTER TABLE ... ALTER COLUMN ... SET STATISTICS) improve the row count estimate? -Neil ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [SQL] [PERFORM] sql performance and cache
On Tue, 14 Oct 2003, Wei Weng wrote: On Sat, 11 Oct 2003, Christopher Kings-Lynne wrote: I have two very similar queries which I need to execute. They both have exactly the same from / where conditions. When I execute the first, it takes about 16 seconds. The second is executed almost immediately after, it takes 13 seconds. In short, I'd like to know why the query result isn't being cached and any ideas on how to improve the execution. snip OK - so I could execute the query once, and get the maximum size of the array and the result set in one. I know what I am doing is less than optimal but I had expected the query results to be cached. So the second execution would be very quick. So why aren't they ? I have increased my cache size - shared_buffers is 2000 and I have doubled the default max_fsm... settings (although I am not sure what they do). sort_mem is 8192. PostgreSQL does not have, and has never had a query cache - so nothing you do is going to make that second query faster. Perhaps you are confusing it with the MySQL query cache? Chris Is there plan on developing one (query cache)? Not really, Postgresql's design makes it a bit of a non-winner. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] sql performance and cache
Perhaps you are confusing it with the MySQL query cache? Is there plan on developing one (query cache)? For the most part, prepared queries and cursors give you a greater advantage due to their versatility -- both of which we do have. In the cases where an actual cache is useful, the client application could do it just as easily or temp tables can be used. I suspect it would be implemented more as a caching proxy than as an actual part of PostgreSQL, should someone really want this feature. signature.asc Description: This is a digitally signed message part
Re: [PERFORM] backup/restore - another area.
Jeff, I'm curious to what kind of testing you've done with LVM. I'm not currently trying any backup/restore stuff, but I'm running our DBT-2 workload using LVM. I've started collecting vmstat, iostat, and readprofile data, initially running disktest to gauge the performance. For anyone curious, I have some data on a 14-disk volume here: http://developer.osdl.org/markw/lvm/results.4/log/ and a 52-disk volume here: http://developer.osdl.org/markw/lvm/results.5/data/ Mark Jeff [EMAIL PROTECTED] writes: Idea #1: Use an LVM and take a snapshop - archive that. From the way I see it. the downside is the LVM will use a lot of space until the snapshot is removed. Also PG may be in a slightly inconsistant state - but this should appear to PG the same as if the power went out. For restore, simply unarchive this snapshot and point postgres at it. Let it recover and you are good to go. Little overhead from what I see... I'm leaning towards this method the more I think of it. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] Sun performance - Major discovery!
Marko Karppinen writes: GCC sets __FAST_MATH__ even if you counter a -ffast-math with the negating flags above. This means that it is not currently possible to use the -fast flag when compiling PostgreSQL at all. Instead, you have to go through all the flags Apple is setting and only pass on those that don't break pg. That sounds perfectly reasonable to me. Why should we develop elaborate workarounds for compiler flags that are known to create broken code? I also want to point out that I'm getting kind of tired of developing more and more workarounds for sloppy Apple engineering. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] [PERFORM] Sun performance - Major discovery!
On 8.10.2003, at 21:31, Bruce Momjian wrote: Well, this is really embarassing. I can't imagine why we would not set at least -O on all platforms. Looking at the template files, I see these have no optimization set: darwin Regarding Darwin optimizations, Apple has introduced a -fast flag in their GCC 3.3 version that they recommend when compiling code for their new G5 systems. Because of this, I foresee a lot of people defining CFLAGS=-fast on their systems. This is problematic for PostgreSQL, however, since the -fast flag is the equivalent of: -O3 -falign-loops-max-skip=15 -falign-jumps-max-skip=15 -falign-loops=16 -falign-jumps=16 -falign-functions=16 -malign-natural -ffast-math -fstrict-aliasing -frelax-aliasing -fgcse-mem-alias -funroll-loops -floop-transpose -floop-to-memset -finline-floor -mcpu=G5 -mpowerpc64 -mpowerpc-gpopt -mtune=G5 -fsched-interblock -fload-after-store --param max-gcse-passes=3 -fno-gcse-sm -fgcse-loop-depth -funit-at-a-time -fcallgraph-inlining -fdisable-typechecking-for-spec At least the --fast-math part causes problems, seeing that PostgreSQL actually checks for the __FAST_MATH__ macro to make sure that it isn't turned on. There might be other problems with Apple's flags, but I think that the __FAST_MATH__ check should be altered. As you know, setting --fast-math in GCC is the equivalent of setting -fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math, -ffinite-math-only and -fno-signaling-nans. What really should be done, I think, is adding the opposites of these flags (-fmath-errno, -fno-unsafe-math-optimizations, -ftrapping_math, -fno-finite-math-only and -fsignaling-nans) to the command line if __FAST_MATH__ is detected. This would allow people to use CFLAGS=-fast on their G5s, beat some Xeon speed records, and not worry about esoteric IEEE math standards. What do you guys think? GCC sets __FAST_MATH__ even if you counter a -ffast-math with the negating flags above. This means that it is not currently possible to use the -fast flag when compiling PostgreSQL at all. Instead, you have to go through all the flags Apple is setting and only pass on those that don't break pg. mk ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster