Re: [PERFORM] Arguments Pro/Contra Software Raid

2006-05-09 Thread William Yu

William Yu wrote:
We upgraded our disk system for our main data processing server earlier 
this year. After pricing out all the components, basically we had the 
choice of:


LSI MegaRaid 320-2 w/ 1GB RAM+BBU + 8 15K 150GB SCSI

  or

Areca 1124 w/ 1GB RAM+BBU + 24 7200RPM 250GB SATA


My mistake -- I keep doing calculations and they don't add up. So I 
looked again on pricewatch and it turns out the actual comparison was 
for 4 SCSI drives, not 8! ($600 for a 15K 145GB versus $90 for a 7200 
250GB.) No wonder our decision seemed to much more decisive back then.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Worsening performance with 7.4 on flash-based system

2006-04-29 Thread William Yu
Usually when simple queries take a long time to run, it's the system 
tables (pg_*) that have become bloated and need vacuuming. But that's 
just random guess on my part w/o my detailed info.



Greg Stumph wrote:
Well, since I got no response at all to this message, I can only assume that 
I've asked the question in an insufficient way, or else that no one has 
anything to offer on our problem.


This was my first post to the list, so if there's a better way I should be 
asking this, or different data I should provide, hopefully someone will let 
me know...


Thanks,
Greg

Greg Stumph [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
We are experiencing gradually worsening performance in PostgreSQL 7.4.7, 
on a system with the following specs:

Linux OS (Fedora Core 1, 2.4 kernal)
Flash file system (2 Gig, about 80% full)
256 Meg RAM
566 MHz Celeron CPU

We use Orbit 2.9.8 to access PostGres. The database contains 62 tables.

When the system is running with a fresh copy of the database, performance 
is fine. At its worst, we are seeing fairly simple SELECT queries taking 
up to 1 second to execute. When these queries are run in a loop, the loop 
can take up to 30 seconds to execute, instead of the 2 seconds or so that 
we would expect.


VACUUM FULL ANALYZE helps, but doesn't seem to eliminate the problem.

The following table show average execution time in bad performance mode 
in the first column, execution time after VACUUM ANALYZE in the second 
column, and % improvement (or degradation?) in the third. The fourth 
column show the query that was executed.


741.831|582.038|-21.5| ^IDECLARE table_cursor
170.065|73.032|-57.1| FETCH ALL in table_cursor
41.953|45.513|8.5| CLOSE table_cursor
61.504|47.374|-23.0| SELECT last_value FROM pm_id_seq
39.651|46.454|17.2| select id from la_looprunner
1202.170|265.316|-77.9| select id from rt_tran
700.669|660.746|-5.7| ^IDECLARE my_tran_load_cursor
1192.182|47.258|-96.0| FETCH ALL in my_tran_load_cursor
181.934|89.752|-50.7| CLOSE my_tran_load_cursor
487.285|873.474|79.3| ^IDECLARE my_get_router_cursor
51.543|69.950|35.7| FETCH ALL in my_get_router_cursor
48.312|74.061|53.3| CLOSE my_get_router_cursor
814.051|1016.219|24.8| SELECT   $1  = 'INSERT'
57.452|78.863|37.3| select id from op_sched
48.010|117.409|144.6| select short_name, long_name from la_loopapp
54.425|58.352|7.2| select id from cd_range
45.289|52.330|15.5| SELECT last_value FROM rt_tran_id_seq
39.658|82.949|109.2| SELECT last_value FROM op_sched_id_seq
42.158|68.189|61.7| select card_id,router_id from rt_valid


Has anyone else seen gradual performance degradation like this? Would 
upgrading to Postgres 8 help? Any other thoughts on directions for 
troubleshooting this?


Thanks...






---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread William Yu

[EMAIL PROTECTED] wrote:


I have an Intel Pentium D 920, and an AMD X2 3800+. These are very
close in performance. The retail price difference is:

Intel Pentium D 920 is selling for $310 CDN
AMD X2 3800+is selling for $347 CDN

Anybody who claims that Intel is 2X more expensive for the same
performance, isn't considering all factors. No question at all - the
Opteron is good, and the Xeon isn't - but the original poster didn't
ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not
double Pentium D. Maybe 10%. Maybe not at all. Especially now that
Intel is dropping it's prices due to overstock.


There's part of the equation you are missing here. This is a PostgreSQL 
mailing list which means we're usually talking about performance of just 
this specific server app. While in general there may not be that much of 
a % difference between the 2 chips, there's a huge gap in Postgres. For 
whatever reason, Postgres likes Opterons. Way more than Intel 
P4-architecture chips. (And it appears way more than IBM Power4 chips 
and a host of other chips also.)


Here's one of the many discussions we had about this issue last year:

http://qaix.com/postgresql-database-development/337-670-re-opteron-vs-xeon-was-what-to-do-with-6-disks-read.shtml

The exact reasons why Opteron runs PostgreSQL so much better than P4s, 
we're not 100% sure of. We have guesses -- lower memory latency, lack of 
shared FSB, better 64-bit, 64-bit IOMMU, context-switch storms on P4, 
better dualcore implementation and so on. Perhaps it's a combination of 
all the above factors but somehow, the general experience people have 
had is that equivalently priced Opterons servers run PostgreSQL 2X 
faster than P4 servers as the baseline and the gap increases as you add 
more sockets and more cores.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread William Yu

David Boreham wrote:

It isn't only Postgres. I work on a number of other server applications
that also run much faster on Opterons than the published benchmark
figures would suggest they should. They're all compiled with gcc4,
so possibly there's a compiler issue. I don't run Windows on any
of our Opteron boxes so I can't easily compare using the MS compiler.



Maybe it's just a fact that the majority of x86 64-bit development for 
open source software happens on Opteron/A64 machines. 64-bit AMD 
machines were selling a good year before 64-bit Intel machines were 
available. And even after Intel EMT64 were available, anybody in their 
right mind would have picked AMD machines over Intel due to 
cost/heat/performance. So you end up with 64-bit OSS being 
developed/optimized for Opterons and the 10% running Intel EMT64 handle 
compatibility issues.


Would be interesting to see a survey of what machines OSS developers use 
to write/test/optimize their code.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 3WARE Card performance boost?

2006-01-18 Thread William Yu

Benjamin Arai wrote:
Obviously, I have done this to improve write performance for the update 
each week.  My question is if I install a 3ware or similar card to 
replace my current software RAID 1 configuration, am I going to see a 
very large improvement?  If so, what would be a ball park figure?


The key is getting a card with the ability to upgrade the onboard ram.

Our previous setup was a LSI MegaRAID 320-1 (128MB), 4xRAID10, 
fsync=off. Replaced it with a ARC-1170 (1GB) w/ 24x7200RPM SATA2 drives 
(split into 3 8-drive RAID6 arrays) and performance for us is through 
the ceiling.


For OLTP type updates, we've gotten about +80% increase. For massive 
1-statement updates, performance increase is in the +triple digits.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] 3WARE Card performance boost?

2006-01-18 Thread William Yu

Steinar H. Gunderson wrote:

On Wed, Jan 18, 2006 at 01:58:09PM -0800, William Yu wrote:

The key is getting a card with the ability to upgrade the onboard ram.

Our previous setup was a LSI MegaRAID 320-1 (128MB), 4xRAID10, 
fsync=off. Replaced it with a ARC-1170 (1GB) w/ 24x7200RPM SATA2 drives 
(split into 3 8-drive RAID6 arrays) and performance for us is through 
the ceiling.


Well, the fact that you went from four to 24 disks would perhaps be a bigger
factor than the amount of RAM...

/* Steinar */


Actually no. Our 2xOpteron 244 server is NOT fast enough to drive an 
array this large. That's why we had to split it up into 3 different 
arrays. I tried all different RAID configs and once past about 8 drives, 
I got the same performance no matter what because the CPU was pegged at 
100%. Right now, 2 of the arrays are just mirroring each other because 
we can't seem utilize the performance right now. (Also protects against 
cabling/power supply issues as we're using 3 seperate external enclosures.)


The 1GB RAM is much bigger because it almost completely hides the write 
activity. Looking at iostat while all our jobs are running, there's 
almost no disk activity. If I manually type sync, I see 1 big 
250MB-500MB write storm for 2 seconds but otherwise, writes just slowly 
dribble out to disk.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?

2005-12-24 Thread William Yu

David Lang wrote:
raid 5 is bad for random writes as you state, but how does it do for 
sequential writes (for example data mining where you do a large import 
at one time, but seldom do other updates). I'm assuming a controller 
with a reasonable amount of battery-backed cache.


Random write performance (small block that only writes to 1 drive):
1 write requires N-1 reads + N writes -- 1/2N-1 %

Sequential write performance (write big enough block to use all N drives):
N-1 Write requires N writes -- N-1/N %

Assuming enough cache so all reads/writes are done in 1 transaction + 
onboard processor calcs RAID parity fast enough to not cause an extra delay.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?

2005-12-24 Thread William Yu

Luke Lonergan wrote:

Note that host-based SCSI raid cards from LSI, Adaptec, Intel, Dell, HP
and others have proven to have worse performance than a single disk
drive in many cases, whether for RAID0 or RAID5.  In most circumstances


This is my own experience. Running a LSI MegaRAID in pure passthrough 
mode + Linux software RAID10 is a ton faster than configuring the RAID 
via the LSI card. One of the things I've noticed is that the card does 
not seem to be able to parallel read on mirrors. While looking at iostat 
under Linux, I can see software RAID1 reading all drives and the MD 
number adding up to the sum of all drives.


The ARECA SATA controller I just got though doesn't seem to exhibit 
these problems. Performance is a few % points above Linux software RAID 
at lower CPU usage. In fact, I'm getting better single-threaded 
bandwidth on a 4x7200RPM SATA config versus a 6x15K SCSI config on the 
LSI. The drives are bigger for the SATA drive (300GB) versus 36GB for 
the SCSI so that means the heads don't have to move any where as much 
and can stay on the fast portion of the disk. Haven't had a chance to 
test multi-user DB between the two setup though.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] What's the best hardver for PostgreSQL 8.1?

2005-12-21 Thread William Yu

Juan Casero wrote:
Can you elaborate on the reasons the opteron is better than the Xeon when it 
comes to disk io?   I have a PostgreSQL 7.4.8 box running a DSS.   One of our 


Opterons have 64-bit IOMMU -- Xeons don't. That means in 64-bit mode, 
transfers to  4GB, the OS must allocated the memory  4GB, DMA to that 
block and then the CPU must do extra work in copying the memory to  
4GB. Versus on the Opteron, it's done by the IO adaptor using DMA in the 
background.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] 15,000 tables - next step

2005-12-04 Thread William Yu

Michael Riess wrote:

Well, I'd think that's were your problem is.  Not only you have a
(relatively speaking) small server -- you also share it with other
very-memory-hungry services!  That's not a situation I'd like to be in.
Try putting Apache and Tomcat elsewhere, and leave the bulk of the 1GB
to Postgres. 



No can do. I can try to switch to a 2GB machine, but I will not use 
several machines. Not for a 5GB database. ;-)



With 1500 shared buffers you are not really going
anywhere -- you should have ten times that at the very least.



Like I said - I tried to double the buffers and the performance did not 
improve in the least. And I also tried this on a 2GB machine, and 
swapping was not a problem. If I used 10x more buffers, I would in 
essence remove the OS buffers.


Increasing buffers do improve performance -- if you have enough memory. 
You just don't have enough memory to play with. My servers run w/ 10K 
buffers (128MB on 64-bit FC4) and it definitely runs better w/ it at 10K 
versus 1500.


With that many tables, your system catalogs are probably huge. To keep 
your system catalog from continually cycling in-out of buffers/OS 
cache/disk, you need a lot more memory. Ordinarily, I'd say the 500MB 
you have available for Postgres to cache 5GB is a workable ratio. My 
servers all have similar ratios of ~1:10 and they perform pretty good -- 
*except* when the system catalogs bloated due to lack of vacuuming on 
system tables. My app regularly creates  drops thousands of temporary 
tables leaving a lot of dead rows in the system catalogs. (Nearly the 
same situation as you -- instead of 15K live tables, I had 200 live 
tables and tens of thousands of dead table records.) Even with almost 
8GB of RAM dedicated to postgres, performance on every single query -- 
not matter how small the table was -- took forever because the query 
planner had to spend a significant period of time scanning through my 
huge system catalogs to build the execution plan.


While my situtation was fixable by scheduling a nightly vacuum/analyze 
on the system catalogs to get rid of the bazillion dead table/index 
info, you have no choice but to get more memory so you can stuff your 
entire system catalog into buffers/os cache. Personally, w/ 1GB of ECC 
RAM at ~$85, it's a no brainer. Get as much memory as your server can 
support.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-20 Thread William Yu

Alan Stange wrote:

Luke Lonergan wrote:

The aka iowait is the problem here - iowait is not idle (otherwise it
would be in the idle column).

Iowait is time spent waiting on blocking io calls.  As another poster
pointed out, you have a two CPU system, and during your scan, as 


iowait time is idle time. Period.   This point has been debated 
endlessly for Solaris and other OS's as well.


I'm sure the the theory is nice but here's my experience with iowait 
just a minute ago. I run Linux/XFce as my desktop -- decided I wanted to 
lookup some stuff in Wikipedia under Mozilla and my computer system 
became completely unusable for nearly a minute while who knows what 
Mozilla was doing. (Probably loading all the language packs.) I could 
not even switch to IRC (already loaded) to chat with other people while 
Mozilla was chewing up all my disk I/O.


So I went to another computer, connected to mine remotely (slow...) and 
checked top. 90% in the wa column which I assume is the iowait column. 
It may be idle in theory but it's not a very useful idle -- wasn't able 
to switch to any programs already running, couldn't click on the XFce 
launchbar to run any new programs.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-17 Thread William Yu

Welty, Richard wrote:

David Boreham wrote:


I guess I've never bought into the vendor story that there are
two reliability grades. Why would they bother making two
different kinds of bearing, motor etc ? Seems like it's more
likely an excuse to justify higher prices.



then how to account for the fact that bleeding edge SCSI drives
turn at twice the rpms of bleeding edge consumer drives?


The motors spin twice as fast?

I'm pretty sure the original comment was based on drives w/ similar 
specs. E.g. 7200RPM enterprise drives versus 7200RPM consumer drives.


Next time one of my 7200RPM SCSIs fail, I'll take it apart and compare 
the insides to an older 7200RPM IDE from roughly the same era.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-17 Thread William Yu

Alex Turner wrote:

Opteron 242 - $178.00
Opteron 242 - $178.00
Tyan S2882 - $377.50
Total: $733.50

Opteron 265 - $719.00
Tyan K8E - $169.00
Total: $888.00


You're comparing the wrong CPUs. The 265 is the 2x of the 244 so you'll 
have to bump up the price more although not enough to make a difference.


Looks like the price of the 2X MBs have dropped since I last looked at 
it. Just a few months back, Tyan duals were $450-$500 which is what I 
was basing my priced less statement from.


Tyan K8E - doesn't have any PCI-X, so it's also not apples to apples. 
Infact I couldn't find a single CPU slot board that did, so you pretty

much have to buy a dual CPU board to get PCI-X.


You can get single CPU boards w/ PCIe and use PCIe controller cards. 
Probably expensive right now because they're so bleeding-edge new but 
definitely on the downswing.



a) OLTP - probably IO bound, large number of queries/sec updating info
on _disks_, not requiring much CPU activity except to retrieve item
infomration which is well indexed and normalized.


Not in my experience. I find on our OLTP servers, we run 98% in RAM and 
hence are 100% CPU-bound. Our DB is about 50GB in size now, servers run 
w/ 8GB of RAM. We were *very* CPU limited running 2x244. During busy 
hours of the day, our avg user transaction time were jumping from 
0.8sec to 1.3+sec. Did the 2x265 and now we're always in the 0.7sec to 
0.9sec range.



DC also gives you a better upgrade path. Let's say you do testing and
figure 2x246 is the right setup to handle the load. Well instead of
getting 2x1P, use the same 2P motherboard but only populate 1 CPU w/ a
DC/270. Now you have a server that can be upgraded to +80% more CPU by
popping in another DC/270 versus throwing out the entire thing to get a
4x1P setup.



No argument there.  But it's pointless if you are IO bound.


Why would you just accept we're IO bound, nothing we can do? I'd do 
everything in my power to make my app go from IO bound to CPU bound -- 
whether by optimizing my code or buying more hardware. I can tell you if 
our OLTP servers were IO bound, it would run like crap. Instead of  1 
sec, we'd be looking at 5-10 seconds per user transaction and our 
users would be screaming bloody murder.


In theory, you can always convert your IO bound DB to CPU bound by 
stuffing more and more RAM into your server. (Or partitioning the DB 
across multiple servers.) Whether it's cost effective depends on the DB 
and how much your users are paying you -- and that's a case-by-case 
analysis. Not a global statement of IO-bound, pointless.



(2) Does a DC system perform better than it's Nx1P cousin? My experience
is yes. Did some rough tests in a drop-in-replacement 1x265 versus 2x244
and saw about +10% for DC. All the official benchmarks (Spec, Java, SAP,
etc) from AMD/Sun/HP/IBM show DCs outperforming the Nx1P setups.



Maybe true, but the 265 does have a 25% faster FSB than the 244, which
might perhaps play a role.


Nope. There's no such thing as FSB on Opterons. On-die memory controller 
runs @ CPU speed and hence connects at whatever the memory runs at 
(rounded off to some multiplier math). There's the HT speed that 
controls the max IO bandwidth but that's based on the motherboard, not 
the CPU. Plus the 265 and 244 both run at 1.8Ghz so the memory 
multiplier  HT IO are both the same.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-17 Thread William Yu

Joshua Marsh wrote:


On 11/17/05, *William Yu* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote:

  No argument there.  But it's pointless if you are IO bound.

Why would you just accept we're IO bound, nothing we can do? I'd do
everything in my power to make my app go from IO bound to CPU bound --
whether by optimizing my code or buying more hardware. I can tell you if
our OLTP servers were IO bound, it would run like crap. Instead of  1
sec, we'd be looking at 5-10 seconds per user transaction and our
users would be screaming bloody murder.

In theory, you can always convert your IO bound DB to CPU bound by
stuffing more and more RAM into your server. (Or partitioning the DB
across multiple servers.) Whether it's cost effective depends on the DB
and how much your users are paying you -- and that's a case-by-case
analysis. Not a global statement of IO-bound, pointless.


We all want our systems to be CPU bound, but it's not always possible.  
Remember, he is managing a 5 TB Databse.  That's quite a bit different 
than a 100 GB or even 500 GB database.


I did say in theory. :) I'm pretty sure google is more CPU bound than 
IO bound -- they just spread their DB over 50K servers or whatever. Not 
everybody is willing to pay for that but it's always in the realm of 
plausibility.


Plus we have to go back to the statement I was replying to which was I 
have yet to come across a DB system that wasn't IO bound.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-16 Thread William Yu

Alex Turner wrote:

Not at random access in RAID 10 they aren't, and anyone with their
head screwed on right is using RAID 10.  The 9500S will still beat the
Areca cards at RAID 10 database access patern.


The max 256MB onboard for 3ware cards is disappointing though. While 
good enough for 95% of cases, there's that 5% that could use a gig or 
two of onboard ram for ultrafast updates. For example, I'm specing out 
an upgrade to our current data processing server. Instead of the 
traditional 6xFast-Server-HDs, we're gonna go for broke and do 
32xConsumer-HDs. This will give us mega I/O bandwidth but we're 
vulnerable to random access since consumer-grade HDs don't have the RPMs 
or the queueing-smarts. This means we're very dependent on the 
controller using onboard RAM to do I/O scheduling. 256MB divided over 
4/6/8 drives -- OK. 256MB divided over 32 drives -- ugh, the HD's 
buffers are bigger than the RAM alotted to it.


At least this is how it seems it would work from thinking through all 
the factors. Unfortunately, I haven't found anybody else who has gone 
this route and reported their results so I guess we're the guinea pig.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Hardware/OS recommendations for large databases ( 5TB)

2005-11-16 Thread William Yu

James Mello wrote:

Unless there was a way to guarantee consistency, it would be hard at
best to make this work. Convergence on large data sets across boxes is
non-trivial, and diffing databases is difficult at best. Unless there
was some form of automated way to ensure consistency, going 8 ways into
separate boxes is *very* hard. I do suppose that if you have fancy
storage (EMC, Hitachi) you could do BCV or Shadow copies. But in terms
of commodity stuff, I'd have to agree with Merlin. 


If you're talking about data consistency, I don't see why that's an 
issue in a bulk-load/read-only setup. Either bulk load on 1 server and 
then do a file copy to all the others -- or simultaneously bulk load on 
all servers.


If you're talking about consistency in directly queries to the 
appropriate servers, I agree that's a more complicated issue but not 
unsurmountable. If you don't use persistent connections, you can 
probably get pretty good routing using DNS -- monitor servers by looking 
at top/iostat/memory info/etc and continually change the DNS zonemaps to 
direct traffic to less busy servers. (I use this method for our global 
load balancers -- pretty easy to script via Perl/Python/etc.) Mind you 
since you need a Dual Processor motherboard anyways to get PCI-X, that 
means every machine would be a 2xDual Core so there's enough CPU power 
to handle the cases where 2 or 3 queries get sent to the same server 
back-to-back. Of course, I/O would take a hit in this case -- but I/O 
would take a hit in every case on a single 16-core mega system.


If use persistent connections, it'll definitely require extra 
programming beyond simple scripting. Take one of the opensource projects 
like PgPool or SQLRelay and alter it so it monitors all servers to see 
what server is least busy before passing a query on.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-16 Thread William Yu

Alex Stapleton wrote:
Your going to have to factor in the increased failure rate in your  cost 
measurements, including any downtime or performance degradation  whilst 
rebuilding parts of your RAID array. It depends on how long  your 
planning for this system to be operational as well of course.


If we go 32xRAID10, rebuild time should be the same as rebuild time in a 
4xRAID10 system. Only the hard drive that was replaced needs rebuild -- 
not the entire array.


And yes, definitely need a bunch of drives lying around as spares.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-16 Thread William Yu

Alex Turner wrote:

Spend a fortune on dual core CPUs and then buy crappy disks...  I bet
for most applications this system will be IO bound, and you will see a
nice lot of drive failures in the first year of operation with
consumer grade drives.

Spend your money on better Disks, and don't bother with Dual Core IMHO
unless you can prove the need for it.


I would say the opposite -- you always want Dual Core nowadays. DC 
Opterons simply give you better bang for the buck than single core 
Opterons. Price out a 1xDC system against a 2x1P system -- the 1xDC will 
be cheaper. Do the same for 2xDC versus 4x1P, 4xDC versus 8x1P, 8xDC 
versus 16x1P, etc. -- DC gets cheaper by wider and wider margins because 
those mega-CPU motherboards are astronomically expensive.


DC also gives you a better upgrade path. Let's say you do testing and 
figure 2x246 is the right setup to handle the load. Well instead of 
getting 2x1P, use the same 2P motherboard but only populate 1 CPU w/ a 
DC/270. Now you have a server that can be upgraded to +80% more CPU by 
popping in another DC/270 versus throwing out the entire thing to get a 
4x1P setup.


The only questions would be:
(1) Do you need a SMP server at all? I'd claim yes -- you always need 2+ 
cores whether it's DC or 2P to avoid IO interrupts blocking other 
processes from running.


(2) Does a DC system perform better than it's Nx1P cousin? My experience 
is yes. Did some rough tests in a drop-in-replacement 1x265 versus 2x244 
and saw about +10% for DC. All the official benchmarks (Spec, Java, SAP, 
etc) from AMD/Sun/HP/IBM show DCs outperforming the Nx1P setups.


(3) Do you need an insane amount of memory? Well here's the case where 
the more expensive motherboard will serve you better since each CPU slot 
has its own bank of memory. Spend more money on memory, get cheaper 
single-core CPUs.


Of course, this doesn't apply if you are an Intel/Dell-only shop. Xeon 
DCs, while cheaper than their corresponding single-core SMPs, don't have 
the same performance profile of Opteron DCs. Basically, you're paying a 
bit extra so your server can generate a ton more heat.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-16 Thread William Yu

David Boreham wrote:

 Spend a fortune on dual core CPUs and then buy crappy disks...  I bet
 for most applications this system will be IO bound, and you will see a
 nice lot of drive failures in the first year of operation with
 consumer grade drives.

I guess I've never bought into the vendor story that there are
two reliability grades. Why would they bother making two
different kinds of bearing, motor etc ? Seems like it's more
likely an excuse to justify higher prices. In my experience the
expensive SCSI drives I own break frequently while the cheapo
desktop drives just keep chunking along (modulo certain products
that have a specific known reliability problem).

I'd expect that a larger number of hotter drives will give a less reliable
system than a smaller number of cooler ones.


Our SCSI drives have failed maybe a little less than our IDE drives. 
Hell, some of the SCSIs even came bad when we bought them. Of course, 
the IDE drive failure % is inflated by all the IBM Deathstars we got -- ugh.


Basically, I've found it's cooling that's most important. Packing the 
drives together into really small rackmounts? Good for your density, not 
good for the drives. Now we do larger rackmounts -- drives have more 
space in between each other plus fans in front and back of the drives.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Hardware/OS recommendations for large databases ( 5TB)

2005-11-15 Thread William Yu

Merlin Moncure wrote:

You could instead buy 8 machines that total 16 cores, 128GB RAM and
 
It's hard to say what would be better.  My gut says the 5u box would be

a lot better at handling high cpu/high concurrency problems...like your
typical business erp backend.  This is pure speculation of course...I'll
defer to the experts here.


In this specific case (data warehouse app), multiple machines is the 
better bet. Load data on 1 machine, copy to other servers and then use a 
middleman to spread out SQL statements to each machine.


I was going to suggest pgpool as the middleman but I believe it's 
limited to 2 machines max at this time. I suppose you could daisy chain 
pgpools running on every machine.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] shared buffers

2005-08-30 Thread William Yu

Carlos Henrique Reimer wrote:

I forgot to say that it´s a 12GB database...
 
Ok, I´ll set shared buffers to 30.000 pages but even so meminfo and 
top shouldn´t show some shared pages?
 
I heard something about that Redhat 9 can´t handle very well RAM higher 
than 2GB. Is it right?

Thanks in advance!


RH9, like any 32-bit OS, is limited to 2GB address space w/o special 
tricks. However, it can access  2GB for the OS disk cache using PAE if 
you are running the bigmem kernel.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Caching by Postgres

2005-08-24 Thread William Yu

Donald Courtney wrote:

I built postgreSQL 8.1 64K bit on solaris 10 a few months ago
and side by side with the 32 bit postgreSQL build saw no improvement. In 
fact the 64 bit result was slightly lower.


I'm not surprised 32-bit binaries running on a 64-bit OS would be faster 
than 64-bit/64-bit. 64-bit isn't some magical wand you wave and it's all 
ok. Programs compiled as 64-bit will only run faster if (1) you need 
64-bit address space and you've been using ugly hacks like PAE to get 
access to memory  2GB or (2) you need native 64-bit data types and 
you've been using ugly hacks to piece 32-bit ints together (example, 
encryption/compression). In most cases, 64-bit will run slightly slower 
due to extra overhead of using larger datatypes.


Since PostgreSQL hands off the majority of memory management/data 
caching to the OS, only the OS needs to be 64-bit to reap the benefits 
of better memory management. Since Postgres *ALREADY* reaps the 64-bit 
benefit, I'm not sure how the argument moving caching/mm/fs into 
Postgres would apply. Yes there's the point about possibly implementing 
better/smarter/more appropriate caching algorithms but that has nothing 
to do with 64-bit.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Caching by Postgres

2005-08-23 Thread William Yu

Donald Courtney wrote:

in that even if you ran postgreSQL on a 64 bit address space
with larger number of CPUs you won't see much of a scale up
and possibly even a drop.   I am not alone in having the *expectation*


What's your basis for believing this is the case? Why would PostgreSQL's 
dependence on the OS's caching/filesystem limit scalability? I know when 
I went from 32bit to 64bit Linux, I got *HUGE* increases in performance 
using the same amount of memory. And when I went from 2x1P to 2xDC, my 
average cpu usage % dropped almost in half.



that a database should have some cache size parameter and
the option to skip the file system.   If I use oracle, sybase, mysql
and maxdb they all have the ability to size a data cache and move
to 64 bits.


Josh Berkus has already mentioned this as conventional wisdom as written 
by Oracle. This may also be legacy wisdom. Oracle/Sybase/etc has been 
around for a long time; it was probably a clear performance win way back 
when. Nowadays with how far open-source OS's have advanced, I'd take it 
with a grain of salt and do my own performance analysis. I suspect the 
big vendors wouldn't change their stance even if they knew it was no 
longer true due to the support hassles.


My personal experience with PostgreSQL. Dropping shared buffers from 2GB 
to 750MB improved performance on my OLTP DB a good 25%. Going down from 
750MB to 150MB was another +10%.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] extremly low memory usage

2005-08-22 Thread William Yu

Ron wrote:

PERC4eDC-PCI Express, 128MB Cache, 2-External Channels


Looks like they are using the LSI Logic MegaRAID SCSI 320-2E 
controller.  IIUC, you have 2 of these, each with 2 external channels?


A lot of people have mentioned Dell's versions of the LSI cards can be 
WAY slower than the ones you buy from LSI. Why this is the case? Nobody 
knows for sure.


Here's a guess on my part. A while back, I was doing some googling. And 
instead of typing LSI MegaRAID xxx, I just typed MegaRAID xxx. Going 
beyond the initial pages, I saw Tekram -- a company that supposedly 
produces their own controllers -- listing products with the exact model 
numbers and photos as cards from LSI and Areca. Seemed puzzling until I 
read a review about SATA RAID cards where it mentioned Tekram produces 
the Areca cards under their own name but using slower components to 
avoid competing at the highend with them.


So what may be happening is that the logic circuitry on the Dell PERCs 
are the same as the source LSI cards, the speed of the RAID 
processor/RAM/internal buffers/etc is not as fast so Dell can shave off 
a few bucks for every server. That would mean while a true LSI card has 
the processing power to do the RAID calculates for X drives, the Dell 
version probably can only do X*0.6 drives or so.



The 128MB buffer also looks suspiciously small, and I do not see any 
upgrade path for it on LSI Logic's site.  Serious RAID controllers 
from companies like Xyratex, Engino, and Dot-hill can have up to 1-2GB 


The card is upgradable. If you look at the pic of the card, it shows a 
SDRAM DIMM versus integrated RAM chips. I've also read reviews a while 
back comparing benchmarks of the 320-2 w/ 128K versus 512K onboard RAM. 
Their product literature is just nebulous on the RAM upgrade part. I'm 
sure if you opened up the PDF manuals, you could find the exact info



That 128MB of buffer cache may very well be too small to keep the IO 
rate up, and/or there may be a more subtle problem with the LSI card, 
and/or you may have a configuration problem, but _something(s)_ need 
fixing since you are only getting raw sequential IO of ~100-150MB/s when 
it should be above 500MB/s.


I think it just might be the Dell hardware or the lack of 64-bit IOMMU 
on Xeon's. Here's my numbers on 320-1 w/ 128K paired up with Opterons 
compared to Jeremiah's.


 # time dd if=/dev/zero of=testfile bs=1024 count=100
 100+0 records in
 100+0 records out

 real0m8.885s
 user0m0.299s
 sys 0m6.998s

2x15K RAID1
real0m14.493s
user0m0.255s
sys 0m11.712s

6x15K RAID10 (2x 320-1)
real0m9.986s
user0m0.200s
sys 0m8.634s


 # time dd of=/dev/null if=testfile bs=1024 count=100
 100+0 records in
 100+0 records out

 real0m1.654s
 user0m0.232s
 sys 0m1.415s

2x15K RAID1
real0m3.383s
user0m0.176s
sys 0m3.207s

6x15K RAID10 (2x 320-1)
real0m2.427s
user0m0.178s
sys 0m2.250s

If all 14 HDs are arranged in a RAID10 array, I'd say there's definitely 
something wrong with Jeremiah's hardware.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance problems on 4/8way Opteron (dualcore)

2005-07-30 Thread William Yu
I've been running 2x265's on FC4 64-bit (2.6.11-1+) and it's been 
running perfect. With NUMA enabled, it runs incrementally faster than 
NUMA off. Performance is definitely better than the 2x244s they replaced 
-- how much faster, I can't measure since I don't have the transaction 
volume to compare to previous benchmarks. I do see more consistently low 
response times though, can run apache also on the server for faster HTML 
generation times and top seems to show in general twice as much CPU 
power idle on average (25% per 265 core versus 50% per 244.)


I haven't investigated the 2.6.12+ kernel updates yet -- I probably will 
do our development servers first to give it a test.




The problem as I remember it boiled down to the Linux kernel handling
memory/process management very badly on large dual core systems --
pathological NUMA behavior.  However, this problem has apparently been fixed
in Linux v2.6.12+, and using the more recent kernel on large dual core
systems generated *massive* performance improvements on these systems for
the individuals with this issue.  Using the patched kernel, one gets the
performance most people were expecting.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[PERFORM] Trying to figure out pgbench

2005-06-21 Thread William Yu
My Dual Core Opteron server came in last week. I tried to do some 
benchmarks with pgbench to get some numbers on the difference between 
1x1 - 2x1 - 2x2 but no matter what I did, I kept getting the same TPS 
on all systems. Any hints on what the pgbench parameters I should be using?


In terms of production use, it definitely can handle more load. 
Previously, Apache/Perl had to run on a separate server to avoid a ~50% 
penalty. Now, the numbers are +15% performance even with Apache/Perl 
running on the same box as PostgreSQL. How much more load of course is 
what I'd like to quantify.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Help specifying new web server/database machine

2005-06-09 Thread William Yu

Rory Campbell-Lange wrote:

Processor:

First of all I noted that we were intending to use Opteron processors. I
guess this isn't a straightforward choice because I believe Debian (our
Linux of choice) doesn't have a stable AMD64 port. However some users on
this list suggest that Opterons work very well even in a 32 bit
environment. Some have suggested that a single dual core processor is
the way to go. The RAM needs to fit the CPU arrangement too; William
points out that one needs 2 DIMMS per CPU.



Your summary here just pointed out the obvious to me. Start with a 2P MB 
but only populate a single DC Opteron. That'll give you 2P system with 
room to expand to 4P in the future. Plus you only need to populate 1 
memory bank so you can do 2x1GB.



Disks:

I'm somewhat confused here. I've followed the various notes about SATA
vs SCSI and it seems that SCSI is the way to go. On a four-slot 1U
server, would one do a single RAID10 over 4 disks 1rpm U320 disks?
I would run the database in its own partition, separate from the rest of
the OS, possible on LVM. An LSI-Megaraid-2 appears to be the card of
choice.


With only 4 disks, a MegaRAID U320-1 is good enough. It's quite a 
premium to go to the 2x channel MegaRAID. With 4 drives, I'd still do 2 
big drives mirrored for the DB partition and 2 small drives for OS+WAL.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Help specifying new web server/database machine

2005-06-08 Thread William Yu

We are considering two RAID1 system disks, and two RAID1 data disks.
We've avoided buying Xeons. The machine we are looking at looks like
this:

Rackmount Chassis - 500W PSU / 4 x SATA Disk Drive Bays
S2882-D - Dual Opteron / AMD 8111 Chipset / 5 x PCI Slots
2x - (Dual) AMD Opteron 246 Processors (2.0GHz) - 1MB L2 Cache/core (single 
core)


For about $1500 more, you could go 2x270 (dual core 2ghz) and get a 4X 
SMP system. (My DC 2x265 system just arrived -- can't wait to start 
testing it!!!)



2GB (2x 1024MB) DDR-400 (PC3200) ECC Registered SDRAM (single rank)


This is a wierd configuration. For a 2x Opteron server to operate at max 
performance, it needs 4 DIMMs minimum. Opterons use a 128-bit memory 
interface and hence requires 2 DIMMs per CPU to run at full speed. With 
only 2 DIMMS, you either have both CPUs run @ 64-bit (this may not even 
be possible) or populate only 1 CPU bank -- the other CPU must then 
request all memory access through the other CPU which is a significant 
penalty. If you went 4x512MB, you'd limit your future update options by 
having less slots available to add more memory. I'd definitely out of 
the chute get 4x1GB,



4 Port AMCC/3Ware 9500-4LP PCI SATA RAID Controller
80GB SATA-150 7200RPM Hard Disk / 8MB Cache
80GB SATA-150 7200RPM Hard Disk / 8MB Cache
250GB SATA-150 7200RPM Hard Disk / 8MB Cache
250GB SATA-150 7200RPM Hard Disk / 8MB Cache


Now this is comes to the interesting part. We've had huge, gigantic 
threads (check archives for the $7K server threads) about SCSI versus 
SATA in the past. 7200 SATAs just aren't fast/smart enough to cut it for 
most production uses in regular configs. If you are set on SATA, you 
will have to consider the following options: (1) use 10K Raptors for TCQ 
goodness, (2) put a huge amount of memory onto the SATA RAID card -- 1GB 
minimum, (3) use a ton of SATA drives to make a RAID10 array -- 8 drives 
minimum.


Or you could go SCSI. SCSI is cost prohibitive though at the larger disk 
sizes -- this is why I'm considering option #3 for my data processing 
server.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Forcing use of specific index

2005-06-03 Thread William Yu
A pretty awful way is to mangle the sql statement so the other field 
logical statements are like so:


select * from mytable where 0+field = 100




Tobias Brox wrote:

Is it any way to attempt to force the planner to use some specific index
while creating the plan?  Other than eventually dropping all the other
indices (which is obiously not a solution in production setting anyway)?

I have one case where I have added 16 indices to a table, many of them
beeing partial indices.  The table itself has only 50k of rows, but are
frequently used in heavy joins.  I imagine there can be exponential order on
the number of alternative paths the planner must examinate as function of
the number of indices?

It seems to me that the planner is quite often not choosing the best
index, so I wonder if there is any easy way for me to check out what the
planner think about a specific index :-)



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Adaptec/LSI/?? RAID

2005-06-01 Thread William Yu
I've used LSI MegaRAIDs successfully in the following systems with both 
Redhat 9 and FC3 64bit.


Arima HDAMA/8GB RAM
Tyan S2850/4GB RAM
Tyan S2881/4GB RAM

I've previously stayed away from Adaptec because we used to run Solaris 
x86 and the driver was somewhat buggy. For Linux and FreeBSD, I'd be 
less worried as open source development of drivers usually lead to 
better testing  bug-fixing.



Stacy White wrote:

We're in the process of buying another Opteron server to run Postgres, and
based on the suggestions in this list I've asked our IT director to get an
LSI MegaRaid controller rather than one of the Adaptecs.

But when we tried to place our order, our vendor (Penguin Computing) advised
us:

we find LSI does not work well with 4GB of RAM. Our engineering find that
LSI card could cause system crashes. One of our customer ... has found that
Adaptec cards works well on PostGres SQL -- they're using it as a preforce
server with xfs and post-gress.

Any comments?  Suggestions for other RAID controllers?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] ok you all win what is best opteron (I dont want a

2005-05-15 Thread William Yu
I say most apps because it's true. :) I would suggest that pretty much 
every app (other than video/audio streaming) people think are 
bandwidth-limited are actually latency-limited. Take the SpecFoo tests. 
Sure I would have rather seen SAP/TPC/etc that would be more relevant to 
Postgres but there aren't any apples-to-apples comparisons available 
yet. But there's something to consider here. What people in the past 
have believed is that memory bandwidth is the key to Spec numbers -- 
SpecFP isn't a test of floating point performance, it's a test of memory 
bandwidth. Or is it? Numbers for DC Opterons show lower latency/lower 
bandwith beating higher latency/higher bandwidth in what was supposedly 
bandwidth limited. What may actually be happening is extra bandwidth 
isn't actually used directly by the app itself -- instead the CPU uses 
it for prefetching to hide latency.

Scrounging around for more numbers, I've found benchmarks at Anandtech 
that relate better to Postgres. He has a Order Entry OLTP app running 
on MS-SQL. 1xDC beats 2x1 -- 2xDC beats 4x1.

order entry reads
2x248 - 235113
1x175 - 257192
4x848 - 360014
2x275 - 392643
order entry writes
2x248 - 235107
1x175 - 257184
4x848 - 360008
2x275 - 392634
order entry stored procedures
2x248 - 2939
1x175 - 3215
4x848 - 4500
2x275 - 4908


Greg Stark wrote:
William Yu [EMAIL PROTECTED] writes:
 

It turns out the latency in a 2xDC setup is just so much lower and most apps
like lower latency than higher bandwidth.
   

You haven't tested anything about most apps. You tested what the SpecFoo
apps prefer. If you're curious about which Postgres prefers you'll have to
test with Postgres.
I'm not sure whether it will change the conclusion but I expect Postgres will
like bandwidth better than random benchmarks do.
 


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgresql Performance via the LSI MegaRAID 2x Card

2005-05-15 Thread William Yu
I'm sure there's some corner case where more memory helps. If you 
consider that 1GB of RAM is about $100, I'd max out memory on the 
controller just for the hell of it.

Josh Berkus wrote:
Steve,

Past recommendations for a good RAID card (for SCSI) have been the LSI
MegaRAID 2x. This unit comes with 128MB of RAM on-board. Has anyone
found by increasing the on-board RAM, did Postgresql performed better?

My informal tests showed no difference between 64MB and 256MB.
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] Whence the Opterons?

2005-05-09 Thread William Yu
Unfortunately, Anandtech only used Postgres just a single time in his 
benchmarks. And what it did show back then was a huge performance 
advantage for the Opteron architecture over Xeon in this case. Where the 
fastest Opterons were just 15% faster in MySQL/MSSQL/DB2 than the 
fastest Xeons, it was 100%+ faster in Postgres. He probably got rid of 
Postgres from his benchmark suite since it favors Opteron too much. As a 
general hardware review site, makes senses that he needs to get more 
neutral apps in order to get free systems to review and (ahem) ad dollars.

That being said, I wouldn't get a quad Opteron system anyways now that 
the dual core Opterons are available. A DP+DC system would be faster and 
cheaper than a pure quad system. Unless of course, I needed a QP+DC for 
8-way SMP.



Anjan Dave wrote:
Wasn't the context switching issue occurring in specific cases only?
I haven't seen any benchmarks for a 50% performance difference. Neither
have I seen any benchmarks of pure disk IO performance of specific
models of Dell vs HP or Sun Opterons.
Thanks,
Anjan
EMC you can file an RPQ via your sales contacts to get it approved,
though not sure how lengthy/painful that process might be, or if it's
gonna be worth it.
Read the article devoted to the v40z on anandtech.com.
I am also trying to get a quad-Opteron versus the latest quad-XEON
from
Dell (6850), but it's hard to justify a difference between a 15K dell
versus a 30k v40z for a 5-8% performance gain (read the XEON Vs.
Opteron
Database comparo on anandtech.com)...
Thanks,
Anjan

15k vs 30k is indeed a big difference. But also realize that Postgres
has a specific benefit to Opterons versus Xeons. The context switching
storm happens less on an Opteron for some reason.
I would venture a much greater benefit than 5-8%, more like 10-50%.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-20 Thread William Yu
I posted this link a few months ago and there was some surprise over the 
difference in postgresql compared to other DBs. (Not much surprise in 
Opteron stomping on Xeon in pgsql as most people here have had that 
experience -- the surprise was in how much smaller the difference was in 
other DBs.) If it was across the board +100% in MS-SQL, MySQL, etc -- 
you can chalk in up to overall better CPU architecture. Most of the time 
though, the numbers I've seen show +0-30% for [insert DB here] and a 
huge whopping + for pgsql. Why the pronounced preference for 
postgresql, I'm not sure if it was explained fully.

BTW, the Anandtech test compares single CPU systems w/ 1GB of RAM. Go to 
dual/quad and SMP Xeon will suffer even more since it has to share a 
fixed amount of FSB/memory bandwidth amongst all CPUs. Xeons also seem 
to suffer more from context-switch storms. Go  4GB of RAM and the Xeon 
suffers another hit due to the lack of a 64-bit IOMMU. Devices cannot 
map to addresses  4GB which means the OS has to do extra work in 
copying data from/to  4GB anytime you have IO. (Although this penalty 
might exist all the time in 64-bit mode for Xeon if Linux/Windows took 
the expedient and less-buggy route of using a single method versus 
checking whether target addresses are  or  4GB.)


Jeff Frost wrote:
On Tue, 19 Apr 2005, J. Andrew Rogers wrote:
I don't know about 2.5x faster (perhaps on specific types of loads), 
but the reason Opterons rock for database applications is their 
insanely good memory bandwidth and latency that scales much better 
than the Xeon.  Opterons also have a ccNUMA-esque I/O fabric and two 
dedicated on-die memory channels *per processor* -- no shared bus 
there, closer to real UNIX server iron than a glorified PC.

Thanks J!  That's exactly what I was suspecting it might be.  Actually, 
I found an anandtech benchmark that shows the Opteron coming in at close 
to 2.0x performance:

http://www.anandtech.com/linux/showdoc.aspx?i=2163p=2
It's an Opteron 150 (2.4ghz) vs. Xeon 3.6ghz from August.  I wonder if 
the differences are more pronounced with the newer Opterons.

-Jeff
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread William Yu
The Linux kernel is definitely headed this way. The 2.6 allows for 
several different I/O scheduling algorithms. A brief overview about the 
different modes:

http://nwc.serverpipeline.com/highend/60400768
Although a much older article from the beta-2.5 days, more indepth info 
from one of the programmers who developed the AS scheduler and worked on 
the deadline scheduler:

http://kerneltrap.org/node/657
I think I'm going to start testing the deadline scheduler for our data 
processing server for a few weeks before trying it on our production 
servers.


Alex Turner wrote:
Whilst I admire your purist approach, I would say that if it is
beneficial to performance that a kernel understand drive geometry,
then it is worth investigating teaching it how to deal with that!
I was less referrring to the kernel as I was to the controller.
Lets say we invented a new protocol that including the drive telling
the controller how it was layed out at initialization time so that the
controller could make better decisions about re-ordering seeks.  It
would be more cost effective to have that set of electronics just once
in the controller, than 8 times on each drive in an array, which would
yield better performance to cost ratio.  Therefore I would suggest it
is something that should be investigated.  After all, why implemented
TCQ on each drive, if it can be handled more effeciently at the other
end by the controller for less money?!
Alex Turner
netEconomist
On 4/19/05, Dave Held [EMAIL PROTECTED] wrote:
-Original Message-
From: Alex Turner [mailto:[EMAIL PROTECTED]
Sent: Monday, April 18, 2005 5:50 PM
To: Bruce Momjian
Cc: Kevin Brown; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?
Does it really matter at which end of the cable the queueing is done
(Assuming both ends know as much about drive geometry etc..)?
[...]
The parenthetical is an assumption I'd rather not make.  If my
performance depends on my kernel knowing how my drive is laid
out, I would always be wondering if a new drive is going to
break any of the kernel's geometry assumptions.  Drive geometry
doesn't seem like a kernel's business any more than a kernel
should be able to decode the ccd signal of an optical mouse.
The kernel should queue requests at a level of abstraction that
doesn't depend on intimate knowledge of drive geometry, and the
drive should queue requests on the concrete level where geometry
matters.  A drive shouldn't guess whether a process is trying to
read a file sequentially, and a kernel shouldn't guess whether
sector 30 is contiguous with sector 31 or not.
__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
   http://www.postgresql.org/docs/faq
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread William Yu
My experience:
1xRAID10 for postgres
1xRAID1 for OS + WAL
Jeff Frost wrote:
Now that we've hashed out which drives are quicker and more money equals 
faster...

Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid 
option would you use for a standalone postgres server?

a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
b) 1xRAID1 for OS/xlog, 1xRAID5 for data
c) 1xRAID10 for OS/xlong/data
d) 1xRAID1 for OS, 1xRAID10 for data
e) .
I was initially leaning towards b, but after talking to Josh a bit, I 
suspect that with only 4 disks the raid5 might be a performance 
detriment vs 3 raid 1s or some sort of split raid10 setup.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread William Yu
Problem with this strategy. You want battery-backed write caching for 
best performance  safety. (I've tried IDE for WAL before w/ write 
caching off -- the DB got crippled whenever I had to copy files from/to 
the drive on the WAL partition -- ended up just moving WAL back on the 
same SCSI drive as the main DB.) That means in addition to a $$$ SCSI 
caching controller, you also need a $$$ SATA caching controller. From my 
glance at prices, advanced SATA controllers seem to cost nearly as their 
SCSI counterparts.

This also looks to be the case for the drives themselves. Sure you can 
get super cheap 7200RPM SATA drives but they absolutely suck for 
database work. Believe me, I gave it a try once -- ugh. The highend WD 
10K Raptors look pretty good though -- the benchmarks @ storagereview 
seem to put these drives at about 90% of SCSI 10Ks for both single-user 
and multi-user. However, they're also priced like SCSIs -- here's what I 
found @ Mwave (going through pricewatch to find WD740GDs):

Seagate 7200 SATA -- 80GB$59
WD 10K SATA   -- 72GB$182
Seagate 10K U320  -- 72GB$289
Using the above prices for a fixed budget for RAID-10, you could get:
SATA 7200 -- 680MB per $1000
SATA 10K  -- 200MB per $1000
SCSI 10K  -- 125MB per $1000
For a 99% read-only DB that required lots of disk space (say something 
like Wikipedia or blog host), using consumer level SATA probably is ok. 
For anything else, I'd consider SATA 10K if (1) I do not need 15K RPM 
and (2) I don't have SCSI intrastructure already.

Steve Poe wrote:
If SATA drives don't have the ability to replace SCSI for a multi-user
Postgres apps, but you needed to save on cost (ALWAYS an issue), 
could/would you implement SATA for your logs (pg_xlog) and keep the rest 
on SCSI?

Steve Poe
Mohan, Ross wrote:
I've been doing some reading up on this, trying to keep up here, and 
have found out that (experts, just yawn and cover your ears)

1) some SATA drives (just type II, I think?) have a Phase Zero
   implementation of Tagged Command Queueing (the special sauce
   for SCSI).
2) This SATA TCQ is called NCQ and I believe it basically
   allows the disk software itself to do the reordering
   (this is called simple in TCQ terminology) It does not
   yet allow the TCQ head of queue command, allowing the
   current tagged request to go to head of queue, which is
   a simple way of manifesting a high priority request.
3) SATA drives are not yet multi-initiator?
Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives
are likely to whomp SATA II drives for a while yet (read: a
year or two) in multiuser PostGres applications.
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 2:04 PM
To: Kevin Brown
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?

Kevin Brown [EMAIL PROTECTED] writes:
 

Greg Stark wrote:
  

I think you're being misled by analyzing the write case.
Consider the read case. When a user process requests a block and 
that read makes its way down to the driver level, the driver can't 
just put it aside and wait until it's convenient. It has to go ahead 
and issue the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a 
couple of milliseconds to see if other requests come in, and then 
issue the read if none do.  If there are already other requests being 
fulfilled, then it'll schedule the request in question just like the 
rest.
  

But then the cure is worse than the disease. You're basically 
describing exactly what does happen anyways, only you're delaying more 
requests than necessary. That intervening time isn't really idle, it's 
filled with all the requests that were delayed during the previous 
large seek...

 

Once the first request has been fulfilled, the driver can now 
schedule the rest of the queued-up requests in disk-layout order.

I really don't see how this is any different between a system that 
has tagged queueing to the disks and one that doesn't.  The only 
difference is where the queueing happens.
  

And *when* it happens. Instead of being able to issue requests while a 
large seek is happening and having some of them satisfied they have to 
wait until that seek is finished and get acted on during the next 
large seek.

If my theory is correct then I would expect bandwidth to be 
essentially equivalent but the latency on SATA drives to be increased 
by about 50% of the average seek time. Ie, while a busy SCSI drive can 
satisfy most requests in about 10ms a busy SATA drive would satisfy 
most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI 
drives have even lower seek times and no such IDE/SATA drives exist...)

In reality higher latency feeds into a system feedback loop causing 
your application to run slower causing bandwidth demands to be lower 
as well. It's often hard to distinguish root causes from symptoms when 
optimizing 

Re: [PERFORM] How to improve db performance with $7K?

2005-04-06 Thread William Yu
Alex Turner wrote:
I'm no drive expert, but it seems to me that our write performance is
excellent.  I think what most are concerned about is OLTP where you
are doing heavy write _and_ heavy read performance at the same time.
Our system is mostly read during the day, but we do a full system
update everynight that is all writes, and it's very fast compared to
the smaller SCSI system we moved off of.  Nearly a 6x spead
improvement, as fast as 900 rows/sec with a 48 byte record, one row
per transaction.
I've started with SATA in a multi-read/multi-write environment. While it 
ran pretty good with 1 thread writing, the addition of a 2nd thread 
(whether reading or writing) would cause exponential slowdowns.

I suffered through this for a week and then switched to SCSI. Single 
threaded performance was pretty similar but with the advanced command 
queueing SCSI has, I was able to do multiple reads/writes simultaneously 
with only a small performance hit for each thread.

Perhaps having a SATA caching raid controller might help this situation. 
I don't know. It's pretty hard justifying buying a $$$ 3ware controller 
just to test it when you could spend the same money on SCSI and have a 
guarantee it'll work good under multi-IO scenarios.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-06 Thread William Yu
It's the same money if you factor in the 3ware controller. Even without 
a caching controller, SCSI works good in multi-threaded IO (not 
withstanding crappy shit from Dell or Compaq). You can get such cards 
from LSI for $75. And of course, many server MBs come with LSI 
controllers built-in. Our older 32-bit production servers all use Linux 
software RAID w/ SCSI and there's no issues when multiple 
users/processes hit the DB.

*Maybe* a 3ware controller w/ onboard cache + battery backup might do 
much better for multi-threaded IO than just plain-jane SATA. 
Unfortunately, I have not been able to find anything online that can 
confirm or deny this. Hence, the choice is spend $$$ on the 3ware 
controller and hope it meets your needs -- or spend $$$ on SCSI drives 
and be sure.

Now if you want to run such tests, we'd all be delighted with to see the 
results so we have another option for building servers.

Alex Turner wrote:
It's hardly the same money, the drives are twice as much.
It's all about the controller baby with any kind of dive.  A bad SCSI
controller will give sucky performance too, believe me.  We had a
Compaq Smart Array 5304, and it's performance was _very_ sub par.
If someone has a simple benchmark test database to run, I would be
happy to run it on our hardware here.
Alex Turner
On Apr 6, 2005 3:30 AM, William Yu [EMAIL PROTECTED] wrote:
Alex Turner wrote:
I'm no drive expert, but it seems to me that our write performance is
excellent.  I think what most are concerned about is OLTP where you
are doing heavy write _and_ heavy read performance at the same time.
Our system is mostly read during the day, but we do a full system
update everynight that is all writes, and it's very fast compared to
the smaller SCSI system we moved off of.  Nearly a 6x spead
improvement, as fast as 900 rows/sec with a 48 byte record, one row
per transaction.
I've started with SATA in a multi-read/multi-write environment. While it
ran pretty good with 1 thread writing, the addition of a 2nd thread
(whether reading or writing) would cause exponential slowdowns.
I suffered through this for a week and then switched to SCSI. Single
threaded performance was pretty similar but with the advanced command
queueing SCSI has, I was able to do multiple reads/writes simultaneously
with only a small performance hit for each thread.
Perhaps having a SATA caching raid controller might help this situation.
I don't know. It's pretty hard justifying buying a $$$ 3ware controller
just to test it when you could spend the same money on SCSI and have a
guarantee it'll work good under multi-IO scenarios.
---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] name search query speed

2005-03-03 Thread William Yu
Jeremiah Jahn wrote:
I have about 5M names stored on my DB. Currently the searches are very
quick unless, they are on a very common last name ie. SMITH. The Index
is always used, but I still hit 10-20 seconds on a SMITH or Jones
search, and I average about 6 searches a second and max out at about
30/s. Any suggestions on how I could arrange things to make this search
quicker? I have 4gb of mem on a raid 5 w/ 3 drives. I'm hoping that I
can increase this speed w/o a HW upgrade.
If it's just SMITH, the only fix is to throw more hardware at the 
problem. I've got my own database of medical providers  facilities in 
the millions and anytime somebody tries to search for MEDICAL FACILITY, 
it takes forever. I've tried every optimization possible but when you 
have 500K records with the word MEDICAL in it, what can you do? You've 
got to check all 500K records to see if it matches your criteria.

For multi-word searches, what I've found does work is to periodically 
generate stats on work frequencies and use those stats to search the 
least common words first. For example, if somebody enters ALTABATES 
MEDICAL HOSPITAL, I can get the ~50 providers with ALTABATES in the 
name and then do a 2nd and 3rd pass to filter against MEDICAL and HOSPITAL.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread William Yu
You can get 64-bit Xeons also but it takes hit in the I/O department due 
to the lack of a hardware I/O MMU which limits DMA transfers to 
addresses below 4GB. This has a two-fold impact:

1) transfering data to 4GB require first a transfer to 4GB and then a 
copy to the final destination.

2) You must allocate real memory 2X the address space of the devices to 
act as bounce buffers. This is especially problematic for workstations 
because if you put a 512MB Nvidia card in your computer for graphics 
work -- you've just lost 1GB of memory. (I dunno how much the typical 
SCSI/NIC/etc take up.)

I thought Intel was copying AMD's 64-bit API.  Is Intel's
implementation as poor as you description?  Does Intel have any better
64-bit offering other than the Itanium/Itanic?
Unfortunately, there's no easy way for Intel to have implemented a 
64-bit IOMMU under their current restrictions. The memory controller 
resides on the chipset and to upgrade the functionality significantly, 
it would probably require changing the bus protocol. It's not that they 
couldn't do it -- it would just require all Intel chipset/MB 
vendors/partners to go through the process of creating  validating 
totally new products. A way lengthier process than just producing 64-bit 
CPUs that drop into current motherboards.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-02 Thread William Yu
Bruce Momjian wrote:
William Yu wrote:
You can get 64-bit Xeons also but it takes hit in the I/O department due 
to the lack of a hardware I/O MMU which limits DMA transfers to 
addresses below 4GB. This has a two-fold impact:

1) transfering data to 4GB require first a transfer to 4GB and then a 
copy to the final destination.

2) You must allocate real memory 2X the address space of the devices to 
act as bounce buffers. This is especially problematic for workstations 
because if you put a 512MB Nvidia card in your computer for graphics 
work -- you've just lost 1GB of memory. (I dunno how much the typical 
SCSI/NIC/etc take up.)

When you say allocate real memory 2X are you saying that if you have
16GB of RAM only 8GB is actually usable and the other 8GB is for
bounce buffers, or is it just address space being used up?
It's 2x the memory space of the devices. E.g. a Nvidia Graphics card w/ 
512MB of RAM would require 1GB of memory to act as bounce buffers. And 
it has to be real chunks of memory in 64-bit mode since DMA transfer 
must drop it into real memory in order to then be copied to  4GB.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] High end server and storage for a PostgreSQL OLTP system

2005-02-01 Thread William Yu
Jim C. Nasby wrote:
On Tue, Feb 01, 2005 at 07:35:35AM +0100, Cosimo Streppone wrote:
You might look at Opteron's, which theoretically have a higher data
bandwidth. If you're doing anything data intensive, like a sort in
memory, this could make a difference.
Would Opteron systems need 64-bit postgresql (and os, gcc, ...)
build to have that advantage?
 
Well, that would give you the most benefit, but the memory bandwidth is
still greater than on a Xeon. There's really no issue with 64 bit if
you're using open source software; it all compiles for 64 bits and
you're good to go. http://stats.distributed.net runs on a dual opteron
box running FreeBSD and I've had no issues.
You can get 64-bit Xeons also but it takes hit in the I/O department due 
to the lack of a hardware I/O MMU which limits DMA transfers to 
addresses below 4GB. This has a two-fold impact:

1) transfering data to 4GB require first a transfer to 4GB and then a 
copy to the final destination.

2) You must allocate real memory 2X the address space of the devices to 
act as bounce buffers. This is especially problematic for workstations 
because if you put a 512MB Nvidia card in your computer for graphics 
work -- you've just lost 1GB of memory. (I dunno how much the typical 
SCSI/NIC/etc take up.)

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-28 Thread William Yu
Hervé Piedvache wrote:
My point being is that there is no free solution.  There simply isn't.
I don't know why you insist on keeping all your data in RAM, but the
mysql cluster requires that ALL data MUST fit in RAM all the time.

I don't insist about have data in RAM  but when you use PostgreSQL with 
big database you know that for quick access just for reading the index file 
for example it's better to have many RAM as possible ... I just want to be 
able to get a quick access with a growing and growind database ...
If it's an issue of RAM and not CPU power, think about this scenario. 
Let's just say you *COULD* partition your DB over multiple servers. What 
 are your plans then? Are you going to buy 4 Dual Xeon servers? Ok, 
let's price that out.

For a full-blown rackmount server w/ RAID, 6+ SCSI drives and so on, you 
are looking at roughly $4000 per machine. So now you have 4 machines -- 
total of 16GB of RAM over the 4 machines.

On the otherhand, let's say you spent that money on a Quad Opteron 
instead. 4x850 will cost you roughly $8000. 16GB of RAM using 1GB DIMMs 
is $3000. If you went with 2GB DIMMs, you could stuff 32GB of RAM onto 
that machine for $7500.

Let's review the math:
4X server cluster, total 16GB RAM = $16K
1 beefy server w/ 16GB RAM = $11K
1 beefy server w/ 32GB RAM = $16K
I know what I would choose. I'd get the mega server w/ a ton of RAM and 
skip all the trickyness of partitioning a DB over multiple servers. Yes 
your data will grow to a point where even the XXGB can't cache 
everything. On the otherhand, memory prices drop just as fast. By that 
time, you can ebay your original 16/32GB and get 64/128GB.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-20 Thread William Yu
Hervé Piedvache wrote:
Sorry but I don't agree with this ... Slony is a replication solution ... I 
don't need replication ... what will I do when my database will grow up to 50 
Gb ... I'll need more than 50 Gb of RAM on each server ???
This solution is not very realistic for me ...
Have you confirmed you need a 1:1 RAM:data ratio? Of course more memory 
gets more speed but often at a diminishing rate of return. Unless every 
record of your 50GB is used in every query, only the most commonly used 
elements of your DB needs to be in RAM. This is the very idea of caching.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql

2005-01-17 Thread William Yu
I inferred this from reading up on the compressed vm project. It can be 
higher or lower depending on what devices you have in your system -- 
however, I've read messages from kernel hackers saying Linux is very 
aggressive in reserving memory space for devices because it must be 
allocated at boottime.


Josh Berkus wrote:
William,

The theshold for using PAE is actually far lower than 4GB. 4GB is the
total memory address space -- split that in half for 2GB for userspace,
2GB for kernel. The OS cache resides in kernel space -- after you take
alway the memory allocation for devices, you're left with a window of
roughly 900MB.

I'm curious, how do you get 1.1GB for memory allocation for devices?
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql

2005-01-17 Thread William Yu
[EMAIL PROTECTED] wrote:
Since the optimal state is to allocate a small amount of memory to
Postgres and leave a huge chunk to the OS cache, this means you are
already hitting the PAE penalty at 1.5GB of memory.
How could I chang this hitting?
Upgrade to 64-bit processors + 64-bit linux.
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql

2005-01-17 Thread William Yu
My experience is RH9 auto detected machines = 2GB of RAM and installs 
the PAE bigmem kernel by default. I'm pretty sure the FC2/3 installer 
will do the same.


[EMAIL PROTECTED] wrote:
I understand that the 2.6.* kernels are much better at large memory
support (with respect to performance issues), so unless you have a
64-bit machine lying around - this is probably worth a try.
You may need to build a new kernel with the various parameters :
CONFIG_NOHIGHMEM
CONFIG_HIGHMEM4G
CONFIG_HIGHMEM64G
set appropriately (or even upgrade to the latest 2.6.10). I would expect
that some research and experimentation will be required to get the best
out of it - (e.g. the 'bounce buffers' issue).

In the standard rpm FC 2-3 with a newly install server , would it automatically
detect and config it if I use the mechine with  4 Gb [6Gb.] or should I
manually config it?
Amrit
Thailand
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Increasing RAM for more than 4 Gb. using postgresql

2005-01-13 Thread William Yu
Gavin Sherry wrote:
There is no problem with free Linux distros handling  4 GB of memory. The
problem is that 32 hardware must make use of some less than efficient
mechanisms to be able to address the memory.
The theshold for using PAE is actually far lower than 4GB. 4GB is the 
total memory address space -- split that in half for 2GB for userspace, 
2GB for kernel. The OS cache resides in kernel space -- after you take 
alway the memory allocation for devices, you're left with a window of 
roughly 900MB.

Since the optimal state is to allocate a small amount of memory to 
Postgres and leave a huge chunk to the OS cache, this means you are 
already hitting the PAE penalty at 1.5GB of memory.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Low Performance for big hospital server ..

2005-01-05 Thread William Yu
[EMAIL PROTECTED] wrote:
Now I turn hyperthreading off and readjust the conf . I found the bulb query
that was :
update one flag of the table [8 million records which I think not too much]
.When I turned this query off everything went fine.
I don't know whether update the data is much slower than insert [Postgresql
7.3.2] and how could we improve the update method?
UPDATE is expensive. Under a MVCC setup, it's roughtly the equivalent of 
DELETE + INSERT new record (ie, old record deprecated, new version of 
record. Updating 8 million records would be very I/O intensive and 
probably flushes your OS cache so all other queries hit disk versus 
superfast memory. And if this operation is run multiple times during the 
day, you may end up with a lot of dead tuples in the table which makes 
querying it deadly slow.

If it's a dead tuples issue, you probably have to increase your 
freespace map and vacuum analyze that specific table more often. If it's 
an I/O hit issue, a lazy updating procedure would help if the operation 
is not time critical (eg. load the record keys that need updating and 
loop through the records with a time delay.)

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Low Performance for big hospital server ..

2005-01-03 Thread William Yu
[EMAIL PROTECTED] wrote:
I will try to reduce shared buffer to 1536 [1.87 Mb].
1536 is probaby too low. I've tested a bunch of different settings on my 
 8GB Opteron server and 10K seems to be the best setting.


also effective cache is the sum of kernel buffers + shared_buffers so it
should be bigger than shared buffers.
also make the effective cache to 2097152 [2 Gb].
I will give you the result , because tomorrow [4/12/05] will be the official day
of my hospital [which have more than 1700 OPD patient/day].
To figure out your effective cache size, run top and add free+cached.

Also turning hyperthreading off may help, it is unlikely it is doing any
good unless you are running a relatively new (2.6.x) kernel.
Why , could you give me the reason?
Pre 2.6, the kernel does not know the difference between logical and 
physical CPUs. Hence, in a dual processor system with hyperthreading, it 
actually sees 4 CPUs. And when assigning processes to CPUs, it may 
assign to 2 logical CPUs in the same physical CPU.



I presume you are vacuuming on a regular basis?
Yes , vacuumdb daily.
Do you vacuum table by table or the entire DB? I find over time, the 
system tables can get very bloated and cause a lot of slowdowns just due 
to schema queries/updates. You might want to try a VACUUM FULL ANALYZE 
just on the system tables.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Low Performance for big hospital server ..

2005-01-03 Thread William Yu
Dave Cramer wrote:

William Yu wrote:
[EMAIL PROTECTED] wrote:
I will try to reduce shared buffer to 1536 [1.87 Mb].

1536 is probaby too low. I've tested a bunch of different settings on 
my  8GB Opteron server and 10K seems to be the best setting.

Be careful here, he is not using opterons which can access physical 
memory above 4G efficiently. Also he only has 4G the 6-10% rule still 
applies
10% of 4GB is 400MB. 10K buffers is 80MB. Easily less than the 6-10% rule.

To figure out your effective cache size, run top and add free+cached.

My understanding is that effective cache is the sum of shared buffers, 
plus kernel buffers, not sure what free + cached gives you?
Not true. Effective cache size is the free memory available that the OS 
can use for caching for Postgres. In a system that runs nothing but 
Postgres, it's free + cached.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Some Performance Advice Needed

2004-12-23 Thread William Yu
Alex wrote:
Hi,
i recently run pgbench against different servers and got some results I 
dont quite understand.

A) EV1: Dual Xenon, 2GHz, 1GB Memory, SCSI 10Krpm, RHE3
B) Dual Pentium3 1.4ghz (Blade), SCSI Disk 10Krmp, 1GB Memory, Redhat 8
C) P4 3.2GHz, IDE 7.2Krpm, 1GBMem, Fedora Core2

Runnig PGbench reported
A) 220 tps
B) 240 tps
C) 510 tps
Running hdparm reported
A) 920mb/s   (SCSI 10k)
B) 270mb/s   (SCSI 10k)
C) 1750mb/s  (IDE  7.2k)
What I dont quite understand is why a P3.2 is twice as fast as a Dual 
Xenon with SCSI disks, A dual Xenon 2GHz is not faster than a dual P3 
1.4Ghz, and the hdparm results also dont make much sense.
A few things to clear up about the P3/P4/Xeons.
Xeons are P4s. Hence, a P4 2ghz will run the same speed as a Xeon 2ghz 
assuming all other variables are the same. Of course they aren't because 
your P4 is probably running unregistered memory, uses either a 533mhz or 
800mhz FSB compared to the Xeon's shared 400mhz amongs 2 CPUs, running a 
faster non-smp kernel. Add all those variables up and it's definitely 
possible for a P4 3.2ghz to run twice as fast as a Dual Xeon 2ghz on a 
single-thread benchmark. (The corollary here is that in a multi-thread 
benchmark, the 2X Xeon can only hope to equal your P4 3.2.)

P3s are faster than P4s at the same clock rate. By a lot. It's not 
really that surprising that a P3 1.4 is faster than a P4/Xeon 2.0. I've 
seen results like this many times over a wide range of applications.

The only variable that is throwing off your comparisons are the hard 
drives. IDE drives have write caching on by default -- SCSI drives have 
it off. Use: hdparm -W0 /dev/hda to turn it off on the P4 system and 
rerun the tests then.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Some Performance Advice Needed

2004-12-23 Thread William Yu
IDE disks lie about write completion (This can be disabled on some 
drives) whereas SCSI drives wait for the data to actually be written 
before they report success.  It is quite
easy to corrupt a PG (Or most any db really) on an IDE drive.  Check 
the archives for more info.

Do we have any real info on this? Specifically which drives? Is SATA the 
same way? What about SATA-II?
I am not saying it isn't true (I know it is) but this is a blanket 
statement that may or may not be
true with newer tech.
From my experience with SATA controllers, write caching is controlled 
via the BIOS.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Some quick Opteron 32-bit/64-bit results

2004-11-15 Thread William Yu
Greg Stark wrote:
William Yu [EMAIL PROTECTED] writes:

Biggest speedup I've found yet is the backup process (PG_DUMP -- GZIP). 100%
faster in 64-bit mode. This drastic speed might be more the result of 64-bit
GZIP though as I've seen benchmarks in the past showing encryption/compression
running 2 or 3 times faster in 64-bit mode versus 32-bit.

Isn't this a major kernel bump too? So a different scheduler, different IO
scheduler, etc?
I'm sure there's some speedup due to the kernel bump. I really didn't 
have the patience to even burn the FC2 32-bit CDs much less install both 
32-bit  64-bit FC2 in order to have a more accurate baseline comparison.

However, that being said -- when you see huge speed increases like 50% 
100% for dump+gzip, it's doubtful the kernel/process scheduler/IO 
scheduler could have made that drastic of a difference. Maybe somebody 
else who has done a 2.4 - 2.6 upgrade can give us a baseline to 
subtract from my numbers.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Some quick Opteron 32-bit/64-bit results

2004-11-13 Thread William Yu
I just finished upgrading the OS on our Opteron 148 from Redhat9 to 
Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc.

The verdict: a definite performance improvement. I tested just a few CPU 
intensive queries and many of them are a good 30%-50% faster. 
Transactional/batch jobs involving client machines (i.e. include fixed 
client/networking/odbc overhead) seem to be about 10%-20% faster 
although I will need run more data through the system to get a better 
feel of the numbers.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Some quick Opteron 32-bit/64-bit results

2004-11-13 Thread William Yu
Biggest speedup I've found yet is the backup process (PG_DUMP -- GZIP). 
100% faster in 64-bit mode. This drastic speed might be more the result 
of 64-bit GZIP though as I've seen benchmarks in the past showing 
encryption/compression running 2 or 3 times faster in 64-bit mode versus 
32-bit.


William Yu wrote:
I just finished upgrading the OS on our Opteron 148 from Redhat9 to 
Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc.

The verdict: a definite performance improvement. I tested just a few CPU 
intensive queries and many of them are a good 30%-50% faster. 
Transactional/batch jobs involving client machines (i.e. include fixed 
client/networking/odbc overhead) seem to be about 10%-20% faster 
although I will need run more data through the system to get a better 
feel of the numbers.
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Some quick Opteron 32-bit/64-bit results

2004-11-13 Thread William Yu
I gave -O3 a try with -funroll-loops, -fomit-frame-pointer and a few 
others. Seemed to perform about the same as the default -O2 so I just 
left it as -O2.

Gustavo Franklin Nóbrega wrote:
Hi Willian,
Which are the GCC flags that you it used to compile PostgreSQL?
Best regards,
Gustavo Franklin Nóbrega
Infraestrutura e Banco de Dados
Planae Tecnologia da Informação
(+55) 14 3224-3066 Ramal 209
www.planae.com.br

I just finished upgrading the OS on our Opteron 148 from Redhat9 to
Fedora FC2 X86_64 with full recompiles of Postgres/Apache/Perl/Samba/etc.
The verdict: a definite performance improvement. I tested just a few CPU
intensive queries and many of them are a good 30%-50% faster.
Transactional/batch jobs involving client machines (i.e. include fixed
client/networking/odbc overhead) seem to be about 10%-20% faster
although I will need run more data through the system to get a better
feel of the numbers.
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Caching of Queries

2004-10-02 Thread William Yu
Josh Berkus wrote:
1) Query caching is not  a single problem, but rather several different 
problems requiring several different solutions.

2) Of these several different solutions, any particular query result caching 
implementation (but particularly MySQL's) is rather limited in its 
applicability, partly due to the tradeoffs required.Per your explanation, 
Oracle has improved this by offering a number of configurable options.

3) Certain other caching problems would be solved in part by the ability to 
construct in-memory tables which would be non-durable and protected from 
cache-flushing.  This is what I'm interested in chatting about.
Just my 2 cents on this whole issue. I would lean towards having result 
caching in pgpool versus the main backend. I want every ounce of memory 
on a database server devoted to the database. Caching results would 
double the effect of cache flushing ... ie, now both the results and the 
pages used to build the results are in memory pushing out other stuff to 
disk that may be just as important.

If it was in pgpool or something similar, I could devote a separate 
machine just for caching results leaving the db server untouched.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Table UPDATE is too slow

2004-08-31 Thread William Yu
Ron St-Pierre wrote:
Yes, I know that it's not a very good idea, however queries are allowed 
against all of those columns. One option is to disable some or all of the
indexes when we update, run the update, and recreate the indexes, 
however it may slow down user queries. Because there are so many indexes,
it is time consuming to recreate them after the update.
Just because a query can run against any column does not mean all 
columns should be indexed. Take a good look at the column types and 
their value distribution.

Let's say I have a table of addresses but every address I collect is in 
the 94116 zip code. That would mean indexes on city, state and zip are 
not only useless but could decrease performance.

Also, if a search always includes a unique key (or a column with highly 
unique values), eliminating the other indexes would force the planner to 
always use that index first.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Help specifying new machine

2004-08-15 Thread William Yu
You're not getting much of a bump with this server. The CPU is
incrementally faster -- in the absolutely best case scenario where your
queries are 100% cpu-bound, that's about ~25%-30% faster.

What about using Dual Athlon MP instead of a Xeon? Would be much less expensive,
but have higher performance (I think).
You're not going to be able to get a Dual Athlon MP for the same price 
as a single Xeon. A few years back, this was the case because Xeon CPUs 
 MBs had a huge premium over Athlon. This is no longer true mainly 
because the number of people carrying Athlon MP motherboards has dropped 
down drastically. Go to pricewatch.com and do a search for 760MPX -- you 
get a mere 8 entries. Not surprisingly because who would not want to 
spend a few pennies more for a much superior Dual Opteron? The few 
sellers you see now just keep stuff in inventory for people who need 
replacement parts for emergencies and are willing to pay up the nose 
because it is an emergency.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Scaling further up

2004-03-02 Thread William Yu
Anjan Dave wrote:
We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running RH9, 
PG 7.4.0. There's an internal U320, 10K RPM RAID-10 setup on 4 drives.
 
We are expecting a pretty high load, a few thousands of 'concurrent' 
users executing either select, insert, update, statments.
The quick and dirty method would be to upgrade to the recently announced 
3GHz Xeon MPs with 4MB of L3. My semi-educated guess is that you'd get 
another +60% there due to the huge L3 hiding the Xeon's shared bus penalty.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] cache whole data in RAM

2004-02-04 Thread William Yu
David Teran wrote:
Hi,

we are trying to speed up a database which has about 3 GB of data. The 
server has 8 GB RAM and we wonder how we can ensure that the whole DB is 
read into RAM. We hope that this will speed up some queries.

regards David

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html

Upon bootup, automatically run SELECT * FROM xyz on every table in 
your database.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[PERFORM] Update on putting WAL on ramdisk/

2003-12-12 Thread William Yu
Some arbitrary data processing job

WAL on single drive: 7.990 rec/s
WAL on 2nd IDE drive: 8.329 rec/s
WAL on tmpfs: 13.172 rec/s
A huge jump in performance but a bit scary having a WAL that can 
disappear at any time. I'm gonna workup a rsync script and do some 
power-off experiments to see how badly it gets mangled.

This could be good method though when you're dumping and restore an 
entire DB. Make a tmpfs mount, restore, shutdown DB and then copy the 
WAL back to the HD.

I checked out the SanDisk IDE FlashDrives. They have a write cycle life 
of 2 million. I'll explore more expensive solid state drives.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Update on putting WAL on ramdisk/

2003-12-12 Thread William Yu
Russell Garrett wrote:
WAL on single drive: 7.990 rec/s
WAL on 2nd IDE drive: 8.329 rec/s
WAL on tmpfs: 13.172 rec/s
A huge jump in performance but a bit scary having a WAL that can
disappear at any time. I'm gonna workup a rsync script and do some
power-off experiments to see how badly it gets mangled.


Surely this is just equivalent to disabling fsync? If you put a WAL on a
volatile file system, there's not a whole lot of point in having one at all.
These tests were all with fsync off.

And no, it's not equivalent to fsync off since the WAL is always written 
immediately regardless of fsync setting.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Hardware suggestions for Linux/PGSQL server

2003-12-11 Thread William Yu
Jeff Bohmer wrote:
We're willing to shell out extra bucks to get something that will 
undoubtedly handle the projected peak load in 12 months with excellent 
performance.  But we're not familiar with PG's performance on Linux and 
don't like to waste money.
Properly tuned, PG on Linux runs really nice. A few people have 
mentioned the VM swapping algorithm on Linux is semi-dumb. I get around 
that problem by having a ton of memory and almost no swap.

I've been thinking of this (overkill? not enough?):
2 Intel 32-bit CPUs
Lowest clock speed chip for the fastest available memory bus
4 GB RAM (maybe we only need 3 GB to start with?)
SCSI RAID 1 for OS
For PostgreSQL data and logs ...
15k rpm SCSI disks
RAID 5, 7 disks, 256MB battery-backed write cache
(Should we save $ and get a 4-disk RAID 10 array?)
I wonder about the 32bit+bigmem vs. 64bit question.  At what database 
size will we need more than 4GB RAM?
With 4GB of RAM, you're already running into bigmem. By default, Linux 
gives 2GB of address space to programs and 2GB to kernel. I usually see 
people quote 5%-15% penalty in general for using PAE versus a flat 
address space. I've seen simple MySQL benchmarks where 64-bit versions 
run 35%+ faster versus 32-bit+PAE but how that translates to PG, I dunno 
yet.

We'd like to always have enough RAM to cache the entire database. While 
64bit is in our long-term future, we're willing to stick with 32bit 
Linux until 64bit Linux on Itanium/Opteron and 64bit PostgreSQL settle 
in to proven production-quality.
Well if this is the case, you probably should get an Opteron server 
*now* and just run 32-bit Linux on it until you're sure about the 
software. No point in buying a Xeon and then throwing the machine away 
in a year when you decide you need 64-bit for more speed.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Hardware suggestions for Linux/PGSQL server

2003-12-11 Thread William Yu
Jeff Bohmer wrote:
It seems I don't fully understand the bigmem situation.  I've searched 
the archives, googled, checked RedHat's docs, etc.  But I'm getting 
conflicting, incomplete and/or out of date information.  Does anyone 
have pointers to bigmem info or configuration for the 2.4 kernel?
Bigmem is the name for Linux's PAE support.

If Linux is setup with 2GB for kernel and 2GB for user, would that be OK 
with a DB size of 2-2.5 GB?  I'm figuring the kernel will cache most/all 
of the DB in it's 2GB and there's 2GB left for PG processes. Where does 
PG's SHM buffers live, kernel or user?  (I don't plan on going crazy 
with buffers, but will guess we'd need about 128MB, 256MB at most.)
PG's SHM buffers live in user. Whether Linux's OS caches lives in user 
or kernel, I think it's in kernel and I remember reading a max of ~950KB 
w/o bigmem which means your 3.5GB of available OS memory will definitely 
have to be swapped in and out of kernel space using PAE.

Well if this is the case, you probably should get an Opteron server 
*now* and just run 32-bit Linux on it until you're sure about the 
software. No point in buying a Xeon and then throwing the machine away 
in a year when you decide you need 64-bit for more speed.
That's a good point.  I had forgotten about the option to run 32bit on 
an Operton.  If we had 3GB or 4GB initially on an Opteron, we'd need 
bigmem for 32bit Linux, right?

This might work nicely since we'd factor in the penalty from PAE for now 
and have the performance boost from moving to 64bit available on 
demand.  Not having to build another DB server in a year would also be 
nice.

FYI, we need stability first and performance second.
We ordered a 2x Opteron server the moment the CPU was released and it's 
been perfect -- except for one incident where the PCI riser card had 
drifted out of the PCI slot due to the heavy SCSI cables connected to 
the card.

I think most of the Opteron server MBs are pretty solid but you want 
extra peace-of-mind, you could get a server from Newisys as they pack in 
a cartload of extra monitoring features.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Has anyone run on the new G5 yet

2003-12-04 Thread William Yu
Sean Shanny wrote:
First question is do we gain anything by moving the RH Enterprise 
version of Linux in terms of performance, mainly in the IO realm as we 
are not CPU bound at all?  Second and more radical, has anyone run 
postgreSQL on the new Apple G5 with an XRaid system?  This seems like a 
great value combination.  Fast CPU, wide bus, Fibre Channel IO, 2.5TB 
all for ~17k.
Seems like a great value but until Apple produces a G5 that supports 
ECC, I'd pass on them.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Slow UPADTE, compared to INSERT

2003-12-04 Thread William Yu
Ivar Zarans wrote:
I am experiencing strange behaviour, where simple UPDATE of one field is
very slow, compared to INSERT into table with multiple indexes. I have
two tables - one with raw data records (about 24000), where one field
In Postgres and any other DB that uses MVCC (multi-version concurrency), 
UPDATES will always be slower than INSERTS. With MVCC, what the DB does 
is makes a copy of the record, updates that record and then invalidates 
the previous record. This allows maintains a consistent view for anybody 
who's reading the DB and also avoids the requirement of row locks.

If you have to use UPDATE, make sure (1) your UPDATE WHERE clause is 
properly indexed and (2) you are running ANALYZE/VACUUM periodically so 
the query planner can optimize for your UPDATE statements.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-26 Thread William Yu
Tom Lane wrote:
William Yu [EMAIL PROTECTED] writes:

I then tried to put the WAL directory onto a ramdisk. I turned off 
swapping, created a tmpfs mount point and copied the pg_xlog directory 
over. Everything looked fine as far as I could tell but Postgres just 
panic'd with a file permissions error. Anybody have thoughts to why 
tmpfs would not work?


I'd say you got the file or directory ownership or permissions wrong.
I did a mv instead of a cp which duplicates ownership  permissions exactly.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
This is an intriguing thought which leads me to think about a similar 
solution for even a production server and that's a solid state drive for 
just the WAL. What's the max disk space the WAL would ever take up? 
There's quite a few 512MB/1GB/2GB solid state drives available now in 
the ~$200-$500 range and if you never hit those limits...

When my current job batch is done, I'll save a copy of the dir and give 
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local 
store and run that through the hooper.

Shridhar Daithankar wrote:
Mount WAL on RAM disk. WAL is most often hit area for heavy 
updates/inserts. If you spped that up, things should be pretty faster.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
Josh Berkus wrote:
William,


When my current job batch is done, I'll save a copy of the dir and give
the WAL on ramdrive a test. And perhaps even buy a Sandisk at the local
store and run that through the hooper.


We'll be interested in the results.   The Sandisk won't be much of a 
performance test; last I checked, their access speed was about 1/2 that of a 
fast SCSI drive.   But it could be a feasability test for the more expensive 
RAMdrive approach.



The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could 
get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You 
shouldn't need to do mirroring with a solid state drive.

Time to Google up some more solid state drive vendors.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Maximum Possible Insert Performance?

2003-11-24 Thread William Yu
Josh Berkus wrote:
William,


The SanDisks do seem a bit pokey at 16MBps. On the otherhand, you could
get 4 of these suckers, put them in a mega-RAID-0 stripe for 64MBps. You
shouldn't need to do mirroring with a solid state drive.


I wouldn't count on RAID0 improving the speed of SANDisk's much.  How are you 
connecting to them?  USB?   USB doesn't support fast parallel data access.
You can get ATA SanDisks up to 2GB. Another vendor I checked out -- 
BitMicro -- has solid state drives for SATA, SCSI and FiberChannel. I'd 
definitely would not use USB SSDs -- USB performance would be so pokey 
to be useless.

Now, if it turns out that 256MB ramdisks are less than 1/5 the cost of 1GB 
ramdisks, then that's worth considering.
Looks like they're linear with size. SanDisk Flashdrive 1GB is about 
$1000 while 256MB is $250.

You're right, though, mirroring a solid state drive is pretty pointless; if 
power fails, both mirrors are dead.  
Actually no. Solid state memory is non-volatile. They retain data even 
without power.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Pg+Linux swap use

2003-11-01 Thread William Yu
Rob Sell wrote:
Not being one to hijack threads, but I haven't heard of this performance hit
when using HT, I have what should all rights be a pretty fast server, dual
2.4 Xeons with HT 205gb raid 5 array, 1 gig of memory. And it is only 50% as
fast as my old server which was a dual AMD MP 1400's with a 45gb raid 5
array and 1gb of ram. I have read everything I could find on Pg performance
tweaked all the variables that were suggested and nothing. Which is why I
subscribed to this list, just been lurking so far but this caught my eye. 
Not to get into a big Intel vs AMD argument but 50% sounds about right. 
Let's first assume that the QS rating for the MP1400 is relatively 
accurate and convert that to a 1.4GHz Xeon. 2.4/1.4 = +71%. Since 
processor performance does not increase linearly with clockspeed, 50% is 
in line with expectations. Then you throw in the fact that (1) QS 
ratings for slower AMD chips are understated (but overstated for the 
fastest chips), (2) AMD uses a point-to-point CPU/memory interface (much 
better for SMP) versus the P4/Xeon's shared bus, (3) Athlon architecture 
is more suited for DB work compared to the P4, I'd say you're lucky to 
see 50% more performance from a Xeon 2.4.

As for HT, I've seen quite a few benchmarks where HT hurts performance. 
The problem is it's not only app and workload specific but also system 
and usage specific. As it involves the internal rescheduling of 
processes, adding more simultaneous processes could help to a point and 
then start hurting or vice-versa.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Tuning for mid-size server

2003-10-24 Thread William Yu
So what is the ceiling on 32-bit processors for RAM? Most of the 64-bit 
vendors are pushing Athalon64 and G5 as breaking the 4GB barrier, and even 
I can do the math on 2^32.   All these 64-bit vendors, then, are talking 
about the limit on ram *per application* and not per machine?
64-bit CPU on 64-bit OS. Up to physical address limit for anything and 
everything.

64-bit CPU on 32-bit OS. Up to 4GB minus the kernel allocation -- which 
is usually 2GB on Windows and Linux. On Windows, you can up this to 3GB 
by using the /3GB switch. Linux requires a kernel recompile. PAE is then 
used to move the memory window to point to different areas of the 
physical memory.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-24 Thread William Yu
I have never worked with a XEON CPU before. Does anyone know how it performs
running PostgreSQL 7.3.4 / 7.4 on RedHat 9 ? Is it faster than a Pentium 4?
I believe the main difference is cache memory, right? Aside from cache mem,
it's basically a Pentium 4, or am I wrong?
Well, see the problem is of course, there's so many flavors of P4s and 
Xeons that it's hard to tell which is faster unless you specify the 
exact model. And even then, it would depend on the workload. Would a 
Xeon/3GHz/2MB L3/400FSB be faster than a P4C/3GHz/800FSB? No idea as no 
one has complete number breakdowns on these comparisons. Oh yeah, you 
could get a big round number that says on SPEC or something one CPU is 
faster than the other but whether that's faster for Postgres and your PG 
app is a totally different story.

That in mind, I wouldn't worry about it. The CPU is probably plenty fast 
for what you need to do. I'd look into two things in the server: memory 
and CPU expandability. I know you already plan on 4GB but you may need 
even more in the future. Few things can dramatically improve performance 
more than moving disk access to disk cache. And if there's a 2nd socket 
where you can pop another CPU in, that would leave you extra room if 
your server becomes CPU limited.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Reading data in bulk - help?

2003-09-10 Thread William Yu
1) Memory - clumsily adjusted shared_buffer - tried three values: 64, 
128, 256 with no discernible change in performance. Also adjusted, 
clumsily, effective_cache_size to 1000, 2000, 4000 - with no discernible 
change in performance. I looked at the Admin manual and googled around 
for how to set these values and I confess I'm clueless here. I have no 
idea how many kernel disk page buffers are used nor do I understand what 
the shared memory buffers are used for (although the postgresql.conf 
file hints that it's for communication between multiple connections). 
Any advice or pointers to articles/docs is appreciated.
The standard procedure is 1/4 of your memory for shared_buffers. Easiest 
way to calculate would be ###MB / 32 * 1000. E.g. if you have 256MB of 
memory, your shared_buffers should be 256 / 32 * 1000 = 8000.

The remaining memory you have leftover should be marked as OS cache 
via the effective_cache_size setting. I usually just multiply the 
shared_buffers value by 3 on systems with a lot of memory. With less 
memory, OS/Postgres/etc takes up a larger percentage of memory so values 
of 2 or 2.5 would be more accurate.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-04 Thread William Yu
Relaxin wrote:
I have a table with 102,384 records in it, each record is 934 bytes.

Using the follow select statement:
  SELECT * from table
PG Info: version 7.3.4 under cygwin on Windows 2000
ODBC: version 7.3.100
Machine: 500 Mhz/ 512MB RAM / IDE HDD

Under PG:  Data is returned in 26 secs!!
Under SQL Server:  Data is returned in 5 secs.
Under SQLBase: Data is returned in 6 secs.
Under SAPDB:Data is returned in 7 secs.
I created a similar table (934 bytes, 102K records) on a slightly faster 
machine: P3/800 + 512MB RAM + IDE HD. The server OS is Solaris 8 x86 and 
the version is 7.3.3.

On the server (via PSQL client) : 7.5 seconds
Using ODBC under VFPW: 10.5 seconds
How that translates to what you should see, I'm not sure. Assuming it 
was just the CPU difference, you should see numbers of roughly 13 
seconds. But the documentation says PG under CYGWIN is significantly 
slower than PG under UNIX so your mileage may vary...

Have you changed any of the settings yet in postgresql.conf, 
specifically the shared_buffers setting?

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-29 Thread William Yu
Shridhar Daithankar wrote:
Be careful here, we've seen that with the P4 Xeon's that are
hyper-threaded and a system that has very high disk I/O causes the
system to be sluggish and slow. But after disabling the hyper-threading
itself, our system flew..
Anybody has opteron working? Hows' the performance?
Yes. I'm using an 2x 1.8GHz Opteron system w/ 8GB of RAM. Right now, I'm 
still using 32-bit Linux -- I'm letting others be the 64-bit guinea 
pigs. :) I probably will get a cheapie 1x Opteron machine first and test 
the 64-bit kernel/libraries thoroughly before rolling it out to production.

As for performance, the scaling is magnificient -- even when just using 
PAE instead of 64-bit addressing. At low transaction counts, it's only 
~75% faster than the 2x Athlon 1800+ MP it replaced. But once the 
transactions start coming in, the gap is as high as 5x. My w-a-g: since 
each CPU has an integrated memory controller, you avoid memory bus 
contention which is probably the major bottleneck as transaction load 
increases. (I've seen Opteron several vs Xeon comparisons where 
single-connection tests are par for both CPUs but heavy-load tests favor 
the Opteron by a wide margin.) I suspect the 4X comparisons would tilt 
even more towards AMD's favor.

We should see a boost when we move to 64-bit Linux and hopefully another 
one when NUMA for Linux is production-stable.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-29 Thread William Yu
Shridhar Daithankar wrote:
Just a guess here but does a precompiled postgresql for x86 and a x86-64 
optimized one makes difference?

 Opteron is one place on earth you can watch difference between 32/64
 bit on same machine. Can be handy at times..
I don't know yet. I tried building a 64-bit kernel and my eyes glazed 
over trying to figure out how to create the cross-platform GCC compiler 
that's first needed to build the kernel. Then I read all the libraries  
drivers also needed to be 64-bit compiled and at that point gave up the 
ghost. I'll wait until a 64-bit Redhat distro is available before I test 
the 64-bit capabilities.

The preview SuSE 64-bit Linux used in most of the Opteron rollout tests 
has MySql precompiled as 64-bit and under that DB, 64-bit added an extra 
 ~25% performance (compared to a 32-bit SuSE install). My guess is half 
of the performance comes from eliminating the PAE swapping.

I am sure. But is 64 bit environment, Xeon is not the compitition. It's PA-RSC-
8700, ultraSparcs, Power series and if possible itanium.
Well, just because the Opteron is 64-bit doesn't mean it's direct 
competition for the high-end RISC chips. Yes, if you're looking at the 
discrete CPU itself, it appears they could compete -- the SpecINT scores 
places the Opteron near the top of the list. But big companies also need 
the infrastructure, management tools and top-end scalability. If you 
just have to have the million dollar machines (128x Itanium2 servers or 
whatever), AMD is nowhere close to competing unless Beowulf clusters fit 
your needs.

In terms of infrastructure, scalability, mindshare and pricing, Xeon is 
most certainly Opteron's main competition. We're talking $10K servers 
versus $50K+ servers (assuming you actually want performance instead of 
having a single pokey UltraSparc CPU in a box). And yes, just because 
Opteron is a better performing server platform than Xeon doesn't mean a 
corporate fuddy-duddy still won't buy Xeon due to the $1B spent by Intel 
on marketting.

We should see a boost when we move to 64-bit Linux and hopefully another 
one when NUMA for Linux is production-stable.
Getting a 2.6 running now is the answer to make it stable fast..:-) Of course 
if you have spare hardware..
My office is a pigsty of spare hardware lying around. :) We're like pigs 
rolling around in the mud.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly