Re: [PERFORM] Projecting currentdb to more users

2005-07-12 Thread Mohan, Ross
From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. 

...125. Intel has designed its compiler purposely to degrade performance when 
a program
is run on an AMD platform. To achieve this, Intel designed the compiler to 
compile code
along several alternate code paths. Some paths are executed when the program 
runs on an Intel
platform and others are executed when the program is operated on a computer 
with an AMD
microprocessor. (The choice of code path is determined when the program is 
started, using a
feature known as CPUID which identifies the computer's microprocessor.) By 
design, the
code paths were not created equally. If the program detects a Genuine Intel 
microprocessor,
it executes a fully optimized code path and operates with the maximum 
efficiency. However,
if the program detects an Authentic AMD microprocessor, it executes a 
different code path
that will degrade the program's performance or cause it to crash...


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Trying to figure out pgbench

2005-06-21 Thread Mohan, Ross
I had a similar experience. 

regardless of scaling, etc, I got same results. almost like flags
are not active. 

did

pgbench -I template1
and
pgbench -c 10 -t 50 -v -d 1 

and played around from there

This is on IBM pSeries, AIX5.3, PG8.0.2

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of William Yu
Sent: Tuesday, June 21, 2005 12:05 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Trying to figure out pgbench


My Dual Core Opteron server came in last week. I tried to do some 
benchmarks with pgbench to get some numbers on the difference between 
1x1 - 2x1 - 2x2 but no matter what I did, I kept getting the same TPS 
on all systems. Any hints on what the pgbench parameters I should be using?

In terms of production use, it definitely can handle more load. 
Previously, Apache/Perl had to run on a separate server to avoid a ~50% 
penalty. Now, the numbers are +15% performance even with Apache/Perl 
running on the same box as PostgreSQL. How much more load of course is 
what I'd like to quantify.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Whence the Opterons?

2005-06-10 Thread Mohan, Ross
Richard, 

thanks for info. 

...the RH supplied Postgres binary has issues...

Would you have the time to provide a bit more info?
Version of PG? Nature of issues? Methods that resolved?

Thanks again, 

-- Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Rowell
Sent: Friday, June 10, 2005 8:34 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Whence the Opterons?


I will second the nod to Penguin computing.  We have a bit of Penguin hardware 
here (though the majority is Dell).  We did have issues with one machine a 
couple of years ago, but Penguin was very pro-active in addressing that.

We recently picked up a Dual Opteron system from them and have been very 
pleased with it so far.  

I would be careful of the RHES that it ships with though.  We had machine 
lockups immediately after the suggested  kernel update (had to down grade 
manually).  Also, the RH supplied Postgres binary has issues, so you would need 
to compile Postgres yourself until the next RH update.

On Fri, 2005-05-06 at 14:39 -0700, Mischa Sandberg wrote:
 After reading the comparisons between Opteron and Xeon processors for 
 Linux, I'd like to add an Opteron box to our stable of Dells and 
 Sparcs, for comparison.
 
 IBM, Sun and HP have their fairly pricey Opteron systems.
 The IT people are not swell about unsupported purchases off ebay. 
 Anyone care to suggest any other vendors/distributors? Looking for 
 names with national support, so that we can recommend as much to our 
 customers.
 
 Many thanks in advance.
-- 
--
Richard Rowell
[EMAIL PROTECTED]
Bowman Systems
(318) 213-8780


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [PORTS] Which library has these symbols? -- Eureka

2005-05-25 Thread Mohan, Ross
Title: RE: [PORTS] Which library has these symbols? -- Eureka






64-bit PG 8.0.2. is up and running on AIX5.3/power5


YES! ! !


The major thing: setting some quirky LDFLAGS. 


Anyone interested in details, please ping. 


Thanks to Nick Addington, Vincent Vanwynsberghe, my SA, Sergey, 

and Tom Lane (for good-natured nudging)



My Next Task: Finding a Stress Test Harness to Load, and Query Data. 


Anyone have ideas? 


I am eagerly awaiting the DESTRUCTION of Oracle around here, and

yes I am an oracle DBA and think it's very good technology. 


Smiling, 


Ross Mohan


-Original Message-

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Mohan, Ross

Sent: Wednesday, May 25, 2005 1:11 PM

To: [EMAIL PROTECTED]

Subject: Re: [PORTS] Which library has these symbols? 



Tom, 


they're all over the place, repeated in different

libraries, kind of a pain. Didn't realize that. I'll

just give linker a bunch of LIBPATH and LIBNAME directives

and have it run around. 



# ar -t ./postgresql-8.0.2/src/interfaces/ecpg/ecpglib/libecpg.a | egrep 'dirmod|path|pgstr|pgsleep' 

path.o


# ar -t ./postgresql-8.0.2/src/interfaces/ecpg/pgtypeslib/libpgtypes.a | egrep 'dirmod|path|pgstr|pgsleep' pgstrcasecmp.o

# ar -t ./postgresql-8.0.2/src/interfaces/libpq/libpq.a | egrep 'dirmod|path|pgstr|pgsleep' 

pgstrcasecmp.o 


# ar -t ./postgresql-8.0.2/src/port/libpgport.a | egrep 'dirmod|path|pgstr|pgsleep' 

dirmod.o

path.o 

pgsleep.o

pgstrcasecmp.o


# ar -t ./postgresql-8.0.2/src/port/libpgport_srv.a | egrep 'dirmod|path|pgstr|pgsleep' dirmod_srv.o path.o pgsleep.o pgstrcasecmp.o



I **really** want this in 64bit..funny this problem only shows up in 64, not 32 mode. sigh 



Thanks for commenting --- That's ALWAYS welcome!


-- Ross


-Original Message-

From: Tom Lane [mailto:[EMAIL PROTECTED]] 

Sent: Tuesday, May 24, 2005 10:53 PM

To: Mohan, Ross

Cc: [EMAIL PROTECTED]

Subject: Re: [PORTS] Which library has these symbols? 



Mohan, Ross [EMAIL PROTECTED] writes:

 So Close, Yet So Far!


The specific symbols being complained of should be in libpgport_srv (see src/port). Dunno why your platform is ignoring that library. When you find out, let us know ;-)

   regards, tom lane


---(end of broadcast)---

TIP 5: Have you checked our extensive FAQ?


 http://www.postgresql.org/docs/faq





[PERFORM] test - pls delete and ignore

2005-05-25 Thread Mohan, Ross
You just couldn't help yourself, could you? :-)


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] Prefetch - OffTopic

2005-05-10 Thread Mohan, Ross
for time-series and insane fast, nothing beats kdB, I believe

www.kx.com

Not trying to Quisling-out PG here, just hoping to respond to Mr. Olson





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
Sent: Tuesday, May 10, 2005 2:54 PM
To: Greg Stark
Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Prefetch 


Greg Stark [EMAIL PROTECTED] writes:
 Actually forcing things to use indexes is the wrong direction to go if 
 you're trying to process lots of data and want to stream it off disk 
 as rapidly as possible. I would think about whether you can structure 
 your data such that you can use sequential scans.

Agreed.

 In your application that might be hard. It sounds like you would need 
 more or less one table per stock ticker which would really be hard to 
 manage.

Actually, in a previous lifetime I used to do pretty much the same stuff Matt 
is working on.  The reason I suggested parallelizing is that what you want is 
usually not so much the 200day moving average of FOO, as the 200day moving 
averages of a whole bunch of things.  If your input table contains time-ordered 
data for all those things, then a seqscan works out pretty well.

 One thing you might look into is using the CLUSTER command. But 
 postgres doesn't maintain the cluster ordering so it would require 
 periodically rerunning it.

If the desired sort order is time-based, it falls out pretty much for free in 
this application, because historical data doesn't change -- you are only 
interested in appending at the right.

In said previous lifetime, we used Postgres for tracking our actual 
transactions, but we built a custom file format for storing the individual tick 
data.  That's not stuff you need transactional semantics for; the historical 
data is what it is.  Besides, you need to compress it as much as you can 
because there's always too much of it.  Machines are faster and disk space 
cheaper than they were at the time, but I'd still question the wisdom of using 
a Postgres row for each daily bar, let alone finer-grain data.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Prefetch - OffTopic

2005-05-10 Thread Mohan, Ross
Yes, that would be a sufficient (although not necessary) condition for being 
well and fine with kdB. 
Last time I used APL was.pre-Gregorian, so yea, that's scary to me, too. 

( Of course, one can use C/ODBC or Java/JDBC to reach kdB; once there, one uses 
SQL92, or
proprietary kSQL. )



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Browne
Sent: Tuesday, May 10, 2005 4:14 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Prefetch - OffTopic


[EMAIL PROTECTED] (Mohan, Ross) writes:
 for time-series and insane fast, nothing beats kdB, I believe

 www.kx.com

... Which is well and fine if you're prepared to require that all of the staff 
that interact with data are skilled APL hackers.  Skilled enough that they're 
all ready to leap into Whitney's ASCII-based variant, K.
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org) 
http://www.ntlug.org/~cbbrowne/functional.html
Rules of the Evil Overlord #78.  I will not tell my Legions of Terror And he 
must  be taken alive! The command will be:  ``And try to take him alive if it 
is reasonably practical.'' http://www.eviloverlord.com/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Table Partitioning: Will it be supported in Future?

2005-04-26 Thread Mohan, Ross
Maybe he needs to spend $7K on performance improvements? 

;-)



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus
Sent: Tuesday, April 26, 2005 8:00 PM
To: Richard Huxton
Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Table Partitioning: Will it be supported in Future?


Richard,

 I believe these are being worked on at the moment. You might want to 
 search the archives of the hackers mailing list to see if the plans 
 will suit your needs.

Actually, this is being discussed through the Bizgres project: 
www.bizgres.org.

However, I agree that a 1GB table is not in need of partitioning.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross
FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joel Fradkin
Sent: Thursday, April 21, 2005 10:36 AM
To: 'Tom Lane'; 'John A Meinel'
Cc: 'Postgresql Performance'
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon



I suspect he's using pgadmin.  
Yup I was, but I did try running on the linux box in psql, but it was running 
to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not tested 
the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of steam but 
been good until this year when we about doubled our demand by adding sears as a 
client).

Using odbc to postgres on some of the views (Josh from Command is having me do 
some very specific testing) is timing out with a 10 minute time limit. These 
are pages that still respond using MSSQL (this is wehere production is using 
the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that and 
see if I get different responses.

Joel


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross
Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the lotsa stuff context of your current
explorations, it might be relevant?

I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating

Session(StringConn) = 
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;

|| Protocol? Is this related to version? is the driver wy old?


FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;

||  Fetch great for OLTP, lousy for batch?


Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;

||  what ARE the datatypes and sizes in your particular case? 

Debug=0;

||  a run with debug=1 probably would spit up something interesting

CommLog=0;Optimizer=1;

||  Optimizer? that's a new one on me

Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0

||  that's about all I can see, prima facie.  I'll be very curious to know if 
ODBC is
   any part of your performance equation. 


HTH, 

Ross

-Original Message-
From: Joel Fradkin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 10:54 AM
To: Mohan, Ross
Cc: [EMAIL PROTECTED]; PostgreSQL Perform
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon


Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.

Session(StringConn) = 
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0

Joel Fradkin
 

-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM
To: [EMAIL PROTECTED]
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross
kewl. 

Well, 8k request out of PG kernel might turn into an XKb request at
disk/OS level, but duly noted. 

Did you scan the code for this, or are you pulling this recollection from
the cognitive archives? :-)



-Original Message-
From: Jim C. Nasby [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 8:12 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote:
 Don't you think optimal stripe width would be
 a good question to research the binaries for? I'd
 think that drives the answer, largely.  (uh oh, pun alert)
 
 EG, oracle issues IO requests (this may have changed _just_
 recently) in 64KB chunks, regardless of what you ask for. 
 So when I did my striping (many moons ago, when the Earth 
 was young...) I did it in 128KB widths, and set the oracle 
 multiblock read count according. For oracle, any stripe size
 under 64KB=stupid, anything much over 128K/258K=wasteful. 
 
 I am eager to find out how PG handles all this.

AFAIK PostgreSQL requests data one database page at a time (normally 8k). Of 
course the OS might do something different.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross
right, the oracle system uses a second low latency bus to 
manage locking information (at the block level) via a
distributed lock manager.  (but this is slightly different
albeit related to a clustered file system and OS-managed
locking, eg) 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dawid Kuroczko
Sent: Wednesday, April 20, 2005 4:56 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


On 4/19/05, Mohan, Ross [EMAIL PROTECTED] wrote:
 Clustered file systems is the first/best example that
 comes to mind. Host A and Host B can both request from diskfarm, eg.

Something like a Global File System?

http://www.redhat.com/software/rha/gfs/

(I believe some other company did develop it some time in the past; hmm, 
probably the guys doing LVM stuff?).

Anyway the idea is that two machines have same filesystem mounted and they 
share it. The locking I believe is handled by communication between computers 
using host to host SCSI commands.

I never used it, I've only heard about it from a friend who used to work with 
it in CERN.

   Regards,
   Dawid

---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to improve db performance with $7K?

2005-04-20 Thread Mohan, Ross
Alex et al., 


I wonder if thats something to think about adding to Postgresql? A setting for 
multiblock read count like Oracle (Although 

||  I would think so, yea. GMTA: I was just having this micro-chat with Mr. Jim 
Nasby. 

having said that I believe that Oracle natively caches pages much more 
aggressively that postgresql, which allows the OS to do the file caching).

||  Yea...and it can rely on what is likely a lot more robust and nuanced 
caching algorithm, but...i don't
know enough (read: anything) about PG's to back that comment up. 


Alex Turner
netEconomist

P.S. Oracle changed this with 9i, you can change the Database block size on a 
tablespace by tablespace bassis making it smaller for OLTP tablespaces and 
larger for Warehousing tablespaces (at least I think it's on a tablespace, 
might be on a whole DB).

||Yes, it's tspace level. 



On 4/19/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
 On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote:
  Don't you think optimal stripe width would be
  a good question to research the binaries for? I'd
  think that drives the answer, largely.  (uh oh, pun alert)
 
  EG, oracle issues IO requests (this may have changed _just_
  recently) in 64KB chunks, regardless of what you ask for. So when I 
  did my striping (many moons ago, when the Earth was young...) I did 
  it in 128KB widths, and set the oracle multiblock read count 
  according. For oracle, any stripe size under 64KB=stupid, anything 
  much over 128K/258K=wasteful.
 
  I am eager to find out how PG handles all this.
 
 AFAIK PostgreSQL requests data one database page at a time (normally 
 8k). Of course the OS might do something different.
 --
 Jim C. Nasby, Database Consultant   [EMAIL PROTECTED]
 Give your computer some brain candy! www.distributed.net Team #1828
 
 Windows: Where do you want to go today?
 Linux: Where do you want to go tomorrow?
 FreeBSD: Are you guys coming, or what?
 
 ---(end of 
 broadcast)---
 TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Good question.  If the SCSI system was moving the head from track 1 to 10, and 
a request then came in for track 5, could the system make the head stop at 
track 5 on its way to track 10?  That is something that only the controller 
could do.  However, I have no idea if SCSI does that.

||  SCSI, AFAIK, does NOT do this. What SCSI can do is allow next request 
insertion into head
of request queue (queue-jumping), and/or defer request ordering to done by 
drive per se (queue
re-ordering).   I  have looked, in vain, for evidence that SCSI somehow 
magically stops in the
middle of request to pick up data (my words, not yours) 

The only part I am pretty sure about is that real-world experience shows SCSI 
is better for a mixed I/O environment.  Not sure why, exactly, but the command 
queueing obviously helps, and I am not sure what else does.

||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per se 
drive request reordering) 
   should go a looong way (but not all the way) toward making SATA 'enterprise 
acceptable'. Multiple 
   initiators (e.g. more than one host being able to talk to a drive) is a 
biggie, too. AFAIK only SCSI
   drives/controllers do that for now. 




-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Clustered file systems is the first/best example that
comes to mind. Host A and Host B can both request from diskfarm, eg. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:10 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
 The only part I am pretty sure about is that real-world experience 
 shows SCSI is better for a mixed I/O environment.  Not sure why, 
 exactly, but the command queueing obviously helps, and I am not sure 
 what else does.
 
 ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version 
 || of per se drive request reordering)
should go a looong way (but not all the way) toward making SATA 
 'enterprise acceptable'. Multiple 
initiators (e.g. more than one host being able to talk to a drive) is a 
 biggie, too. AFAIK only SCSI
drives/controllers do that for now.

What is 'multiple initiators' used for in the real world?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Well, more like they both are allowed to issue disk
requests and the magical clustered file system manages
locking, etc. 

In reality, any disk is only reading/writing to one part of
the disk at any given time, of course, but that in the multiple
initiator deal, multiple streams of requests from multiple hosts
can be queued. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:16 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
 Clustered file systems is the first/best example that
 comes to mind. Host A and Host B can both request from diskfarm, eg.

So one host writes to part of the disk and another host writes to a different 
part?

---

 -Original Message-
 From: Bruce Momjian [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, April 19, 2005 12:10 PM
 To: Mohan, Ross
 Cc: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] How to improve db performance with $7K?
 
 
 Mohan, Ross wrote:
  The only part I am pretty sure about is that real-world experience
  shows SCSI is better for a mixed I/O environment.  Not sure why, 
  exactly, but the command queueing obviously helps, and I am not sure 
  what else does.
  
  ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version
  || of per se drive request reordering)
 should go a looong way (but not all the way) toward making SATA 
  'enterprise acceptable'. Multiple 
 initiators (e.g. more than one host being able to talk to a drive) is a 
  biggie, too. AFAIK only SCSI
 drives/controllers do that for now.
 
 What is 'multiple initiators' used for in the real world?
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
 
 ---(end of 
 broadcast)---
 TIP 6: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] How to improve db performance with $7K?

2005-04-18 Thread Mohan, Ross
Don't you think optimal stripe width would be
a good question to research the binaries for? I'd
think that drives the answer, largely.  (uh oh, pun alert)

EG, oracle issues IO requests (this may have changed _just_ 
recently) in 64KB chunks, regardless of what you ask for. 
So when I did my striping (many moons ago, when the Earth 
was young...) I did it in 128KB widths, and set the oracle 
multiblock read count according. For oracle, any stripe size
under 64KB=stupid, anything much over 128K/258K=wasteful. 

I am eager to find out how PG handles all this. 


- Ross



p.s. Brooklyn thug accent 'You want a database record? I 
  gotcher record right here' http://en.wikipedia.org/wiki/Akashic_Records



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Turner
Sent: Monday, April 18, 2005 2:21 PM
To: Jacques Caron
Cc: Greg Stark; William Yu; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


So I wonder if one could take this stripe size thing further and say that a 
larger stripe size is more likely to result in requests getting served 
parallized across disks which would lead to increased performance?

Again, thanks to all people on this list, I know that I have learnt a _hell_ of 
alot since subscribing.

Alex Turner
netEconomist

On 4/18/05, Alex Turner [EMAIL PROTECTED] wrote:
 Ok - well - I am partially wrong...
 
 If you're stripe size is 64Kb, and you are reading 256k worth of data, 
 it will be spread across four drives, so you will need to read from 
 four devices to get your 256k of data (RAID 0 or 5 or 10), but if you 
 are only reading 64kb of data, I guess you would only need to read 
 from one disk.
 
 So my assertion that adding more drives doesn't help is pretty 
 wrong... particularly with OLTP because it's always dealing with 
 blocks that are smaller that the stripe size.
 
 Alex Turner
 netEconomist
 
 On 4/18/05, Jacques Caron [EMAIL PROTECTED] wrote:
  Hi,
 
  At 18:56 18/04/2005, Alex Turner wrote:
  All drives are required to fill every request in all RAID levels
 
  No, this is definitely wrong. In many cases, most drives don't 
  actually have the data requested, how could they handle the request?
 
  When reading one random sector, only *one* drive out of N is ever 
  used to service any given request, be it RAID 0, 1, 0+1, 1+0 or 5.
 
  When writing:
  - in RAID 0, 1 drive
  - in RAID 1, RAID 0+1 or 1+0, 2 drives
  - in RAID 5, you need to read on all drives and write on 2.
 
  Otherwise, what would be the point of RAID 0, 0+1 or 1+0?
 
  Jacques.
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?

2005-04-15 Thread Mohan, Ross

Sorry to blend threads, but in my kinda longish, somewhat thankless, 
essentially anonymous, and quite average career as a dba, I have 
found that the 7K would be best spent on a definitive end-to-end
application critical path test (pretty easy to instrument apps
and lash on test harnesses these days). 


If it's the disk subsystem, then by all means, spend the 7K there. 

If the 7K$ is for hardware only, then disk is always a good choice. For
a really small shop, maybe it's an upgrade to a dual CPU opteron MOBO, eg. 
dunno.

If, however, in the far-more-likely case that the application code
or system/business process is the throttle point, it'd be a great
use of money to have a test report showing that to the higher ups. 
That's where the best scalability bang-for-buck can be made. 


- Ross

p.s. having said this, and as already been noted 7K ain't
 going to buy that muchmaybe the ability to go RAID 10?
 
p.p.s  Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-)


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Friday, April 15, 2005 11:20 AM
To: Alex Turner
Cc: Dave Held; pgsql-performance@postgresql.org; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Intel SRCS16 SATA raid?


This is a different thread that the $7k server thread.
Greg Stark started it and wrote:

 I'm also wondering about whether I'm better off with one of these 
 SATA raid  
 controllers or just going with SCSI drives.   



Rick

[EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM:

 The original thread was how much can I get for $7k

 You can't fit a 15k RPM SCSI solution into $7K ;)  Some of us are ona
budget!

 10k RPM SATA drives give acceptable performance at a good price, thats 
 really the point here.

 I have never really argued that SATA is going to match SCSI 
 performance on multidrive arrays for IO/sec.  But it's all about the 
 benjamins baby.  If I told my boss we need $25k for a database 
 machine, he'd tell me that was impossible, and I have $5k to do it. If 
 I tell him $7k - he will swallow that.  We don't _need_ the amazing 
 performance of a 15k RPM drive config.  Our biggest hit is reads, so 
 we can buy 3xSATA machines and load balance.  It's all about the 
 application, and buying what is appropriate.  I don't buy a Corvette 
 if all I need is a malibu.

 Alex Turner
 netEconomist

 On 4/15/05, Dave Held [EMAIL PROTECTED] wrote:
   -Original Message-
   From: Alex Turner [mailto:[EMAIL PROTECTED]
   Sent: Thursday, April 14, 2005 6:15 PM
   To: Dave Held
   Cc: pgsql-performance@postgresql.org
   Subject: Re: [PERFORM] Intel SRCS16 SATA raid?
  
   Looking at the numbers, the raptor with TCQ enabled was close or 
   beat the Atlas III 10k drive on most benchmarks.
 
  And I would be willing to bet that the Atlas 10k is not using the 
  same generation of technology as the Raptors.
 
   Naturaly a 15k drive is going to be faster in many areas, but it 
   is also much more expensive.  It was only 44% better on the server 
   tests than the raptor with TCQ, but it costs nearly 300% more 
   ($538 cdw.com, $180 newegg.com).
 
  State that in terms of cars.  Would you be willing to pay 300% more 
  for a car that is 44% faster than your competitor's?  Of course you 
  would, because we all recognize that the cost of speed/performance 
  does not scale linearly.  Naturally, you buy the best speed that you 
  can afford, but when it comes to hard drives, the only major feature 
  whose price tends to scale anywhere close to linearly is capacity.
 
   Note also that the 15k drive was the only drive that kept up with 
   the raptor on raw transfer speed, which is going to matter for 
   WAL.
 
  So get a Raptor for your WAL partition. ;)
 
   [...]
   The Raptor drives can be had for as little as $180/ea, which is 
   quite a good price point considering they can keep up with their 
   SCSI 10k RPM counterparts on almost all tests with NCQ enabled 
   (Note that 3ware controllers _don't_ support NCQ, although they 
   claim their HBA based queueing is 95% as good as NCQ on the 
   drive).
 
  Just keep in mind the points made by the Seagate article.  You're 
  buying much more than just performance for that $500+.  You're also 
  buying vibrational tolerance, high MTBF, better internal 
  environmental controls, and a pretty significant margin on seek 
  time, which is probably your most important feature for disks 
  storing tables. An interesting test would be to stick several drives 
  in a cabinet and graph how performance is affected at the different 
  price points/ technologies/number of drives.
 
  __
  David B. Held
  Software 

Re: [PERFORM] How to improve db performance with $7K?

2005-04-15 Thread Mohan, Ross
Greg, et al. 

I never found any evidence of a stop and get an intermediate request
functionality in the TCQ protocol. 

IIRC, what is there is

1) Ordered
2) Head First
3) Simple

implemented as choices. *VERY* roughly, that'd be like
(1) disk subsystem satisfies requests as submitted, (2) let's
the this request be put at the very head of the per se disk
queue after the currently-running disk request is complete, and
(3) is let the per se disk and it's software reorder the requests
on-hand as per it's onboard software.  (N.B. in the last, it's
the DISK not the controller making those decisions). (N.B. too, that
this last is essentially what NCQ (cf. TCQ) is doing )

I know we've been batting around a hypothetical case of SCSI
where it stops and gets smth. on the way, but I can find
no proof (yet) that this is done, pro forma, by SCSI drives. 

In other words, SCSI is a necessary, but not sufficient cause
for intermediate reading. 

FWIW

- Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Friday, April 15, 2005 2:02 PM
To: Tom Lane
Cc: Kevin Brown; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Tom Lane [EMAIL PROTECTED] writes:

 Yes, you can probably assume that blocks with far-apart numbers are 
 going to require a big seek, and you might even be right in supposing 
 that a block with an intermediate number should be read on the way. 
 But you have no hope at all of making the right decisions at a more 
 local level --- say, reading various sectors within the same cylinder 
 in an optimal fashion.  You don't know where the track boundaries are, 
 so you can't schedule in a way that minimizes rotational latency. 
 You're best off to throw all the requests at the drive together and 
 let the drive sort it out.

Consider for example three reads, one at the beginning of the disk, one at the 
very end, and one in the middle. If the three are performed in the logical 
order (assuming the head starts at the beginning), then the drive has to seek, 
say, 4ms to get to the middle and 4ms to get to the end.

But if the middle block requires a full rotation to reach it from when the head 
arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive).

Whereas the drive could have seeked over to the last block, then seeked back in 
8ms and gotten there just in time to perform the read for free.


I'm not entirely convinced this explains all of the SCSI drives' superior 
performance though. The above is about a worst-case scenario. should really 
only have a small effect, and it's not like the drive firmware can really 
schedule things perfectly either.


I think most of the difference is that the drive manufacturers just don't 
package their high end drives with ATA interfaces. So there are no 10k RPM ATA 
drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but 
most of the manufacturers aren't even doing that.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] How to improve db performance with $7K?

2005-04-14 Thread Mohan, Ross
Imagine a system in furious activity with two (2) process regularly occuring

Process One:  Long read (or write). Takes 20ms to do seek, latency, and 
stream off. Runs over and over. 
Process Two:  Single block read ( or write ). Typical database row access. 
Optimally, could be submillisecond. happens more or less 
randomly. 


Let's say process one starts, and then process two. Assume, for sake of this 
discussion, 
that P2's block lies w/in P1's swath. (But doesn't have to...)

Now, everytime process two has to wait at LEAST 20ms to complete. In a 
queue-reordering
system, it could be a lot faster. And me, looking for disk service times on P2, 
keep
wondering why does a single diskblock read keep taking 20ms?


Sit doesn't need to be a read or a write. It doesn't need to be 
furious activity
(two processes is not furious, even for a single user desktop.)  This is not a 
corner case, 
and while it doesn't take into account kernel/drivecache/UBC buffering issues, 
I think it
shines a light on why command re-ordering might be useful. shrug 

YMMV. 



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Brown
Sent: Thursday, April 14, 2005 4:36 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Greg Stark wrote:


 I think you're being misled by analyzing the write case.
 
 Consider the read case. When a user process requests a block and that 
 read makes its way down to the driver level, the driver can't just put 
 it aside and wait until it's convenient. It has to go ahead and issue 
 the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a couple of 
milliseconds to see if other requests come in, and then issue the read if none 
do.  If there are already other requests being fulfilled, then it'll schedule 
the request in question just like the rest.

 In the 10ms or so that it takes to seek to perform that read
 *nothing* gets done. If the driver receives more read or write 
 requests it just has to sit on them and wait. 10ms is a lifetime for a 
 computer. In that time dozens of other processes could have been 
 scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the system goes 
from an essentially idle state to one of furious activity.  In other words, 
it's a corner case that I strongly suspect isn't typical in situations where 
SCSI has historically made a big difference.

Once the first request has been fulfilled, the driver can now schedule the rest 
of the queued-up requests in disk-layout order.


I really don't see how this is any different between a system that has tagged 
queueing to the disks and one that doesn't.  The only difference is where the 
queueing happens.  In the case of SCSI, the queueing happens on the disks (or 
at least on the controller).  In the case of SATA, the queueing happens in the 
kernel.

I suppose the tagged queueing setup could begin the head movement and, if 
another request comes in that requests a block on a cylinder between where the 
head currently is and where it's going, go ahead and read the block in 
question.  But is that *really* what happens in a tagged queueing system?  It's 
the only major advantage I can see it having.


 The same thing would happen if you had lots of processes issuing lots 
 of small fsynced writes all over the place. Postgres doesn't really do 
 that though. It sort of does with the WAL logs, but that shouldn't 
 cause a lot of seeking.  Perhaps it would mean that having your WAL 
 share a spindle with other parts of the OS would have a bigger penalty 
 on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any.  When a process 
issues an fsync (or even a sync), the kernel doesn't *have* to drop everything 
it's doing and get to work on it immediately.  It could easily gather a few 
more requests, bundle them up, and then issue them.  If there's a lot of disk 
activity, it's probably smart to do just that.  All fsync and sync require is 
that the caller block until the data hits the disk (from the point of view of 
the kernel). The specification doesn't require that the kernel act on the calls 
immediately or write only the blocks referred to by the call in question.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Intel SRCS16 SATA raid?

2005-04-14 Thread Mohan, Ross
sorry, don't remember whether it's SCSI or SATA II, but IIRC
the Areca controllers are just stellar for things. 

If you do get SATA for db stuff..especially multiuser...i still
haven't seen anything to indicate an across-the-board primacy
for SATA over SCSI. I'd go w/SCSI, or if SATA for $$$ reasons, I'd
be sure to have many spindles and RAID 10. 

my 0.02. I'm surely not an expert of any kind. 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark
Sent: Thursday, April 14, 2005 10:55 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Intel SRCS16 SATA raid?



Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead 
of the 3ware one.

Poking around it seems this does come with Linux drivers and there is a battery 
backup option. So it doesn't seem to be completely insane.

Anyone have any experience with these controllers?

I'm also wondering about whether I'm better off with one of these SATA raid 
controllers or just going with SCSI drives.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Building postmaster with Profiling Support WAS Tweaking a C Function I wrote

2005-04-07 Thread Mohan, Ross
Adam - 

Is compiling postmaster with profiling support just a flag
in the build/make? Or is there something more involved? 

I'd like to be able to do this in the future and so am
curious about means/methods. 

If this is a RTFM, just let me know that (am currently 
Reading The F Manual), but if you have any special sauce
here, that'd be of great interest. 

Thanks

-Ross

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Palmblad
Sent: Wednesday, April 06, 2005 7:23 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Tweaking a C Function I wrote


I wanted to see if I could squeeze any more performance out of a C set 
returning function I wrote.  As such, I looked to a profiler.  Is it 
possible to get profile information on the function I wrote?  I've got 
postmaster and my function compiled with profiling support, and can find 
the gmon.out files... can I actually look at the call tree that occurs 
when my function is being executed or will I be limited to viewing calls 
to functions in the postmaster binary?

-Adam


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?

2005-04-06 Thread Mohan, Ross
How close to this is PG's COPY?  I get surprisingly good results using COPY 
with jdbc on smallish systems (now if that patch would make into the mainstream 
PG jdbc support!)  I think COPY has a bit more overhead than what a Bulkload 
feature may have, but I suspect it's not that much more.

||  Steve, I do not know. But am reading the docs now, and should figure it 
out. Ask
me later if you remember. Oracle's direct path is a way of just slamming 
blocks
filled with rows into the table, above the high water mark. It sidesteps 
freelist
management and all manner of intrablock issues. There is a payback, but 
the benefits
far far outweigh the costs. 

 Now...if you ask me can this work without Power5 and Hitachi SAN? my 
 answer is..you give me a top end Dell and SCSI III on 15K disks and 
 I'll likely easily match it, yea.
 
 I'd love to see PG get into this range..i am a big fan of PG (just a 
 rank newbie) but I gotta think the underlying code to do this has to 
 be not-too-complex.

It may not be that far off if you can use COPY instead of INSERT. But comparing 
Bulkload to INSERT is a bit apples-orangish.

||  Oh! I see! I had no idea I was doing that!  Thanks for pointing it out 
clearly to me. Yea, I would
say a full transactional INSERT of 5K rows/sec into an indexed-table is a 
near-mythology without significant
caveats (parallelized, deferred buffering, etc.) 



-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?

2005-04-06 Thread Mohan, Ross
snip good stuff...

31Million tuples were loaded in approx 279 seconds, or approx 112k rows per 
second.

 I'd love to see PG get into this range..i am a big fan of PG (just a 
 rank newbie) but I gotta think the underlying code to do this has to 
 be not-too-complex.

I'd say we're there.


||  CLAPPING!!  Yes! PG is there, assuredly!   So VERY cool!  I made a newbie
error of conflating COPY with INSERT. I don't know if I could get
oracle to do much more than about 500-1500 rows/sec...PG is quite 
impressive.

Makes one wonder why corporations positively insist on giving oracle
 yearly. shrug

-Original Message-
From: Rod Taylor [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 06, 2005 12:41 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?


On Wed, 2005-04-06 at 16:12 +, Mohan, Ross wrote:
 I wish I had a Dell system and run case to show you Alex, but I 
 don't... however...using Oracle's direct path feature, it's pretty 
 straightforward.
 
 We've done 110,000 rows per second into index-less tables on a big 
 system (IBM Power5 chips, Hitachi SAN). ( Yes, I am sure: over 100K a 
 second. Sustained for almost 9 minutes. )

Just for kicks I did a local test on a desktop machine (single CPU, single IDE 
drive) using COPY from STDIN for a set of integers in via a single transaction, 
no indexes.

1572864 tuples were loaded in 13715.613ms, which is approx 115k rows per second.

Okay, no checkpoints and I didn't cross an index boundary, but I also haven't 
tuned the config file beyond bumping up the buffers.

Lets try again with more data this time.

31Million tuples were loaded in approx 279 seconds, or approx 112k rows per 
second.

 I'd love to see PG get into this range..i am a big fan of PG (just a 
 rank newbie) but I gotta think the underlying code to do this has to 
 be not-too-complex.

I'd say we're there.

 -Original Message-
 From: Alex Turner [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, April 06, 2005 11:38 AM
 To: [EMAIL PROTECTED]
 Cc: pgsql-performance@postgresql.org; Mohan, Ross
 Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?
 
 
 I think everyone was scared off by the 5000 inserts per second number.
 
 I've never seen even Oracle do this on a top end Dell system with 
 copious SCSI attached storage.
 
 Alex Turner
 netEconomist
 
 On Apr 6, 2005 3:17 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
   
  Unfortunately.
   
  But we are in the the process to choose Postgresql with pgcluster. 
  I'm
  currently running some tests (performance, stability...) Save the 
  money on the license fees, you get it for your hardware ;-)
   
  I still welcome any advices or comments and I'll let you know how 
  the
  project is going on.
   
  Benjamin.
   
   
   
   Mohan, Ross [EMAIL PROTECTED]
  
  05/04/2005 20:48
   
  Pour :[EMAIL PROTECTED] 
  cc : 
  Objet :RE: [PERFORM] Postgresql vs SQLserver for this
  application ?
   
   
  You never got answers on this? Apologies, I don't have one, but'd be
  curious to hear about any you did get

  thx

  Ross
  
  -Original Message-
   From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  [EMAIL PROTECTED]
   Sent: Monday, April 04, 2005 4:02 AM
   To: pgsql-performance@postgresql.org
   Subject: [PERFORM] Postgresql vs SQLserver for this application ?
   
  
   hi all.
   
   We are designing a quite big application that requires a
  high-performance database backend.  The rates we need to obtain are at 
  least  5000 inserts per second and 15 selects per second for one 
  connection. There should only be 3 or 4 simultaneous connections.
   I think our main concern is to deal with the constant flow of data coming
  from the inserts that must be available for selection as fast as possible.
  (kind of real time access ...) 
   
   As a consequence, the database should rapidly increase up to more
  than one hundred gigs. We still have to determine how and when we 
  shoud backup old data to prevent the application from a performance 
  drop. We intend to develop some kind of real-time partionning on our 
  main table keep the flows up.
   
   At first, we were planning to use SQL Server as it has features 
  that
  in my opinion could help us a lot :
  - replication 
  - clustering
   
   Recently we started to study Postgresql as a solution for our project : 
  - it also has replication 
  - Postgis module can handle geographic datatypes (which 
  would
  facilitate our developments)
  - We do have a strong knowledge on Postgresql administration 
  (we use it for production processes)
  - it is free (!) and we could save money for hardware 
  purchase.
   
   Is SQL server clustering a real asset ? How reliable are Postgresql
  replication tools  ? Should I

Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross
VOIP over BitTorrent?  

;-)


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, March 30, 2005 11:27 AM
To: Michael Fuhr
Cc: Marc Burgauer; pgsql-performance@postgresql.org; [EMAIL PROTECTED]
Subject: Re: [PERFORM] Reading recommendations




[EMAIL PROTECTED] wrote on 03/30/2005 10:58:21 AM:


 Allow telecommute from across the pond and I might be interested :-)

Please post phone bills to this list.


 --
 Michael Fuhr

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross
Yea, the upside is that you get better than the 1 byte/hour rate for 
pigeon-net. 

Downside is that simply because you use BiTorrent, the RIAA accuses you of 
everything from CD piracy to shipping pr*n to cyberterrorism, and you spend 
the next four years in Gitmo, comparing notes with your cellmates in Camp 
X-Ray, 
and watching pigeons fly overhead. 


-Original Message-
From: Steve Wampler [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 30, 2005 11:52 AM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Reading recommendations


Mohan, Ross wrote:
 VOIP over BitTorrent?

Now *that* I want to see.  Aught to be at least as interesting as the TCP/IP 
over carrier pigeon experiment - and more challenging to boot!


-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Reading recommendations

2005-03-30 Thread Mohan, Ross
I can see that PG'ers have a wicked sense of humor. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Wampler
Sent: Wednesday, March 30, 2005 3:58 PM
To: [EMAIL PROTECTED]
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Reading recommendations


[EMAIL PROTECTED] wrote:

Mohan, Ross wrote:

VOIP over BitTorrent?

Now *that* I want to see.  Aught to be at least as interesting as the 
TCP/IP over carrier pigeon experiment - and more challenging to 
boot!

 
 
 It was very challenging.  I worked on the credit window sizing and 
 retransmission timer estimation algorithms.  We took into account 
 weather patterns, size and age of the bird, feeding times, and the 
 average number of times a bird circles before determining magnetic 
 north.  Interestingly, packet size had little effect in the final 
 algorithms.
 
 I would love to share them with all of you, but they're classified.

Ah, but VOIPOBT requires many people all saying the same thing at the same 
time.  The synchronization alone (since you need to distribute these people 
adequately to avoid overloading a trunk line...) is probably sufficiently hard 
to make it interesting.  Then there are the problems of different accents, 
dilects, and languages ;)

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster