Re: [PERFORM] Analysis Function

2010-06-16 Thread Magnus Hagander
On Mon, Jun 14, 2010 at 15:59, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Sun, Jun 13, 2010 at 21:19, David Jarvis thanga...@gmail.com wrote:
 I prefer to_timestamp and to_date over the more verbose construct_timestamp.

 Yeah, I agree with that.

 Those names are already taken.  It will cause confusion (of both people
 and machines) if you try to overload them with this.

Fair enough. How about something like make_timestamp? It's at least
shorter and easier than construct :-)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net
On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote:
 On 6/15/10 10:37 AM, Chris Browne wrote:

 I'd like to see some figures about WAL on RAMfs vs. simply turning off
 fsync and full_page_writes.  Per Gavin's tests, PostgreSQL is already
 close to TokyoCabinet/MongoDB performance just with those turned off; I
 wonder if actually having the WAL on a memory partition would make any
 real difference in throughput.

 I've seen a lot of call for this recently, especially since PostgreSQL
 seems to be increasingly in use as a reporting server for Hadoop.  Might
 be worth experimenting with just making wal writing a no-op.  We'd also
 want to disable checkpointing, of course.


My back-of-the-envelope experiment: Inserting single integers into a
table without indexes using a prepared query via psycopg2.

Python Script:
import psycopg2
from time import time
conn = psycopg2.connect(database='jgardner')
cursor = conn.cursor()
cursor.execute(CREATE TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
conn.commit()
start = time()
tx = 0
while time() - start  1.0:
cursor.execute(EXECUTE ins(%s), (tx,));
conn.commit()
tx += 1
print tx
cursor.execute(DROP TABLE test);
conn.commit();

Local disk, WAL on same FS:
* Default config = 90
* full_page_writes=off = 90
* synchronous_commit=off = 4,500
* fsync=off = 5,100
* fsync=off and synchronous_commit=off = 5,500
* fsync=off and full_page_writes=off = 5,150
* fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off = 5,200
* fsync=off = 5,250
* synchronous_commit=off = 5,200
* fsync=off and synchronous_commit=off = 5,450
* fsync=off and full_page_writes=off = 5,250
* fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.

If there's anything else I should tinker with, I'm all ears.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Mark Kirkwood

On 16/06/10 18:30, jgard...@jonathangardner.net wrote:

On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote:
   

On 6/15/10 10:37 AM, Chris Browne wrote:

I'd like to see some figures about WAL on RAMfs vs. simply turning off
fsync and full_page_writes.  Per Gavin's tests, PostgreSQL is already
close to TokyoCabinet/MongoDB performance just with those turned off; I
wonder if actually having the WAL on a memory partition would make any
real difference in throughput.

I've seen a lot of call for this recently, especially since PostgreSQL
seems to be increasingly in use as a reporting server for Hadoop.  Might
be worth experimenting with just making wal writing a no-op.  We'd also
want to disable checkpointing, of course.

 

My back-of-the-envelope experiment: Inserting single integers into a
table without indexes using a prepared query via psycopg2.

Python Script:
import psycopg2
from time import time
conn = psycopg2.connect(database='jgardner')
cursor = conn.cursor()
cursor.execute(CREATE TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
conn.commit()
start = time()
tx = 0
while time() - start  1.0:
 cursor.execute(EXECUTE ins(%s), (tx,));
 conn.commit()
 tx += 1
print tx
cursor.execute(DROP TABLE test);
conn.commit();

Local disk, WAL on same FS:
* Default config =  90
* full_page_writes=off =  90
* synchronous_commit=off =  4,500
* fsync=off =  5,100
* fsync=off and synchronous_commit=off =  5,500
* fsync=off and full_page_writes=off =  5,150
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500

tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off =  5,200
* fsync=off =  5,250
* synchronous_commit=off =  5,200
* fsync=off and synchronous_commit=off =  5,450
* fsync=off and full_page_writes=off =  5,250
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.

If there's anything else I should tinker with, I'm all ears.

   


Seeing some profiler output (e.g oprofile) for the fastest case (and 
maybe 'em all later) might be informative about what limit is being hit 
here.


regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C



Have you tried connecting using a UNIX socket instead of a TCP socket on
localhost ? On such very short queries, the TCP overhead is significant.


Actually UNIX sockets are the default for psycopg2, had forgotten that.

I get 7400 using UNIX sockets and 3000 using TCP (host=localhost)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Greg Smith

jgard...@jonathangardner.net wrote:

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.
  


That's usually about where I run into the upper limit on how many 
statements Python can execute against the database per second.  Between 
that and the GIL preventing better multi-core use, once you pull the 
disk out and get CPU bound it's hard to use Python for load testing of 
small statements and bottleneck anywhere except in Python itself.


I normally just write little performance test cases in the pgbench 
scripting language, then I get multiple clients and (in 9.0) multiple 
driver threads all for free.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Analysis Function

2010-06-16 Thread David Jarvis
 Fair enough. How about something like make_timestamp? It's at least
 shorter and easier than construct :-)


Agreed.

Dave


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C


FYI I've tweaked this program a bit :

import psycopg2
from time import time
conn = psycopg2.connect(database='peufeu')
cursor = conn.cursor()
cursor.execute(CREATE TEMPORARY TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
cursor.execute(PREPARE sel AS SELECT 1)
conn.commit()
start = time()
tx = 0
N = 100
d = 0
while d  10:
for n in xrange( N ):
cursor.execute(EXECUTE ins(%s), (tx,));
#~ conn.commit()
#~ cursor.execute(EXECUTE sel );
conn.commit()
d = time() - start
tx += N
print result : %d tps % (tx / d)
cursor.execute(DROP TABLE test);
conn.commit();

Results (Core 2 quad, ubuntu 10.04 64 bits) :

SELECT 1 : 21000 queries/s (I'd say 50 us per query isn't bad !)
INSERT with commit every 100 inserts : 17800 insets/s
INSERT with commit every INSERT : 7650 tps

fsync is on but not synchronous_commit.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Alvaro Herrera
Excerpts from jgard...@jonathangardner.net's message of mié jun 16 02:30:30 
-0400 2010:

 NOTE: If I do one giant commit instead of lots of littler ones, I get
 much better speeds for the slower cases, but I never exceed 5,500
 which appears to be some kind of wall I can't break through.
 
 If there's anything else I should tinker with, I'm all ears.

increase wal_buffers?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Josh Berkus

 * fsync=off = 5,100
 * fsync=off and synchronous_commit=off = 5,500

Now, this *is* interesting ... why should synch_commit make a difference
if fsync is off?

Anyone have any ideas?

 tmpfs, WAL on same tmpfs:
 * Default config: 5,200
 * full_page_writes=off = 5,200
 * fsync=off = 5,250
 * synchronous_commit=off = 5,200
 * fsync=off and synchronous_commit=off = 5,450
 * fsync=off and full_page_writes=off = 5,250
 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

So, in this test, it seems like having WAL on tmpfs doesn't make a
significant difference for everything == off.

I'll try running some tests on Amazon when I have a chance.  It would be
worthwhile to get figures without Python's ceiling.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Confirm calculus

2010-06-16 Thread Juan Pablo Sandoval Rivera
Good morning List 

In relation to the process of tuning the engine PostgreSQL database, especially 
7.3.7 version that is being used currently, agreaceria me clarify a procedure 

If I have a server with 2 GB of RAM, it is said that the shared memory segment 
for 
the engine of the database on a dedicated server should be between 25% and 
maximum 33% of the RAM. 

I'm doing the calculation would be the next agradeze I confirm if I am 
correct 

1 MB = 1,048,576 
  * 2 = 2097152 1048576 
25% would be 2097152 * 25 / 100 = 524288 
Then the shared_buffers should be 524288 / 8 = 65536 for 25% of 2 GB 
or should be 524 288? 


Sincerely.

Juan Pablo Sandoval Rivera
Tecnologo Prof. en Ing. de Sistemas

Linux User : 322765 
msn: juan_pab...@hotmail.com
yahoo : juan_pab...@rocketmail.com (juan_pablos.rm)
UIN : 276125187 (ICQ)
Jabber : juan_pab...@www.jabberes.org
Skype  : juan.pablo.sandoval.rivera

APOYA A ECOSEARCH.COM - Ayuda a salvar al Planeta.




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner
On Wed, Jun 16, 2010 at 1:27 AM, Greg Smith g...@2ndquadrant.com wrote:

 I normally just write little performance test cases in the pgbench scripting
 language, then I get multiple clients and (in 9.0) multiple driver threads
 all for free.


See, this is why I love these mailing lists. I totally forgot about
pgbench. I'm going to dump my cheesy python script and play with that
for a while.

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner
On Wed, Jun 16, 2010 at 4:22 AM, Pierre C li...@peufeu.com wrote:

 import psycopg2
 from time import time
 conn = psycopg2.connect(database='peufeu')
 cursor = conn.cursor()
 cursor.execute(CREATE TEMPORARY TABLE test (data int not null))
 conn.commit()
 cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
 cursor.execute(PREPARE sel AS SELECT 1)
 conn.commit()
 start = time()
 tx = 0
 N = 100
 d = 0
 while d  10:
        for n in xrange( N ):
                cursor.execute(EXECUTE ins(%s), (tx,));
                #~ conn.commit()
                #~ cursor.execute(EXECUTE sel );
        conn.commit()
        d = time() - start
        tx += N
 print result : %d tps % (tx / d)
 cursor.execute(DROP TABLE test);
 conn.commit();


I'm not surprised that Python add is so slow, but I am surprised that
I didn't remember it was... ;-)

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner
On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote:

 * fsync=off = 5,100
 * fsync=off and synchronous_commit=off = 5,500

 Now, this *is* interesting ... why should synch_commit make a difference
 if fsync is off?

 Anyone have any ideas?


I may have stumbled upon this by my ignorance, but I thought I read
that synchronous_commit controlled whether it tries to line up commits
or has a more free-for-all that may cause some intermediate weirdness.

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Confirm calculus

2010-06-16 Thread Kevin Grittner
Juan Pablo Sandoval Rivera juapab...@tutopia.com wrote:
 
 In relation to the process of tuning the engine PostgreSQL
 database, especially 7.3.7 version that is being used currently,
 
Do you really mean PostgreSQL version 7.3.7 (not 8.3.7)?  If so, you
should really consider upgrading.  Performance is going to be much
better, not to mention all the new features and bug fixes.
 
 it is said that the shared memory segment for the engine of the
 database on a dedicated server should be between 25% and maximum
 33% of the RAM.
 
I think that started being a recommended practice with 8.1 or 8.2;
in my limited experience with older versions, it didn't pay to go
beyond somewhere in the 100MB to 200MB range.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Balkrishna Sharma

http://www.postgresql.org/docs/current/static/wal-async-commit.html
 the server waits for the transaction's WAL records to be flushed to permanent 
storage before returning a success indication to the client.
I think with fynch=off, whether WAL gets written to disk or not is still 
controlled by synchronous_commit parameter. guessing here...

 Date: Wed, 16 Jun 2010 12:19:20 -0700
 Subject: Re: [PERFORM] PostgreSQL as a local in-memory cache
 From: jgard...@jonathangardner.net
 To: j...@agliodbs.com
 CC: pgsql-performance@postgresql.org
 
 On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote:
 
  * fsync=off = 5,100
  * fsync=off and synchronous_commit=off = 5,500
 
  Now, this *is* interesting ... why should synch_commit make a difference
  if fsync is off?
 
  Anyone have any ideas?
 
 
 I may have stumbled upon this by my ignorance, but I thought I read
 that synchronous_commit controlled whether it tries to line up commits
 or has a more free-for-all that may cause some intermediate weirdness.
 
 -- 
 Jonathan Gardner
 jgard...@jonathangardner.net
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
  
_
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3

Re: [PERFORM] Confirm calculus

2010-06-16 Thread Scott Marlowe
On Wed, Jun 16, 2010 at 7:46 AM, Juan Pablo Sandoval Rivera
juapab...@tutopia.com wrote:
 Good morning List

 In relation to the process of tuning the engine PostgreSQL database, 
 especially
 7.3.7 version that is being used currently, agreaceria me clarify a procedure

I concur with the other poster on keeping shared_mem lower on this
older version.  Also, you should be running at the very least the
latest version of 7.3, which is 7.3.21 and available here:

ftp://ftp-archives.postgresql.org/pub/source/

Tuning 7.3 is more an effort in futility at this point, considering
that a non-tuned 8.3 or 8.4 install will still be much much faster.
If the lack of auto-casts in 8.3 impacts you negatively then at least
look at 8.2.latest.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner
On Wed, Jun 16, 2010 at 12:51 AM, Pierre C li...@peufeu.com wrote:

 Have you tried connecting using a UNIX socket instead of a TCP socket on
 localhost ? On such very short queries, the TCP overhead is significant.


Unfortunately, this isn't an option for my use case. Carbonado only
supports TCP connections.


-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Craig James

On 6/16/10 12:00 PM, Josh Berkus wrote:



* fsync=off =  5,100
* fsync=off and synchronous_commit=off =  5,500


Now, this *is* interesting ... why should synch_commit make a difference
if fsync is off?

Anyone have any ideas?


I found that pgbench has noise of about 20% (I posted about this a couple 
days ago using data from 1000 identical pgbench runs).  Unless you make a bunch of runs 
and average them, a difference of 5,100 to 5,500 appears to be meaningless.

Craig




tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off =  5,200
* fsync=off =  5,250
* synchronous_commit=off =  5,200
* fsync=off and synchronous_commit=off =  5,450
* fsync=off and full_page_writes=off =  5,250
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500


So, in this test, it seems like having WAL on tmpfs doesn't make a
significant difference for everything == off.

I'll try running some tests on Amazon when I have a chance.  It would be
worthwhile to get figures without Python's ceiling.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C



I'm not surprised that Python add is so slow, but I am surprised that
I didn't remember it was... ;-)


it's not the add(), it's the time.time()...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Analysis Function

2010-06-16 Thread Tom Lane
David Jarvis thanga...@gmail.com writes:
 Fair enough. How about something like make_timestamp? It's at least
 shorter and easier than construct :-)

 Agreed.

No objection here either.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Parallel queries for a web-application |performance testing

2010-06-16 Thread Balkrishna Sharma

Hello,I will have a web application having postgres 8.4+ as backend. At any 
given time, there will be max of 1000 parallel web-users interacting with the 
database (read/write)I wish to do performance testing of 1000 simultaneous 
read/write to the database.
I can do a simple unix script on the postgres server and have parallel updates 
fired for example with an ampersand at the end. Example:
echo '\timing \\update DAPP.emp_data set f1 = 123where  emp_id =0;' | psql 
test1 postgres|grep Time:|cut -d' ' -f2-  
/home/user/Documents/temp/logs/$NUM.txt pid1=$!echo '\timing \\update 
DAPP.emp_data set f1 = 123 where  emp_id =2;' | psql test1 postgres|grep 
Time:|cut -d' ' -f2-  /home/user/Documents/temp/logs/$NUM.txt pid2=$!
echo '\timing \\update DAPP.emp_data set f1 = 123 where  emp_id =4;' | 
psql test1 postgres|grep Time:|cut -d' ' -f2-  
/home/user/Documents/temp/logs/$NUM.txt pid3=$!.

My question is:Am I losing something by firing these queries directly off the 
server and should I look at firing the queries from different IP address (as it 
would happen in a web application). Would the way postgres opens 
sockets/allocates buffer etc change in the two approaches and I get 
non-realistic results by a unix script on the server ?It will be very tedious  
exercise to have 1000 different machines (IP address)  and each firing a query; 
all the same time. But at the same time, I want to be absolutely sure my test 
would give the same result in production (requirements for latency for 
read/write is very very low)I am not interested in the network time; just the 
database read/write time.

Thanks for any tips !-Bala
  
_
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
Hotmail. 
http://www.windowslive.com/campaign/thenewbusy?tile=multicalendarocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5

Re: [PERFORM] requested shared memory size overflows size_t

2010-06-16 Thread Alvaro Herrera
Excerpts from Tom Lane's message of lun jun 14 23:57:11 -0400 2010:
 Scott Carey sc...@richrelevance.com writes:
  Great points.  There is one other option that is decent for the WAL:
  If splitting out a volume is not acceptable for the OS and WAL -- 
  absolutely split those two out into their own partitions.  It is most 
  important to make sure that WAL and data are not on the same filesystem, 
  especially if ext3 is involved.
 
 Uh, no, WAL really needs to be on its own *spindle*.  The whole point
 here is to have one disk head sitting on the WAL and not doing anything
 else except writing to that file.

However, there's another point here -- probably what Scott is on about:
on Linux (at least ext3), an fsync of any file does not limit to
flushing that file's blocks -- it flushes *ALL* blocks on *ALL* files in
the filesystem.  This is particularly problematic if you have pgsql_tmp
in the same filesystem and do lots of disk-based sorts.

So if you have it in the same spindle but on a different filesystem, at
least you'll avoid that extra fsync work, even if you have to live with
the extra seeking.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net
We have a fairly unique need for a local, in-memory cache. This will
store data aggregated from other sources. Generating the data only
takes a few minutes, and it is updated often. There will be some
fairly expensive queries of arbitrary complexity run at a fairly high
rate. We're looking for high concurrency and reasonable performance
throughout.

The entire data set is roughly 20 MB in size. We've tried Carbonado in
front of SleepycatJE only to discover that it chokes at a fairly low
concurrency and that Carbonado's rule-based optimizer is wholly
insufficient for our needs. We've also tried Carbonado's Map
Repository which suffers the same problems.

I've since moved the backend database to a local PostgreSQL instance
hoping to take advantage of PostgreSQL's superior performance at high
concurrency. Of course, at the default settings, it performs quite
poorly compares to the Map Repository and Sleepycat JE.

My question is how can I configure the database to run as quickly as
possible if I don't care about data consistency or durability? That
is, the data is updated so often and it can be reproduced fairly
rapidly so that if there is a server crash or random particles from
space mess up memory we'd just restart the machine and move on.

I've never configured PostgreSQL to work like this and I thought maybe
someone here had some ideas on a good approach to this.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] requested shared memory size overflows size_t

2010-06-16 Thread Tom Wilcox
Thanks. I will try with a more sensible value of wal_buffers.. I was 
hoping to keep more in memory and therefore reduce the frequency of disk 
IOs..


Any suggestions for good monitoring software for linux?

On 15/06/2010 00:08, Robert Haas wrote:

On Mon, Jun 14, 2010 at 2:53 PM, Tom Wilcoxhungry...@gmail.com  wrote:
   

maintenance_work_mem=4GB
work_mem=4GB
shared_buffers=4GB
effective_cache_size=4GB
wal_buffers=1GB
 

It's pretty easy to drive your system into swap with such a large
value for work_mem - you'd better monitor that carefully.

The default value for wal_buffers is 64kB.  I can't imagine why you'd
need to increase that by four orders of magnitude.  I'm not sure
whether it will cause you a problem or not, but you're allocating
quite a lot of shared memory that way that you might not really need.

   



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Parallel queries for a web-application |performance testing

2010-06-16 Thread Kevin Grittner
Balkrishna Sharma b...@hotmail.com wrote:
 
 I wish to do performance testing of 1000 simultaneous read/write
 to the database.
 
You should definitely be using a connection pool of some sort.  Both
your throughput and response time will be better that way.  You'll
want to test with different pool sizes, but I've found that a size
which allows the number of active queries in PostgreSQL to be
somewhere around (number_of_cores * 2) + effective_spindle_count to
be near the optimal size.
 
 My question is:Am I losing something by firing these queries
 directly off the server and should I look at firing the queries
 from different IP address (as it would happen in a web application).
 
If you run the client side of your test on the database server, the
CPU time used by the client will probably distort your results.  I
would try using one separate machine to generate the requests, but
monitor to make sure that the client machine isn't hitting some
bottleneck (like CPU time).  If the client is the limiting factor,
you may need to use more than one client machine.  No need to use
1000 different client machines.  :-)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net
On Jun 14, 7:14 pm, jgard...@jonathangardner.net
jgard...@jonathangardner.net wrote:
 We have a fairly unique need for a local, in-memory cache. This will
 store data aggregated from other sources. Generating the data only
 takes a few minutes, and it is updated often. There will be some
 fairly expensive queries of arbitrary complexity run at a fairly high
 rate. We're looking for high concurrency and reasonable performance
 throughout.

 The entire data set is roughly 20 MB in size. We've tried Carbonado in
 front of SleepycatJE only to discover that it chokes at a fairly low
 concurrency and that Carbonado's rule-based optimizer is wholly
 insufficient for our needs. We've also tried Carbonado's Map
 Repository which suffers the same problems.

 I've since moved the backend database to a local PostgreSQL instance
 hoping to take advantage of PostgreSQL's superior performance at high
 concurrency. Of course, at the default settings, it performs quite
 poorly compares to the Map Repository and Sleepycat JE.

 My question is how can I configure the database to run as quickly as
 possible if I don't care about data consistency or durability? That
 is, the data is updated so often and it can be reproduced fairly
 rapidly so that if there is a server crash or random particles from
 space mess up memory we'd just restart the machine and move on.

 I've never configured PostgreSQL to work like this and I thought maybe
 someone here had some ideas on a good approach to this.

Just to summarize what I've been able to accomplish so far. By turning
fsync and synchronize_commit off, and moving the data dir to tmpfs,
I've been able to run the expensive queries much faster than BDB or
the MapRepository that comes with Carbonado. This is because
PostgreSQL's planner is so much faster and better than whatever
Carbonado has. Tweaking indexes has only made things run faster.

Right now I'm wrapping up the project so that we can do some serious
performance benchmarks. I'll let you all know how it goes.

Also, just a note that setting up PostgreSQL for these weird scenarios
turned out to be just a tiny bit harder than setting up SQLite. I
remember several years ago when there was a push to simplify the
configuration and installation of PostgreSQL, and I believe that that
has born fruit.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance