Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-05 Thread Pierre C
On Thu, 04 Nov 2010 15:42:08 +0100, Nick Matheson  
nick.d.mathe...@noaa.gov wrote:
I think your comments really get at what our working hypothesis was, but  
given that our experience is limited compared to you all here on the  
mailing lists we really wanted to make sure we weren't missing any  
alternatives. Also the writing of custom aggregators will likely  
leverage any improvements we make to our storage throughput.


Quick test : SELECT sum(x) FROM a table with 1 INT column, 3M rows, cached
= 244 MB/s
= 6.7 M rows/s

Same on MySQL :

 sizeSELECT sum(x) (cached)
postgres 107 MB  0.44 s
myisam   20 MB   0.42 s
innodb   88 MB   1.98 s

As you can see, even though myisam is much smaller (no transaction data to  
store !) the aggregate performance isn't any better, and for innodb it is  
much worse.


Even though pg's per-row header is large, seq scan / aggregate performance  
is very good.


You can get performance in this ballpark by writing a custom aggregate in  
C ; it isn't very difficult, the pg source code is clean and full of  
insightful comments.


- take a look at how contrib/intagg works
- http://www.postgresql.org/files/documentation/books/aw_pgsql/node168.html
- and the pg manual of course

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-05 Thread Robert Klemme

On 11/03/2010 04:52 PM, Nick Matheson wrote:


We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table). We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is
due to the storage overhead we have observed in Postgres. In the
example below, it takes 1 GB to store 350 MB of nominal data. However
that suggests we would expect to get 35 MB/s bulk read rates.



Our business logic does operations on the resulting data such that
the output is several orders of magnitude smaller than the input. So
we had hoped that by putting our business logic into stored procedures
(and thus drastically reducing the amount of data flowing to the
client) our throughput would go way up. This did not happen.


Can you disclose what kinds of manipulations you want to do on the data? 
 I am asking because maybe there is a fancy query (possibly using 
windowing functions and / or aggregation functions) that gets you the 
speed that you need without transferring the whole data set to the client.



So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads? Or is this the price we have to pay for
using SQL instead of some NoSQL solution. (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).


Kind regards

robert


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-05 Thread Samuel Gendler
On Fri, Nov 5, 2010 at 12:23 PM, Samuel Gendler
sgend...@ideasculptor.comwrote:

 On Thu, Nov 4, 2010 at 8:07 AM, Vitalii Tymchyshyn tiv...@gmail.comwrote:

 04.11.10 16:31, Nick Matheson написав(ла):

  Heikki-


 Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

  Thanks for the suggestion. A preliminary test shows an improvement
 closer to our expected 35 MB/s.

 Are you familiar with any Java libraries for decoding the COPY format?
 The spec is clear and we could clearly write our own, but figured I would
 ask. ;)

 JDBC driver has some COPY support, but I don't remember details. You'd
 better ask in JDBC list.



 The JDBC driver support works fine.  You can pass a Reader or InputStream
 (if I recall correctly, the InputStream path is more efficient.  Or maybe
 the Reader path was buggy.  Regardless, I wound up using an InputStream in
 the driver which I then wrap in a Reader in order to get it line-by-line.

 You can write a COPY statement to send standard CSV format - take a look at
 the postgres docs for the COPY statement to see the full syntax.  I then
 have a subclass of BufferedReader which parses each line of CSV and does
 something interesting with it.  I've had it working very reliably for many
 months now, processing about 500 million rows per day (I'm actually COPYing
 out, rather than in, but the concept is the same, rgardless - my
 outputstream is wrapper in a writer, which reformats data on the fly).



I should mention that I found basically no documentation of the copy api in
the jdbc driver in 8.4.  I have no idea if that has changed with 9.x. I had
to figure it out by reading the source code.  Fortunately, it is very
simple:

return ((PGConnection) con).getCopyAPI().copyIn(sql, this.fis);


Where this.fis is an InputStream. There's an alternative copyIn
implementation that takes a Reader instead.  I'm sure the copyOut methods
are the same.


Note: my earlier email was confusing.  copyIn, copies into the db and
receives an InputStream that will deliver data when it is read.  copyOut
copies data from the db and receives an OutputStream which will receive the
data.  I inverted those in my earlier email.


You can look at the source code to the CopyAPI to learn more about the
mechanism.


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-05 Thread Samuel Gendler
On Thu, Nov 4, 2010 at 8:07 AM, Vitalii Tymchyshyn tiv...@gmail.com wrote:

 04.11.10 16:31, Nick Matheson написав(ла):

  Heikki-


 Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

  Thanks for the suggestion. A preliminary test shows an improvement
 closer to our expected 35 MB/s.

 Are you familiar with any Java libraries for decoding the COPY format? The
 spec is clear and we could clearly write our own, but figured I would ask.
 ;)

 JDBC driver has some COPY support, but I don't remember details. You'd
 better ask in JDBC list.



The JDBC driver support works fine.  You can pass a Reader or InputStream
(if I recall correctly, the InputStream path is more efficient.  Or maybe
the Reader path was buggy.  Regardless, I wound up using an InputStream in
the driver which I then wrap in a Reader in order to get it line-by-line.

You can write a COPY statement to send standard CSV format - take a look at
the postgres docs for the COPY statement to see the full syntax.  I then
have a subclass of BufferedReader which parses each line of CSV and does
something interesting with it.  I've had it working very reliably for many
months now, processing about 500 million rows per day (I'm actually COPYing
out, rather than in, but the concept is the same, rgardless - my
outputstream is wrapper in a writer, which reformats data on the fly).


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Pierre C



Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads?  Or is this the price we have to pay for
using SQL instead of some NoSQL solution.  (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).


Reading from the tables is very fast, what bites you is that postgres has  
to convert the data to wire format, send it to the client, and the client  
has to decode it and convert it to a format usable by your application.  
Writing a custom aggregate in C should be a lot faster since it has direct  
access to the data itself. The code path from actual table data to an  
aggregate is much shorter than from table data to the client...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson

Heikki-


Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

Thanks for the suggestion. A preliminary test shows an improvement 
closer to our expected 35 MB/s.


Are you familiar with any Java libraries for decoding the COPY format? 
The spec is clear and we could clearly write our own, but figured I 
would ask. ;)


Nick

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson

Marti-

Just some ideas that went through my mind when reading your post
PostgreSQL 8.3 and later have 22 bytes of overhead per row, plus
page-level overhead and internal fragmentation. You can't do anything
about row overheads, but you can recompile the server with larger
pages to reduce page overhead.

  

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads?



Perhaps a simpler alternative would be writing your own aggregate
function with four arguments.

If you write this aggregate function in C, it should have similar
performance as the sum() query.
  
You comments seem to confirm some of our foggy understanding of the 
storage 'overhead' and nudge us in the direction of C stored procedures.


Do you have any results or personal experiences from moving calculations 
in this way? I think we are trying to get an understanding of how much 
we might stand to gain by the added investment.


Thanks,

Nick


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson

Andy-
I have no idea if this would be helpful or not, never tried it, but 
when you fire off select * from bigtable pg will create the entire 
resultset in memory (and maybe swap?) and then send it all to the 
client in one big lump.  You might try a cursor and fetch 100-1000 at 
a time from the cursor.  No idea if it would be faster or slower.
I am pretty sure we have tried paged datasets and didn't see any 
improvement. But we will put this on our list of things to double check, 
better safe than sorry you know.


Thanks,

Nick

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson

Pierre-

Reading from the tables is very fast, what bites you is that postgres 
has to convert the data to wire format, send it to the client, and the 
client has to decode it and convert it to a format usable by your 
application. Writing a custom aggregate in C should be a lot faster 
since it has direct access to the
data itself. The code path from actual table data to an aggregate is 
much shorter than from table data to the client...



I think your comments really get at what our working hypothesis was, but 
given that our experience is limited compared to you all here on the 
mailing lists we really wanted to make sure we weren't missing any 
alternatives. Also the writing of custom aggregators will likely 
leverage any improvements we make to our storage throughput.


Thanks,

Nick


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Vitalii Tymchyshyn

04.11.10 16:31, Nick Matheson написав(ла):

Heikki-


Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

Thanks for the suggestion. A preliminary test shows an improvement 
closer to our expected 35 MB/s.


Are you familiar with any Java libraries for decoding the COPY format? 
The spec is clear and we could clearly write our own, but figured I 
would ask. ;)
JDBC driver has some COPY support, but I don't remember details. You'd 
better ask in JDBC list.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson
-of be on postgres and use all cores... but that
introduces a whole host of other issues to solve. 
  
Interesting. I had heard of Greenplum, but thought it was more about 
scaling to clusters rather than single node improvements. We will have 
to look into that.


Thanks again for all the ideas, questions and things to look into I 
think you have opened a number of new possibilities.


Cheers,

Nick


-Original Message-
From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Nick Matheson
Sent: Wednesday, November 03, 2010 9:53 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Simple (hopefully) throughput question?

Hello

We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table).  We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s).  Part of this is
due to the storage overhead we have observed in Postgres.  In the
example below, it takes 1 GB to store 350 MB of nominal data.  However
that suggests we would expect to get 35 MB/s bulk read rates.

Observations using iostat and top during these bulk reads suggest
that the queries are CPU bound, not I/O bound.  In fact, repeating the
queries yields similar response times.  Presumably if it were an I/O
issue the response times would be much shorter the second time through
with the benefit of caching.

We have tried these simple queries using psql, JDBC, pl/java stored
procedures, and libpq.  In all cases the client code ran on the same
box as the server. We have experimented with Postgres 8.1, 8.3 and 9.0.

We also tried playing around with some of the server tuning parameters such
as shared_buffers to no avail.

Here is uname -a for a machine we have tested on:

Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20 07:12:06
EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

A sample dataset that reproduces these results looks like the following
(there are no indexes):

Table bulk_performance.counts
 Column |  Type   | Modifiers
+-+---
 i1 | integer |
 i2 | integer |
 i3 | integer |
 i4 | integer |

There are 22 million rows in this case.

We HAVE observed that summation queries run considerably faster.  In this
case,

select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts

runs at 35 MB/s.

Our business logic does operations on the resulting data such that
the output is several orders of magnitude smaller than the input.  So
we had hoped that by putting our business logic into stored procedures
(and thus drastically reducing the amount of data flowing to the
client) our throughput would go way up.  This did not happen.

So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads?  Or is this the price we have to pay for
using SQL instead of some NoSQL solution.  (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).

Thanks

Dan Schaffer
Paul Hamer
Nick Matheson


  



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Maciek Sakrejda
 JDBC driver has some COPY support, but I don't remember details. You'd
 better ask in JDBC list.

As long as we're here: yes, the JDBC driver has COPY support as of
8.4(?) via the CopyManager PostgreSQL-specific API. You can call
((PGConnection)conn).getCopyManager() and do either push- or
pull-based COPY IN or OUT. We've been using it for several years and
it works like a charm. For more details, ask the JDBC list or check
out the docs: http://jdbc.postgresql.org/documentation/publicapi/index.html

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-04 Thread Nick Matheson

Maciek/Vitalii-

Thanks for the pointers to the JDBC work. 

Luckily, we had already found the COPY support in the pg driver, but 
were wondering if anyone had already written the complimentary unpacking 
code for the raw data returned from the copy.


Again the spec is clear enough that we could write it, but we just 
didn't want to re-invent the wheel if it wasn't necessary.


Cheers,

Nick

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Simple (hopefully) throughput question?

2010-11-03 Thread Nick Matheson

Hello

We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table).  We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s).  Part of this is
due to the storage overhead we have observed in Postgres.  In the
example below, it takes 1 GB to store 350 MB of nominal data.  However
that suggests we would expect to get 35 MB/s bulk read rates.

Observations using iostat and top during these bulk reads suggest
that the queries are CPU bound, not I/O bound.  In fact, repeating the
queries yields similar response times.  Presumably if it were an I/O
issue the response times would be much shorter the second time through
with the benefit of caching.

We have tried these simple queries using psql, JDBC, pl/java stored
procedures, and libpq.  In all cases the client code ran on the same
box as the server. We have experimented with Postgres 8.1, 8.3 and 9.0.

We also tried playing around with some of the server tuning parameters such as 
shared_buffers to no avail.

Here is uname -a for a machine we have tested on:

Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20 07:12:06 EDT 
2010 x86_64 x86_64 x86_64 GNU/Linux

A sample dataset that reproduces these results looks like the following (there 
are no indexes):

Table bulk_performance.counts
Column |  Type   | Modifiers
+-+---
i1 | integer |
i2 | integer |
i3 | integer |
i4 | integer |

There are 22 million rows in this case.

We HAVE observed that summation queries run considerably faster.  In this case,

select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts

runs at 35 MB/s.

Our business logic does operations on the resulting data such that
the output is several orders of magnitude smaller than the input.  So
we had hoped that by putting our business logic into stored procedures
(and thus drastically reducing the amount of data flowing to the
client) our throughput would go way up.  This did not happen.

So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads?  Or is this the price we have to pay for
using SQL instead of some NoSQL solution.  (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).

Thanks

Dan Schaffer
Paul Hamer
Nick Matheson


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-03 Thread Heikki Linnakangas

On 03.11.2010 17:52, Nick Matheson wrote:

We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table). We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is
due to the storage overhead we have observed in Postgres. In the
example below, it takes 1 GB to store 350 MB of nominal data. However
that suggests we would expect to get 35 MB/s bulk read rates.

Observations using iostat and top during these bulk reads suggest
that the queries are CPU bound, not I/O bound. In fact, repeating the
queries yields similar response times. Presumably if it were an I/O
issue the response times would be much shorter the second time through
with the benefit of caching.

We have tried these simple queries using psql, JDBC, pl/java stored
procedures, and libpq. In all cases the client code ran on the same
box as the server. We have experimented with Postgres 8.1, 8.3 and 9.0.


Try COPY, ie. COPY bulk_performance.counts TO STDOUT BINARY.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-03 Thread Marti Raudsepp
Just some ideas that went through my mind when reading your post.

On Wed, Nov 3, 2010 at 17:52, Nick Matheson nick.d.mathe...@noaa.gov wrote:
 than observed raw disk reads (5 MB/s versus 100 MB/s).  Part of this is
 due to the storage overhead we have observed in Postgres.  In the
 example below, it takes 1 GB to store 350 MB of nominal data.

PostgreSQL 8.3 and later have 22 bytes of overhead per row, plus
page-level overhead and internal fragmentation. You can't do anything
about row overheads, but you can recompile the server with larger
pages to reduce page overhead.

 Is there any way using stored procedures (maybe C code that calls
 SPI directly) or some other approach to get close to the expected 35
 MB/s doing these bulk reads?

Perhaps a simpler alternative would be writing your own aggregate
function with four arguments.

If you write this aggregate function in C, it should have similar
performance as the sum() query.

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Simple (hopefully) throughput question?

2010-11-03 Thread Andy Colson

On 11/3/2010 10:52 AM, Nick Matheson wrote:

Hello

We have an application that needs to do bulk reads of ENTIRE
Postgres tables very quickly (i.e. select * from table). We have
observed that such sequential scans run two orders of magnitude slower
than observed raw disk reads (5 MB/s versus 100 MB/s). Part of this is
due to the storage overhead we have observed in Postgres. In the
example below, it takes 1 GB to store 350 MB of nominal data. However
that suggests we would expect to get 35 MB/s bulk read rates.

Observations using iostat and top during these bulk reads suggest
that the queries are CPU bound, not I/O bound. In fact, repeating the
queries yields similar response times. Presumably if it were an I/O
issue the response times would be much shorter the second time through
with the benefit of caching.

We have tried these simple queries using psql, JDBC, pl/java stored
procedures, and libpq. In all cases the client code ran on the same
box as the server. We have experimented with Postgres 8.1, 8.3 and 9.0.

We also tried playing around with some of the server tuning parameters
such as shared_buffers to no avail.

Here is uname -a for a machine we have tested on:

Linux nevs-bdb1.fsl.noaa.gov 2.6.18-194.17.1.el5 #1 SMP Mon Sep 20
07:12:06 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

A sample dataset that reproduces these results looks like the following
(there are no indexes):

Table bulk_performance.counts
Column | Type | Modifiers
+-+---
i1 | integer |
i2 | integer |
i3 | integer |
i4 | integer |

There are 22 million rows in this case.

We HAVE observed that summation queries run considerably faster. In this
case,

select sum(i1), sum(i2), sum(i3), sum(i4) from bulk_performance.counts

runs at 35 MB/s.

Our business logic does operations on the resulting data such that
the output is several orders of magnitude smaller than the input. So
we had hoped that by putting our business logic into stored procedures
(and thus drastically reducing the amount of data flowing to the
client) our throughput would go way up. This did not happen.

So our questions are as follows:

Is there any way using stored procedures (maybe C code that calls
SPI directly) or some other approach to get close to the expected 35
MB/s doing these bulk reads? Or is this the price we have to pay for
using SQL instead of some NoSQL solution. (We actually tried Tokyo
Cabinet and found it to perform quite well. However it does not measure
up to Postgres in terms of replication, data interrogation, community
support, acceptance, etc).

Thanks

Dan Schaffer
Paul Hamer
Nick Matheson




I have no idea if this would be helpful or not, never tried it, but when 
you fire off select * from bigtable pg will create the entire 
resultset in memory (and maybe swap?) and then send it all to the client 
in one big lump.  You might try a cursor and fetch 100-1000 at a time 
from the cursor.  No idea if it would be faster or slower.


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance