Re: [HACKERS] multi-threaded pgbench

2009-07-30 Thread Magnus Hagander
On Wed, Jul 29, 2009 at 23:31, Josh Williams wrote:
> On Tue, 2009-07-28 at 23:38 -0400, Josh Williams wrote:
>> Huh, running the patched version on a single thread with 128 clients
>> just got it to crash.  Actually consistently, three times now.  Will try
>> the same thing on the development box tomorrow morning to get some
>> better debugging information.
>
> So yeah, buffer overrun.
>
> In pgbench.c FD_SETSIZE is redefined to get around the Windows default
> of 64.  But this is done after bringing in winsock2.h (a couple levels
> in as a result of first including postgres_fe.h).  So any fd_set is
> built with an array of 64 descriptors, while pgbench thinks it has 1024
> available to work with.
>
> This was introduced a while back; the multi-threaded patch just makes it
> visible by giving it an important pointer to write over.  Previously it
> would just run over into the loop counter (and probably a couple other
> things) and thus it'd continue on happily with the [sub]set it has.

Yikes.
Yeah, this is fallout from the hacking we did with moving the winsock
includes around a while back. At the time the #defines were added,
winsock came in through the win32.h file :S


> In either case this seems to be a simple fix, to move that #define
> earlier (see pgbench_win32.patch.)

Yes, and it seems to be entirely unrelated to the multithreaded patch.
Thus, applied as a separate patch.



-- 
 Magnus Hagander
 Self: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-29 Thread Greg Smith
This patch is wrapping up nicely.  I re-tested against the updated 
pgbench-mt_20090724 and now I get similar results whether or not 
--enable-thread-safety is enabled on Linux, so that problem is gone. 
Josh's successful Windows tests along with finding the bug he attached a 
patch to is also encouraging.


I re-ran my performance tests with the same basic setup (16 core system, 
database scale=10, read-only tests) but this time increased shared_buffers 
to 256MB just to see if results popped up significantly (they didn't).


Here's a comparison of the original pgbench select-only TPS against the 
new version using 1 thread:


clients
threads 16  32  64  128
none91763   69707   68465   63730
1   90797   70117   66324   63626

I ran these a few times and those are basically the same result.  If 
there's a regression using 1 threads instead of 1 process, which I thought 
I was seeing at one point with j=1/c=128, under closer investigation it 
would have to be much smaller than the run to run variation of pgbench 
because it vanished when I collected many runs of data.


Running the new pgbench with thread safety turned on:

clients
threads 16  32  64  128
1   89503   67849   67120   63499
2   97883   91888   87556   84430
4   95319   96409   90445   83569
8   96002   95411   88988   82383
16  103798  95056   87701   82253
32  X   95869   88253   82253

Running it without thread safety turned on so it uses processes instead 
(this is the case I couldn't report on before):


clients
threads 16  32  64  128
1   89706   68702   64545   62770
2   99224   91677   88812   82442
4   96124   96552   90245   83311
8   97066   96000   89149   83266
16  103276  96088   88276   82652
32  X   97405   90082   83672

Those two tables are also identical relative to the run to run pgbench 
noise.


This looks ready for a committer review to me, I'm happy that the patch 
performs as expected and it seems to work across two platforms.


To step back for a second, I'm testing a fairly optimistic situation--the 
standard RHEL 2.6.18 kernel which doesn't have any major issues here--and 
I see a decent sized speedup (>30%) in the worst case.  I've reported 
before that running pgbench on newer Linux kernels (>=2.6.23) is horribly 
slow, and sure enough the original results kicking off this thread showed 
the same thing:  only 11600 TPS on a modern 8 core system.  That's less 
than 1/4 what that server is capable of, and this patch allows working 
around that issue nicely.  pgbench not scaling up really a much worse 
problem than my test results suggest.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-29 Thread Josh Williams
On Tue, 2009-07-28 at 23:38 -0400, Josh Williams wrote:
> Huh, running the patched version on a single thread with 128 clients
> just got it to crash.  Actually consistently, three times now.  Will try
> the same thing on the development box tomorrow morning to get some
> better debugging information.

So yeah, buffer overrun.

In pgbench.c FD_SETSIZE is redefined to get around the Windows default
of 64.  But this is done after bringing in winsock2.h (a couple levels
in as a result of first including postgres_fe.h).  So any fd_set is
built with an array of 64 descriptors, while pgbench thinks it has 1024
available to work with.

This was introduced a while back; the multi-threaded patch just makes it
visible by giving it an important pointer to write over.  Previously it
would just run over into the loop counter (and probably a couple other
things) and thus it'd continue on happily with the [sub]set it has.

In either case this seems to be a simple fix, to move that #define
earlier (see pgbench_win32.patch.)

- Josh Williams

diff -c -r1.87 pgbench.c
*** contrib/pgbench/pgbench.c	11 Jun 2009 14:48:51 -	1.87
--- contrib/pgbench/pgbench.c	29 Jul 2009 21:18:18 -
***
*** 26,31 
--- 26,36 
   * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
   *
   */
+ 
+ #ifdef WIN32
+ #define FD_SETSIZE 1024		/* set before winsock2.h is included */
+ #endif   /* ! WIN32 */
+ 
  #include "postgres_fe.h"
  
  #include "libpq-fe.h"
***
*** 34,41 
  #include 
  
  #ifdef WIN32
- #undef FD_SETSIZE
- #define FD_SETSIZE 1024
  #include 
  #else
  #include 
--- 39,44 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-28 Thread Josh Williams
On Tue, 2009-07-28 at 12:10 -0400, Greg Smith wrote:
> If your test system 
> is still setup, it might be interesting to try the 64 and 128 client cases 
> with Task Manager open, to see what percentage of the CPU the pgbench 
> driver program is using.  If the pgbench client isn't already pegged at a 
> full CPU, I wouldn't necessarily threading it to help--it would just add 
> overhead that doesn't buy you anything, which seems to be what you're 
> measuring.

That's a really good point, I do recall seeing pgbench taking only a
fraction of the CPU...  Running it again, it hovers around 6 or 7
percent in both cases, so it's only using up around half a core.

Huh, running the patched version on a single thread with 128 clients
just got it to crash.  Actually consistently, three times now.  Will try
the same thing on the development box tomorrow morning to get some
better debugging information.


> All the Linux tests suggest that limit tends up show up at over 20,000 TPS 
> nowawadys, so maybe your Window system is bottlenecking somewhere 
> completely different before it reaches saturation on the client.

I figured it was just indicating a limitation of the environment, where
Windows has some kind of inefficiency either in the PG port or just
something inherent in how the OS works.  It does make me wonder where
exactly all that CPU time is going, though.  OProfile, how I miss thee.
But that's a different discussion entirely.

- Josh Williams



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-28 Thread Greg Smith

On Tue, 28 Jul 2009, Josh Williams wrote:


Maybe pgbench itself is less of a bottleneck in this environment,
relatively speaking?


On UNIXish systems, you know you've reached the conditions under which the 
threaded pgbench would be helpful if the pgbench client program itself is 
taking up a large percentage of a CPY just by itself.  If your test system 
is still setup, it might be interesting to try the 64 and 128 client cases 
with Task Manager open, to see what percentage of the CPU the pgbench 
driver program is using.  If the pgbench client isn't already pegged at a 
full CPU, I wouldn't necessarily threading it to help--it would just add 
overhead that doesn't buy you anything, which seems to be what you're 
measuring.


All the Linux tests suggest that limit tends up show up at over 20,000 TPS 
nowawadys, so maybe your Window system is bottlenecking somewhere 
completely different before it reaches saturation on the client.


In any case, Josh's review is exactly what I wanted to see here--the code 
does compile and run successfully for someone besides its author under 
Windows.  Making it *effective* on that platform might end up being 
outside the scope of what we want to chew on right now.  I'll have updated 
performance results to submit later this week against the updated patch.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-27 Thread Josh Williams
On Wed, 2009-07-22 at 22:23 -0400, Greg Smith wrote:
> Onto performance.  My test system has a 16 cores of Xeon X5550 @
> 2.67GHz. 
> I created a little pgbench database (-s 10) and used the default 
> postgresql.conf parameters for everything but max_connections for a
> rough 
> initial test.

To test on Windows, I set up a similar database on an 8-core 2.0GHz
E5335 (closest match I have.)  It's compiled against a fresh CVS pull
from this morning, patched with the "20090724" updated version.  I tried
to mirror the tests as much as possible, including the concurrent thread
counts despite having half the number of available cores.  Doing that
didn't have much impact on the results, but more on that later.

Comparing the unpatched version to the new version running a single
client thread, there's no significant performance difference:

C:\pgsql85>bin\pgbenchorig.exe -S -c 8 -t 10 pgbench
...
tps = 19061.234215 (including connections establishing)

C:\pgsql85>bin\pgbench.exe -S -c 8 -t 10 pgbench
tps = 18852.928562 (including connections establishing)

As a basis of comparison the original pgbench was run with increasing
client counts, which shows the same drop off in throughput past the
16-client sweet spot:

con   tps
  8 18871
 16 19161
 24 18804
 32 18670
 64 17598
128 16664

However I was surprised to see these results for the patched version,
running 16 worker threads (apart from the 8 client run of course.)

C:\pgsql85>bin\pgbench.exe -S -j 16 -c 128 -t 10 pgbench ...
con   tps
  8 18435 (-j 8)
 16 18866
 24 -
 32 17937
 64 17016
128 15930

In all cases the patched version resulted in a lower performing output
than the unpatched version.  It's clearly working, at least in that it's
launching the requested number of worker threads when looking at the
process.  Adjusting the worker thread count down to match the number of
cores yielded identical results in the couple of test cases I ran.
Maybe pgbench itself is less of a bottleneck in this environment,
relatively speaking?

- Josh Williams



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-23 Thread Itagaki Takahiro

Itagaki Takahiro  wrote:

> Greg Smith  wrote:
> > That second code path, when --enable-thread-safety is turned off, crashes 
> > and burns on my Linux system:
> 
> It comes from confliction of identifiers.
> Renaming identifiers with #define can solve the errors:
> #define pthread_t pg_pthread_t

Here is a patch to fix compile errors by identifier-renaming
when thread-safety is disabled on linux.

Also I fixed file descriptor leaks at the end of benchmark.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



pgbench-mt_20090724.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-22 Thread Itagaki Takahiro

Greg Smith  wrote:

> That second code path, when --enable-thread-safety is turned off, crashes 
> and burns on my Linux system:

It comes from confliction of identifiers.
Renaming identifiers with #define can solve the errors:

#define pthread_t   pg_pthread_t
#define pthread_attr_t  pg_pthread_attr_t
#define pthread_create  pg_pthread_create
#define pthread_joinpg_pthread_join
typedef struct fork_pthread*pthread_t;
...

Another idea is that we don't use pthread and add 'pg_thread' wrapper
module on the top of pthread.

We can choose either of implementations... Which is better?


> $ ./pgbench -j 16 -S -c 24 -t 1 pgbench
> number of clients (24) must be a multiple number of threads (16)

It's hard on forking-thread platforms because multiple threads need
to access the job queue. We need to put the queue on inter-process
shared memory, but it introduces additional complexities.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-22 Thread Greg Smith
I just took multi-threaded pgbench for an initial spin, looks good overall 
with only a couple of small rough edges.


The latest code works differently depending on whether you compiled with 
--enable-thread-safety or not, it defines some structures based on fork if 
it's not enabled:


#elif defined(ENABLE_THREAD_SAFETY)
#include 
#else
#include 
typedef struct fork_pthread*pthread_t;
typedef int pthread_attr_t;
static int pthread_create(pthread_t *thread, pthread_attr_t *attr, void 
* (*start_routine)(void *), void * arg);

static int pthread_join(pthread_t th, void **thread_return);
#endif

That second code path, when --enable-thread-safety is turned off, crashes 
and burns on my Linux system:


gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv 
-I../../src/interfaces/libpq -I. -I../../src/include -D_GNU_SOURCE   -c -o 
pgbench.o pgbench.c -MMD -MP -MF .deps/pgbench.Po

pgbench.c:72: error: conflicting types for pthread_t
/usr/include/bits/pthreadtypes.h:50: error: previous declaration of 
pthread_t was here

pgbench.c:73: error: conflicting types for pthread_attr_t
/usr/include/bits/pthreadtypes.h:57: error: previous declaration of 
pthread_attr_t was here


So that's the first problem to sort out, I was planning to test that path 
as well as the regular threaded one.  Since I'd expect there to be Linux 
packages built both with and without thread safety enabled, they both 
should work, even though people should always be turning safety on 
nowadays.


We should try to get a Windows based tester here too at some point, 
there's a completely different set of thread wrapper code for that OS that 
could use a look by somebody more familiar than me with that platform.


The second thing that concerns me is that there's a limitation in the code 
where the number of clients must be a multiple of the number of workers. 
When I tried to gradually step up the client volume the tests wouldn't 
run:


$ ./pgbench -j 16 -S -c 24 -t 1 pgbench
number of clients (24) must be a multiple number of threads (16)

Once the larger issues are worked out, I would be much friendlier if it 
were possible to pass new threads a client count so that the last in the 
pool could service a smaller number.  The logic for that is kind of a 
pain--in this case you'd want 8 threads running 2 clients each while 8 ran 
1 client--but it would really be much friendlier and flexible that way.


Onto performance.  My test system has a 16 cores of Xeon X5550 @ 2.67GHz. 
I created a little pgbench database (-s 10) and used the default 
postgresql.conf parameters for everything but max_connections for a rough 
initial test.


Performance on this box drops off pretty fast once you get past 16 
clients; using the original, unpatched pgbench:


c   tps
16  86887
24  70685
32  63787
64  64712
128 60602

A quick test of the new version suggest that there's no glaring 
performance regression running it with a single client thread:


$ ./pgbench.orig -S -c 64 -t 1 pgbench
tps = 64712.451737 (including connections establishing)

$ ./pgbench -S -c 64 -t 1 pgbench
tps = 63806.494046 (including connections establishing)

So I moved onto to testing with a worker thread per CPU:

./pgbench -j 16 -S -c 16 -t 10 pgbench
./pgbench -j 16 -S -c 32 -t 5 pgbench
./pgbench -j 16 -S -c 64 -t 1 pgbench
./pgbench -j 16 -S -c 128 -t 1 pgbench

And got considerably better results:

c  tps
16  96223
32  89014
64  82487
128 74217

That's as much as a 40% speedup @ 32 clients, and even a decent win at 
lower counts.


The patch looks like it accomplishes its performance goals quite well 
here.  I'll be glad to run some more extensive performance tests, but I'd 
like to at least see the version without --enable-thread-safety fixed 
first so that I can queue up and compare both versions when I go through 
that.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-19 Thread Robert Haas

On Jul 18, 2009, at 3:40 PM, Greg Stark  wrote:

On Sat, Jul 18, 2009 at 8:25 PM, Robert Haas  
wrote:

On Thu, Jul 9, 2009 at 4:51 AM, Itagaki
Takahiro wrote:

Here is an updated version of multi-threaded pgbench patch.


Greg (Smith), do you have time to review this version?  If not, I  
will

assign a round-robin reviewer when one becomes available.


Incidentally you could assign me something if you want.


OK.


I gave feedback on Simon/Your join removal and the Append min/max
patch. I don't think either has really reached any conclusive
"finished" state though. I suppose I should mark your patch as
"returned with feedback" even if it's mostly just "good work, keep
going"? And the other patch isn't actually in this commitfest but I
think we're still discussing what it should do.


Well, I think we really need Tom to look at join removal. If he  
doesn't have any better ideas for how to structure the code it's not  
clear to me that we shouldn't just commit what I already did  and then  
start future work from there. But this seems like an issue for that  
thread rather than this one.


Wrt append min/max I think we should postpone further discussion until  
end of commitfest, since it was submitted mid-CommitFest.


...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-19 Thread Robert Haas
On Sun, Jul 19, 2009 at 12:50 AM, Josh Berkus wrote:
>> Greg (Smith), do you have time to review this version?  If not, I will
>> assign a round-robin reviewer when one becomes available.
>
> I can do a concurrency test of this next week.

Sounds good.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-18 Thread Josh Berkus



Greg (Smith), do you have time to review this version?  If not, I will
assign a round-robin reviewer when one becomes available.


I can do a concurrency test of this next week.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-18 Thread Greg Stark
On Sat, Jul 18, 2009 at 8:25 PM, Robert Haas wrote:
> On Thu, Jul 9, 2009 at 4:51 AM, Itagaki
> Takahiro wrote:
>> Here is an updated version of multi-threaded pgbench patch.
>
> Greg (Smith), do you have time to review this version?  If not, I will
> assign a round-robin reviewer when one becomes available.

Incidentally you could assign me something if you want.

I gave feedback on Simon/Your join removal and the Append min/max
patch. I don't think either has really reached any conclusive
"finished" state though. I suppose I should mark your patch as
"returned with feedback" even if it's mostly just "good work, keep
going"? And the other patch isn't actually in this commitfest but I
think we're still discussing what it should do.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-18 Thread Robert Haas
On Thu, Jul 9, 2009 at 4:51 AM, Itagaki
Takahiro wrote:
> Here is an updated version of multi-threaded pgbench patch.

Greg (Smith), do you have time to review this version?  If not, I will
assign a round-robin reviewer when one becomes available.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-09 Thread Itagaki Takahiro
Here is an updated version of multi-threaded pgbench patch.

Andrew Dunstan  wrote:

> > Hmm, but how will you communicate stats back from the sub-processes?
> My first reaction is to say "use a pipe."

I added partial implementation of pthread using fork and pipe for platform
without ENABLE_THREAD_SAFETY. Pthread version is not necessarily needed
if we have the fork version, but I still left it as-is.

The name of new option is still -j, that is borrowed from pg_restore
and gmake. They use -j for multi-worker-processing.

  -j NUM   number of threads (default: 1)

I needed to modify the meaning of tps (excluding connections establishing)
a little because connections are executed in parallel. I subtract average
of connection times from total execution time.

total_time := last_thread_finish_time - first_thread_start_time
tps (including connection) := num_transaction / total_time
tps (excluding connection) := num_transaction /
(total_time - (total_connection_time / num_threads))

I notice that I also fixed a few parts of pgbench:
  * Use instr_time instead of struct timeval.
Macros in portability/instr_time.h makes codes cleaner.
  * Accept "\sleep 1ms" format (no spaces between "1" and "ms") for sleep
meta command. The old version of pgbench interprets "1ms" as just "1",
that means "1 s". It was confusable.

I'll add the patch to the commitfest page.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



pgbench-mt_20090709.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Greg Smith

On Wed, 8 Jul 2009, Tom Lane wrote:


pg_restore doesn't need anything more than a success/failure result
from its child processes, but I think pgbench will want more.


The biggest chunk of returned state to consider is how each client 
transaction generates a line of latency information that goes into the log 
file.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Itagaki Takahiro

Andrew Dunstan  wrote:

> I think you should have it use pthreads if available, or Windows threads 
> there, or fork() elsewhere.

Just a question - which platform does not support any threading?
I think threading is very common in modern applications. If there
are such OSes, they seem to be just abandoned and not maintained...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan  writes:
  
I think you should have it use pthreads if available, or Windows threads 
there, or fork() elsewhere.



Hmm, but how will you communicate stats back from the sub-processes?
pg_restore doesn't need anything more than a success/failure result
from its child processes, but I think pgbench will want more.

  


My first reaction is to say "use a pipe."

cheers

andtrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Tom Lane
Andrew Dunstan  writes:
> I think you should have it use pthreads if available, or Windows threads 
> there, or fork() elsewhere.

Hmm, but how will you communicate stats back from the sub-processes?
pg_restore doesn't need anything more than a success/failure result
from its child processes, but I think pgbench will want more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Greg Smith

On Wed, 8 Jul 2009, Itagaki Takahiro wrote:


Multi-threading would be a solution. The attached patch adds -j
(number of jobs) option to pgbench.


Should probably name this -w "numbers of workers" to stay consistent with 
terminology used on the server side.



Is it acceptable to use pthread in contrib module?
If ok, I will add the patch to the next commitfest.


pgbench is basically broken right now, as demonstrated by the lack of 
scaling show in your results and similar ones I've collected.  This looks 
like it fixes the primary problem there.  While it would be nice if a 
multi-process based solution were written instead, unless someone is 
willing to step up and volunteer to write one I'd much rather see your 
patch go in than doing nothing at all.  It shouldn't even impact old 
results if you don't toggle the option on.


I have 3 new server systems I was going to run pgbench on anyway in the 
next month as part of my standard performance testing on new hardware. 
I'll be happy to mix in results using the multi-threaded pgbench to check 
the patch's performance, along with the rest of the initial review here.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Stefan Kaltenbrunner

Tom Lane wrote:

Alvaro Herrera  writes:

Itagaki Takahiro wrote:

Is it acceptable to use pthread in contrib module?



We don't have a precedent it seems.  I think the requirement would be
that it should compile if pthread support is not present.


Right.  Breaking it for non-pthread environments is not acceptable.

The real question here is whether it will be a problem if pgbench
delivers significantly different results when built with or without
threading support.  I can see arguents either way on that ...


well pgbench as it is now is now is more ore less unusable on modern 
hardware for SELECT type queries(way too slow to scale to what the 
backend can do thses days and the number of cores in a recent box).
It is only somewhat usable on the default update heavy test as well 
because even there it is hitting scalability limits (ie I can easily 
improve on its numbers with a perl script that forks and issues the same 
queries).
I would even go as far as issuing a WARNING if pgbench is invoked and 
not compiled with threads if we accept this patch...




Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Andrew Dunstan



Heikki Linnakangas wrote:

Alvaro Herrera wrote:
  

Itagaki Takahiro wrote:



Is it acceptable to use pthread in contrib module?
  

We don't have a precedent it seems.  I think the requirement would be
that it should compile if pthread support is not present.



My thoughts as well. But I wonder, would it be harder or easier to use
fork() instead?

  


I have just been down this road to some extent with parallel pg_restore, 
which uses threads on Windows. That might be useful as a bit of a 
template. Extending it to use pthreads would probably be fairly trivial. 
The thread/fork specific stuff ended up being fairly isolated for 
pg_restore. see src/bin/pg_dump/pg_backup_archiver.c:spawn_restore()


I think you should have it use pthreads if available, or Windows threads 
there, or fork() elsewhere.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Tom Lane
Alvaro Herrera  writes:
> Itagaki Takahiro wrote:
>> Is it acceptable to use pthread in contrib module?

> We don't have a precedent it seems.  I think the requirement would be
> that it should compile if pthread support is not present.

Right.  Breaking it for non-pthread environments is not acceptable.

The real question here is whether it will be a problem if pgbench
delivers significantly different results when built with or without
threading support.  I can see arguents either way on that ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Heikki Linnakangas
Alvaro Herrera wrote:
> Itagaki Takahiro wrote:
> 
>> Is it acceptable to use pthread in contrib module?
> 
> We don't have a precedent it seems.  I think the requirement would be
> that it should compile if pthread support is not present.

My thoughts as well. But I wonder, would it be harder or easier to use
fork() instead?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multi-threaded pgbench

2009-07-08 Thread Alvaro Herrera
Itagaki Takahiro wrote:

> Is it acceptable to use pthread in contrib module?

We don't have a precedent it seems.  I think the requirement would be
that it should compile if pthread support is not present.

> If ok, I will add the patch to the next commitfest.

Add it anyway -- discussion should happen during commitfest if it
doesn't spark right away.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers