Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-30 Thread Hans-Jürgen Schönig

Bruce Momjian wrote:

Greg Stark wrote:
  
I couldn't get async I/O to work on Linux. That is it "worked" but  
performed the same as reading one block at a time. On solaris the  
situation is reversed.


In what way is fadvise a kludge?



I think he is saying AIO gives us more flexibility, but I am unsure we
need it.
  



absolutely.
posix_fadvise is easy to implement and i would assume that it takes away 
a lot of "guessing" on the OS internals side.
the database usually knows that it is gonna read a lot of data in a 
certain way and it cannot be a bad idea to give the kernel a hint here.
especially synchronized seq scans and so on are real winners here as you 
stop confusing the kernel with XX concurrent readers on the same file.

this can also be an issue with some controller firmwares and so on.

   many thanks,

  hans


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-29 Thread Dann Corbit
>Hi Simon,
>
>He is going to do some investigation in the methods and
>write down the possibilities and then he is going to implement
>something from that for PostgreSQL.
>
>When will this work be complete? We are days away from completing main
>work on 8.4, so you won't get much discussion on this for a few months
>yet. Will it be complete in time for 8.5? Or much earlier even?
> 
>The first guess is that the work will be done for 8.6. 
>Dano is supposed to finish the work and defend his thesis in something
a bit more than a year.
>
>Julius, you don't mention what your role is in this. In what sense is
>Dano's master's thesis a "we" thing?
> 
>I am Dano's mentor and we have a closed contact with Zdenek as well. 
>We would like the project to become a "we" thing as another reason why
to work on the project. 
>It seems to be better to research some ideas at the begging and discuss
the stuff
>during development than just individually writing some piece of code
which could 
>be published afterwards. Especially, when this area seems to be of
interest of 
>more people.

Threads are where future performance is going to come from:
General purpose->
http://www.setup32.com/hardware/cpuchipset/32core-processors-intel-reach
e.php

GPU->
http://wwwx.cs.unc.edu/~lastra/Research/GPU_performance.html
http://www.cs.unc.edu/~geom/GPUSORT/results.html

Database engines that want to exploit the ultimate in performance will
utilize multiple threads of execution.
True, the same thing can be realized by multiple processes, but a
process is more expensive than a thread.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-29 Thread Julius Stroffek

Hi Simon,


He is going to do some investigation in the methods and
write down the possibilities and then he is going to implement
something from that for PostgreSQL.



When will this work be complete? We are days away from completing main
work on 8.4, so you won't get much discussion on this for a few months
yet. Will it be complete in time for 8.5? Or much earlier even?
  
The first guess is that the work will be done for 8.6. Dano is supposed 
to finish the work and defend his thesis in something a bit more than a 
year.

Julius, you don't mention what your role is in this. In what sense is
Dano's master's thesis a "we" thing?
  
I am Dano's mentor and we have a closed contact with Zdenek as well. We 
would like the project to become a "we" thing as another reason why to 
work on the project. It seems to be better to research some ideas at the 
begging and discuss the stuff during development than just individually 
writing some piece of code which could be published afterwards. 
Especially, when this area seems to be of interest of more people.


Cheers

Julo


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Aidan Van Dyk
* Greg Stark <[EMAIL PROTECTED]> [081024 10:48]:
> I thought about how to support both and ran into probblems that would  
> make the resulting solutions quite complex.
> 
> In the libaio view of the world you initiate io and either get a  
> callback or call another syscall to test if it's complete. Either  
> approach has problems for postgres. If the process that initiated io  
> is in the middle of a long query it might takr a long time ot even  
> never get back to complete the io. The callbacks use threads...
> 
> And polling for completion has the problem that another process could  
> be waiting on the io and can't issue a read as long as the first  
> process has the buffer locked and io in progress. I think aio makes a  
> lot more sense if you're using threads so you can start a thread to  
> wait for the io to complete.
> 
> Actually I think it might be doable with a lot of work but I'm worried  
> that it would be a lot of extra complexity even when you're not using  
> it. The current patch doesn't change anything when you're not using it  
> and actually is quite simple.

In the Solaris async IO, are you bound by direct IO?  Does the OS page-cache
still get primed by async reads?  If so, how about starting async IO
into a "throwaway" local buffer;  treat async IO in the same way as
fadvise, a "pre-load the OS page cache so the real read is quick".

Sure, I understand it's not the "perfect model", but it I don't see
PostgreSQL being refactored enough to have a pure async model happening
any time in the near future...

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Greg Stark
Also keep in mind that solaris is open source these days. If someone  
wants it they could always go ahead and add the feature ...


greg

On 24 Oct 2008, at 03:18 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:


Jonah H. Harris wrote:
On Fri, Oct 24, 2008 at 7:59 AM, Hannu Krosing  
<[EMAIL PROTECTED]> wrote:

On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:

While we could build an
abstract prefetch interface and simply use fadvise for it now  
(rather

than OS-specific code), I don't see an easy win in any case.


When building an abstract interface, always use at least two
implementations (I guess that would be fadvise on linux and AIO on
solaris in this case). You are much more likely to get the interface
right this way.


I agree, I just wasn't sure as to whether Greg's patch supported  
both methods.


It does not, and probably will not for the near future;  we can only
hope Solaris suports posix_fadvise() at some point.

--
 Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
 EnterpriseDB http://enterprisedb.com

 + If your life is a hard drive, Christ can be your backup. +


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Greg Stark
I thought about how to support both and ran into probblems that would  
make the resulting solutions quite complex.


In the libaio view of the world you initiate io and either get a  
callback or call another syscall to test if it's complete. Either  
approach has problems for postgres. If the process that initiated io  
is in the middle of a long query it might takr a long time ot even  
never get back to complete the io. The callbacks use threads...


And polling for completion has the problem that another process could  
be waiting on the io and can't issue a read as long as the first  
process has the buffer locked and io in progress. I think aio makes a  
lot more sense if you're using threads so you can start a thread to  
wait for the io to complete.


Actually I think it might be doable with a lot of work but I'm worried  
that it would be a lot of extra complexity even when you're not using  
it. The current patch doesn't change anything when you're not using it  
and actually is quite simple.


greg

On 24 Oct 2008, at 03:18 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:


Jonah H. Harris wrote:
On Fri, Oct 24, 2008 at 7:59 AM, Hannu Krosing  
<[EMAIL PROTECTED]> wrote:

On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:

While we could build an
abstract prefetch interface and simply use fadvise for it now  
(rather

than OS-specific code), I don't see an easy win in any case.


When building an abstract interface, always use at least two
implementations (I guess that would be fadvise on linux and AIO on
solaris in this case). You are much more likely to get the interface
right this way.


I agree, I just wasn't sure as to whether Greg's patch supported  
both methods.


It does not, and probably will not for the near future;  we can only
hope Solaris suports posix_fadvise() at some point.

--
 Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
 EnterpriseDB http://enterprisedb.com

 + If your life is a hard drive, Christ can be your backup. +


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Bruce Momjian
Jonah H. Harris wrote:
> On Fri, Oct 24, 2008 at 7:59 AM, Hannu Krosing <[EMAIL PROTECTED]> wrote:
> > On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:
> >> While we could build an
> >> abstract prefetch interface and simply use fadvise for it now (rather
> >> than OS-specific code), I don't see an easy win in any case.
> >
> > When building an abstract interface, always use at least two
> > implementations (I guess that would be fadvise on linux and AIO on
> > solaris in this case). You are much more likely to get the interface
> > right this way.
> 
> I agree, I just wasn't sure as to whether Greg's patch supported both methods.

It does not, and probably will not for the near future;  we can only
hope Solaris suports posix_fadvise() at some point.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Jonah H. Harris
On Fri, Oct 24, 2008 at 7:59 AM, Hannu Krosing <[EMAIL PROTECTED]> wrote:
> On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:
>> While we could build an
>> abstract prefetch interface and simply use fadvise for it now (rather
>> than OS-specific code), I don't see an easy win in any case.
>
> When building an abstract interface, always use at least two
> implementations (I guess that would be fadvise on linux and AIO on
> solaris in this case). You are much more likely to get the interface
> right this way.

I agree, I just wasn't sure as to whether Greg's patch supported both methods.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-24 Thread Hannu Krosing
On Fri, 2008-10-24 at 00:52 -0400, Jonah H. Harris wrote:
> While we could build an
> abstract prefetch interface and simply use fadvise for it now (rather
> than OS-specific code), I don't see an easy win in any case.

When building an abstract interface, always use at least two
implementations (I guess that would be fadvise on linux and AIO on
solaris in this case). You are much more likely to get the interface
right this way.

--
Hannu 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Heikki Linnakangas

Jonah H. Harris wrote:

On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.


I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.


Let's try to focus. We're not talking about using O_DIRECT, we're 
talking about using asynchronous I/O or posix_fadvise(). And without 
more details on what you mean by benefit, under what circumstances, any 
numbers like that is just unmeasurable handwaving.


In terms of getting the RAID array busy, in Greg's tests posix_fadvise() 
on Linux worked just as well as async I/O works on Solaris. So it 
doesn't seem like there's any inherent performance advantage in the 
async I/O interface over posix_fadvise() + read(),


There is differences between different OS implementations of the 
interfaces. But we're developing software for the future, and for a wide 
range of platforms, and I'm sure operating systems will develop as well. 
The decision should not be made on what is the fastest interface on a 
given operating system in 2008.


Async I/O might have a small potential edge on CPU usage, because less 
system calls are needed. However, let me remind you all that we're 
talking about how to utilize RAID array to do physical, random, I/O as 
fast as possible. IOW, the bottleneck is I/O, by definition. The CPU 
efficiency of the kernel interface to initiate the I/O is insignificant, 
until we reach a large enough random read throughput to saturate the 
CPU, and even then there's probably more significant CPU savings to be 
made elsewhere.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Heikki Linnakangas

Jonah H. Harris wrote:

fadvise is a kludge.


I don't think it's a kludge at all. posix_fadvise() is a pretty nice and 
clean interface to hint the kernel what pages you're going to access in 
the near future. I can't immediately come up with a cleaner interface to 
do that.


Compared to async I/O, it's helluva lot simpler to add a few 
posix_fadvise() calls to an application, than switch to a completely 
different paradigm. And while posix_fadvise() is just a hint, allowing 
the OS to prioritize accordingly, all async I/O requests look the same.



 While it will help, it still makes us completely
reliant on the OS.


That's not a bad thing in my opinion. The OS knows the I/O hardware, 
disk layout, utilization, and so forth, and is in a much better position 
to do I/O scheduling than a user process. The only advantage a user 
process has is that it knows better what pages it's going to need, and 
posix_fadvise() is a good interface to let the user process tell the 
kernel that.



IIRC, we currently have support for rings in the buffer pool, which we could 
read
directly into.


The rings won't help you a bit. It's just a different way to choose 
victim buffers.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark
We did discuss this in Ottawa and I beleive your comment was "the hint  
is in the name" referring to posix_fadvise.


In any case both aio and posix_fadvise are specified by posix so I  
don't see either as a problem on that front.


I don't think we can ignore any longer that we effectively can't use  
raid arrays with postgres. If you have many concurrent queries or  
restrict yourself to sequential scans you're ok but if you're doing  
data warehousing you're going to be pretty disappointed to see your  
shiny raid array performing like a single drive.



greg

On 24 Oct 2008, at 05:42 AM, Tom Lane <[EMAIL PROTECTED]> wrote:


"Jonah H. Harris" <[EMAIL PROTECTED]> writes:

On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark

In what way is fadvise a kludge?



non-portable, requires more user-to-system CPU, ... need I go on?


I'd be interested to know which of these proposals you claim *is*
portable.  The single biggest reason to reject 'em all is that
they aren't.

   regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark
Based on what? I did test this and posted the data. The results I  
posted showed that posix_fadvise on Linux performed nearly as well on  
Linux as async I/O on Solaris on identical hardware.


More importantly it scaled with the number if drives. A 15 drive array  
gets about 15x the performance of a 1 drive array if enough read-ahead  
is done. Plus an extra boost if the input wasn't already sorted which  
presumably reflects the better i/o ordering.





--
greg

On 24 Oct 2008, at 04:29 AM, "Jonah H. Harris"  
<[EMAIL PROTECTED]> wrote:


On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]>  
wrote:

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.


I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.

--
Jonah H. Harris, Senior DBA
myYearbook.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark



On 24 Oct 2008, at 04:31 AM, "Jonah H. Harris"  
<[EMAIL PROTECTED]> wrote:



On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
<[EMAIL PROTECTED]> wrote:
I couldn't get async I/O to work on Linux. That is it "worked" but  
performed

the same as reading one block at a time. On solaris the situation is
reversed.


Hmm, then obviously you did something wrong, because my tests showed
it quite well.  Pull the source to iozone or fio.


I posted the source, feel free to point out what I did wrong. It did  
work on solaris with and without o_direct so I didn't think it was a  
bug in my code.




In what way is fadvise a kludge?


non-portable, requires more user-to-system CPU, ... need I go on?


Well it's just as portable, they're both specified by posix. Actually  
async I/o is in the real-time extensions so one could argue it's less  
portable. Also before posix_fadvise there was plain old fadvise so  
it's portable to older platforms too whereas async I/o isn't.


Posix_fadvise does require two syscalls and two trips to the buffer  
manager. But that doesn't really make it a kludge if the resulting  
code is cleaner than the async I/o code would be. To use async I/o we  
would have to pin all the buffers we're reading which would be quite a  
lot of code changes.


I did ask for feedback on precisely this point of whether two trips to  
the buffer manager was a problem. It would have been nice to get the  
feedback 6 months ago when I posted it instead of now two weeks before  
feature freeze. 
   


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris
On Fri, Oct 24, 2008 at 12:42 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> non-portable, requires more user-to-system CPU, ... need I go on?
>
> I'd be interested to know which of these proposals you claim *is*
> portable.  The single biggest reason to reject 'em all is that
> they aren't.

Yes, that was bad wording on my part.  What I mean to say was
unpredictable.  Different OSes and filesystems handle fadvise
differently (or not at all), which makes any claim to performance gain
configuration-dependent.  My preferred method, using O_DIRECT and
fetching directly into shared buffers, is not without its issues or
challenges as well.  However, by abstracting the multi-block read
interface, we could use more optimal calls depending on the OS.

Having done a bit of research and testing in this area (AIO and buffer
management), I don't see any easy solution.  fadvise will work on some
systems and will likely give some gain on them, but won't work for
everyone.  The alternative is to abstract prefetching and allow
platform-specific code, which we rarely do.  While we could build an
abstract prefetch interface and simply use fadvise for it now (rather
than OS-specific code), I don't see an easy win in any case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Tom Lane
"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
>> In what way is fadvise a kludge?

> non-portable, requires more user-to-system CPU, ... need I go on?

I'd be interested to know which of these proposals you claim *is*
portable.  The single biggest reason to reject 'em all is that
they aren't.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris
On Thu, Oct 23, 2008 at 10:36 PM, Greg Stark
<[EMAIL PROTECTED]> wrote:
> I couldn't get async I/O to work on Linux. That is it "worked" but performed
> the same as reading one block at a time. On solaris the situation is
> reversed.

Hmm, then obviously you did something wrong, because my tests showed
it quite well.  Pull the source to iozone or fio.

> In what way is fadvise a kludge?

non-portable, requires more user-to-system CPU, ... need I go on?

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris
On Thu, Oct 23, 2008 at 8:44 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> True, it is a kludge but if it gives us 95% of the benfit with 10% of
> the code, it is a win.

I'd say, optimistically, maybe 30-45% the benefit over a proper
multi-block read using O_DIRECT.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian
Greg Stark wrote:
> I couldn't get async I/O to work on Linux. That is it "worked" but  
> performed the same as reading one block at a time. On solaris the  
> situation is reversed.
> 
> In what way is fadvise a kludge?

I think he is saying AIO gives us more flexibility, but I am unsure we
need it.

---


> 
> greg
> 
> On 24 Oct 2008, at 01:44 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote:
> 
> > Jonah H. Harris wrote:
> >> On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]>  
> >> wrote:
>  I think the current plan is to use posix_advise() to allow  
>  parallel I/O,
>  rather than async I/O becuase posix_advise() will require fewer  
>  code
>  changes.
> >>>
> >>> These are not necessarily mutually exclusive designs.  fadvise  
> >>> works fine on
> >>> Linux, but as far as I know only async I/O works on Solaris.   
> >>> Linux also has
> >>> an async I/O library, and it's not clear to me yet whether that  
> >>> might work
> >>> even better than the fadvise approach.
> >>
> >> fadvise is a kludge.  While it will help, it still makes us  
> >> completely
> >> reliant on the OS.  For performance reasons, we should be  
> >> supporting a
> >> multi-block read directly into shared buffers.  IIRC, we currently
> >> have support for rings in the buffer pool, which we could read
> >> directly into.  Though, an LRU-based buffer manager design would be
> >> more optimal in this case.
> >
> > True, it is a kludge but if it gives us 95% of the benfit with 10% of
> > the code, it is a win.
> >
> > -- 
> >  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
> >  EnterpriseDB http://enterprisedb.com
> >
> >  + If your life is a hard drive, Christ can be your backup. +
> >
> > -- 
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Stark
I couldn't get async I/O to work on Linux. That is it "worked" but  
performed the same as reading one block at a time. On solaris the  
situation is reversed.


In what way is fadvise a kludge?

greg

On 24 Oct 2008, at 01:44 AM, Bruce Momjian <[EMAIL PROTECTED]> wrote:


Jonah H. Harris wrote:
On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]>  
wrote:
I think the current plan is to use posix_advise() to allow  
parallel I/O,
rather than async I/O becuase posix_advise() will require fewer  
code

changes.


These are not necessarily mutually exclusive designs.  fadvise  
works fine on
Linux, but as far as I know only async I/O works on Solaris.   
Linux also has
an async I/O library, and it's not clear to me yet whether that  
might work

even better than the fadvise approach.


fadvise is a kludge.  While it will help, it still makes us  
completely
reliant on the OS.  For performance reasons, we should be  
supporting a

multi-block read directly into shared buffers.  IIRC, we currently
have support for rings in the buffer pool, which we could read
directly into.  Though, an LRU-based buffer manager design would be
more optimal in this case.


True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.

--
 Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
 EnterpriseDB http://enterprisedb.com

 + If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian
Jonah H. Harris wrote:
> On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
> >> I think the current plan is to use posix_advise() to allow parallel I/O,
> >> rather than async I/O becuase posix_advise() will require fewer code
> >> changes.
> >
> > These are not necessarily mutually exclusive designs.  fadvise works fine on
> > Linux, but as far as I know only async I/O works on Solaris.  Linux also has
> > an async I/O library, and it's not clear to me yet whether that might work
> > even better than the fadvise approach.
> 
> fadvise is a kludge.  While it will help, it still makes us completely
> reliant on the OS.  For performance reasons, we should be supporting a
> multi-block read directly into shared buffers.  IIRC, we currently
> have support for rings in the buffer pool, which we could read
> directly into.  Though, an LRU-based buffer manager design would be
> more optimal in this case.

True, it is a kludge but if it gives us 95% of the benfit with 10% of
the code, it is a win.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Jonah H. Harris
On Thu, Oct 23, 2008 at 4:53 PM, Greg Smith <[EMAIL PROTECTED]> wrote:
>> I think the current plan is to use posix_advise() to allow parallel I/O,
>> rather than async I/O becuase posix_advise() will require fewer code
>> changes.
>
> These are not necessarily mutually exclusive designs.  fadvise works fine on
> Linux, but as far as I know only async I/O works on Solaris.  Linux also has
> an async I/O library, and it's not clear to me yet whether that might work
> even better than the fadvise approach.

fadvise is a kludge.  While it will help, it still makes us completely
reliant on the OS.  For performance reasons, we should be supporting a
multi-block read directly into shared buffers.  IIRC, we currently
have support for rings in the buffer pool, which we could read
directly into.  Though, an LRU-based buffer manager design would be
more optimal in this case.

-- 
Jonah H. Harris, Senior DBA
myYearbook.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Greg Smith

On Thu, 23 Oct 2008, Bruce Momjian wrote:


I think the current plan is to use posix_advise() to allow parallel I/O,
rather than async I/O becuase posix_advise() will require fewer code
changes.


These are not necessarily mutually exclusive designs.  fadvise works fine 
on Linux, but as far as I know only async I/O works on Solaris.  Linux 
also has an async I/O library, and it's not clear to me yet whether that 
might work even better than the fadvise approach.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-23 Thread Bruce Momjian
Julius Stroffek wrote:
> Hi All,
> 
> we would like to start some work on improving the performance of
> PostgreSQL in a multi-CPU environment. Dano Vojtek is student at the
> Faculty of Mathematics and Physics of Charles university in Prague
> (http://www.mff.cuni.cz) and he is going to cover this topic in his
> master thesis. He is going to do some investigation in the methods and
> write down the possibilities and then he is going to implement something
> from that for PostgreSQL.
> 
> We want to come out with a serious proposal for this work after
> collecting the feedback/opinions and doing the more serious investigation.

Exciting stuff, and clearly a direction we need to explore.

> Topics that seem to be of interest and most of them were already
> discussed at developers meeting in Ottawa are
> 1.) parallel sorts
> 2.) parallel query execution
> 3.) asynchronous I/O

I think the current plan is to use posix_advise() to allow parallel I/O,
rather than async I/O becuase posix_advise() will require fewer code
changes.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-22 Thread Simon Riggs

On Mon, 2008-10-20 at 21:05 +0200, Julius Stroffek wrote:

> He is going to do some investigation in the methods and
> write down the possibilities and then he is going to implement
> something from that for PostgreSQL.

When will this work be complete? We are days away from completing main
work on 8.4, so you won't get much discussion on this for a few months
yet. Will it be complete in time for 8.5? Or much earlier even?

Julius, you don't mention what your role is in this. In what sense is
Dano's master's thesis a "we" thing?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-21 Thread Myron Scott

I can confirm that bringing Postgres code to multi-thread implementation
requires quite a bit of ground work.  I have been working for a long  
while
with a Postgres 7.* fork that uses pthreads rather than processes.   
The effort

to make all the subsystems thread safe took some time and touched
almost every section of the codebase.

I recently spent some time trying to optimize for Chip Multi-Threading
systems but focused more on total throughput rather than single query
performance.  The biggest wins came from changing some coarse
grained locks in the page buffering system to a finer grained  
implementation.


I also tried to improve single query performance by splitting index and
sequential scans into two threads, one to fault in pages and check tuple
visibility and the other for everything else.  My success was limited  
and
it was hard for me to work the proper costing into the query optimizer  
so

that it fired at the right times.

One place that multiple threads really helped was in index building.

My code is poorly commented and the build system is a mess (I am only
building 64bit SPARC for embedding into another app).  However, I am
using it in production and source is available if it's of any help.

http://weaver2.dev.java.net

Myron Scott



On Oct 20, 2008, at 11:28 PM, Chuck McDevitt wrote:

There is a problem trying to make Postgres do these things in  
Parallel.


The backend code isn’t thread-safe, so doing a multi-thread  
implementation requires quite  a bit of work.


Using multiple processes has its own problems:  The whole way  
locking works equates one process with one transaction (The proc  
table is one entry per process).  Processes would conflict on locks,  
deadlocking themselves, as well as many other problems.


It’s all a good idea, but the work is probably far more than you  
expect.


Async I/O might be easier, if you used pThreads, which is mostly  
portable, but not to all platforms.  (Yes, they do work on Windows)


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
] On Behalf Of Jeffrey Baker

Sent: 2008-10-20 22:25
To: Julius Stroffek
Cc: pgsql-hackers@postgresql.org; Dano Vojtek
Subject: Re: [HACKERS] Multi CPU Queries - Feedback and/or  
suggestions wanted!


On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek <[EMAIL PROTECTED] 
> wrote:

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing
[...]
2.)
Different subtrees (or nodes) of the plan could be executed in  
parallel

on different CPUs and the results of this subtrees could be requested
either synchronously or asynchronously.

I don't see why multiple CPUs can't work on the same node of a  
plan.  For instance, consider a node involving a scan with an  
expensive condition, like UTF-8 string length.  If you have four  
CPUs you can bring to bear, each CPU could take every fourth page,  
computing the expensive condition for each tuple in that page.  The  
results of the scan can be retired asynchronously to the next node  
above.


-jwb



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-21 Thread Julius Stroffek

Hi Jeffrey,

thank you for the suggestion. Yes, they potentially can, we'll consider 
this.


Julo

Jeffrey Baker wrote:
I don't see why multiple CPUs can't work on the same node of a plan.  
For instance, consider a node involving a scan with an expensive 
condition, like UTF-8 string length.  If you have four CPUs you can 
bring to bear, each CPU could take every fourth page, computing the 
expensive condition for each tuple in that page.  The results of the 
scan can be retired asynchronously to the next node above.





-jwb


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Chuck McDevitt
There is a problem trying to make Postgres do these things in Parallel.

 

The backend code isn't thread-safe, so doing a multi-thread
implementation requires quite  a bit of work.

 

Using multiple processes has its own problems:  The whole way locking
works equates one process with one transaction (The proc table is one
entry per process).  Processes would conflict on locks, deadlocking
themselves, as well as many other problems.

 

It's all a good idea, but the work is probably far more than you expect.

 

Async I/O might be easier, if you used pThreads, which is mostly
portable, but not to all platforms.  (Yes, they do work on Windows)

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Baker
Sent: 2008-10-20 22:25
To: Julius Stroffek
Cc: pgsql-hackers@postgresql.org; Dano Vojtek
Subject: Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions
wanted!

 

On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek
<[EMAIL PROTECTED]> wrote:

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing

[...] 

2.)
Different subtrees (or nodes) of the plan could be executed in
parallel
on different CPUs and the results of this subtrees could be
requested
either synchronously or asynchronously.


I don't see why multiple CPUs can't work on the same node of a plan.
For instance, consider a node involving a scan with an expensive
condition, like UTF-8 string length.  If you have four CPUs you can
bring to bear, each CPU could take every fourth page, computing the
expensive condition for each tuple in that page.  The results of the
scan can be retired asynchronously to the next node above.

-jwb



Re: [HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Jeffrey Baker
On Mon, Oct 20, 2008 at 12:05 PM, Julius Stroffek
<[EMAIL PROTECTED]>wrote:

> Topics that seem to be of interest and most of them were already
> discussed at developers meeting in Ottawa are
> 1.) parallel sorts
> 2.) parallel query execution
> 3.) asynchronous I/O
> 4.) parallel COPY
> 5.) parallel pg_dump
> 6.) using threads for parallel processing
>
[...]

> 2.)
> Different subtrees (or nodes) of the plan could be executed in parallel
> on different CPUs and the results of this subtrees could be requested
> either synchronously or asynchronously.


I don't see why multiple CPUs can't work on the same node of a plan.  For
instance, consider a node involving a scan with an expensive condition, like
UTF-8 string length.  If you have four CPUs you can bring to bear, each CPU
could take every fourth page, computing the expensive condition for each
tuple in that page.  The results of the scan can be retired asynchronously
to the next node above.

-jwb


[HACKERS] Multi CPU Queries - Feedback and/or suggestions wanted!

2008-10-20 Thread Julius Stroffek

Hi All,

we would like to start some work on improving the performance of
PostgreSQL in a multi-CPU environment. Dano Vojtek is student at the
Faculty of Mathematics and Physics of Charles university in Prague
(http://www.mff.cuni.cz) and he is going to cover this topic in his
master thesis. He is going to do some investigation in the methods and
write down the possibilities and then he is going to implement something
from that for PostgreSQL.

We want to come out with a serious proposal for this work after
collecting the feedback/opinions and doing the more serious investigation.

Topics that seem to be of interest and most of them were already
discussed at developers meeting in Ottawa are
1.) parallel sorts
2.) parallel query execution
3.) asynchronous I/O
4.) parallel COPY
5.) parallel pg_dump
6.) using threads for parallel processing

A scaling with increasing number of CPUs in 1.) and 2.) will face with
the I/O bottleneck at some point and the benefit gained here should be
nearly the same as for 3.) - the OS or disk could do a better job while
scheduling multiple reads from the disk for the same query at the same time.

1.)
More merges could be executed on different CPUs. However, one N-way
merge on one CPU is probably better than two N/2-way merges on 2 CPUs
while sharing the limit of work_mem together for these. This is specific
and separate from 2.) or 3.) and if something implemented here it could
probably share just the parallel infrastructure code.


2.)
Different subtrees (or nodes) of the plan could be executed in parallel
on different CPUs and the results of this subtrees could be requested
either synchronously or asynchronously.


3.)
The simplest possible way is to change the scan nodes that they will
send out the asynchronous  I/O requests for the next blocks before they
manage to run out of tuples in the block they are going through. The
more advanced way would arise just by implementing 2.) which will then
lead to different scan nodes to be executed on different CPUs at the
same time.


4.) and 5.)
We do not want to focus here, since there are on-going projects already.


6.)
Currently, threads are not used in PostgreSQL (except some cases for 
Windows OS). Generally using them would bring some problems


a) different thread implementations on different OSes
b) crash of the whole process if the problem happens in one thread.
Backends are isolated and the problem in one backend leads to the
graceful shut down of other backends.
c) synchronization problems

* a) seem just to be more for implementation. Is there any problem with 
execution of more threads on any supported OS? Like some planning issue 
that all the threads for the same process end up planned on the same 
CPU? Or something similar?


* b) is fine with using more threads for processing the same query in 
the same backend - if one crashes others could do the graceful shutdown.


* c) does not have to be solved in general because the work of
all the threads will be synchronized and we could expect pretty well 
which data are being accessed by which thread. The memory allocation 
have to be adjusted to be thread safe and should not affect the 
performance (Is different memory context for different threads 
sufficient?). Other common code might need some changes as well. 
Possibly, the synchronization/critical section exclusion could be done 
in executor and only if needed.


* Using processes instead of threads makes other things more complex
  - sharing objects between processes might need much more coding
  - more overhead during execution and synchronization


It seems to that it makes sense to start working on 2) and 3) and we
would like to think of using more threads for processing the same query
within one backend.

We appreciate feedback, comments and/or suggestions.

Cheers

Julo


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers