subject:"Re\: \[HACKERS\] Parallel query and temp_file

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-07 Thread Robert Haas

On Tue, Jul 5, 2016 at 3:59 PM, Peter Geoghegan  wrote:
> On Tue, Jul 5, 2016 at 12:58 PM, Tom Lane  wrote:
>> Perhaps we could change the wording of temp_file_limit's description
>> from "space that a session can use" to "space that a process can use"
>> to help clarify this?
>
> That's all that I was looking for, really.

OK, done that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Tom Lane

Peter Geoghegan  writes:
> On Tue, Jul 5, 2016 at 12:00 PM, Robert Haas  wrote:
>> I think that it is not worth mentioning specifically for
>> temp_file_limit; to me that seems to be a hole with no bottom.  We'll
>> end up arguing about which GUCs should mention it specifically and
>> there will be no end to it.

> I don't think that you need it for any other GUC, so I really don't
> know why you're concerned about a slippery slope.

FWIW, I agree with Robert on this.  It seems just weird to call out
temp_file_limit specifically.  Also, I don't agree that that's the
only interesting per-process resource consumption; max_files_per_process
seems much more likely to cause trouble in practice.

Perhaps we could change the wording of temp_file_limit's description
from "space that a session can use" to "space that a process can use"
to help clarify this?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan

On Tue, Jul 5, 2016 at 12:58 PM, Tom Lane  wrote:
> Perhaps we could change the wording of temp_file_limit's description
> from "space that a session can use" to "space that a process can use"
> to help clarify this?

That's all that I was looking for, really.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan

On Tue, Jul 5, 2016 at 12:00 PM, Robert Haas  wrote:
> I think that it is not worth mentioning specifically for
> temp_file_limit; to me that seems to be a hole with no bottom.  We'll
> end up arguing about which GUCs should mention it specifically and
> there will be no end to it.

I don't think that you need it for any other GUC, so I really don't
know why you're concerned about a slippery slope. The only other
resource GUC that is scoped per session that I can see is
temp_buffers, but that doesn't need to change, since parallel workers
cannot use temp_buffers directly in practice. max_files_per_process is
already clearly per process, so no change needed there either.

I don't see a case other than temp_file_limit that appears to be even
marginally in need of a specific note.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Robert Haas

On Tue, Jul 5, 2016 at 1:58 PM, Peter Geoghegan  wrote:
> On Tue, Jul 5, 2016 at 7:45 AM, Robert Haas  wrote:
>> Since Peter doesn't seem in a hurry to produce a patch for this issue,
>> I wrote one.  It is attached.  I'll commit this in a day or two if
>> nobody objects.
>
> Sorry about the delay.
>
> Your patch seems reasonable, but I thought we'd also want to change
> "per session" to "per session (with an additional temp_file_limit
> allowance within each parallel worker)" for temp_file_limit.
>
> I think it's worthwhile noting this for temp_file_limit specifically,
> since it's explicitly a per session limit, whereas users are quite
> used to the idea that work_mem might be doled out multiple times for
> multiple executor nodes.

I think that it is not worth mentioning specifically for
temp_file_limit; to me that seems to be a hole with no bottom.  We'll
end up arguing about which GUCs should mention it specifically and
there will be no end to it.

We can see what other people think, but that's my position.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Peter Geoghegan

On Tue, Jul 5, 2016 at 7:45 AM, Robert Haas  wrote:
> Since Peter doesn't seem in a hurry to produce a patch for this issue,
> I wrote one.  It is attached.  I'll commit this in a day or two if
> nobody objects.

Sorry about the delay.

Your patch seems reasonable, but I thought we'd also want to change
"per session" to "per session (with an additional temp_file_limit
allowance within each parallel worker)" for temp_file_limit.

I think it's worthwhile noting this for temp_file_limit specifically,
since it's explicitly a per session limit, whereas users are quite
used to the idea that work_mem might be doled out multiple times for
multiple executor nodes.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-07-05 Thread Robert Haas

On Tue, Jun 21, 2016 at 8:15 AM, Robert Haas  wrote:
> On Mon, Jun 20, 2016 at 11:01 PM, Tom Lane  wrote:
>> Peter Geoghegan  writes:
>>> On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
 What I'm tempted to do is trying to document that, as a point of
 policy, parallel query in 9.6 uses up to (workers + 1) times the
 resources that a single session might use.  That includes not only CPU
 but also things like work_mem and temp file space.  This obviously
 isn't ideal, but it's what could be done by the ship date.
>>
>>> Where would that be documented, though? Would it need to be noted in
>>> the case of each such GUC?
>>
>> Why can't we just note this in the number-of-workers GUCs?  It's not like
>> there even *is* a GUC for many of our per-process resource consumption
>> behaviors.
>
> +1.

Since Peter doesn't seem in a hurry to produce a patch for this issue,
I wrote one.  It is attached.  I'll commit this in a day or two if
nobody objects.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-workers-guc-doc.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-21 Thread Robert Haas

On Mon, Jun 20, 2016 at 11:01 PM, Tom Lane  wrote:
> Peter Geoghegan  writes:
>> On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
>>> What I'm tempted to do is trying to document that, as a point of
>>> policy, parallel query in 9.6 uses up to (workers + 1) times the
>>> resources that a single session might use.  That includes not only CPU
>>> but also things like work_mem and temp file space.  This obviously
>>> isn't ideal, but it's what could be done by the ship date.
>
>> Where would that be documented, though? Would it need to be noted in
>> the case of each such GUC?
>
> Why can't we just note this in the number-of-workers GUCs?  It's not like
> there even *is* a GUC for many of our per-process resource consumption
> behaviors.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-20 Thread Tom Lane

Peter Geoghegan  writes:
> On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
>> What I'm tempted to do is trying to document that, as a point of
>> policy, parallel query in 9.6 uses up to (workers + 1) times the
>> resources that a single session might use.  That includes not only CPU
>> but also things like work_mem and temp file space.  This obviously
>> isn't ideal, but it's what could be done by the ship date.

> Where would that be documented, though? Would it need to be noted in
> the case of each such GUC?

Why can't we just note this in the number-of-workers GUCs?  It's not like
there even *is* a GUC for many of our per-process resource consumption
behaviors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-20 Thread Peter Geoghegan

On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
> What I'm tempted to do is trying to document that, as a point of
> policy, parallel query in 9.6 uses up to (workers + 1) times the
> resources that a single session might use.  That includes not only CPU
> but also things like work_mem and temp file space.  This obviously
> isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-07 Thread Peter Geoghegan

On Tue, Jun 7, 2016 at 8:32 AM, Robert Haas  wrote:
> You previously offered to write a patch for this.  Are you still
> planning to do that?

OK, I'll get to that in the next few days.

I'm slightly concerned that I might have missed a real problem in the
code. I'll need to examine the issue more closely.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-07 Thread Robert Haas

On Sun, Jun 5, 2016 at 4:32 PM, Peter Geoghegan  wrote:
> On Wed, May 18, 2016 at 12:01 PM, Peter Geoghegan  wrote:
>>> I think for 9.6 we just have to document this issue.  In the next
>>> release, we could (and might well want to) try to do something more
>>> clever.
>>
>> Works for me. You may wish to update comments within fd.c at the same time.
>
> I've created a 9.6 open issue for this.

You previously offered to write a patch for this.  Are you still
planning to do that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-06-05 Thread Peter Geoghegan

On Wed, May 18, 2016 at 12:01 PM, Peter Geoghegan  wrote:
>> I think for 9.6 we just have to document this issue.  In the next
>> release, we could (and might well want to) try to do something more
>> clever.
>
> Works for me. You may wish to update comments within fd.c at the same time.

I've created a 9.6 open issue for this.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-18 Thread Peter Geoghegan

On Wed, May 18, 2016 at 3:40 AM, Robert Haas  wrote:
>> I'll write a patch to fix the issue, if there is a consensus on a solution.
>
> I think for 9.6 we just have to document this issue.  In the next
> release, we could (and might well want to) try to do something more
> clever.

Works for me. You may wish to update comments within fd.c at the same time.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-18 Thread David Rowley

On 18 May 2016 at 22:40, Robert Haas  wrote:

> On Tue, May 17, 2016 at 6:40 PM, Peter Geoghegan  wrote:
> > On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan  wrote:
> >> Fundamentally, since temporary_files_size enforcement simply
> >> piggy-backs on low-level fd.c file management, without any
> >> consideration of what the temp files contain, it'll be hard to be sure
> >> that parallel workers will not have issues. I think it'll be far
> >> easier to fix the problem then it would be to figure out if it's
> >> possible to get away with it.
> >
> > I'll write a patch to fix the issue, if there is a consensus on a
> solution.
>
> I think for 9.6 we just have to document this issue.  In the next
> release, we could (and might well want to) try to do something more
> clever.
>
> What I'm tempted to do is trying to document that, as a point of
> policy, parallel query in 9.6 uses up to (workers + 1) times the
> resources that a single session might use.  That includes not only CPU
> but also things like work_mem and temp file space.  This obviously
> isn't ideal, but it's what could be done by the ship date.
>

I was asked (internally I believe) about abuse of work_mem during my work
on parallel aggregates, at the time I didn't really feel like I was abusing
that any more than parallel hash join was.  My thought was that one day it
would be nice if work_mem could be granted to a query and we had some query
marshal system which ensured that the total grants did not exceed the
server wide memory dedicated to work_mem. Of course that's lots of work, as
there's at least one node (HashAgg) which can still blow out work_mem for
bad estimates. For this release, I assumed it wouldn't be too big an issue
if we're shipping with max_parallel_degree = 0 as we could just decorate
the docs with some warnings about work_mem is per node / per worker to
caution users setting this setting any higher. That might be enough to give
us wriggle from for the future where we can make improvements, so I agree
with Robert, the docs seem like the best solution for 9.6.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-18 Thread Robert Haas

On Tue, May 17, 2016 at 6:40 PM, Peter Geoghegan  wrote:
> On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan  wrote:
>> Fundamentally, since temporary_files_size enforcement simply
>> piggy-backs on low-level fd.c file management, without any
>> consideration of what the temp files contain, it'll be hard to be sure
>> that parallel workers will not have issues. I think it'll be far
>> easier to fix the problem then it would be to figure out if it's
>> possible to get away with it.
>
> I'll write a patch to fix the issue, if there is a consensus on a solution.

I think for 9.6 we just have to document this issue.  In the next
release, we could (and might well want to) try to do something more
clever.

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use.  That includes not only CPU
but also things like work_mem and temp file space.  This obviously
isn't ideal, but it's what could be done by the ship date.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-17 Thread Peter Geoghegan

On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan  wrote:
> Fundamentally, since temporary_files_size enforcement simply
> piggy-backs on low-level fd.c file management, without any
> consideration of what the temp files contain, it'll be hard to be sure
> that parallel workers will not have issues. I think it'll be far
> easier to fix the problem then it would be to figure out if it's
> possible to get away with it.

I'll write a patch to fix the issue, if there is a consensus on a solution.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-17 Thread Peter Geoghegan

On Tue, May 17, 2016 at 1:53 PM, Amit Kapila  wrote:
> What kind of special treatment are you expecting for temporary_files_size,
> also why do you think it is required?  Currently neither we build hash in
> parallel nor there is any form of parallel sort work.

I expect only that temporary_files_size be described accurately, and
have new behavior for parallel query that is not surprising. There are
probably several solutions that would meet that standard, and I am not
attached to any particular one of them.

I wrote a parallel sort patch already (CREATE INDEX for the B-Tree
AM), and will post it at an opportune time. So, I think we can expect
your observations about there not being parallel sort work to no
longer apply in a future release, which we should get ahead of now.
Also, won't parallel workers that build their own copy of the hash
table (for a hash join) also use their own temp files, if there is a
need for temp files? I think parallel query will end up sharing temp
files fairly often, and not just out of convenience to implementers
(that is, not just to avoid using shared memory extensively).

Fundamentally, since temporary_files_size enforcement simply
piggy-backs on low-level fd.c file management, without any
consideration of what the temp files contain, it'll be hard to be sure
that parallel workers will not have issues. I think it'll be far
easier to fix the problem then it would be to figure out if it's
possible to get away with it.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel query and temp_file_limit

2016-05-17 Thread Amit Kapila

On Wed, May 18, 2016 at 12:55 AM, Peter Geoghegan  wrote:
>
> temp_file_limit "specifies the maximum amount of disk space that a
> session can use for temporary files, such as sort and hash temporary
> files", according to the documentation. That's not true when parallel
> query is in use, since the global variable temporary_files_size
> receives no special treatment for parallel query.
>

What kind of special treatment are you expecting for temporary_files_size,
also why do you think it is required?  Currently neither we build hash in
parallel nor there is any form of parallel sort work.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

Re: [HACKERS] Parallel query and temp_file_limit

19 matches

Site Navigation

Mail list logo

Footer information