On Mon, Nov 25, 2019 at 10:11 AM Tomas Vondra
wrote:
> On Mon, Nov 25, 2019 at 05:33:35PM +0900, Michael Paquier wrote:
> >On Tue, Sep 10, 2019 at 03:47:51PM +0200, Tomas Vondra wrote:
> >> My feeling is that we should get the BNLJ committed first, and then
> maybe
> >> use some of those
On Mon, Nov 25, 2019 at 07:11:19PM +0100, Tomas Vondra wrote:
> I'm not planning to do any any immediate work on this, so I agree with
> marking it as RWF. I think Melanie is working on the BNL patch, which
> seems like the right solution.
Thanks, I have switched the patch as returned with
On Mon, Nov 25, 2019 at 05:33:35PM +0900, Michael Paquier wrote:
On Tue, Sep 10, 2019 at 03:47:51PM +0200, Tomas Vondra wrote:
My feeling is that we should get the BNLJ committed first, and then maybe
use some of those additional strategies as fallbacks (depending on which
issues are still
On Tue, Sep 10, 2019 at 03:47:51PM +0200, Tomas Vondra wrote:
> My feeling is that we should get the BNLJ committed first, and then maybe
> use some of those additional strategies as fallbacks (depending on which
> issues are still unsolved by the BNLJ).
The glacier is melting more. Tomas,
On Thu, Sep 05, 2019 at 09:54:33AM -0700, Melanie Plageman wrote:
On Tue, Sep 3, 2019 at 9:36 AM Alvaro Herrera
wrote:
On 2019-Jul-11, Tomas Vondra wrote:
> On Wed, Jul 10, 2019 at 04:51:02PM -0700, Melanie Plageman wrote:
> > I think implementing support for parallel hashjoin or explicitly
On Tue, Sep 3, 2019 at 9:36 AM Alvaro Herrera
wrote:
> On 2019-Jul-11, Tomas Vondra wrote:
>
> > On Wed, Jul 10, 2019 at 04:51:02PM -0700, Melanie Plageman wrote:
>
> > > I think implementing support for parallel hashjoin or explicitly
> > > disabling it would be the bare minimum for this patch,
On 2019-Jul-11, Tomas Vondra wrote:
> On Wed, Jul 10, 2019 at 04:51:02PM -0700, Melanie Plageman wrote:
> > I think implementing support for parallel hashjoin or explicitly
> > disabling it would be the bare minimum for this patch, which is why I
> > made 2 its own item. I've marked it as
On Fri, Jul 12, 2019 at 1:16 AM Robert Haas wrote:
> On Mon, May 6, 2019 at 9:49 PM Thomas Munro
> wrote:
> > Stepping back a bit, I think there is something fishy about the way we
> > detect extreme skew. Is that a factor in this case? Right now we
> > wait until we have a batch that gets
On Mon, May 6, 2019 at 9:49 PM Thomas Munro wrote:
> Stepping back a bit, I think there is something fishy about the way we
> detect extreme skew. Is that a factor in this case? Right now we
> wait until we have a batch that gets split into child batches
> containing exactly 0% and 100% of the
On Wed, Jul 10, 2019 at 04:51:02PM -0700, Melanie Plageman wrote:
Okay, so, while I do have specific, actual code review/commitfest-y
feedback for the patch in this thread registered for this commitfest,
I wanted to defer that for a later email and use this one to cover off
on a few higher level
On Tue, May 28, 2019 at 03:40:01PM +0800, Hubert Zhang wrote:
On Sat, May 4, 2019 at 8:34 AM Tomas Vondra
wrote:
Hi Tomas
I read your second patch which uses overflow buf files to reduce the total
number of batches.
It would solve the hash join OOM problem what you discussed above: 8K per
Hi Tomas,
Here is the patch, it's could be compatible with your patch and it focus on
when to regrow the batch.
On Tue, May 28, 2019 at 3:40 PM Hubert Zhang wrote:
> On Sat, May 4, 2019 at 8:34 AM Tomas Vondra
> wrote:
>
>> The root cause is that hash join treats batches as pretty much free,
On Sat, May 4, 2019 at 8:34 AM Tomas Vondra
wrote:
> The root cause is that hash join treats batches as pretty much free, but
> that's not really true - we do allocate two BufFile structs per batch,
> and each BufFile is ~8kB as it includes PGAlignedBuffer.
>
> The OOM is not very surprising,
On Tue, May 21, 2019 at 05:38:50PM -0700, Melanie Plageman wrote:
On Wed, May 8, 2019 at 8:08 AM Tomas Vondra
wrote:
On Tue, May 07, 2019 at 05:30:27PM -0700, Melanie Plageman wrote:
> One thing I was a little confused by was the nbatch_inmemory member
> of the hashtable. The comment in
On Wed, May 8, 2019 at 8:08 AM Tomas Vondra
wrote:
> On Tue, May 07, 2019 at 05:30:27PM -0700, Melanie Plageman wrote:
> > One thing I was a little confused by was the nbatch_inmemory member
> > of the hashtable. The comment in ExecChooseHashTableSize says that
> > it is determining the
On Tue, May 07, 2019 at 05:30:27PM -0700, Melanie Plageman wrote:
On Mon, May 6, 2019 at 8:15 PM Tomas Vondra
wrote:
Nope, that's not how it works. It's the array of batches that gets
sliced, not the batches themselves.
It does slightly increase the amount of data we need to
On Tue, May 07, 2019 at 05:43:56PM -0700, Melanie Plageman wrote:
On Tue, May 7, 2019 at 6:59 AM Tomas Vondra
wrote:
On Tue, May 07, 2019 at 04:28:36PM +1200, Thomas Munro wrote:
>On Tue, May 7, 2019 at 3:15 PM Tomas Vondra
> wrote:
>> On Tue, May 07, 2019 at 01:48:40PM
On Tue, May 7, 2019 at 6:59 AM Tomas Vondra
wrote:
> On Tue, May 07, 2019 at 04:28:36PM +1200, Thomas Munro wrote:
> >On Tue, May 7, 2019 at 3:15 PM Tomas Vondra
> > wrote:
> >> On Tue, May 07, 2019 at 01:48:40PM +1200, Thomas Munro wrote:
> >> Switching to some other algorithm during execution
On Mon, May 6, 2019 at 8:15 PM Tomas Vondra
wrote:
> Nope, that's not how it works. It's the array of batches that gets
> sliced, not the batches themselves.
>
> It does slightly increase the amount of data we need to shuffle between
> the temp files, because we can't write the data directly to
On Tue, May 07, 2019 at 10:42:36AM -0400, Tom Lane wrote:
Tomas Vondra writes:
On Mon, May 06, 2019 at 11:18:28PM -0400, Tom Lane wrote:
Tomas Vondra writes:
Do we actually check how many duplicates are there during planning?
Certainly that's part of the planner's cost estimates ... but
Tomas Vondra writes:
> On Mon, May 06, 2019 at 11:18:28PM -0400, Tom Lane wrote:
>> Tomas Vondra writes:
>>> Do we actually check how many duplicates are there during planning?
>> Certainly that's part of the planner's cost estimates ... but it's
>> only as good as the planner's statistical
On Tue, May 07, 2019 at 04:28:36PM +1200, Thomas Munro wrote:
On Tue, May 7, 2019 at 3:15 PM Tomas Vondra
wrote:
On Tue, May 07, 2019 at 01:48:40PM +1200, Thomas Munro wrote:
>Seems expensive for large numbers of slices -- you need to join the
>outer batch against each inner slice.
Nope,
On Mon, May 06, 2019 at 11:18:28PM -0400, Tom Lane wrote:
Tomas Vondra writes:
Do we actually check how many duplicates are there during planning?
Certainly that's part of the planner's cost estimates ... but it's
only as good as the planner's statistical knowledge.
I'm looking at the
On Tue, May 7, 2019 at 3:15 PM Tomas Vondra
wrote:
> On Tue, May 07, 2019 at 01:48:40PM +1200, Thomas Munro wrote:
> >Seems expensive for large numbers of slices -- you need to join the
> >outer batch against each inner slice.
>
> Nope, that's not how it works. It's the array of batches that gets
Tomas Vondra writes:
> Do we actually check how many duplicates are there during planning?
Certainly that's part of the planner's cost estimates ... but it's
only as good as the planner's statistical knowledge.
regards, tom lane
On Tue, May 07, 2019 at 01:48:40PM +1200, Thomas Munro wrote:
On Tue, May 7, 2019 at 9:58 AM Melanie Plageman
wrote:
On Fri, May 3, 2019 at 5:34 PM Tomas Vondra
wrote:
The second patch tries to enforce work_mem more strictly. That would be
impossible if we were to keep all the BufFile
On Tue, May 7, 2019 at 9:58 AM Melanie Plageman
wrote:
> On Fri, May 3, 2019 at 5:34 PM Tomas Vondra
> wrote:
>> The second patch tries to enforce work_mem more strictly. That would be
>> impossible if we were to keep all the BufFile structs in memory, so
>> instead it slices the batches into
On Fri, May 3, 2019 at 5:34 PM Tomas Vondra
wrote:
>
> The second patch tries to enforce work_mem more strictly. That would be
> impossible if we were to keep all the BufFile structs in memory, so
> instead it slices the batches into chunks that fit into work_mem, and
> then uses a single
Hi,
I'm starting this thread mostly to keep track of patches developed in
response to issue [1] reported on pgsql-performance. The symptoms are
very simple - query performing a hash join ends up using much more
memory than expected (pretty much ignoring work_mem), and possibly
ending up with
29 matches
Mail list logo