On 07/20/2014 07:17 PM, Tomas Vondra wrote:
On 19.7.2014 20:24, Tomas Vondra wrote:
On 13.7.2014 21:32, Tomas Vondra wrote:
The current patch only implemnents this for tuples in the main
hash table, not for skew buckets. I plan to do that, but it will
require separate chunks for each skew
On 20 Srpen 2014, 14:05, Heikki Linnakangas wrote:
On 07/20/2014 07:17 PM, Tomas Vondra wrote:
On 19.7.2014 20:24, Tomas Vondra wrote:
On 13.7.2014 21:32, Tomas Vondra wrote:
The current patch only implemnents this for tuples in the main
hash table, not for skew buckets. I plan to do that,
On 20.7.2014 00:12, Tomas Vondra wrote:
On 19.7.2014 23:07, Tomas Vondra wrote:
On 19.7.2014 20:28, Tomas Vondra wrote:
For the first case, a WARNING at the end of estimate_hash_bucketsize
says this:
WARNING: nbuckets=8388608.00 estfract=0.01
WARNING: nbuckets=65536.00
On 19.7.2014 20:24, Tomas Vondra wrote:
On 13.7.2014 21:32, Tomas Vondra wrote:
The current patch only implemnents this for tuples in the main
hash table, not for skew buckets. I plan to do that, but it will
require separate chunks for each skew bucket (so we can remove it
without messing
On 14.7.2014 06:29, Stephen Frost wrote:
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
On 6.7.2014 17:57, Stephen Frost wrote:
* Tomas Vondra (t...@fuzzy.cz) wrote:
I can't find the thread / test cases in the archives. I've found this
thread in hackers:
Tomas Vondra t...@fuzzy.cz writes:
I've reviewed the two test cases mentioned here, and sadly there's
nothing that can be 'fixed' by this patch. The problem here lies in the
planning stage, which decides to hash the large table - we can't fix
that in the executor.
We've heard a couple reports
On 13.7.2014 21:32, Tomas Vondra wrote:
The current patch only implemnents this for tuples in the main hash
table, not for skew buckets. I plan to do that, but it will require
separate chunks for each skew bucket (so we can remove it without
messing with all of them). The chunks for skew
On 19.7.2014 20:24, Tom Lane wrote:
Tomas Vondra t...@fuzzy.cz writes:
I've reviewed the two test cases mentioned here, and sadly there's
nothing that can be 'fixed' by this patch. The problem here lies in the
planning stage, which decides to hash the large table - we can't fix
that in the
On 19.7.2014 20:28, Tomas Vondra wrote:
On 19.7.2014 20:24, Tom Lane wrote:
Tomas Vondra t...@fuzzy.cz writes:
I've reviewed the two test cases mentioned here, and sadly there's
nothing that can be 'fixed' by this patch. The problem here lies in the
planning stage, which decides to hash the
On 19.7.2014 23:07, Tomas Vondra wrote:
On 19.7.2014 20:28, Tomas Vondra wrote:
For the first case, a WARNING at the end of estimate_hash_bucketsize
says this:
WARNING: nbuckets=8388608.00 estfract=0.01
WARNING: nbuckets=65536.00 estfract=0.000267
There are 4.3M rows in the
On 12 July 2014 12:43, Tomas Vondra t...@fuzzy.cz wrote:
So lets just this change done and then do more later.
There's no way back, sadly. The dense allocation turned into a
challenge. I like challenges. I have to solve it or I won't be able to
sleep.
I admire your tenacity, but how about
On 13.7.2014 12:27, Simon Riggs wrote:
On 12 July 2014 12:43, Tomas Vondra t...@fuzzy.cz wrote:
So lets just this change done and then do more later.
There's no way back, sadly. The dense allocation turned into a
challenge. I like challenges. I have to solve it or I won't be able
to
On 11.7.2014 19:25, Tomas Vondra wrote:
2) walking through the tuples sequentially
--
The other option is not to walk the tuples through buckets, but by
walking throught the chunks - we know the tuples are stored as
HashJoinTuple/MinimalTuple, so it
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
On 6.7.2014 17:57, Stephen Frost wrote:
* Tomas Vondra (t...@fuzzy.cz) wrote:
I can't find the thread / test cases in the archives. I've found this
thread in hackers:
On 11 July 2014 18:25, Tomas Vondra t...@fuzzy.cz wrote:
Turns out getting this working properly will quite complicated.
Lets keep this patch simple then. Later research can be another patch.
In terms of memory pressure, having larger joins go x4 faster has a
much more significant reducing
On 12.7.2014 11:39, Simon Riggs wrote:
On 11 July 2014 18:25, Tomas Vondra t...@fuzzy.cz wrote:
Turns out getting this working properly will quite complicated.
Lets keep this patch simple then. Later research can be another patch.
Well, the dense allocation is independent to the
On 9 July 2014 18:54, Tomas Vondra t...@fuzzy.cz wrote:
(1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
of batches this requires)
If we start off by assuming NTUP_PER_BUCKET = 1, how much memory does
it save to recalculate the hash bucket at 10 instead?
Resizing sounds
On 11 Červenec 2014, 9:27, Simon Riggs wrote:
On 9 July 2014 18:54, Tomas Vondra t...@fuzzy.cz wrote:
(1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
of batches this requires)
If we start off by assuming NTUP_PER_BUCKET = 1, how much memory does
it save to
On 11 July 2014 10:23, Tomas Vondra t...@fuzzy.cz wrote:
On 11 Červenec 2014, 9:27, Simon Riggs wrote:
On 9 July 2014 18:54, Tomas Vondra t...@fuzzy.cz wrote:
(1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
of batches this requires)
If we start off by assuming
On 10.7.2014 21:33, Tomas Vondra wrote:
On 9.7.2014 16:07, Robert Haas wrote:
On Tue, Jul 8, 2014 at 5:16 PM, Tomas Vondra t...@fuzzy.cz wrote:
Thinking about this a bit more, do we really need to build the hash
table on the first pass? Why not to do this:
(1) batching
- read the
On 9.7.2014 16:07, Robert Haas wrote:
On Tue, Jul 8, 2014 at 5:16 PM, Tomas Vondra t...@fuzzy.cz wrote:
Thinking about this a bit more, do we really need to build the hash
table on the first pass? Why not to do this:
(1) batching
- read the tuples, stuff them into a simple list
-
On Tue, Jul 8, 2014 at 5:16 PM, Tomas Vondra t...@fuzzy.cz wrote:
Thinking about this a bit more, do we really need to build the hash
table on the first pass? Why not to do this:
(1) batching
- read the tuples, stuff them into a simple list
- don't build the hash table yet
(2)
On 9.7.2014 16:07, Robert Haas wrote:
On Tue, Jul 8, 2014 at 5:16 PM, Tomas Vondra t...@fuzzy.cz wrote:
Thinking about this a bit more, do we really need to build the
hash table on the first pass? Why not to do this:
(1) batching - read the tuples, stuff them into a simple list -
don't
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
once the table is built and there's free space in work_mem. The patch
mentioned above makes implementing this possible / rather simple.
Another idea would be
On 8 Červenec 2014, 14:49, Robert Haas wrote:
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
once the table is built and there's free space in work_mem. The patch
mentioned above makes implementing this
On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
On 8 Červenec 2014, 14:49, Robert Haas wrote:
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
once the table is built and there's free space
On 8 Červenec 2014, 16:16, Robert Haas wrote:
On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
relations it may be way cheaper to use higher NTUP_PER_BUCKET values
instead of increasing the number of
On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote:
On 8 Červenec 2014, 16:16, Robert Haas wrote:
On Tue, Jul 8, 2014 at 9:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large outer
relations it may be way cheaper to use
On 8.7.2014 19:00, Robert Haas wrote:
On Tue, Jul 8, 2014 at 12:06 PM, Tomas Vondra t...@fuzzy.cz wrote:
On 8 Červenec 2014, 16:16, Robert Haas wrote:
Right, I think that's clear. I'm just pointing out that you get
to decide: you can either start with a larger NTUP_PER_BUCKET and
then reduce
On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
On 8 Červenec 2014, 14:49, Robert Haas wrote:
On Wed, Jul 2, 2014 at 8:13 PM, Tomas Vondra t...@fuzzy.cz wrote:
I propose dynamic increase of the nbuckets (up to NTUP_PER_BUCKET=1)
once the table is built and there's free
On 8.7.2014 21:53, Jeff Janes wrote:
On Tue, Jul 8, 2014 at 6:35 AM, Tomas Vondra t...@fuzzy.cz wrote:
Maybe. I'm not against setting NTUP_PER_BUCKET=1, but with large
outer relations it may be way cheaper to use higher NTUP_PER_BUCKET
values instead of increasing the number of batches
Hi,
Thinking about this a bit more, do we really need to build the hash
table on the first pass? Why not to do this:
(1) batching
- read the tuples, stuff them into a simple list
- don't build the hash table yet
(2) building the hash table
- we have all the tuples in a simple list,
On 6.7.2014 06:47, Stephen Frost wrote:
* Greg Stark (st...@mit.edu) wrote:
Last time was we wanted to use bloom filters in hash joins to
filter out tuples that won't match any of the future hash batches
to reduce the amount of tuples that need to be spilled to disk.
However the problem was
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
I can't find the thread / test cases in the archives. I've found this
thread in hackers:
http://www.postgresql.org/message-id/caoezvif-r-ilf966weipk5by-khzvloqpwqurpak3p5fyw-...@mail.gmail.com
Can you point me to the right one, please?
This:
On 6.7.2014 17:57, Stephen Frost wrote:
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
I can't find the thread / test cases in the archives. I've found this
thread in hackers:
http://www.postgresql.org/message-id/caoezvif-r-ilf966weipk5by-khzvloqpwqurpak3p5fyw-...@mail.gmail.com
Can you
* Greg Stark (st...@mit.edu) wrote:
On Thu, Jul 3, 2014 at 11:40 AM, Atri Sharma atri.j...@gmail.com wrote:
IIRC, last time when we tried doing bloom filters, I was short of some real
world useful hash functions that we could use for building the bloom filter.
Last time was we wanted to
On 3.7.2014 02:13, Tomas Vondra wrote:
Hi,
while hacking on the 'dynamic nbucket' patch, scheduled for the next CF
(https://commitfest.postgresql.org/action/patch_view?id=1494) I was
repeatedly stumbling over NTUP_PER_BUCKET. I'd like to propose a change
in how we handle it.
TL;DR;
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
However it's likely there are queries where this may not be the case,
i.e. where rebuilding the hash table is not worth it. Let me know if you
can construct such query (I wasn't).
Thanks for working on this! I've been thinking on this for a while
On Thu, Jul 3, 2014 at 11:40 PM, Stephen Frost sfr...@snowman.net wrote:
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
However it's likely there are queries where this may not be the case,
i.e. where rebuilding the hash table is not worth it. Let me know if you
can construct such query (I
Hi Stephen,
On 3.7.2014 20:10, Stephen Frost wrote:
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
However it's likely there are queries where this may not be the case,
i.e. where rebuilding the hash table is not worth it. Let me know if you
can construct such query (I wasn't).
Thanks for
On Thu, Jul 3, 2014 at 11:40 AM, Atri Sharma atri.j...@gmail.com wrote:
IIRC, last time when we tried doing bloom filters, I was short of some real
world useful hash functions that we could use for building the bloom filter.
Last time was we wanted to use bloom filters in hash joins to filter
On 3.7.2014 20:50, Tomas Vondra wrote:
Hi Stephen,
On 3.7.2014 20:10, Stephen Frost wrote:
Tomas,
* Tomas Vondra (t...@fuzzy.cz) wrote:
However it's likely there are queries where this may not be the case,
i.e. where rebuilding the hash table is not worth it. Let me know if you
can
42 matches
Mail list logo