* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost writes:
> > I'm certainly curious about those, but I'm also very interested in the
> > possibility of making NTUP_PER_BUCKET much smaller, or perhaps variable
> > depending on the work_mem setting.
>
> Not sure about that. That would make th
* Matthias (nitrogen...@gmail.com) wrote:
> >In this example, hashing the large table is actually 2 seconds *faster*
> >than hashing the small table (again, all on my laptop).
>
> Are you running the laptop on battery? When I've benchmarked pgsql
> last time I used my laptop as well and it only oc
In this example, hashing the large table is actually 2 seconds *faster*
than hashing the small table (again, all on my laptop).
Are you running the laptop on battery? When I've benchmarked pgsql last
time I used my laptop as well and it only occured to me after a lot of
trying that laptops (
On Thu, Apr 04, 2013 at 04:16:12PM -0400, Stephen Frost wrote:
> * Stephen Frost (sfr...@snowman.net) wrote:
> > It does look like reducing bucket depth, as I outlined before through
> > the use of a 2-level hashing system, might help speed up
> > ExecScanHashBucket, as it would hopefully have very
Stephen Frost writes:
> Looking with opannotate, there's two main hotspots in
> ExecScanHashBucket:
> 12846 17.4001 :hashTuple =
> hashtable->buckets[hjstate->hj_CurBucketNo];
> and
> 22100 29.9348 :hashTuple = hashTuple->next;
Those are, of course, pretty trivial statements;
* Stephen Frost (sfr...@snowman.net) wrote:
> It does look like reducing bucket depth, as I outlined before through
> the use of a 2-level hashing system, might help speed up
> ExecScanHashBucket, as it would hopefully have very few (eg: 1-2)
> entries to consider instead of more. Along those same
* Stephen Frost (sfr...@snowman.net) wrote:
> 8504125241524.3624 72.3115mcount.c:60
> libc-2.15.so __mcount_internal
> 28370280785 8.1274 80.4389_mcount.S:33
> libc-2.15.so mcount
[...]
And as a side-note, I
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> perf or oprofile reveal anything?
Here's what we get from oprofile (perhaps not too surprising):
Hash the small table / scan the big table:
samples cum. samples %cum. % linenr info image
name symbol name
167374
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> > What I'm trying to get at in this overall email is: why in the world is
> > it so expensive to do hash lookups?
>
> perf or oprofile reveal anything?
Working on a test case actually- I've got one now:
http://snowman.net/~sfrost/test_case2.sql
In this ex
Stephen Frost writes:
> I've been fiddling with this on the very much larger overall database
> where this test case came from and have found that hashing the large
> table can actually be *faster* and appears to cause a more consistent
> and constant amount of disk i/o (which is good).
Interesti
Tom, all,
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> So that's at least going in the right direction.
I agree that this is going in the right direction; it certainly would
make the plan that I *expect* to be chosen more likely, however..
I've been fiddling with this on the very much larger overall
I wrote:
> Perhaps what we should do is charge the hash_qual_cost only for some
> small multiple of the number of tuples that we expect will *pass* the
> hash quals, which is a number we have to compute anyway. The multiple
> would represent the rate of hash-code collisions we expect.
I tried the
On Mon, Apr 1, 2013 at 2:35 AM, Jeff Davis wrote:
> On Sun, 2013-03-31 at 15:45 -0400, Tom Lane wrote:
>> Really, when we're traipsing down a bucket
>> list, skipping over bucket entries with the wrong hash code is just
>> about free, or at least it's a whole lot cheaper than applying ExecQual.
>>
On Sun, 2013-03-31 at 15:45 -0400, Tom Lane wrote:
> Really, when we're traipsing down a bucket
> list, skipping over bucket entries with the wrong hash code is just
> about free, or at least it's a whole lot cheaper than applying ExecQual.
>
> Perhaps what we should do is charge the hash_qual_cos
Stephen Frost writes:
> * Jeff Davis (pg...@j-davis.com) wrote:
>> In Stephen's case the table was only 41KB, so something still seems off.
>> Maybe we should model the likelihood of a collision based on the
>> cardinalities (assuming a reasonably good hash function)?
> It's not really 'hash coll
* Jeff Davis (pg...@j-davis.com) wrote:
> In Stephen's case the table was only 41KB, so something still seems off.
> Maybe we should model the likelihood of a collision based on the
> cardinalities (assuming a reasonably good hash function)?
It's not really 'hash collisions' that we're trying to b
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I think the point is that there may *not* be few hash collisions ...
Right, but that's actually all entirely based on concerns over there
being duplicates (hence why we consider the MCVs and ndistinct), which
makes *some* sense given that we currently have
Jeff,
* Jeff Davis (pg...@j-davis.com) wrote:
> On Thu, 2013-03-28 at 19:56 -0400, Stephen Frost wrote:
> > 41K hashed, seqscan 4M: 115030.10 + 1229.46 = 116259.56
> > 4M hashed, seqscan 41K: 1229.46 + 211156.20 = 212385.66
>
> I think those are backwards -- typo?
Yes, sorry, those are backw
On Fri, 2013-03-29 at 16:37 -0400, Tom Lane wrote:
> Jeff Davis writes:
> > Yes, I have run into this issue (or something very similar). I don't
> > understand why the bucketsize even matters much -- assuming few hash
> > collisions, we are not actually evaluating the quals any more times than
>
Jeff Davis writes:
> Yes, I have run into this issue (or something very similar). I don't
> understand why the bucketsize even matters much -- assuming few hash
> collisions, we are not actually evaluating the quals any more times than
> necessary. So why all of the hashjoin-specific logic in dete
On Thu, 2013-03-28 at 19:56 -0400, Stephen Frost wrote:
> 41K hashed, seqscan 4M: 115030.10 + 1229.46 = 116259.56
> 4M hashed, seqscan 41K: 1229.46 + 211156.20 = 212385.66
I think those are backwards -- typo?
> In the end, I think the problem here is that we are charging far too
> much fo
All,
Marty's issue w/ NestLoop reminded me that I'd put together a test
case which illustrates a HashJoin doing the wrong thing.
The test case is here:
http://snowman.net/~sfrost/test_case.sql
Simply load that up and then check out this plan:
explain select * from small_table join
22 matches
Mail list logo