On Sunday, June 23, 2013, Simon Riggs wrote:

> On 23 June 2013 03:16, Stephen Frost <sfr...@snowman.net <javascript:;>>
> wrote:
>
> > Will think on it more.
>
> Some other thoughts related to this...
>
> * Why are we building a special kind of hash table? Why don't we just
> use the hash table code that we in every other place in the backend.
> If that code is so bad why do we use it everywhere else? That is
> extensible, so we could try just using that. (Has anyone actually
> tried?)


I've not looked at the hash table in the rest of the backend.


> * We're not thinking about cache locality and set correspondence
> either. If the join is expected to hardly ever match, then we should
> be using a bitmap as a bloom filter rather than assuming that a very
> large hash table is easily accessible.


That's what I was suggesting earlier, though I don't think it's technically
a bloom filter- doesn't that require multiple hash functions?I don't think
we want to require every data type to provide multiple hash functions.


> * The skew hash table will be hit frequently and would show good L2
> cache usage. I think I'll try adding the skew table always to see if
> that improves the speed of the hash join.
>

The skew tables is just for common values though...   To be honest, I have
some doubts about that structure really being a terribly good approach for
anything which is completely in memory.

Thanks,

Stephen

Reply via email to