[
https://issues.apache.org/jira/browse/ASTERIXDB-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shiva Jahangiri updated ASTERIXDB-2577:
---------------------------------------
Description:
In probe() method in optimized hybrid hash join, if insertion fails on the
current spilled partition, we try to find the biggest spilled partition and
flush it as a victim. If we could not find any spilled partition with size > 0,
then we ASSUME that the record is large and flush it as a big object. By
running customerOrderCIDHybridHashJoin_Case3() test in
TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206 bytes
(so it is smaller than a frame), but neither the spilled partitions nor the
buffer manager has any frame (This is the problem, there should be 1 frame for
each spilled partition). In this case, we flush the record as a large object.
This means that every single record that is supposed to get inserted to a
spilled partition during the probe, will get flushed separately.
was:
In probe() method in optimized hybrid hash join, if insertion fails on the
current spilled partition, we try to find the biggest spilled partition and
flush it as a victim. If we could not find any spilled partition with size > 0,
then we ASSUME that the record is large and flush it as a big object. By
running customerOrderCIDHybridHashJoin_Case3() test in
TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206 bytes
(so it is smaller than a frame), but neither the spilled partitions nor the
buffer manager has any frame (This is the problem, there should be 1 frame for
each spilled partition). In this case, we flush the record(without checking if
it is large or not) as a large object. This means that every single record that
is supposed to get inserted to a spilled partition during the probe, will get
flushed separately.
> Flushing small records during the probe in optimized hhj as large objects
> --------------------------------------------------------------------------
>
> Key: ASTERIXDB-2577
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2577
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: *DB - AsterixDB
> Affects Versions: 0.9.4.1
> Reporter: Shiva Jahangiri
> Priority: Major
>
> In probe() method in optimized hybrid hash join, if insertion fails on the
> current spilled partition, we try to find the biggest spilled partition and
> flush it as a victim. If we could not find any spilled partition with size >
> 0, then we ASSUME that the record is large and flush it as a big object. By
> running customerOrderCIDHybridHashJoin_Case3() test in
> TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206
> bytes (so it is smaller than a frame), but neither the spilled partitions nor
> the buffer manager has any frame (This is the problem, there should be 1
> frame for each spilled partition). In this case, we flush the record as a
> large object. This means that every single record that is supposed to get
> inserted to a spilled partition during the probe, will get flushed
> separately.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)