[ 
https://issues.apache.org/jira/browse/ASTERIXDB-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shiva Jahangiri updated ASTERIXDB-2577:
---------------------------------------
    Description: 
In probe() method in optimized hybrid hash join, if insertion fails on the 
current spilled partition, we try to find the biggest spilled partition and 
flush it as a victim. If we could not find any spilled partition with size > 0, 
then we ASSUME that the record is large and flush it as a big object. By 
running customerOrderCIDHybridHashJoin_Case3() test in 

TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206 bytes 
(so it is smaller than a frame), but neither the spilled partitions nor the 
buffer manager has any frame (This is the problem, there should be 1 frame for 
each spilled partition). In this case, we flush the record as a large object. 
This means that every single record that is supposed to get inserted to a 
spilled partition during the probe, will get flushed separately. 

  was:
In probe() method in optimized hybrid hash join, if insertion fails on the 
current spilled partition, we try to find the biggest spilled partition and 
flush it as a victim. If we could not find any spilled partition with size > 0, 
then we ASSUME that the record is large and flush it as a big object. By 
running customerOrderCIDHybridHashJoin_Case3() test in 

TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206 bytes 
(so it is smaller than a frame), but neither the spilled partitions nor the 
buffer manager has any frame (This is the problem, there should be 1 frame for 
each spilled partition). In this case, we flush the record(without checking if 
it is large or not) as a large object. This means that every single record that 
is supposed to get inserted to a spilled partition during the probe, will get 
flushed separately. 


> Flushing small records during the probe in optimized hhj  as large objects
> --------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-2577
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2577
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: *DB - AsterixDB
>    Affects Versions: 0.9.4.1
>            Reporter: Shiva Jahangiri
>            Priority: Major
>
> In probe() method in optimized hybrid hash join, if insertion fails on the 
> current spilled partition, we try to find the biggest spilled partition and 
> flush it as a victim. If we could not find any spilled partition with size > 
> 0, then we ASSUME that the record is large and flush it as a big object. By 
> running customerOrderCIDHybridHashJoin_Case3() test in 
> TPCHCustomerOrderHashJoinTest, it can be seen that the record size is 206 
> bytes (so it is smaller than a frame), but neither the spilled partitions nor 
> the buffer manager has any frame (This is the problem, there should be 1 
> frame for each spilled partition). In this case, we flush the record as a 
> large object. This means that every single record that is supposed to get 
> inserted to a spilled partition during the probe, will get flushed 
> separately. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to