Sergey Soldatov created PHOENIX-4018:
----------------------------------------

             Summary: HashJoin may produce nulls for LHS table columns
                 Key: PHOENIX-4018
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4018
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.11.0
            Reporter: Sergey Soldatov
            Assignee: Sergey Soldatov
            Priority: Critical


Here is the problem: in HashJoinRegionScanner methods (nextRow for example) we 
are using the same scanner context that was created in RSRpcServices. It has 
limits (i.e. 2Mb size). Let's say that we have 3Mb region and the only key that 
match the join condition is located at the end of the region. In 
HashJoinRegionScanner#nextRow when we iterate through the region rows once we 
reached the limit of 2Mb, every region scanner nextRow will  return a single 
cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But we 
don't have any logic that check that, so this single cell is considered as a 
complete row with all nulls except one column. 

How to fix it: 
1. for region scanner we may provide NoLimitScannerContext, so we will never 
get a partial result.  
2. We need to update the scanner context that we got from RSRpcServices with 
the real data, basing on the size of results we are going to return. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to