Jian Wu has posted comments on this change. Change subject: IMPALA-2328 Parquet scan should use min/max stats ......................................................................
Patch Set 1: > Thanks for your fast response and consideration. > Your comment makes sense. In order to use the min/max stats in as > many situations as possible, it seems we need a more general > solution. How about slightly changing my suggestion to do the > following: > > In the Impala FE: > 1. Use the existing scan tuple for materializing the min stats. > Create a new tuple identical to the scan tuple for the max stats. > We evaluate predicates against a row that consists of those two > tuples. > > 2. Analyze the scan predicates and generate a list of > minMaxConjuncts that are evaluated against that min/max row. > > In the Impala BE: > 3. During the Parquet scan, generate the min/max row, populate the > min/max tuples and evaluate the list of minMaxConjuncts against > that row. Thanks for your further advice. Let me try to summarize your thoughts first. 1. Make a tuple row consists of two tuples which are min stats and max stats. 2. Generate minMaxConjuncts in FE 3. Evaluate the minMaxConjuncts in during the parquet scan on that tuple row. And somehow make a way to let the slotref get the min value or the max value according to the minMaxConjunct. Am I right? -- To view, visit http://gerrit.cloudera.org:8080/3623 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I91de1f4d0fb2a982d06cd344e41901e3bf3c2cea Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Jian Wu <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Jian Wu <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: No
