[
https://issues.apache.org/jira/browse/PIG-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762754#action_12762754
]
Pradeep Kamath commented on PIG-922:
------------------------------------
Some comments on new patch:
PruneColumns.java:
| 274 if (relevantFields!=null &&
relevantFields.needAllFields())
| 275 {
| 276 requiredInputFieldsList.set(j, new
RequiredFields(true));
| 277 continue;
| 278 }
| 279
| 280 // Mapping output map keys to input map keys
| 281 //
| 282 if (rlo instanceof LOCogroup)
| 283 {
| 284 if (relevantFields!=null &&
relevantFields.needAllFields())
| 285 {
| 286 for (Pair<Integer, Integer> pair :
relevantFields.getFields())
| 287 relevantFields.setMapKeysInfo(pair.first,
pair.second,
| 288 new MapKeysInfo(true));
| 289 }
| 290 }
Wouldn't the last if be redundant since it is same as first if and first if is
true, the loop "continues" and never reaches the last if
line numbers per old code:
326 // Collect required map keys in foreach plan
here.
327 // This is the only logical operator that we
collect map keys
328 // which are introduced by the operator here.
329 // For all other logical operators, it is
attached to required fields
330 // of that logical operator, will process in
required fields processing
331 // section
332 for (Pair<Integer, Integer> relevantField :
relevantFields.getFields())
333 {
334 MapKeysInfo mapKeysInfo =
getMapKeysInPlan(forEachPlan, relevantField.second);
335 relevantFields.mergeMapKeysInfo(0,
relevantField.second, mapKeysInfo);
336 }
Why do we get the forEachPlan again here? isn't it the same as the one just
above this code? also are we merging map key references with relayed map keys
here?
ColumnPruner.java:
108 allPruned = false;
This could result in pruning even when all the relevantFields are null OR do
not need any field (pruning should happen only if input had that field pruned)
In general I didn't fully understand the logic here - comments would be good :)
- I felt starting with allPruned true may lead to error - maybe I didn;t
understand
this well
In LOCogroup.java
790 unsetSchema();
791 getSchema();
792 mIsProjectionMapComputed = false;
793 getProjectionMap();
794 return true;
795 }
this should be replaced with return super(); - same comment for
LOSplitOutput.java, LOJoin.java, LOFilter.java, LOSort.java,
In LOForEach.java:
944 for (int i=columns.size()-1;i>=0;i--) {
945 Pair<Integer, Integer> column = columns.get(i);
946 for (LogicalPlan plan : mForEachPlans) {
947 pruneColumnInPlan(plan, column.second);
948 }
949 }
it seems like we will prune the column in all for each plans - a comment to
explain why would be good.
> Logical optimizer: push up project
> ----------------------------------
>
> Key: PIG-922
> URL: https://issues.apache.org/jira/browse/PIG-922
> Project: Pig
> Issue Type: New Feature
> Components: impl
> Affects Versions: 0.3.0
> Reporter: Daniel Dai
> Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-922-p1_0.patch, PIG-922-p1_1.patch,
> PIG-922-p1_2.patch, PIG-922-p1_3.patch, PIG-922-p1_4.patch,
> PIG-922-p2_preview.patch, PIG-922-p2_preview2.patch, PIG-922-p3_1.patch,
> PIG-922-p3_2.patch, PIG-922-p3_3.patch, PIG-922-p3_4.patch, PIG-922-p3_5.patch
>
>
> This is a continuation work of
> [PIG-697|https://issues.apache.org/jira/browse/PIG-697]. We need to add
> another rule to the logical optimizer: Push up project, ie, prune columns as
> early as possible.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
