Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3643/IMPALA-5344: Fix FE tests on Java 8
......................................................................


Patch Set 5:

(6 comments)

I had a few high-level comments:
* A lot of this depends on order being preserved across multiple classes. If 
one class depends on another class returning things in a particular order, we 
should document that as part of the interface (either in a comment or via the 
types)
* In a lot of cases, if it's important that a particular data structure 
preserves insertion order, we should declare the member or return value as a 
LinkedHash* instead of just a Hash*. Otherwise it's all very mysterious.
* I think in many cases, instead of trying to preserve an arbitrary order 
across multiple processing steps, we should instead put the elements into a 
canonical order at the point where we iterate over them. It's hard to 
understand why things work otherwise.

http://gerrit.cloudera.org:8080/#/c/7073/5/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

Line 443:       Map<String, List<PrivilegeRequest>> tablePrivReqs = 
Maps.newLinkedHashMap();
Mention that the LinkedHashMap is used to preserve the order. I also wonder if 
TreeMap would be better - that way the order would be alphabetical rather than 
arbitrary.


http://gerrit.cloudera.org:8080/#/c/7073/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

Line 203:     public final Map<ExprId, Expr> conjuncts = 
Maps.newLinkedHashMap();
The changes in here all feel a bit arbitrary. Do we understand why it's 
necessary to change all of these?


Line 1811:   private Map<EquivalenceClassId, List<SlotId>> 
getEquivClasses(List<TupleId> tids) {
I don't understand why the order should matter here. It looks like there is 
some interaction between this, DisjointSet, and  createEquivConjuncts() but 
it's not clear what the contract between the components is with regards to 
ordering.


http://gerrit.cloudera.org:8080/#/c/7073/5/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java:

It's really hard to understand why the ordering is necessary in most of these 
places or what the contracts are between the different data structures. Is 
there a way we can just enforce the ordering on output, rather than trying to 
maintain an arbitrary order the whole way through.


http://gerrit.cloudera.org:8080/#/c/7073/5/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

Line 187:         Maps.newLinkedHashMap(innerStmt.getTblProperties());
It was not obvious the me that getTblProperties() had any specific order. I had 
to look at the cup file to figure that out!


http://gerrit.cloudera.org:8080/#/c/7073/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

Line 437:   public Set<RuntimeFilter> getRuntimeFilters() {
I think it would be less brittle to just return a list of filters sorted by 
filterId here. That also will make it less likely that we will unintentionally 
change the result if we change the algorithms in here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7073
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad8e1dccec3a51293a109c420bd2b88b9d1e0625
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to