Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
......................................................................


Patch Set 2:

(6 comments)

Thanks much for the improvement. Some general comments sprinkled among the 
files.

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@395
PS2, Line 395:       String sql = toSql(DEFAULT);
Let's think about the meaning of the options for expressions. Expr nodes won't 
have the state to provide un-rewritten SQL: a node is what it is, and does not 
know if it should be considered rewritten or not. The rewritten state is 
something that the outer non-expr node must provide.

Expr nodes can decide whether to include or exclude implicit casts.

This, then, raises and interesting issue. The ToSqlOptions enum is an enum: we 
get to pick one value. But, some of the options overlap: I may want to include 
implicit casts as one choice, and may want source or rewritten SQL on the other 
hand.

I wonder, can we make this simpler somehow?


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1300
PS2, Line 1300:           name, (printExpr) ? " '" + toSql(DEFAULT) + "'" : "", 
type_.toString()));
This comment applies in a zillion other places. Our general rule for error 
messages should be that we show the user's original SQL without rewrites or 
implicit casts. We have hundreds of such usages and we have to check that new 
messages use the correct form. Can we standardize these somehow?

One thought is to rename the option from DEFAULT to something like ORIGINAL or 
SOURCE. But, this means we still have to sprinkle this option across many error 
messages. This is, really, exposing implementation in too many places.

As it turns out, ParseNode is an interface, with many direct decedents. Does it 
make sense to introduce a new base class, AbstractParseNode, that has a 
userSql() (or sourceSql()) method that we use in error messages?

Recognizing that expressions are different than statements, maybe:

ParseNode
|- Expr (root of all expressions)
|- AbstractStmt (root of all statements)

The statement base would provide the source/rewritten options. Exprs only 
provide the implicit cast or not options.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java
File fe/src/main/java/org/apache/impala/analysis/ParseNode.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@33
PS2, Line 33:   String toSql(ToSqlOptions options);
A common practice in Java is to include the enum definition in the interface 
file if the enum is used only by the interface (and its implementations).


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@217
PS2, Line 217:           "OFFSET requires an ORDER BY clause: " + 
limitElement_.toSql(DEFAULT).trim());
I wonder if we can solve a mess here with exceptions also. Rather than building 
message (very inconsistently) in each message, maybe have a builder:

throw new AnalysisException
  .builder("OFFSET requires an ORDER BY clause")
  .expr(limitElement_)
  .build();

The builder would have methods for expressions (which will call the proper 
toSql version), for statements, for additional context, and so on.

This would avoid the issue of exposing the magic toSql(DEFAULT) incantation in 
zillions of places.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@44
PS2, Line 44: import static org.apache.impala.analysis.ToSqlOptions.DEFAULT;
Two suggestions.

1. Rename this as ToSqlOption (or ToSqlFormat) since we can pass only one 
option.
2. Import the enum, not the static value. Reference it with the enum name: 
ToSqlFormat.DEFAULT (or, based on earlier suggestions, ToSqlFormat.SOURCE).


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@214
PS2, Line 214:         output.append(info_.getSortExprs().get(i).toSql(DEFAULT) 
+ " ");
This is an example of the point made elsewhere. It SEEMS we are asking the 
expression to give the DEFAULT (source) SQL. But, an expression can't make that 
decision. if we want the source SQL, we'd have to have cached it.

I'm working on a project to do just that in a more standardized form. (Though, 
the result is still pretty messy in order to minimize massive changes...)



--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Fri, 16 Nov 2018 21:56:53 +0000
Gerrit-HasComments: Yes

Reply via email to