Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10757#issuecomment-172296357
  
    A summary of offline discussion with @rxin and @marmbrus:
    
    The reason why @rxin suggested removing all back-ticks in generated column 
names is mostly because of backwards compatibility. Those generated column 
names are now generated using `Expression.sql` instead of 
`Expression.prettyString`. For example, the following DataFrame
    
    ```scala
    df.selectExpr("id + 1")
    ```
    
    used to produce a single column named `id + 1`, but now it becomes `` `id` 
+ 1``.
    
    However, later on I found that removing back-ticks in generated column 
names still cannot guarantee this level of backwards-compatibility. This is 
because, although `pettyString` and `sql` are often quite similar, they are 
still inherently different from each other in many cases, for example:
    
    Expression | `prettyString` | `sql`
    ---------- | -------------- | -----
    `a && b` | `a && b` | `a AND b`
    `a.getField("f")` | `a[f]` | `a.f`
    `m.getItem("key")` | `m[key]` | `m["key"]`
    
    Basically we won't be able to replace `prettyString` with `sql` is we do 
want this level of backwards-compatibility.  However, we may see that the 
original `prettyString` method doesn't return proper column names.  I tend to 
fix them in Spark 2.0.  As for back-tick quoting, a utility method 
`safeSQLIdent` is added to quote identifiers only when necessary.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to