PavithraRamachandran opened a new pull request #25628: 
[SPARK-28897][Core]'coalesce' error when executing dataframe.na.fill
URL: https://github.com/apache/spark/pull/25628
 
 
   ### What changes were proposed in this pull request?
   **Root Cause:**
   When a dataframe is created using select statement (using 
**spark.sql.parser.quotedRegexColumnNames=true**) dataframe fill is called- the 
_fillCol_ in DataFrameNaFunctions, **``(backtick)** are added  explicitly to 
the **columnNames**, the column name is misunderstood to be a regex and it is 
set as an unresolvedregex, which makes the coalesce resolving to fail.
   
   _Observation_
   When we create the dataframe from the select statement using a regex, valid 
columns names are returned after applying the filter(regex). So adding 
_backticks_ to column name in this flow was not needed. To check the impact, 
select statement with regex were used, there was no impact while executing 
without the _backticks_.
   
   **After Fix**
   While passing the columnname to the dataframe column method, 
**``(backtick)** are not added, as the value that is received is not a regular 
expression, but a valid column name.
   
   ### Why are the changes needed?
   By doing this change column name is not considered as regex and the proper 
Column function is derived.
   And does not fail to resolve the expression.
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   The patch was tested by adding UT cases. And testing in spark shell using 
various select statement .(with and without regex)
   
   Before Fix:
   
![Before](https://user-images.githubusercontent.com/51401130/63996784-417fe600-cb1a-11e9-9c0c-f15a0e9d362c.png)
   
   
   After Fix:
   
![After](https://user-images.githubusercontent.com/51401130/63996792-4e043e80-cb1a-11e9-8ddf-753f9e1444f8.png)
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to