eatoncys commented on issue #23010: [SPARK-26012][SQL]Null and '' values should 
not cause dynamic partition failure of string types
URL: https://github.com/apache/spark/pull/23010#issuecomment-465517963
 
 
   > this looks a little hacky. How about we create an analyzer rule, which 
deals with `InsertIntoHadoopFsRelationCommand`, and changes its `query` field 
to do the empty-string-to-null for partition columns?
   
   Sorry, maybe I didn't understand the meaning of above correctly, does it 
mean add an analyzer rule to match `InsertIntoHadoopFsRelationCommand` in 
`Analyzer`, and changes its `query` field, but the 
`InsertIntoHadoopFsRelationCommand` is newly created after the `query` plan has 
been analyzed in the code below:
   
   def writeAndRead(
   mode: SaveMode,
   **data: LogicalPlan,**
   outputColumnNames: Seq[String],
   **physicalPlan: SparkPlan**): BaseRelation = {
   ...
   case format: FileFormat =>
   **val cmd = planForWritingFileFormat(format, mode, _data_)**
   ...
   val resolved = cmd.copy(
   partitionColumns = resolvedPartCols,
   outputColumnNames = outputColumnNames)
   **resolved.run(sparkSession, _physicalPlan_)**
   
   `InsertIntoHadoopFsRelationCommand` is created by 
`planForWritingFileFormat`, and then `cmd.run` is called immediately with the 
`physicalPlan` which is analyzed already, so, where to add an analyzer rule?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to