Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23052#discussion_r236652952
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala
 ---
    @@ -169,13 +169,18 @@ private[csv] class CsvOutputWriter(
         context: TaskAttemptContext,
         params: CSVOptions) extends OutputWriter with Logging {
     
    -  private val charset = Charset.forName(params.charset)
    +  private var univocityGenerator: Option[UnivocityGenerator] = None
    --- End diff --
    
    I don't mean that it would cause an error, but that it could create many 
generators and writers that aren't closed. It may not be obvious that it's 
happening. Unless we know writes will only happen in one thread what about 
breaking out and synchronizing the get/create part of this method?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to