Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23241#discussion_r239509724
  
    --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala ---
    @@ -197,4 +201,8 @@ class ZStdCompressionCodec(conf: SparkConf) extends 
CompressionCodec {
         // avoid overhead excessive of JNI call while trying to uncompress 
small amount of data.
         new BufferedInputStream(new ZstdInputStream(s), bufferSize)
       }
    +
    +  override def zstdEventLogCompressedInputStream(s: InputStream): 
InputStream = {
    +    new BufferedInputStream(new ZstdInputStream(s).setContinuous(true), 
bufferSize)
    --- End diff --
    
    That's what I'm wondering about. Is it actually desirable to not fail on a 
partial frame? I'm not sure. We *shouldn't* encounter it elsewhere.
    
    This changes a developer API, but may not even be a breaking change as 
there is a default implementation. We can take breaking changes in Spark 3 
though.
    
    I think I agree with your approach here in the end.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to