[ https://issues.apache.org/jira/browse/SPARK-53190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-53190: ---------------------------------- Description: Java `transferTo` is significantly faster than `ByteStreams.copy`. {code} scala> import java.io._ import java.io._ scala> spark.time(new FileInputStream("/tmp/4G.bin").transferTo(new FileOutputStream("/dev/null"))) Time taken: 5 ms val res2: Long = 4294967296 scala> spark.time(com.google.common.io.ByteStreams.copy(new FileInputStream("/tmp/4G.bin"), new FileOutputStream("/dev/null"))) Time taken: 772 ms val res3: Long = 4294967296 {code} was: {code} scala> import java.io._ import java.io._ scala> spark.time(new FileInputStream("/tmp/4G.bin").transferTo(new FileOutputStream("/dev/null"))) Time taken: 5 ms val res2: Long = 4294967296 scala> spark.time(com.google.common.io.ByteStreams.copy(new FileInputStream("/tmp/4G.bin"), new FileOutputStream("/dev/null"))) Time taken: 772 ms val res3: Long = 4294967296 {code} > Use Java `InputStream.transferTo` instead of `ByteStreams.copy` > --------------------------------------------------------------- > > Key: SPARK-53190 > URL: https://issues.apache.org/jira/browse/SPARK-53190 > Project: Spark > Issue Type: Sub-task > Components: Spark Core > Affects Versions: 4.1.0 > Reporter: Dongjoon Hyun > Priority: Major > > Java `transferTo` is significantly faster than `ByteStreams.copy`. > {code} > scala> import java.io._ > import java.io._ > scala> spark.time(new FileInputStream("/tmp/4G.bin").transferTo(new > FileOutputStream("/dev/null"))) > Time taken: 5 ms > val res2: Long = 4294967296 > scala> spark.time(com.google.common.io.ByteStreams.copy(new > FileInputStream("/tmp/4G.bin"), new FileOutputStream("/dev/null"))) > Time taken: 772 ms > val res3: Long = 4294967296 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org