[ https://issues.apache.org/jira/browse/SPARK-53160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-53160: ---------------------------------- Description: In general, Java built-in `Files.writeString` is over 4 times faster than Google `Files.asCharSink`. {code} scala> val s = "a".repeat(500_000_000) scala> spark.time(com.google.common.io.Files.asCharSink(new java.io.File("/dev/null"), java.nio.charset.StandardCharsets.UTF_8).write(s)) Time taken: 265 ms scala> spark.time(java.nio.file.Files.writeString(Path.of("/dev/null"), s)) Time taken: 59 ms val res1: java.nio.file.Path = /dev/null {code} > Use Java `Files.writeString` instead of `Files.asCharSink` > ---------------------------------------------------------- > > Key: SPARK-53160 > URL: https://issues.apache.org/jira/browse/SPARK-53160 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, MLlib, Spark Core, SQL, SS, YARN > Affects Versions: 4.1.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun > Priority: Major > Labels: pull-request-available > > In general, Java built-in `Files.writeString` is over 4 times faster than > Google `Files.asCharSink`. > {code} > scala> val s = "a".repeat(500_000_000) > scala> spark.time(com.google.common.io.Files.asCharSink(new > java.io.File("/dev/null"), java.nio.charset.StandardCharsets.UTF_8).write(s)) > Time taken: 265 ms > scala> spark.time(java.nio.file.Files.writeString(Path.of("/dev/null"), s)) > Time taken: 59 ms > val res1: java.nio.file.Path = /dev/null > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org