[ 
https://issues.apache.org/jira/browse/SPARK-53160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-53160:
----------------------------------
    Description: 
In general, Java built-in `Files.writeString` is over 4 times faster than 
Google `Files.asCharSink`.

{code}
scala> val s = "a".repeat(500_000_000)

scala> spark.time(com.google.common.io.Files.asCharSink(new 
java.io.File("/dev/null"), java.nio.charset.StandardCharsets.UTF_8).write(s))
Time taken: 265 ms

scala> spark.time(java.nio.file.Files.writeString(Path.of("/dev/null"), s))
Time taken: 59 ms
val res1: java.nio.file.Path = /dev/null
{code}

> Use Java `Files.writeString` instead of `Files.asCharSink`
> ----------------------------------------------------------
>
>                 Key: SPARK-53160
>                 URL: https://issues.apache.org/jira/browse/SPARK-53160
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes, MLlib, Spark Core, SQL, SS, YARN
>    Affects Versions: 4.1.0
>            Reporter: Dongjoon Hyun
>            Assignee: Dongjoon Hyun
>            Priority: Major
>              Labels: pull-request-available
>
> In general, Java built-in `Files.writeString` is over 4 times faster than 
> Google `Files.asCharSink`.
> {code}
> scala> val s = "a".repeat(500_000_000)
> scala> spark.time(com.google.common.io.Files.asCharSink(new 
> java.io.File("/dev/null"), java.nio.charset.StandardCharsets.UTF_8).write(s))
> Time taken: 265 ms
> scala> spark.time(java.nio.file.Files.writeString(Path.of("/dev/null"), s))
> Time taken: 59 ms
> val res1: java.nio.file.Path = /dev/null
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to