(spark) branch master updated: [SPARK-53195][CORE] Use Java `InputStream.readNBytes` instead of `ByteStreams.read`

dongjoon Fri, 08 Aug 2025 06:35:32 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 8eaad0069e3f [SPARK-53195][CORE] Use Java `InputStream.readNBytes` 
instead of `ByteStreams.read`
8eaad0069e3f is described below

commit 8eaad0069e3f26264be849600ee224b9ca9bafc8
Author: Dongjoon Hyun <dongj...@apache.org>
AuthorDate: Fri Aug 8 06:35:12 2025 -0700

    [SPARK-53195][CORE] Use Java `InputStream.readNBytes` instead of 
`ByteStreams.read`
    
    ### What changes were proposed in this pull request?
    
    This PR aims to use Java 9+ `InputStream.readNBytes` API instead of 
`ByteStreams.read`.
    
    ### Why are the changes needed?
    
    To simplify the code by using native Java API.
    
    ```scala
    - var numBytes = ByteStreams.read(gzInputStream, buf, 0, bufSize)
    + var numBytes = gzInputStream.readNBytes(buf, 0, bufSize)
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No behavior change.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #51923 from dongjoon-hyun/SPARK-53195.
    
    Authored-by: Dongjoon Hyun <dongj...@apache.org>
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
---
 core/src/main/scala/org/apache/spark/util/Utils.scala | 6 +++---
 scalastyle-config.xml                                 | 5 +++++
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 57f70489a860..4f2d5fe8f820 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -46,7 +46,7 @@ import scala.util.matching.Regex
 import _root_.io.netty.channel.unix.Errors.NativeIoException
 import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache}
 import com.google.common.collect.Interners
-import com.google.common.io.{ByteStreams, Files => GFiles}
+import com.google.common.io.{Files => GFiles}
 import com.google.common.net.InetAddresses
 import jakarta.ws.rs.core.UriBuilder
 import org.apache.commons.codec.binary.Hex
@@ -1558,10 +1558,10 @@ private[spark] object Utils
       gzInputStream = new GZIPInputStream(new FileInputStream(file))
       val bufSize = 1024
       val buf = new Array[Byte](bufSize)
-      var numBytes = ByteStreams.read(gzInputStream, buf, 0, bufSize)
+      var numBytes = gzInputStream.readNBytes(buf, 0, bufSize)
       while (numBytes > 0) {
         fileSize += numBytes
-        numBytes = ByteStreams.read(gzInputStream, buf, 0, bufSize)
+        numBytes = gzInputStream.readNBytes(buf, 0, bufSize)
       }
       fileSize
     } catch {
diff --git a/scalastyle-config.xml b/scalastyle-config.xml
index e6d0007cae48..42e3913f10d6 100644
--- a/scalastyle-config.xml
+++ b/scalastyle-config.xml
@@ -732,6 +732,11 @@ This file is divided into 3 sections:
     <customMessage>Use Java `write` instead.</customMessage>
   </check>
 
+  <check customId="bytestreamsread" level="error" 
class="org.scalastyle.file.RegexChecker" enabled="true">
+    <parameters><parameter 
name="regex">\bByteStreams\.read\b</parameter></parameters>
+    <customMessage>Use Java readNBytes instead.</customMessage>
+  </check>
+
   <check customId="bytestreamscopy" level="error" 
class="org.scalastyle.file.RegexChecker" enabled="true">
     <parameters><parameter 
name="regex">\bByteStreams\.copy\b</parameter></parameters>
     <customMessage>Use Java transferTo instead.</customMessage>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-53195][CORE] Use Java `InputStream.readNBytes` instead of `ByteStreams.read`

Reply via email to