This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0-preview
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0-preview by this 
push:
     new 931cc1b  [SPARK-27259][CORE] Allow setting -1 as length for FileBlock
931cc1b is described below

commit 931cc1ba068f64a835264c1c8fc3431ecd4e31a0
Author: prasha2 <[email protected]>
AuthorDate: Tue Oct 15 22:22:37 2019 -0700

    [SPARK-27259][CORE] Allow setting -1 as length for FileBlock
    
    ### What changes were proposed in this pull request?
    
    This PR aims to update the validation check on `length` from `length >= 0` 
to `length >= -1` in order to allow set `-1` to keep the default value.
    
    ### Why are the changes needed?
    
    At Apache Spark 2.2.0, 
[SPARK-18702](https://github.com/apache/spark/pull/16133/files#diff-2c5519b1cf4308d77d6f12212971544fR27-R38)
 adds `class FileBlock` with the default `length` value, `-1`, initially.
    
    There is no way to set `filePath` only while keeping `length` is `-1`.
    ```scala
      def set(filePath: String, startOffset: Long, length: Long): Unit = {
         require(filePath != null, "filePath cannot be null")
         require(startOffset >= 0, s"startOffset ($startOffset) cannot be 
negative")
         require(length >= 0, s"length ($length) cannot be negative")
         inputBlock.set(new FileBlock(UTF8String.fromString(filePath), 
startOffset, length))
       }
    ```
    
    For compressed files (like GZ), the size of split can be set to -1. This 
was allowed till Spark 2.1 but regressed starting with spark 2.2.x. Please note 
that split length of -1 also means the length was unknown - a valid scenario. 
Thus, split length of -1 should be acceptable like pre Spark 2.2.
    
    ### Does this PR introduce any user-facing change?
    
    No
    
    ### How was this patch tested?
    
    This is updating the corner case on the requirement check. Manually check 
the code.
    
    Closes #26123 from praneetsharma/fix-SPARK-27259.
    
    Authored-by: prasha2 <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 57edb4258254fa582f8aae6bfd8bed1069e8155c)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 core/src/main/scala/org/apache/spark/rdd/InputFileBlockHolder.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/core/src/main/scala/org/apache/spark/rdd/InputFileBlockHolder.scala 
b/core/src/main/scala/org/apache/spark/rdd/InputFileBlockHolder.scala
index bfe8152..1beb085 100644
--- a/core/src/main/scala/org/apache/spark/rdd/InputFileBlockHolder.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/InputFileBlockHolder.scala
@@ -76,7 +76,7 @@ private[spark] object InputFileBlockHolder {
   def set(filePath: String, startOffset: Long, length: Long): Unit = {
     require(filePath != null, "filePath cannot be null")
     require(startOffset >= 0, s"startOffset ($startOffset) cannot be negative")
-    require(length >= 0, s"length ($length) cannot be negative")
+    require(length >= -1, s"length ($length) cannot be smaller than -1")
     inputBlock.get().set(new FileBlock(UTF8String.fromString(filePath), 
startOffset, length))
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to