(spark) branch master updated: [SPARK-55801][PYTHON] Fix type hint of _SimpleStreamReaderWrapper.getCache

ruifengz Mon, 02 Mar 2026 18:58:55 -0800

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new be10dc03bbb0 [SPARK-55801][PYTHON] Fix type hint of 
_SimpleStreamReaderWrapper.getCache
be10dc03bbb0 is described below

commit be10dc03bbb0da40f0d5b3058c8907fdffaac41a
Author: Tian Gao <[email protected]>
AuthorDate: Tue Mar 3 10:58:23 2026 +0800

    [SPARK-55801][PYTHON] Fix type hint of _SimpleStreamReaderWrapper.getCache
    
    ### What changes were proposed in this pull request?
    
    Add `None` to type hint of `getCache` method.
    
    ### Why are the changes needed?
    
    The `getCache` method can return `None` but the type hint says otherwise. 
This confuses the type checker.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    CI.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #54582 from gaogaotiantian/get-cache-type-hint.
    
    Authored-by: Tian Gao <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 python/pyspark/sql/datasource_internal.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/pyspark/sql/datasource_internal.py 
b/python/pyspark/sql/datasource_internal.py
index 2ac6c280e822..34039216c1d8 100644
--- a/python/pyspark/sql/datasource_internal.py
+++ b/python/pyspark/sql/datasource_internal.py
@@ -19,7 +19,7 @@
 import json
 import copy
 from itertools import chain
-from typing import Iterator, List, Sequence, Tuple, Type, Dict
+from typing import Iterator, List, Optional, Sequence, Tuple, Type, Dict
 
 from pyspark.sql.datasource import (
     DataSource,
@@ -143,7 +143,7 @@ class _SimpleStreamReaderWrapper(DataSourceStreamReader):
             assert self.cache[-1].end == end
         return [SimpleInputPartition(start, end)]
 
-    def getCache(self, start: dict, end: dict) -> Iterator[Tuple]:
+    def getCache(self, start: dict, end: dict) -> Optional[Iterator[Tuple]]:
         start_idx = -1
         end_idx = -1
         for idx, entry in enumerate(self.cache):
@@ -155,7 +155,7 @@ class _SimpleStreamReaderWrapper(DataSourceStreamReader):
                 end_idx = idx
                 break
         if start_idx == -1 or end_idx == -1:
-            return None  # type: ignore[return-value]
+            return None
         # Chain all the data iterator between start offset and end offset
         # need to copy here to avoid exhausting the original data iterator.
         entries = [copy.copy(entry.iterator) for entry in self.cache[start_idx 
: end_idx + 1]]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55801][PYTHON] Fix type hint of _SimpleStreamReaderWrapper.getCache

Reply via email to