rich7420 commented on code in PR #1174:
URL: https://github.com/apache/mahout/pull/1174#discussion_r2923245221


##########
qdp/qdp-python/qumat_qdp/loader.py:
##########
@@ -166,10 +166,15 @@ def source_file(self, path: str, streaming: bool = False) 
-> QuantumDataLoader:
 
         For streaming=True (Phase 2b), only .parquet is supported; data is 
read in chunks to reduce memory.
         For streaming=False, supports .parquet, .arrow, .feather, .ipc, .npy, 
.pt, .pth, .pb.
-        Remote paths (s3://) are supported when the remote-io feature is 
enabled.
+        Remote paths (s3://, gs://) are supported when the remote-io feature 
is enabled.
+        Remote URL query/fragment (for example ?versionId=... or #...) is not 
supported.
         """
         if not path or not isinstance(path, str):
             raise ValueError(f"path must be a non-empty string, got {path!r}")
+        if "://" in path and ("?" in path or "#" in path):
+            raise ValueError(
+                "Remote URL query/fragment is not supported; use plain 
scheme://bucket/key path."
+            )
         # For remote URLs, extract the key portion for extension checks.
         check_path = path.split("?")[0].rsplit("/", 1)[-1] if "://" in path 
else path

Review Comment:
   nit: Since query/fragment is already rejected on L174, the 
`path.split("?")[0]` here is effectively dead code for remote URLs. Could 
simplify to just `path.rsplit("/", 1)[-1]`, or drop a quick comment noting it's 
kept as a defensive fallback. Not a big deal either way



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to