400Ping commented on code in PR #1174:
URL: https://github.com/apache/mahout/pull/1174#discussion_r2923291289
##########
qdp/qdp-python/qumat_qdp/loader.py:
##########
@@ -166,10 +166,15 @@ def source_file(self, path: str, streaming: bool = False)
-> QuantumDataLoader:
For streaming=True (Phase 2b), only .parquet is supported; data is
read in chunks to reduce memory.
For streaming=False, supports .parquet, .arrow, .feather, .ipc, .npy,
.pt, .pth, .pb.
- Remote paths (s3://) are supported when the remote-io feature is
enabled.
+ Remote paths (s3://, gs://) are supported when the remote-io feature
is enabled.
+ Remote URL query/fragment (for example ?versionId=... or #...) is not
supported.
"""
if not path or not isinstance(path, str):
raise ValueError(f"path must be a non-empty string, got {path!r}")
+ if "://" in path and ("?" in path or "#" in path):
+ raise ValueError(
+ "Remote URL query/fragment is not supported; use plain
scheme://bucket/key path."
+ )
# For remote URLs, extract the key portion for extension checks.
check_path = path.split("?")[0].rsplit("/", 1)[-1] if "://" in path
else path
Review Comment:
For reference: https://github.com/ray-project/ray/pull/61376
This is an actual problem in ray data.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]