platypii commented on code in PR #115: URL: https://github.com/apache/parquet-site/pull/115#discussion_r2096981770
########## content/en/docs/File Format/implementationstatus.md: ########## @@ -118,7 +118,7 @@ Implementations: | Feature | arrow | parquet-java | arrow-go | arrow-rs | cudf | hyparquet | duckdb | | ----------------------------------------- | ----- | ------------- | -------- | -------- | ----- | --------- | ------ | -| External column data (1) | ✅ | ✅ | ❌ | ❌ | (W) | ❌ | ❌ | +| External column data (1) | ✅ | ✅ (*) | ❌ | ❌ | (W) | ✅ | ❌ | Review Comment: Sorry that wasn't meant for this PR! I'll revert that as it is a separate from my intended update here. Backstory: I've been implementing support for parquet `file_path` for external column data in hyparquet. And the parquet-java implementation has significant limitations: 1) if any file_path is set, then _every_ column chunk must be external, 2) all file_paths that are external must be the same file, and 3) even with those assumptions I've been unable to make a single example parquet file with external column data that it _can_ read. That might be on me, I'm still investigating. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
