gszadovszky commented on issue #3006: URL: https://github.com/apache/parquet-java/issues/3006#issuecomment-2748534423
Thanks @kenwenzel for the clarification. It seems your use case is more "black box" like, meaning your not using parquet-java from the java API, but only transitively via Spark. So you would benefit from the configurable approach more. What I would like to avoid is to implement a caching logic, that might be configurable, still useful for only a specific scenario. Like we are talking about local, in-memory cache, I guess. What about distributed caching? Maybe spilling? I think I would push this request back a bit to the original idea of making parquet-java extendable instead of getting in an actual implementation for caching. We may still implement some configuration where the user can hook their implementation into parquet-java. @wgtmac, WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
