[GitHub] [parquet-mr] guillaume-fetter commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support
guillaume-fetter commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1193643505 Thank you very much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] guillaume-fetter commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support
guillaume-fetter commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1154004113 @dossett Depends on your use case. If you are running a simple program that does data processing on a single host, then you're good. If you are using a big data processing tool (like me here, Flink) you can't pass around a DM instance from one task to the other, or at least, I did not find a way to make it work... For unrelated reasons, we are using the SelfDescribingMessage design pattern (https://developers.google.com/protocol-buffers/docs/techniques#self-description), which is a specific message, therefore serializable. From there we wrote a parquet writer which basically converts the SelfDescribingMessage to a DynamicMessage and then writes it using this upgraded ProtoWriteSupport. It's clearly convoluted unless you are already using a SelfDescribingMessage or equivalent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] guillaume-fetter commented on pull request #963: PARQUET-1020 Add DynamicMessage writing support
guillaume-fetter commented on PR #963: URL: https://github.com/apache/parquet-mr/pull/963#issuecomment-1153626738 Just a heads-up (because I have run into that issue), DynamicMessage is not serializable. So this means that this use-case is for local-only instances of a DynamicMessage. In my use case I need to build the DynamicMessage from another object which is serializable and do so directly in the writer, which is a bit convoluted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org