[
https://issues.apache.org/jira/browse/SPARK-43361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720264#comment-17720264
]
Hudson commented on SPARK-43361:
--------------------------------
User 'justaparth' has created a pull request for this issue:
https://github.com/apache/spark/pull/41075
> Allow deserializing protobuf enum fields as integers
> ----------------------------------------------------
>
> Key: SPARK-43361
> URL: https://issues.apache.org/jira/browse/SPARK-43361
> Project: Spark
> Issue Type: Improvement
> Components: Protobuf
> Affects Versions: 3.4.0
> Reporter: Parth Upadhyay
> Priority: Major
>
> When deserializing protobuf enum fields, the spark-protobuf library will
> deserialize them as string values based on the enum name in the proto. E.g.
> {code:java}
> message Person {
> enum Job {
> NOTHING = 0;
> ENGINEER = 1;
> DOCTOR = 2;
> }
> Job job = 1;
> }{code}
> And we have a message like
> {code:java}
> Person(job=ENGINEER){code}
> Then the deserialized value will be:
> {code:java}
> {"job": "ENGINEER"}{code}
> However it can be useful to deserialize the enum integer value rather than
> the name (and this option exists in other major libraries). So, namely:
> {code:java}
> {"job": 1}{code}
>
> Examples in other libraries:
> * protobuf-java-util JsonFormat:
> [https://javadoc.io/doc/com.google.protobuf/protobuf-java-util/3.10.0/com/google/protobuf/util/JsonFormat.Printer.html#printingEnumsAsInts--]
> * golang/protobuf jsonpb marshaler
> [https://pkg.go.dev/github.com/golang/protobuf/jsonpb#Marshaler]
> I propose extending spark-protobuf to add this functionality.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]