[
https://issues.apache.org/jira/browse/AVRO-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399160#comment-15399160
]
Michal Turek commented on AVRO-1827:
------------------------------------
Hi, here is a script for forking of avro-protobuf and applying patch from this
task. I don't have rights to attach it as a file here.
{noformat}
#!/bin/bash -ex
# Fork Avro to fix https://issues.apache.org/jira/browse/AVRO-1827.
# Release tag or brach to apply the patch to and the patch version to be
applied.
#BRANCH_TAG=release-1.7.7
#PATCH=https://issues.apache.org/jira/secure/attachment/12799258/AVRO-1827.patch
BRANCH_TAG=release-1.8.1
PATCH=https://issues.apache.org/jira/secure/attachment/12820911/AVRO-1827.patch
git clone https://github.com/apache/avro.git
pushd avro
git checkout "${BRANCH_TAG}"
pushd lang/java/protobuf/src/
wget "${PATCH}"
dos2unix -c mac AVRO-1827.patch
patch -p0 < AVRO-1827.patch
sed -i
's%<artifactId>avro-protobuf</artifactId>%<artifactId>avro-protobuf-AVRO-1827</artifactId>%'
../pom.xml
popd
mvn clean package
echo
echo 'Result package is stored at:'
find . -name '*.jar' | grep AVRO-1827
popd
# mvn install:install-file
-Dfile=avro/lang/java/protobuf/target/avro-protobuf-AVRO-1827-1.8.1.jar
-DgroupId=org.apache.avro -DartifactId=avro-protobuf-AVRO-1827 -Dversion=1.8.1
-Dpackaging=jar
{noformat}
> Handling correctly optional fields when converting Protobuf to Avro
> -------------------------------------------------------------------
>
> Key: AVRO-1827
> URL: https://issues.apache.org/jira/browse/AVRO-1827
> Project: Avro
> Issue Type: Improvement
> Affects Versions: 1.7.7, 1.8.0
> Reporter: Jakub Kahovec
> Attachments: AVRO-1827.patch, AVRO-1827.patch, AVRO-1827.patch
>
>
> Hello,
> as of the current implementation of converting protobuf files into avro
> format, protobuf optional fields are being given default values in the avro
> schema if not specified explicitly.
> So for instance when the protobuf field is defined as
> {quote}
> optional int64 fieldInt64 = 1;
> {quote}
> in the avro schema it appears as
> {quote}
> "name" : "fieldInt64",
> "type" : "long",
> "default" : 0
> {quote}
> The problem with this implementation is that we are losing information about
> whether the field was present or not in the original protobuf, as when we ask
> for this field's value in avro we will be given the default value.
> What I'm proposing instead is that if the field in the protobuf is defined as
> optional and has no default value then the generated avro schema type will us
> a union comprising the matching type and null type with default value null.
> It is going to look like this:
> {quote}
> "name" : "fieldIn64",
> "type" : [ "null", "long" ],
> "default" : null
> {quote}
> I'm aware that is a breaking change but I think that is the proper way how to
> handle optional fields.
> I've also created a patch which fixes the conversion
> Jakub
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)