[
https://issues.apache.org/jira/browse/HIVE-18956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465693#comment-16465693
]
gonglinglei commented on HIVE-18956:
------------------------------------
{code:java}
@Override
public void initialize(Configuration configuration, Properties properties)
throws SerDeException {
...
if(!badSchema) {
this.avroSerializer = new AvroSerializer();
this.avroDeserializer = new AvroDeserializer();
}
}
{code}
It's already fixed in
[HIVE-18410|https://issues.apache.org/jira/browse/HIVE-18410], since both
{{AvroSerializer}} and {{AvroDeserializer}} now get instance in {{initialize}}.
> AvroSerDe Race Condition
> ------------------------
>
> Key: HIVE-18956
> URL: https://issues.apache.org/jira/browse/HIVE-18956
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Affects Versions: 3.0.0, 2.3.2
> Reporter: BELUGA BEHR
> Priority: Trivial
>
> {code}
> @Override
> public Writable serialize(Object o, ObjectInspector objectInspector) throws
> SerDeException {
> if(badSchema) {
> throw new BadSchemaException();
> }
> return getSerializer().serialize(o, objectInspector, columnNames,
> columnTypes, schema);
> }
> @Override
> public Object deserialize(Writable writable) throws SerDeException {
> if(badSchema) {
> throw new BadSchemaException();
> }
> return getDeserializer().deserialize(columnNames, columnTypes, writable,
> schema);
> }
> ...
> private AvroDeserializer getDeserializer() {
> if(avroDeserializer == null) {
> avroDeserializer = new AvroDeserializer();
> }
> return avroDeserializer;
> }
> private AvroSerializer getSerializer() {
> if(avroSerializer == null) {
> avroSerializer = new AvroSerializer();
> }
> return avroSerializer;
> }
> {code}
> {{getDeserializer}} and {{getSerializer}} methods are not thread safe, so
> neither are {{deserialize}} and {{serialize}} methods. It probably didn't
> matter with MapReduce, but now that we have Spark/Tez, it may be an issue.
> You could visualize a scenario where three threads all enter
> {{getSerializer}} and all see that {{avroSerializer}} is _null_ and create
> three instances, then they would fight to assign the new object to the
> {{avroSerializer}} variable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)