[
https://issues.apache.org/jira/browse/HIVE-18956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Mollitor resolved HIVE-18956.
-----------------------------------
Resolution: Won't Fix
> AvroSerDe Race Condition
> ------------------------
>
> Key: HIVE-18956
> URL: https://issues.apache.org/jira/browse/HIVE-18956
> Project: Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Affects Versions: 3.0.0, 2.3.2
> Reporter: David Mollitor
> Priority: Trivial
>
> {code}
> @Override
> public Writable serialize(Object o, ObjectInspector objectInspector) throws
> SerDeException {
> if(badSchema) {
> throw new BadSchemaException();
> }
> return getSerializer().serialize(o, objectInspector, columnNames,
> columnTypes, schema);
> }
> @Override
> public Object deserialize(Writable writable) throws SerDeException {
> if(badSchema) {
> throw new BadSchemaException();
> }
> return getDeserializer().deserialize(columnNames, columnTypes, writable,
> schema);
> }
> ...
> private AvroDeserializer getDeserializer() {
> if(avroDeserializer == null) {
> avroDeserializer = new AvroDeserializer();
> }
> return avroDeserializer;
> }
> private AvroSerializer getSerializer() {
> if(avroSerializer == null) {
> avroSerializer = new AvroSerializer();
> }
> return avroSerializer;
> }
> {code}
> {{getDeserializer}} and {{getSerializer}} methods are not thread safe, so
> neither are {{deserialize}} and {{serialize}} methods. It probably didn't
> matter with MapReduce, but now that we have Spark/Tez, it may be an issue.
> You could visualize a scenario where three threads all enter
> {{getSerializer}} and all see that {{avroSerializer}} is _null_ and create
> three instances, then they would fight to assign the new object to the
> {{avroSerializer}} variable.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)