> I'm using avro C library to serialize and de-serialize data. I'm seeing
> issues(seg-faults) when I use it with multi-threading, the same code works
> fine when I run it in single threaded mode. To give some details about my
> code, on init I read the schema from a file and create a global
> avro_schema_t object. Multiple threads then use this global variable(schema)
> to serialize and de-serialize data. The schema itself is never modified
> during run-time in my code. From whatever I understood by going through avro
> code, I don't think the schema is modified by avro code either during
> serializing/de-serializing. If this is in-fact the case, the schema is
> essentially a read-only global and should be fine with multiple threads
> accessing it. I haven't specifically found any documentation that claims
> that avro C is thread safe, It would be really helpful if someone who as
> used avro C in a multi-threaded environment could share their experience.
> And also, let me know if what I am trying is infact possible.

Which library version are you using?  Anything in the 1.5 branch or earlier 
doesn't make any guarantees about thread safety.  Awhile back I checked in a 
patch for AVRO-746 [1] that made the various incref and decref functions 
thread-safe, but this was only applied to the Subversion HEAD, and not 
back-ported to 1.5.  You're right that the contents of the schema objects 
aren't modified during serialization or deserialization, but some of the helper 
objects that are created do update the reference counts of any schemas pointers 
that they hold.  Without the AVRO-746 patch, you could easily have race 
conditions that would cause the schema objects to be freed while there were 
still references to them.

Can you try the latest Subversion HEAD and see if that fixes the segfaults?  
Also note that even with HEAD, it's only the incref and decref functions that 
are thread-safe.  If you're doing any updates or modifications to an Avro 
object, it should only be used within a single thread.  And if any object is 
used in multiple threads, you can only read from it.

An alternative, if you can't use HEAD, is to create a separate copy of the 
schema for each thread, using avro_schema_copy.

[1] https://issues.apache.org/jira/browse/AVRO-746

cheers
–doug

Reply via email to