Hi All, I have a batch application that serializes some fixed length files to avro . The program reads up a file and then spawns multiple threads to serialize to avro and write to HDFS.I have seen that it hotspots on GenericData.getDefaultValue(Field field) since the internal cache implementation is based on a synchronized map .Any reason why this has to be a synchronized map?. I am not getting enough throughput on the writes . Any thoughts on how to accomplish fast writes would be very helpful.
Thanks
