[
https://issues.apache.org/jira/browse/AVRO-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553598#comment-17553598
]
tansion edited comment on AVRO-3531 at 6/13/22 1:59 PM:
--------------------------------------------------------
And it is ture that not easy to reproduce the issue (Same code running well
more than two years before i meet this issue)
java.util.IdentityHashMap#put
{code:java}
/**
* Returns index for Object x.
*/
private static int hash(Object x, int length) {
int h = System.identityHashCode(x);
// Multiply by -127, and left-shift to use least bit as part of hash
return ((h << 1) - (h << 8)) & (length - 1);
}
/**
* Circularly traverses table of size len.
*/
private static int nextKeyIndex(int i, int len) {
return (i + 2 < len ? i + 2 : 0);
}
public V put(K key, V value) {
final Object k = maskNull(key);
retryAfterResize: for (;;) {
final Object[] tab = table;
final int len = tab.length;
int i = hash(k, len);
for (Object item; (item = tab[i]) != null;
i = nextKeyIndex(i, len)) {
if (item == k) {
@SuppressWarnings("unchecked")
V oldValue = (V) tab[i + 1];
tab[i + 1] = value;
return oldValue;
}
}
final int s = size + 1;
// Use optimized form of 3 * s.
// Next capacity is len, 2 * current capacity.
if (s + (s << 1) > len && resize(len))
continue retryAfterResize;
modCount++;
tab[i] = k;
tab[i + 1] = value;
size = s;
return null;
}
}
{code}
The initial capacity is 32, only when there are 32 KVs (or 32 * 2/3) being
inserted into the map concurrently, and each key's hash code happens to be
different, like 0~31. The resize operation might be skipped, and all the slot
is occupied.
Then the get() method will go into infinite loop status for each time.
was (Author: JIRAUSER290698):
And it is ture that not easy to reproduce the issue (Same code running well
more than two years before i meet this issue)
{code:java}
/**
* Returns index for Object x.
*/
private static int hash(Object x, int length) {
int h = System.identityHashCode(x);
// Multiply by -127, and left-shift to use least bit as part of hash
return ((h << 1) - (h << 8)) & (length - 1);
}
/**
* Circularly traverses table of size len.
*/
private static int nextKeyIndex(int i, int len) {
return (i + 2 < len ? i + 2 : 0);
}
public V put(K key, V value) {
final Object k = maskNull(key);
retryAfterResize: for (;;) {
final Object[] tab = table;
final int len = tab.length;
int i = hash(k, len);
for (Object item; (item = tab[i]) != null;
i = nextKeyIndex(i, len)) {
if (item == k) {
@SuppressWarnings("unchecked")
V oldValue = (V) tab[i + 1];
tab[i + 1] = value;
return oldValue;
}
}
final int s = size + 1;
// Use optimized form of 3 * s.
// Next capacity is len, 2 * current capacity.
if (s + (s << 1) > len && resize(len))
continue retryAfterResize;
modCount++;
tab[i] = k;
tab[i + 1] = value;
size = s;
return null;
}
}
{code}
The initial capacity is 32, only when there are 32 KVs (or 32 * 2/3) being
inserted into the map concurrently, and each key's hash code happens to be
different, like 0~31. The resize operation might be skipped, and all the slot
is occupied.
Then the get() method will go into infinite loop status for each time.
> GenericDatumReader in multithread lead to infinite loop cause misused of
> IdentityHashMap
> ----------------------------------------------------------------------------------------
>
> Key: AVRO-3531
> URL: https://issues.apache.org/jira/browse/AVRO-3531
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.11.0
> Reporter: tansion
> Priority: Critical
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Hi,
> I am working on a java project that uses Kafka with Avro
> serialization/deserialization in an messaging platform.
> In production enrionment, we meet a serious issue on the deserialization
> processs. The GenericDatumReader process some how get into a infinite loop
> status, and it is happened accationally.
> When the issue happens, The thread stack is like this:
>
> {code:java}
> "DmqFixedRateConsumer-Thread-17" #453 daemon prio=5 os_prio=0
> tid=0x00007f2ae1832800 nid=0xef49 runnable [0x00007f2a743fc000]
> java.lang.Thread.State: RUNNABLE
> at java.util.IdentityHashMap.get(IdentityHashMap.java:337)
> at
> org.apache.avro.generic.GenericDatumReader.getStringClass(GenericDatumReader.java:503)
> at
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:454)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:191)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:187)
> at
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:291)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
> at
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
> at
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
> at com.xxx.xxx.xxx.xxx.xxx.XXX.deserialize(XXX.java:252)
> at com.xxx.xxx.xxx.xxx.xxx.ZZZ.deserialize(ZZZ.java:216)
> at com.xxx.xxx.xxx.xxx.xxx.SSS.processMessage(SSS.java:152)
> at com.xxx.xxx.xxx.xxx.xxx.SSS.loopProcess(SSS.java:127)
> at com.xxx.xxx.xxx.xxx.xxx.SSS$$Lambda$172/367082698.run(Unknown Source)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) {code}
> We create 30 threads, and all the threads are the same as above! They all get
> stuck in the IdentityHashMap.get() method.
>
> Accroding to this mail [1.7.6 Slow
> Deserialization|https://www.mail-archive.com/[email protected]/msg02902.html],
> the Reader is thread-safe, But actually, it seems not.
> Why?
> org.apache.avro.generic.GenericDatumReader#getStringClass
>
> {code:java}
> /**
> * Called to read strings. Subclasses may override to use a different string
> * representation. By default, this calls {@link #readString(Object,Decoder)}.
> */
> protected Object readString(Object old, Schema expected, Decoder in) throws
> IOException {
> Class stringClass = getStringClass(expected);
> if (stringClass == String.class) {
> return in.readString();
> }
> if (stringClass == CharSequence.class) {
> return readString(old, in);
> }
> return newInstanceFromString(stringClass, in.readString());
> }
> private Map<Schema, Class> stringClassCache = new IdentityHashMap<>();
> private Class getStringClass(Schema s) {
> Class c = stringClassCache.get(s);
> if (c == null) {
> c = findStringClass(s);
> stringClassCache.put(s, c);
> }
> return c;
> }
> {code}
> The IdentityHashMap is not thread-safe, which is addressed by javadoc
> clearly! Like Hashmap infinite loop issue in multithread using, same issue
> happen to IdentityHashMap,too.
> My question is: Can the class GenericDatumReader fix this issue and act like
> real thread-safe? Or we need to avoid use the single instance of
> GenericDatumReader in multithread?
> Thanks a lot,
> Xtsong.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)