josh gruenberg created KAFKA-3499:
-------------------------------------

             Summary: byte[] should not be used as Map key nor Set member
                 Key: KAFKA-3499
                 URL: https://issues.apache.org/jira/browse/KAFKA-3499
             Project: Kafka
          Issue Type: Bug
          Components: kafka streams
            Reporter: josh gruenberg


On the JVM, Array.equals and Array.hashCode do not incorporate array contents; 
they inherit Object.equals/hashCode. This implies that Collections that rely 
upon equals/hashCode (eg, HashMap/HashSet and variants) treat two arrays with 
equal contents as distinct elements.

Many of the Kafka Streams internal classes currently use generic HashMaps and 
Sets to manage caches and invalidation status. For example, 
RocksDBStore.cacheDirtyKeys is a HashSet<K>. Then, in RocksDBWindowStore, the 
Elements are constructed as RocksDBStore<byte[], byte[]>.

Similarly, the MemoryLRUCache<K, RocksDBCacheEntry> internally holds a 
LinkedHashMap<K,V> map, and a HashSet<K> keys, and these end up holding byte[] 
keys. Finally, user-code may attempt to use any of these provided types with 
byte[], with undesirable results.

Keys that are byte-arrays should be wrapped in a type that incorporates the 
content in their computation of equals/hashCode. java.nio.ByteBuffer is one 
such type that could be used, but a purpose-built immutable class would likely 
be a better solution.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to