[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065155#comment-16065155 ]
Guozhang Wang commented on KAFKA-4750: -------------------------------------- [~evis] Inside RocksDB store, after the serialization, if we get "null" byte arrays (NOTE it is not "null" object that gets passed into the API) then we should always treat it as a delete call; i.e. the current implementation inside RocksDB is ok: {code} private void putInternal(byte[] rawKey, byte[] rawValue) { if (rawValue == null) { try { db.delete(wOptions, rawKey); } catch (RocksDBException e) { ... } } else { try { db.put(wOptions, rawKey, rawValue); } catch (RocksDBException e) { ... } } } {code} The question is, on the API layer do we want to enforce "null" object to indicate deletion as well. Currently we are a bit vague in this, I was proposing two options and make it clear: 1) Clarify in javadoc that null value in {{put(key, value)}} indicates deletion; if it is "null" object by-pass the serde and send "null" bytes directly into inner functions and vice verse for deserialization; do not enforce user customized serdes how to handle null values since we are not going to call them with null values any more. 2) Do NOT enforce in java doc that null value in {{put(key, value)}} indicates deletion; implement all {{delete(key)}} call directly throughout all the layers of stores instead of calling {{put(key, null)}}; recommend user customized serdes to handle null values themselves. I am a bit inclined to the second option, and [~mjsax] seem to be favoring the first option. And I'd like to hear see how others think. > KeyValueIterator returns null values > ------------------------------------ > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 > Reporter: Michal Borowiecki > Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)