There are few race condition because of the way this class is implemented. 
Mainly the `RuntimeFilterSink` can be accessed in context of Netty thread and 
FragmentExecutor thread. Netty's thread will just add each received 
`RuntimeFilterWritable` into the queue and be done with it. 
The race condition mainly appears w.r.t `AsyncAggregateWorker` thread and 
`FragmentExecutor` thread where async thread might be updating the shared 
`aggregated` instance and fragment executor thread will be using the same 
instance thinking it's the older filter (specifically underlying bloomfilter 
DrillBuff). Also during `close()` there can be issues like async thread might 
have just received another runtimeFilter and `close()` will then update the 
running state and close `aggregated` instance and thinks queue is empty. 
Whereas async thread can then try to `aggregate` the received runtimeFilter.

Please define a clean contract for this class. Few things to consider:

- Async aggregated thread can be started during creation of RuntimeFilterSink
- Consider using `BlockingQueue` since async thread should block until next 
item becomes available rather than just spinning based on a state.
- access to shared resource `RuntimeFilterWritable aggregated` needs to be 
protected by a lock.
- async thread to check for `running` state before aggregating and after 
retrieving an element from the queue. In case of running state set to false 
should `clear` the polled element.
- This class should just return bloom filter list and fieldList rather than 
entire aggregated `RuntimeFilterWritable` since that can be modified by caller 
as it exposes setter methods.

[ Full content available at: https://github.com/apache/drill/pull/1459 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to