Load/evict blocks will help reduce the cache memory footprint, but we still
won't be able to release the underlying resources. We can add definitely
heuristics to help release the resources as you mentioned, but there is no
accurate way to track all the iterators/iterables created and free them up
once not needed. I think while the API is aimed at nice user experience, we
should have the option to let users optimize their performance if they
choose to. Do you agree?

Thanks,
Xinyu

On Thu, May 10, 2018 at 3:25 PM, Lukasz Cwik <lc...@google.com> wrote:

> Users won't reliably close/release the resources and forcing them to will
> make the user experience worse.
> It will make a lot more sense to use a file format which allows random
> access and use a cache to load/evict blocks of the state from memory.
> If that is not possible, use an iterable which frees the resource after a
> certain amount of inactivity or uses weak references.
>
> On Thu, May 10, 2018 at 3:07 PM Xinyu Liu <xinyuliu...@gmail.com> wrote:
>
>> Hi, folks,
>>
>> I'm in the middle of implementing the MapState and SetState in our Samza
>> runner. We noticed that the state returns the Java Iterable for reading
>> entries, keys, etc. For state backed by file-based kv store like rocksDb,
>> we need to be able to let users explicitly close iterator/iterable to
>> release the resources.Otherwise we have to load the iterable into memory so
>> we can safely close the underlying rocksDb iterator, similar to Flink's
>> implementation. But this won't work for states that don't fit into
>> memory. I chatted with Kenn and he also agrees we need this capability to
>> avoid bulk read/write. This seems to be a general use case and I'm
>> wondering if we can add the support to it? I am happy to contribute to this
>> if needed. Any feedback is highly appreciated.
>>
>> Thanks,
>> Xinyu
>>
>

Reply via email to