I don't agree. I believe you can track the iterators/iterables that are
created and freed by using weak references and reference queues (or other
methods). Having a few people work 10x as hard to provide a good
implementation is much better then having 100s or 1000s of users suffering
through a more complicated API.

On Thu, May 10, 2018 at 3:44 PM Xinyu Liu <xinyuliu...@gmail.com> wrote:

> Load/evict blocks will help reduce the cache memory footprint, but we
> still won't be able to release the underlying resources. We can add
> definitely heuristics to help release the resources as you mentioned, but
> there is no accurate way to track all the iterators/iterables created and
> free them up once not needed. I think while the API is aimed at nice user
> experience, we should have the option to let users optimize their
> performance if they choose to. Do you agree?
>
> Thanks,
> Xinyu
>
> On Thu, May 10, 2018 at 3:25 PM, Lukasz Cwik <lc...@google.com> wrote:
>
>> Users won't reliably close/release the resources and forcing them to will
>> make the user experience worse.
>> It will make a lot more sense to use a file format which allows random
>> access and use a cache to load/evict blocks of the state from memory.
>> If that is not possible, use an iterable which frees the resource after a
>> certain amount of inactivity or uses weak references.
>>
>> On Thu, May 10, 2018 at 3:07 PM Xinyu Liu <xinyuliu...@gmail.com> wrote:
>>
>>> Hi, folks,
>>>
>>> I'm in the middle of implementing the MapState and SetState in our Samza
>>> runner. We noticed that the state returns the Java Iterable for reading
>>> entries, keys, etc. For state backed by file-based kv store like rocksDb,
>>> we need to be able to let users explicitly close iterator/iterable to
>>> release the resources.Otherwise we have to load the iterable into memory so
>>> we can safely close the underlying rocksDb iterator, similar to Flink's
>>> implementation. But this won't work for states that don't fit into
>>> memory. I chatted with Kenn and he also agrees we need this capability to
>>> avoid bulk read/write. This seems to be a general use case and I'm
>>> wondering if we can add the support to it? I am happy to contribute to this
>>> if needed. Any feedback is highly appreciated.
>>>
>>> Thanks,
>>> Xinyu
>>>
>>
>

Reply via email to