Github user kkhatua commented on the issue:
    The choice for a `TreeSet` is to basically use a binary structure that 
keeps the (maximum permitted) profiles sorted and in memory. 
    When Drill detect changes, 
    it will fetch all the available profiles in the PStore and reconstruct the 
tree (since the order of the profiles returned by the `FileSystem` is not 
    I tried using the `PathFilter` to fetch only new profiles, but the cost of 
the `FileSystem` fetching only new profiles, versus the entire list is the 
same! Also, there is the possibility that some profiles might have been deleted 
as new ones were added, so a full reconstruction would take care of that 
scenario as well. 
    To evict, as I construct the TreeSet, I simply pop the oldest (by filename) 
entry. The Guava cache options don't seem to provide a way to define the basis 
on which to evict entries.
    I believe, @vrozov's work on DRILL-6053 is to address locking during writes 
specifically. The lock I used (and need) is for reads to ensure that multiple 
requests don't trigger an expensive FileSystem call for the same state of the 
    e.g. consider T# as timestamps
    * `currBasePathModified` = T0 
    * _ThreadA_ requests at t=T1 and issues a read-lock
    * _ThreadB_ requests at t=T2 but is waiting for read-lock
    If the tree exists and no change is detected, _ThreadA_ will use the 
`TreeSet` contents and resume by releasing the lock. 
    If the `TreeSet` exists and a change is detected, _ThreadA_ will 
reconstruct the `TreeSet` before using its contents and it will update 
`lastBasePathModified`, before releasing the lock.
    When _ThreadB_ gets the read-lock, it discovers that during the wait, the 
`TreeSet` was already updated. So, in terms of t=T2, this is the most recent 
snapshot, so it proceeds to use the treeSet's contents rather than reconstruct. 
That will be deferred to the next request.
    We're using the `lastBasePathModified` as a way to provide a 
pseudo-versioned access to the list. That means if there are more profiles 
added *after* _ThreadB_ was waiting for the read-lock, it will not trigger the 
`FileSystem` call right away. 


Reply via email to