Navneet,
Thank you for the warning. A cursory search didn't turn up any patent on
the algorithms. As this paper is 14 years old I'm not sure if contacting
the original authors would be fruitful, but unless there's a better way
to determine this I'll have to try if this is an approach I consider
further.
Harshad,
You're correct, it might be an easier initial approach to improve our
existing algorithms and attempt to parallelize serial parts instead of
rip the system out and try again with a far more complex algorithm. I'll
look at the current LRU-k approach we take and see if any
parallelization is possible, or working with mutexes could be improved.
As for the concerns, I do not believe that the current LRU-k
implementation notes block size or temporary/permanent blocks. It seems
strictly based on time. Evicting hash tables would take new structures
build to detect, evaluate, and evict said tables I believe. It's
something to consider but attention might be better spent on the mutex
situation and general algorithm efficiency.
On 4/27/18 2:39 PM, Navneet Potti wrote:
Caution! That paper is from IBM. A lot of the caching algorithms developed there
(including one <https://dl.acm.org/citation.cfm?id=2987553> I worked on, which
handles the problem of blocks being of different sizes) come with patent restrictions.
On Apr 27, 2018, at 2:21 PM, Harshad Deshmukh <hars...@cs.wisc.edu> wrote:
Hi Dylan,
That's a good find! Thanks for sharing the paper.
I have dealt with the eviction policy code a little bit. Would you mind
describing the issues first? As far as I know, the referencing and
de-referencing of blocks is serial and sometimes causes performance
bottlenecks. We may need to analyze if the bottlenecks are due to the eviction
algorithm or synchronization involved in the eviction policy code. I tried the
following fix
https://github.com/hbdeshmukh/incubator-quickstep/commit/4b32336a1b89f70d1c17ffe7956ae6f78e691d7c
to make the synchronization a bit fine grained and it helped a bit. Please
feel free to adopt this fix.
As a separate discussion, we have a complicated buffer management problem for
following reasons. I am not sure if the current LRU-k eviction implementation
addresses these concerns:
1. Blocks could be of different sizes and have different lifetimes (stored
vs temporary). Therefore the consequences of evicting different blocks could be
different.
2. We currently do not evict join hash tables. When memory is really scarce,
the buffer pool should evict hash tables.
How much attention should we pay to buffer management is more of a strategic
decision, considering that our focus is on in-memory settings.
Thanks,
Harshad
________________________________
From: Dylan Bacon <dba...@wisc.edu>
Sent: Friday, April 27, 2018 2:07:14 PM
To: dev@quickstep.incubator.apache.org
Subject: Eviction Policy Algorithm
Attached screenshot of an algorithm that seems worthwhile to implement
in Quickstep for an eviction policy. The language and structure would
need to be adapted from cache pages to our database blocks, I'm not sure
how straightforward that would be. Sending this to the dev email for
broader discussion as I initially sent to individuals. Thoughts and
input are welcome.
Source paper:
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=C660AD720FD52C67A063B270F4B6DDF4?doi=10.1.1.105.6057&rep=rep1&type=pdf
--
Regards,
Dylan Bacon
University of Wisconsin - Madison
Department of Computer Sciences
dba...@wisc.edu