[
https://issues.apache.org/jira/browse/TAJO-5?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885471#comment-13885471
]
Henry Saputra commented on TAJO-5:
----------------------------------
[~hyunsik], since latest Tajo currently not using YARN as RM, is this JIRA
ticket still valid?
> Cache mechanism to keep instances of opened BSTIndexs in PullServerAuxService
> -----------------------------------------------------------------------------
>
> Key: TAJO-5
> URL: https://issues.apache.org/jira/browse/TAJO-5
> Project: Tajo
> Issue Type: Improvement
> Components: data shuffle
> Reporter: Hyunsik Choi
> Assignee: Henry Saputra
> Labels: newbie
>
> PullServerAuxService is an auxiliary service of Yarn to repartition
> intermediate data. It is similar to ShuffleHandler of MRv2.
> PullServerAuxService supports hash repartition as well as range repartition.
> It works through netty-based HTTP web server.
> For retrieval of range partition data, PullServerAuxService uses a binary
> search tree (BSTIndex.java). For each request of range partitioned data, it
> opens BSTIndex every time. It may cause overheads. See messageReceived in
> PullServer and getFileChunks in PullServerAuxService.
> If PullServerAuxService uses some cache mechanism that keeps instances of
> opened BSTIndex and data files, it could get rid of this overhead.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)