[
https://issues.apache.org/jira/browse/TEZ-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621362#comment-14621362
]
Saikat commented on TEZ-2172:
-----------------------------
A solution could be to make a linkedhashmap<InputAttemptIdentifier, Integer>
(LInkedHashmap has efficient remove properties, and for our scenario each
Fetcher runs in its own thread context so the map need not be thread safe)
The Integer value field could be a dummy field.
We would retrieve the key and work with it.
> FetcherOrderedGrouped using List to store InputAttemptIdentifier can lead to
> some inefficiency during remove() operation
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: TEZ-2172
> URL: https://issues.apache.org/jira/browse/TEZ-2172
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Saikat
>
> As part of fixing TEZ-2001, FetcherOrderedGrouped stores
> InputAttemptIdentifier in List. This can lead to some inefficiency - since
> the size of this list can be ~30, and remove() calls can be expensive.
> Option 1: by using the spillId in the hashCode - or a wrapping structure for
> just this. However, SpillId can not be added to the hashCode as it would
> break ShuffleScheduler shuffleInfoEventsMap.
> Option 2: consider using Map with an identifier.
> Need to consider other options as well. Creating this jira as a placeholder
> to fix this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)