[
https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008327#comment-16008327
]
Jason Lowe commented on TEZ-3334:
---------------------------------
Sorry for the delay. I finally got some time to look at this.
h6. ContainerLauncherWrapper
Does it make sense to create an abstract class in tez-common that derives from
ContainerLauncher and has an abstract dagComplete method? Then we can have the
existing launchers derive from that rather than ContainerLauncher directly and
can RTTI check against that one class rather than maintain a specific list of
container launchers.
h6. LocalContainerLauncher
I'm confused why we go through the trouble of serializing the hardcoded value
of zero into the aux service protocol buffer, stuff it into the env, then
immediately go fetch it back out and extract the integer from the byte buffer.
Isn't this a complicated way to say, shufflePort = 0?
tezDefaultComponentName only needs to be computed when cleanupDagDataOnComplete
is true. Actually may not be needed at all, see related comment for
DagDeleteRunnable below.
The reflection instantiation is invoking a constructor signature that isn't in
the DeletionTracker abstract class? Isn't that too much knowledge about the
actual class being created?
h6. DagDeleteRunnable
tezDefaultComponentName is unused? I think this transitively means pluginName
is unused in DeletionTracker which would simplify it's constructor signature.
h6. DeletionTracker
Nit: addNodeShufflePorts method name being plural implies more than one port
can be added but it's only for adding a single node, port pair.
h6. AMContainerHelpers
As Sidd mentioned before, we should avoid the redundant conf key lookup when
creating each container launch context.
h6. ShuffleInputEventHandler and ShuffleInputEventHandlerOrderedGrouped
We could do a better job of leveraging the emptyPartitionsBitSet. Currently we
iterate it bit-by-bit. Instead we could mask it with the desired bits to
examine and iterate the result with nextSetBit. This should be a lot faster if
there are a lot of bits to iterate and we expect a significant number of the
partitions to not be empty. Can be postponed to a followup JIRA if desired.
h6. DagDeleteRunnable
Do we need to do any cleanup on the httpConnection?
h6. DeletionTrackerImpl
What if the submission to the executor throws RejectedExecutionException
because the executor was already shutdown and a late dagComplete was invoked?
> Tez Custom Shuffle Handler
> --------------------------
>
> Key: TEZ-3334
> URL: https://issues.apache.org/jira/browse/TEZ-3334
> Project: Apache Tez
> Issue Type: New Feature
> Reporter: Jonathan Eagles
> Attachments: TEZ-3334.1.patch, TEZ-3334.2.patch
>
>
> For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom
> shuffle handler could help reduce the number of fetches and could more
> efficiently fetch data. In particular if a reducer is fetching 100 pieces
> serially from the same mapper it could do this in one fetch call.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)