[
https://issues.apache.org/jira/browse/TEZ-4075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940640#comment-16940640
]
Siddharth Seth commented on TEZ-4075:
-------------------------------------
Some comments from a quick look at the patch, will look in more detail tomorrow.
Line 121 of the patch: Don't remember exactly what DISK vs MEMORY does. Will
check and comment. Would expect this to always be MEMORY.
IFile.write* methods - Can the bufferFull checks be collapsed into a single
function?
IFile.write* - I was initially thinking this should be a write, then determine
the size to see if there is an overflow (with buffer expansion). Probably
better this way though. The size check doesn't add an overhead, does it?
UnorderedPartitionedKVWriter - this.cachedStream = ... ... // Can this be done
only if pipelined == false and numPartitions=1. THat's when this scenario kicks
in.
canSendDataOverDME - If using disk, compressed size is used? If using mem, the
uncompressed size is used? (The doc for the parameter should ideally mention
this)
Will apply the patch tomorrow and look in a little more detail. Also, could you
please post as a GitHub PR - I think Tez supports these now, and makes reviews
and back and forth easier.
> Tez: Reimplement tez.runtime.transfer.data-via-events.enabled
> -------------------------------------------------------------
>
> Key: TEZ-4075
> URL: https://issues.apache.org/jira/browse/TEZ-4075
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Gopal Vijayaraghavan
> Assignee: Richard Zhang
> Priority: Major
> Attachments: TEZ-4075.10.patch, TEZ-4075.15.patch, TEZ-4075.16.patch,
> Tez-4075.5.patch, Tez-4075.8.patch
>
>
> This was factored out by TEZ-2196, which does skip buffers for 1-partition
> data exchanges (therefore goes to disk directly).
> {code}
> if (shufflePayload.hasData()) {
> shuffleManager.addKnownInput(shufflePayload.getHost(),
> DataProto dataProto = shufflePayload.getData();
> shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);
> FetchedInput fetchedInput =
> inputAllocator.allocate(dataProto.getRawLength(),
> dataProto.getCompressedLength(), srcAttemptIdentifier);
> moveDataToFetchedInput(dataProto, fetchedInput, hostIdentifier);
> shuffleManager.addCompletedInputWithData(srcAttemptIdentifier,
> fetchedInput);
> } else {
> shuffleManager.addKnownInput(shufflePayload.getHost(),
> shufflePayload.getPort(), srcAttemptIdentifier, srcIndex);
> }
> {code}
> got removed in
> https://github.com/apache/tez/commit/1ba1f927c16a1d7c273b6cd1a8553e5269d1541a
> It would be better to buffer up the 512Byte limit for the event size before
> writing to disk, since creating a new file always incurs disk traffic, even
> if the file is eventually being served out of the buffer cache.
> The total overhead of receiving an event, then firing an HTTP call to fetch
> the data etc adds approx 100-150ms to a query - the data xfer through the
> event will skip the disk entirely for this & also remove the extra IOPS
> incurred.
> This channel is not suitable for large-scale event transport, but
> specifically the workload here deals with 1-row control tables which consume
> more bandwidth with HTTP headers and hostnames than the 93 byte payload.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)