[
https://issues.apache.org/jira/browse/TEZ-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086857#comment-14086857
]
Siddharth Seth commented on TEZ-1379:
-------------------------------------
bq. Ideally, enabling or disabling compression should be determined at runtime
based on cost of doing it. E.g. small data need to be compressed.
Yep, this is something that Tez could potentially choose to do on it's own at
some point.
bq. If most of the times, the paritioner/comparator/etc confs are going to be
null, then should we have setters for them instead of making them a nullable
argument. Else users may end up passing whatever conf they have into the
argument unnecessarily.
>From a usability perspective - I think having them together is useful -
>configure everything about a partitioner/comparator/combiner in one call. It
>does definitely have the potential of users just specifying a conf from not
>knowing whether one is required or not. One other option is to have multiple
>methods - setComparator(String comparator), setComparator(String comparator,
>Configuration conf). Thoughts ?
bq. Would be helpful to know whats the default for compression/https/etc are.
So we dont have to enable it if its already enabled.
That's currently from TezRuntimeConfiguration.
bq. Why have these been marked unstable. Whats the recommended way to configure
inputs/outputs
I'll remove this from the current patch. Can be addressed in a separate jira if
required. Currently, setFromConfiguration seems to be the standard way to
configure edges - since everything is specified in tez-site anyway.
bq. Why is this private?
This is private for the Configurers. The standard constructor for the Output
already accepts these parameters, so will never need to be invoked by users.
bq. Rename OnFileSortedOutputConfiguration etc to Configurer?
In a patch just before committing, otherwise the review gets difficult.
> EdgeConfigurers should accept a Partitioner configuration, accept parameters
> for compression and secure shuffle
> ---------------------------------------------------------------------------------------------------------------
>
> Key: TEZ-1379
> URL: https://issues.apache.org/jira/browse/TEZ-1379
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
> Priority: Blocker
> Attachments: TEZ-1379.1.txt
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)