[
https://issues.apache.org/jira/browse/TEZ-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974173#comment-13974173
]
Hitesh Shah commented on TEZ-698:
---------------------------------
Comments:
{code}
getContext().canCommit()
{code}
- is this thread safe?
bq. MRHelpers.getReduceResource(tezConf)
- do we really need to use MRHelpers for resource calculation and java opts?
{code}
+ byte[] intermediateDataPayload =
+ MRHelpers.createMRIntermediateDataPayload(tezConf,
Text.class.getName(),
+ IntWritable.class.getName(), true, null, null);
{code}
- intermediate data payload could be a bit confusing for someone reading
code. Is this the payload for an edge? or for both the input and output? From
the perspective of a user looking at example code, are the payloads for an
input/output pair meant to be the same?
{code}
tezConf.set(TezConfiguration.TEZ_AM_JAVA_OPTS,
MRHelpers.getMRAMJavaOpts(tezConf));
{code}
- just rely on tez configs and not use MR configs?
{code}
public static void main(String[] args) throws Exception {
- if ((args.length%2) != 0) {
+ if (args.length != 2) {
printUsage();
System.exit(2);
}
{code}
- does this work with -Dparams?
{code}
+ public static byte[] createUserPayload(Configuration conf,
+ String outputFormatName, boolean useNewApi) throws IOException {
+ Configuration outputConf = new JobConf(conf);
+ outputConf.set(MRJobConfig.OUTPUT_FORMAT_CLASS_ATTR, outputFormatName);
+ MultiStageMRConfToTezTranslator.translateVertexConfToTez(outputConf,
+ null);
+ MRHelpers.doJobClientMagic(outputConf);
+ return TezUtils.createUserPayloadFromConf(outputConf);
+ }
{code}
- useNewApi unused?
- what does doJobClientMagic do? :)
- is "Configuration outputConf = new JobConf(conf);" needed? What if conf is
already an instance of jobconf?
- above comments will hold for MRInput related code too.
> Make it easy to create and configure
> MRInput/MROutput/ShuffleInput/SortedOutput
> -------------------------------------------------------------------------------
>
> Key: TEZ-698
> URL: https://issues.apache.org/jira/browse/TEZ-698
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-698.1.patch
>
>
> We have moved away from MR and its not necessary for anyone to write mappers
> and reducers or to configure them. But MR input and output and Shuffle
> related inputs/outputs. Currently we have to invoke a host of methods to
> configure them. If we can have a single API to make these configs then it
> would really help. Secondly for IO pairs like ShuffleInput/SortedOutput,
> their configs are related (KV types e.g.) So it maybe useful to have a
> combined API that generates configs for both in a single API.
--
This message was sent by Atlassian JIRA
(v6.2#6252)