[
https://issues.apache.org/jira/browse/TEZ-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384796#comment-15384796
]
Hitesh Shah commented on TEZ-3357:
----------------------------------
Comments:
- TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP - usedNumDagsPerGroup - configs
do not use camel case. Either "." or "-".
- ConfigurationScope(Scope.AM) - does not make sense here at all.
- Why is TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP_DEFAULT needed? Can't it
fall back to the configured value used when writing/creating these groups?
- Changes in TezUtilsInternal: In retrospect, the grouping logic can belong
in TezDAGId.java. A couple of points though:
- use just daggroup or groupId and do not reference timeline in docs or var
names
- s/DAGGROUP/daggroup/
- s/AtsGroup/DagGroup/
- For "TimelineEntityGroupPlugin implements Configurable" - how are we
certain that YARN will invoke setConf on the plugin? The base class
TimelineEntityGroupPlugin does not require a plugin to implement methods
defined as part of the Configurable interface.
- What if setConf is never called?
{code}
conf.get(TezConfiguration.TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP,
184
TezConfiguration.TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP_DEFAULT).split(",");
{code}
- maybe use conf.getStrings() ?
{code}
* Comma separated list of numDagsPerGroup used until now. This is used by
the TimelineCacheService to generate
1233 * TimelineEntityGroupIds. Do not add too many here, it will affect
the performance of ATS during query time.
{code}
- Needs a bit more clarity on what this config implies or maybe a reference
to the write path where this will be used?
TestTimelineCachePluginImpl needs tests for where
TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP_DEFAULT is configured to empty i.e.
<value></value> or more than one value i.e. "50,100".
{code}
for (int i = 0; i < usedNumGroups.length; ++i) {
187 allNumGroupsPerDag[i] = Integer.parseInt(usedNumGroups[i]);
188 }
{code}
- what if someone sets an invalid value? The stack trace without an
accompaning message of which config property would make it tough for a user to
figure out what is wrong.
> Change TimecachePlugin to return grouped entity ids.
> ----------------------------------------------------
>
> Key: TEZ-3357
> URL: https://issues.apache.org/jira/browse/TEZ-3357
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Harish Jaiprakash
> Assignee: Harish Jaiprakash
> Attachments: TEZ-3357.01.patch
>
>
> TimelineCachePlugin has to return TimelineEntityGroupId grouped based on dag
> id.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)