[
https://issues.apache.org/jira/browse/TEZ-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387132#comment-15387132
]
Hitesh Shah commented on TEZ-3358:
----------------------------------
Some comments - mostly minor:
- Lets use a slightly different naming convention for
TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP - it is a bit too similar to
TEZ_HISTORY_LOGGING_NUM_DAGS_PER_GROUP and can potential cause confusion.
- TEZ_HISTORY_LOGGING_NUM_DAGS_PER_GROUP - add javadoc denoting impact on
HDFS in terms of no. of files per dag vs group
- s/ATS/YARN Timeline/
- also lets mark all of these new configs as Private and Unstable.
- "getGroupId(int numDagsPerGroup)" - should this throw an invalid arg for
groupCnt == 1?
{code}
Set<TimelineEntityGroupId> groupId =
convertToTimelineEntityGroupIds(entityType, entityId);
+ if (groupId != null && !groupId.isEmpty()) {
+ groupIds.addAll(groupId);
+ appIdSet.add(groupId.iterator().next().getApplicationId());
}
{code}
- could this code be moved into createTimelineEntityGroupIds() or does it
need to be replicated in the various places in use currently?
- In TestTimelineCachePluginImpl, minor nit: use "new Configuration(false) "
instead of "new Configuration()"
> Group ATSLogs for multiple DAGs into one file.
> ----------------------------------------------
>
> Key: TEZ-3358
> URL: https://issues.apache.org/jira/browse/TEZ-3358
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Harish Jaiprakash
> Assignee: Harish Jaiprakash
> Attachments: TEZ-3358.01.patch, TEZ-3358.02.patch
>
>
> Currently we create one history log file per DAG, change to use one group for
> multiple DAGs to prevent creation of too many files on hdfs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)