Hi Robert,
Could you provide more details on the kind of Tez job you are running?
For example, if I run orderedwordcount, I see the HDFS counters show up
correctly:
2015-02-07 18:38:54,617 INFO [Dispatcher thread: Central]
history.HistoryEventHandler:
[HISTORY][DAG:dag_1423172314047_0008_1][Event:VERTEX_FINISHED]:
vertexName=Sorter, vertexId=vertex_1423172314047_0008_1_02,
initRequestedTime=1423363130484, initedTime=1423363130531,
startRequestedTime=1423363130567, startedTime=1423363130567,
finishTime=1423363134615, timeTaken=4048, status=SUCCEEDED, diagnostics=,
counters=Counters: 40, File System Counters, FILE_BYTES_READ=7901,
FILE_BYTES_WRITTEN=7901, FILE_READ_OPS=0, FILE_LARGE_READ_OPS=0,
FILE_WRITE_OPS=0, HDFS_BYTES_READ=0, HDFS_BYTES_WRITTEN=8078, HDFS_READ_OPS=3,
HDFS_LARGE_READ_OPS=0, HDFS_WRITE_OPS=2,
org.apache.tez.common.counters.TaskCounter, REDUCE_INPUT_GROUPS=10,
REDUCE_INPUT_RECORDS=251, COMBINE_INPUT_RECORDS=0, SPILLED_RECORDS=251,
NUM_SHUFFLED_INPUTS=1, NUM_SKIPPED_INPUTS=0, NUM_FAILED_SHUFFLE_INPUTS=0,
MERGED_MAP_OUTPUTS=1, GC_TIME_MILLIS=0, COMMITTED_HEAP_BYTES=257425408,
OUTPUT_RECORDS=251, ADDITIONAL_SPILLS_BYTES_WRITTEN=7901,
ADDITIONAL_SPILLS_BYTES_READ=7901, SHUFFLE_BYTES=7901,
SHUFFLE_BYTES_DECOMPRESSED=7897, SHUFFLE_BYTES_TO_MEM=7901,
SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_DISK_DIRECT=0, NUM_MEM_TO_DISK_MERGES=0,
NUM_DISK_TO_DISK_MERGES=0, SHUFFLE_PHASE_TIME=28, MERGE_PHASE_TIME=38,
FIRST_EVENT_RECEIVED=10, LAST_EVENT_RECEIVED=10, Shuffle Errors, BAD_ID=0,
CONNECTION=0, IO_ERROR=0, WRONG_LENGTH=0, WRONG_MAP=0, WRONG_REDUCE=0,
vertexStats=firstTaskStartTime=1423363134261, firstTasksToStart=[
task_1423172314047_0008_1_02_000000 ], lastTaskFinishTime=1423363134610,
lastTasksToFinish=[ task_1423172314047_0008_1_02_000000 ], minTaskDuration=349,
maxTaskDuration=349, avgTaskDuration=349.0, numSuccessfulTasks=1,
shortestDurationTasks=[ task_1423172314047_0008_1_02_000000 ],
longestDurationTasks=[ task_1423172314047_0008_1_02_000000 ],
vertexTaskStats={numFailedTasks=0, numSucceededTasks=1,
numKilledTaskAttempts=0, numKilledTasks=0, numFailedTaskAttempts=0,
numCompletedTasks=1}
thanks
— Hitesh
On Feb 7, 2015, at 6:27 PM, Grandl Robert <[email protected]> wrote:
> Guys,
>
> I was trying to figure out some counters from the job history files. For
> reducers, I am trying to find, which reducer is writing to HDFS(i.e. last
> reducers in the job's chain of vertices).
>
> For Map vertices, they have an HDFS: BYTES_WRITTEN, HDFS: BYTES_READ counters
> which makes sense. However, I could not find the corresponding ones for
> reducers, other than the OUTPUT bytes.
>
> Do I miss something ?
>
> Thanks for your help,
> Robert
>