[
https://issues.apache.org/jira/browse/MAPREDUCE-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742712#action_12742712
]
Owen O'Malley edited comment on MAPREDUCE-157 at 8/12/09 9:47 PM:
------------------------------------------------------------------
I'm confused what the goal of using Avro here would be.
Let's review the goals:
1. Get an easily parseable text format.
2. Not require excessive amounts of time for logging
2a. Not require excessive object allocations.
It seems like to use Avro, we'd need to create the Avro objects and then write
them out. I'd rather just use a JsonWriter to write the events out to the
stream. Of course reading is the reverse. It would be like writing xml files by
generating the necessary DOM objects. You can do it (and in fact Configuration
is written that way. *sigh*), but it costs a lot of time.
Not having seen the Avro text format, I can't evaluation how much overhead it
adds. None of the features of Avro seem compelling in this case, and could
easily lead to unfortunate choices.
Furthermore, I don't know if there are any guarantees about the Avro text
format's stability. We need stability in this format.
was (Author: owen.omalley):
I'm confused what the goal of using Avro here would be.
Let's review the goals:
1. Get an easily parseable text format.
2. Not require excessive amounts of time for logging
2a. Not require excessive object allocations.
It seems like to use Avro, we'd need to create the Avro objects and then write
them out. I'd rather just use a JsonWriter to write the events out to the
stream. Of course reading is the reverse. I would be like writing xml files by
generating the necessary DOM objects. You can do it (and in fact Configuration
is written that way. *sigh*), but it costs a lot of time.
Not having seen the Avro text format, I can't evaluation how much overhead it
adds. None of the features of Avro seem compelling in this case, and could
easily lead to unfortunate choices.
Furthermore, I don't know if there are any guarantees about the Avro text
format's stability. We need stability in this format.
> Job History log file format is not friendly for external tools.
> ---------------------------------------------------------------
>
> Key: MAPREDUCE-157
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-157
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Owen O'Malley
> Assignee: Jothi Padmanabhan
>
> Currently, parsing the job history logs with external tools is very difficult
> because of the format. The most critical problem is that newlines aren't
> escaped in the strings. That makes using tools like grep, sed, and awk very
> tricky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.