[ 
https://issues.apache.org/jira/browse/PIG-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548058
 ] 

Alan Gates commented on PIG-12:
-------------------------------

A couple of things:

1) When I put in the log4j stuff, I had pig use its own appender rather than 
the root one so that we could set up our own appenders, log levels, etc. 
independent of hadoop.  Maybe that isn't the right choice.  Maybe we should 
assume that users will want the same from pig and hadoop log4j.  I'm open to 
modifying this if we don't think the initial approach is the best one.

2) I agree that we don't want hadoop and pig to be outputting log records that 
are formatted differently.  My concern is that if we control format through 
configuration, it is difficult for a user to turn the timestamp on or off.  
(Not difficult in the sense that they can't do it, as they can and it is 
totally flexible.  But difficult in the sense that how many users know how to 
manage log4j and tweak their configuration to change the timestamp format?)  If 
we do it through the code we can provide the user command line options to 
control it.



> Please add timestamps to pig map/reduce progress messages
> ---------------------------------------------------------
>
>                 Key: PIG-12
>                 URL: https://issues.apache.org/jira/browse/PIG-12
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Olga Natkovich
>
> From one of the users: 
> ------------------------------
> I'm spending a lot of time trying to optimize my pig queries for short
> run-times.  This process would be much easier if, in the progress output
> from pig (currently on stdout, but hopefully soon moving to  
> stderr?!), the
> initiation and completion of each map/reduce job could be  
> timestamped.  Pig
> already spits out messages of the form "----- MapReduce Job -----",  
> "Input:
> ...", "Combine: ...", etc; could you just add a "Timestamp: ..."
> field as well?        Or ideally, both "Starting timestamp: ..." and  
> "Finishing
> timestamp ...".
> Additional comments from another user:
> ------------------------------------------------------
> I'm adding my vote for this as well.
> I'd like to know timestamp and "running time" in seconds or D;H:M:S:
> Thu Oct 25 10:06:01 GMT 2007 (0:00:12:56): 56% done
> Starting and stopping timestamps in the log would also be valuable.
> Unforutately, there's no "workaround" such as putting a date command before 
> and after the pig command in logging --
> queuing times can be seconds to hours and completely mess up any notion of 
> job execution time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to