Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 

The following page has been changed by OlgaN:

  ===== 4.1.1 Logging =====
- stderr of streaming application needs to be captured and presented to the 
user in a easily digestable format. The user will be presented the output of 
each streaming task separately with the header that includes the following 
information:  task name, task result code, start time, end time, input size, 
and primary output size.
+ Users will have control over handling of `stderr` of their streaming 
application. By default, in case of errors, the full error information would be 
brought to the client and stored in the client side log.
+ In addition, a user can request the `stderr` is stored in DFS both for 
successful and failed jobs. This is done by adding `stderr spec` to the 
streaming command declaration:
+ {{{
+ define CMD `` stderr('stream.stderr')  
+ }}}
+ In this case, the streaming `stderr` will be stored in _logs directory in the 
jobs output directory. Note that the same Pig job can have multiple streaming 
applications associated with it. It would be up to the user to make sure that 
different names are used for this to avoid conflicts.
+ Pig would store up to '''500''' logs per streaming job in this location. The 
limit is imposed to make sure that we don't create a large number of small 
files in DFS and waste space and name node resources. The user can specify a 
smaller number via `limit` keyword in the `stderr` specL
+ {{{
+ define CMD `` stderr('stream.stderr' limit 100)  
+ }}}
+ The logs would only contain stderr information from the streaming 
application. The content will include a header and a footer. The header will 
include task name, start time, input size, input file and input range if 
available. The footer will contain result code, end time, and primary output 
  ===== 4.1.2 Error Handling =====

Reply via email to