[ 
https://issues.apache.org/jira/browse/PIG-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080280#comment-13080280
 ] 

Dmitriy V. Ryaboy commented on PIG-2207:
----------------------------------------

You read my mind.

I think for starters, we should increment the warning counters in groups based 
on the UDF. Right now all the UDF_WARNING_1s roll together, but we could split 
them into MyUDF / UDF_WARNING_1, MyOtherUDF / UDF_WARNING_2

I also think the warning message should be printed "at least once" -- just the 
counters alone aren't always sufficient. We could get this fairly cheaply by 
keeping a weakref hashmap of seen warnings, and logging the message every time 
we see a new warning.

> Support custom counters for aggregating warnings from different udfs
> --------------------------------------------------------------------
>
>                 Key: PIG-2207
>                 URL: https://issues.apache.org/jira/browse/PIG-2207
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Thejas M Nair
>              Labels: newbie
>             Fix For: 0.10
>
>
> Pig allows udfs to aggregate warning messages instead of writing out a 
> separate warning message each time. Udfs can do this by logging the warning 
> using EvalFunc.warn(String msg, Enum) call. But the udfs are forced to use 
> PigWarning class if the warning needs to be printed at the end of the pig 
> script . 
> For example, with the changes in PIG-2191, some of the builtin udfs are using 
> PigWarning.UDF_WARNING_1 as argument in calls to EvalFunc.warn. This will 
> result in the warning count being printed on STDERR -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning UDF_WARNING_1 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> But it would be better if a udf such as the LOWER udf could use a custom 
> warning counter, and the STDERR is like -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning LOWER_FUNC_INPUT_WARNING 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> A new function could be added to support this - (something like) 
> EvalFunc.warn(String warnName, String warnMsg);  A specific counter group 
> could be used for udf warnings (see org.apache.hadoop.mapred.Counters), and 
> counters for that group could be done during final warning aggregation in 
> done in MapReduceLauncher.computeWarningAggregate(). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to