Thejas M Nair commented on PIG-420:

The idea proposed by Rekha seems to be better alternative for 'limit on 
nothing' . It would be good to have something similar to C++ preprocessor 
macros . This way the "if debug" decisions can be done at compile time, and 
there will not be any performance impact.

Pig could have some syntax to denote debug only sections of the pig script , 
something like -
a = load 'file';
b = #IFDEF DEBUG { limit a, 100; } #ELSE { a; /*assuming we start supporting 
the syntax "b=a;" */}
c = filter b by $0 = 1;
#IFDEF DEBUG { store c into 'debug_file' ; }


> Limit on nothing functionality
> ------------------------------
>                 Key: PIG-420
>                 URL: https://issues.apache.org/jira/browse/PIG-420
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Anand Murugappan
> Pig 2.0 implements the limit feature but as a standalone statement. 
> Limit is very useful in debug mode where we could run queries on smaller 
> amount of data (faster and on fewer nodes) to iron out issues but in the 
> production mode we would like to run through all the data. It would be good 
> to have a easy "switch" between debug and prod mode using the limit statement 
> without having to change the underlying code templates. Given that LIMIT is a 
> separate standalone statement it gets hard to parametrize the code. 
> For instance a query template might look like, 
> A = LOAD '...';
> B = LIMIT A $N;
> C = FOREACH B .... 
> In debug mode, we would like to set the variable $N to 100 but in prod mode 
> we would like to set it to a 'special value' that would not apply LIMIT and 
> letting us run it on all the data. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to