[ https://issues.apache.org/jira/browse/PIG-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12780181#action_12780181 ]
Romain Rigaux commented on PIG-420: ----------------------------------- We have commands that look like Unix commands (e.g. top-queries) and use Pig scripts below. These commands have parameters like -limit (e.g. how many results to return) and the user specifies -limit N where N is an integer. This is then simply transformed into a: {code} B = LIMIT A $N; {code} It would be nice if we could specify -limit * and the compiler removes the statement (in case users want everything). Currently we use a custom limit UDF filter or LIMIT with Integer.MAX_VALUE/(Long.MAX_VALUE soon!). > Limit on nothing functionality > ------------------------------ > > Key: PIG-420 > URL: https://issues.apache.org/jira/browse/PIG-420 > Project: Pig > Issue Type: Improvement > Reporter: Anand Murugappan > > Pig 2.0 implements the limit feature but as a standalone statement. > Limit is very useful in debug mode where we could run queries on smaller > amount of data (faster and on fewer nodes) to iron out issues but in the > production mode we would like to run through all the data. It would be good > to have a easy "switch" between debug and prod mode using the limit statement > without having to change the underlying code templates. Given that LIMIT is a > separate standalone statement it gets hard to parametrize the code. > For instance a query template might look like, > A = LOAD '...'; > B = LIMIT A $N; > C = FOREACH B .... > In debug mode, we would like to set the variable $N to 100 but in prod mode > we would like to set it to a 'special value' that would not apply LIMIT and > letting us run it on all the data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.