I would be a bit clearer what you want to do. It's kind of vague, and the solution will vary depending. Keep in mind that in M/R (and thus in Pig), you have no idea how many records will be given to a given mapper, and thus, a given instance of your UDF in a given JVM. For all you know, it could be 1 instance per row...this'd be super inefficient, but it'd mean that there'd be no point in storing anything statically!
There are ways to guarantee that all of the data will stream through one UDF, namely, grouping, whether it be group all, or grouping by a key. In this situation, though, you don't need static variables, as on a given mapper the same instance of the class will do all of the processing. It depends on your data and what you want to do on it, of course. But static variables will not let you "communicate" between instances that are processing data, because they are in different JVM's. 2012/3/1 Shibu Thomas <[email protected]> > Hi, > > I am trying to use a static variable in PIG UDF which will be invoked from > a foreach statement > > This static variable will be used an index into an array to return the > next value from the array. > > I want to understand the implications of the same > > Thanks > > Shibu Thomas > MSCIS-IS > Office : +91 (40) 669 32660 > Mobile: +91 95811 51116 >
