Should not be. Pig does not cache myUDF.jar. Every run will pick
myUDF.jar again from /home/user/project/lib.
Daniel
On 06/16/2011 06:09 AM, Jameson Li wrote:
Great. Depend onthe
wiki:http://wiki.apache.org/pig/PigStorageWithInputPath and the
setting:-Dpig.noSplitCombination=true, I can get the filename in the pig.
But I have another problem.
I modify the UDF code and ant it and generate the newest jar file(I am
sure the jar file has updated)
pig -x local
register /home/user/project/lib/myUDF.jar
a = load 'aaa';
b = foreach a generate com.company.pig.myUDF();
dump b;
I found that the result has been using the old jar file and UDF class,
and I think UDF classes has been caced somewhere.
Am I right?
And how to using the really newest jar file after re-compile?
Thanks very much.
2011/6/15 Daniel Dai <[email protected]
<mailto:[email protected]>>
Check http://wiki.apache.org/pig/PigStorageWithInputPath, also you
will need to disable split combination: -Dpig.noSplitCombination=true
Daniel
On 06/13/2011 04:07 AM, Jameson Li wrote:
Hi, I hava some files in the hdfs://path/load/ like this:
file_29_00001 file_47_00001 file_16_00001 ... These files are
generate by other M/R jobs. The files are only contains one
column, and the number in the file name between 'file_' and
'_00001' is a id. I want to add the id into its input format like
this(I think I should to write a LoadFunc to get the id): a =
load '/path/load/' as com.company.pig.
GetIDFromFileName();
dump a;
//here the parameter 'a' will have two columns:one is the origin column and
the other is the id.
And my question are these:
1, Does there have the existing func that I can get the id from the file
name?
2, I think the method in pig 0.6.0 can help me:
*bindTo<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/builtin/PigStorage.html#bindTo(java.lang.String,
org.apache.pig.impl.io.BufferedPositionedInputStream, long,
long)>
<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/builtin/PigStorage.html#bindTo%28java.lang.String,org.apache.pig.impl.io.BufferedPositionedInputStream,long,long%29>*(String<http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html?is-external=true>
fileName,
BufferedPositionedInputStream<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/impl/io/BufferedPositionedInputStream.html>
in, long offset, long end) Specifies a portion of an InputStream
to read tuples. but I can't find the same method in pig 0.8.1.
Which method can I use to operate the input file in the pig 0.8.1
API? Thanks very much.