Should not be. Pig does not cache myUDF.jar. Every run will pick myUDF.jar again from /home/user/project/lib.

Daniel

On 06/16/2011 06:09 AM, Jameson Li wrote:
Great. Depend onthe wiki:http://wiki.apache.org/pig/PigStorageWithInputPath and the setting:-Dpig.noSplitCombination=true, I can get the filename in the pig.

But I have another problem.
I modify the UDF code and ant it and generate the newest jar file(I am sure the jar file has updated)
pig -x local
register /home/user/project/lib/myUDF.jar
a = load 'aaa';
b = foreach a generate com.company.pig.myUDF();
dump b;

I found that the result has been using the old jar file and UDF class, and I think UDF classes has been caced somewhere.

Am I right?
And how to using the really newest jar file after re-compile?

Thanks very much.

2011/6/15 Daniel Dai <[email protected] <mailto:[email protected]>>

    Check http://wiki.apache.org/pig/PigStorageWithInputPath, also you
    will need to disable split combination: -Dpig.noSplitCombination=true

    Daniel


    On 06/13/2011 04:07 AM, Jameson Li wrote:
    Hi, I hava some files in the hdfs://path/load/ like this:
    file_29_00001 file_47_00001 file_16_00001 ... These files are
    generate by other M/R jobs. The files are only contains one
    column, and the number in the file name between 'file_' and
    '_00001' is a id. I want to add the id into its input format like
    this(I think I should to write a LoadFunc to get the id): a =
    load '/path/load/' as com.company.pig.
    GetIDFromFileName();
    dump a;
    //here the parameter 'a' will have two columns:one is the origin column and
    the other is the id.

    And my question are these:
    1, Does there have the existing func that I can get the id from the file
    name?
    2, I think the method in pig 0.6.0 can help me:
    
*bindTo<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/builtin/PigStorage.html#bindTo(java.lang.String,
    org.apache.pig.impl.io.BufferedPositionedInputStream, long,
    long)>  
<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/builtin/PigStorage.html#bindTo%28java.lang.String,org.apache.pig.impl.io.BufferedPositionedInputStream,long,long%29>*(String<http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html?is-external=true>
      fileName, 
BufferedPositionedInputStream<http://pig.apache.org/docs/r0.6.0/api/org/apache/pig/impl/io/BufferedPositionedInputStream.html>
    in, long offset, long end) Specifies a portion of an InputStream
    to read tuples. but I can't find the same method in pig 0.8.1.
    Which method can I use to operate the input file in the pig 0.8.1
    API? Thanks very much.



Reply via email to