[ 
https://issues.apache.org/jira/browse/PIG-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12773389#action_12773389
 ] 

Ankur commented on PIG-958:
---------------------------

> Can you explain this a little bit more - ......
In the earlier patch (958.v3.patch), After moving the results from the tasks 
current working directory, I was manually deleting the directory. This is to 
ensure that empty part files don't get moved to the final output directory. But 
doing so causes hadoop to complain that it can no longer write to task's output 
dir and the task fails.

> I saw compile errors while trying to run unit test: ...
Did you compile the pig.jar  and ran core test before. This creates the 
necessary classes and jar file son the local machine required by contrib tests.

On my local machine
gan...@grainflydivide-dr:pig_trunk$ ant 
...
buildJar:
     [echo] svnString 830456
      [jar] Building jar: 
/home/gankur/eclipse/workspace/pig_trunk/build/pig-0.6.0-dev-core.jar
      [jar] Building jar: 
/home/gankur/eclipse/workspace/pig_trunk/build/pig-0.6.0-dev.jar
     [copy] Copying 1 file to /home/gankur/eclipse/workspace/pig_trunk

gan...@grainflydivide-dr:pig_trunk$ ant test
...
test-core:
   [delete] Deleting directory 
/home/gankur/eclipse/workspace/pig_trunk/build/test/logs
    [mkdir] Created dir: 
/home/gankur/eclipse/workspace/pig_trunk/build/test/logs
    [junit] Running org.apache.pig.test.TestAdd
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.024 sec
    [junit] Running org.apache.pig.test.TestAlgebraicEval
...
gan...@grainflydivide-dr:pig_trunk$ cd contrib/piggybank/java/
gan...@grainflydivide-dr:java$ ant test
...
test:
     [echo]  *** Running UDF tests ***
   [delete] Deleting directory 
/home/gankur/eclipse/workspace/pig_trunk/contrib/piggybank/java/build/test/logs
    [mkdir] Created dir: 
/home/gankur/eclipse/workspace/pig_trunk/contrib/piggybank/java/build/test/logs
    [junit] Running org.apache.pig.piggybank.test.evaluation.TestEvalString
    [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.15 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.TestMathUDF
    [junit] Tests run: 35, Failures: 0, Errors: 0, Time elapsed: 0.123 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.TestStat
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.114 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.datetime.TestDiffDate
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.105 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.decode.TestDecode
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.089 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.string.TestHashFNV
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.094 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.string.TestLookupInFiles
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 17.163 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.string.TestRegex
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.092 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.util.TestSearchQuery
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.093 sec
    [junit] Running org.apache.pig.piggybank.test.evaluation.util.TestTop
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.099 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.util.apachelogparser.TestDateExtractor
    [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.087 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.util.apachelogparser.TestHostExtractor
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.083 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.util.apachelogparser.TestSearchEngineExtractor
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.091 sec
    [junit] Running 
org.apache.pig.piggybank.test.evaluation.util.apachelogparser.TestSearchTermExtractor
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.1 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestCombinedLogLoader
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.535 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestCommonLogLoader
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.54 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestHelper
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.014 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestMultiStorage
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 16.964 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestMyRegExLoader
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.452 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestRegExLoader
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.302 sec
    [junit] Running org.apache.pig.piggybank.test.storage.TestSequenceFileLoader
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.883 sec

BUILD SUCCESSFUL
Total time: 58 seconds



> Splitting output data on key field
> ----------------------------------
>
>                 Key: PIG-958
>                 URL: https://issues.apache.org/jira/browse/PIG-958
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Ankur
>         Attachments: 958.v3.patch, 958.v4.patch
>
>
> Pig users often face the need to split the output records into a bunch of 
> files and directories depending on the type of record. Pig's SPLIT operator 
> is useful when record types are few and known in advance. In cases where type 
> is not directly known but is derived dynamically from values of a key field 
> in the output tuple, a custom store function is a better solution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to