[ 
https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012785#comment-13012785
 ] 

Olga Natkovich commented on PIG-1899:
-------------------------------------

The patch looks good. A few comments:

(1) There are several scripts that are placed directly under pig/tools. I 
wonder if we have a test subdirectory under this
(2) It would be great for each file especially for scripts and UDFs to have a a 
little more information on what it does. For instance, generate_data.pl just 
says that it generates data but not what kind and what parameters it support.
(3) CreateMap.java TOMAP.java - there is already a TOMAP function in builtin 
which I think does something very similar
(4) UPPER.java, TOBAG.java  - these are also part of builtins
(5) pig/udfs/java/build.xml - not sure exactly what this is for but location is 
kind of strange and it also refers to HowlDriver.
(6) There are also one reference to yahoo that needs to be removed (just grep 
for yahoo)

> Pig needs a tool for doing end to end testing efficiently
> ---------------------------------------------------------
>
>                 Key: PIG-1899
>                 URL: https://issues.apache.org/jira/browse/PIG-1899
>             Project: Pig
>          Issue Type: Test
>          Components: tools
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: PIG-1899.patch, PIG-1899.patch, e2e.patch
>
>
> Pig currently uses junit for all testing.  junit is good for unit tests, but 
> limited for end to end and integration testing.
> Building an end to end test in junit is cumbersome (a lot of setup and such 
> to do using MiniCluster).  Given that expected results must be known 
> beforehand and hand crafted they must be kept very small, usually ten or less 
> rows.  This does not lead to realistic testing scenarios.
> A test tool is needed that allows the test developer to write a Pig Latin 
> script and specify a source of truth against which to test the results of 
> running this Pig Latin script.  A database or a previous version of Pig can 
> then be used as that source of truth.  This will allow developers to quickly 
> add new tests that return more than trivial results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to