Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

2019-04-10 Thread Nandor Kollar via Review Board


> On Jan. 26, 2018, 4:58 p.m., Rohini Palaniswamy wrote:
> > bin/pig
> > Line 480 (original), 478 (patched)
> > 
> >
> > How does it work when both h2 and h3 jars are generated in one target?

Is it possible to build with h2 and h3 into one target? I assume that there's 
only one jar there, relevant for hadoop 3 or hadoop 2.

If there's two jar there, then pig won't start:
Error: Could not find or load main class org.apache.pig.Main

Should we prepare for this case too? Not sure what is the best approach here.


- Nandor


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/#review196182
---


On April 10, 2019, 11:49 a.m., Nandor Kollar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65239/
> ---
> 
> (Updated April 10, 2019, 11:49 a.m.)
> 
> 
> Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and 
> Adam Szita.
> 
> 
> Repository: pig-git
> 
> 
> Description
> ---
> 
> This is an initial patch that adds Hadoop 3 support to Pig in addition to 
> Hadoop 2.
> 
> Major modifications:
>  * No breaking API change was introduced in Hadoop 3, the current code 
> compiles with Hadoop 3
>  * hadoopversion property tells which mode the tests should run, the default 
> is hadoop 2
>  * Hadoop 3 introduced a security fix, only whitelisted environment variables 
> are passed to MiniCluster
>  * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by 
> core-site.xml, hdfs-site.xml and mapred-site.xml. I decided to write the 
> config into all of these files in MiniCluster.java (into hadoop-site.xml too 
> to stay compatible with Hadoop 2) for the shake of simplicity, we might want 
> to have different files for Hadoop 2 and separate the properties for Hadoop 3.
>  * TestErrorHandling.java: small format change in error message, modified the 
> assert so it works on both on Hadoop 2 and Hadoop 3
>  * HadoopShims: code is identical with Hadoop 2, not sure if we need shims 
> any more. I think we should move it to the src instead.
>  * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 
> specific properties
>  * ivy.xml: new config for Hadoop 3
>  * build.xml: new target to package both hadoop2 and hadoop3 - not sure that 
> this is needed, if we move shims, the I think we don't need this target
>  * HBase unit test fails on Hadoop 3 (as per 
> https://hbase.apache.org/book.html HBase 1.x is not tested against Hadoop 2)
> 
> 
> Diffs
> -
> 
>   bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
>   bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
>   build.xml 9bb1b125eef3c9381466b8f043ecd31ff0d56dc7 
>   ivy.xml a93da2a9a20edf103dcbf08abc92301a1c2fda9e 
>   ivy/libraries-h2.properties PRE-CREATION 
>   ivy/libraries-h3.properties PRE-CREATION 
>   ivy/libraries.properties 1abc9677e6c0747bb102c563b4f6642274541476 
>   test/org/apache/pig/parser/TestErrorHandling.java 
> 15e09031c360cea5f81609129ac3a6d38d68d3ea 
>   test/org/apache/pig/parser/TestQueryParserUtils.java 
> 1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
>   test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
>   test/org/apache/pig/test/MiniCluster.java 
> a7532ad750f06ffae5a03024b1658ff77152c902 
>   test/org/apache/pig/test/MiniGenericCluster.java 
> 674860f880407595d68c4eea2b67e2d6465417fe 
>   test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
>   test/org/apache/pig/test/YarnMiniCluster.java 
> 69d808124a4e9be661f1fda25755075dcb6607b1 
> 
> 
> Diff: https://reviews.apache.org/r/65239/diff/4/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Nandor Kollar
> 
>



Re: Review Request 65239: PIG-5253: Pig Hadoop 3 support

2019-04-10 Thread Nandor Kollar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65239/
---

(Updated April 10, 2019, 11:49 a.m.)


Review request for pig, Daniel Dai, Koji Noguchi, Rohini Palaniswamy, and Adam 
Szita.


Changes
---

- Hadoop 2 jar is created even when build is hadoop 3
- missing important h3 dependencies for local mode


Repository: pig-git


Description
---

This is an initial patch that adds Hadoop 3 support to Pig in addition to 
Hadoop 2.

Major modifications:
 * No breaking API change was introduced in Hadoop 3, the current code compiles 
with Hadoop 3
 * hadoopversion property tells which mode the tests should run, the default is 
hadoop 2
 * Hadoop 3 introduced a security fix, only whitelisted environment variables 
are passed to MiniCluster
 * In Hadoop 3 hadoop-site.xml is deprecated, and is replaced by core-site.xml, 
hdfs-site.xml and mapred-site.xml. I decided to write the config into all of 
these files in MiniCluster.java (into hadoop-site.xml too to stay compatible 
with Hadoop 2) for the shake of simplicity, we might want to have different 
files for Hadoop 2 and separate the properties for Hadoop 3.
 * TestErrorHandling.java: small format change in error message, modified the 
assert so it works on both on Hadoop 2 and Hadoop 3
 * HadoopShims: code is identical with Hadoop 2, not sure if we need shims any 
more. I think we should move it to the src instead.
 * Split properties into 3 files: common properties, Hadoop 2 and Hadoop 3 
specific properties
 * ivy.xml: new config for Hadoop 3
 * build.xml: new target to package both hadoop2 and hadoop3 - not sure that 
this is needed, if we move shims, the I think we don't need this target
 * HBase unit test fails on Hadoop 3 (as per https://hbase.apache.org/book.html 
HBase 1.x is not tested against Hadoop 2)


Diffs (updated)
-

  bin/pig 3fcf165106cccbe75fc1c61ea74732456ae50fc7 
  bin/pig.py b6c396579c54359f430c6e74d055ec7f27ae2197 
  build.xml 9bb1b125eef3c9381466b8f043ecd31ff0d56dc7 
  ivy.xml a93da2a9a20edf103dcbf08abc92301a1c2fda9e 
  ivy/libraries-h2.properties PRE-CREATION 
  ivy/libraries-h3.properties PRE-CREATION 
  ivy/libraries.properties 1abc9677e6c0747bb102c563b4f6642274541476 
  test/org/apache/pig/parser/TestErrorHandling.java 
15e09031c360cea5f81609129ac3a6d38d68d3ea 
  test/org/apache/pig/parser/TestQueryParserUtils.java 
1c217e3cab9c4b5dc51289a883aa696dcd2feeea 
  test/org/apache/pig/test/MapReduceMiniCluster.java PRE-CREATION 
  test/org/apache/pig/test/MiniCluster.java 
a7532ad750f06ffae5a03024b1658ff77152c902 
  test/org/apache/pig/test/MiniGenericCluster.java 
674860f880407595d68c4eea2b67e2d6465417fe 
  test/org/apache/pig/test/Util.java 788a72fe3ceca08ec61ae425a393b5b0936454f4 
  test/org/apache/pig/test/YarnMiniCluster.java 
69d808124a4e9be661f1fda25755075dcb6607b1 


Diff: https://reviews.apache.org/r/65239/diff/4/

Changes: https://reviews.apache.org/r/65239/diff/3-4/


Testing
---


Thanks,

Nandor Kollar



[jira] Subscription: PIG patch available

2019-04-10 Thread jira
Issue Subscription
Filter: PIG patch available (38 issues)

Subscriber: pigdaily

Key Summary
PIG-5386Pig local mode with bundled Hadoop broken
https://issues.apache.org/jira/browse/PIG-5386
PIG-5385Skip calling extra gc() before spilling large bag when unnecessary
https://issues.apache.org/jira/browse/PIG-5385
PIG-5377Move supportsParallelWriteToStoreLocation from StoreFunc to 
StoreFuncInterfce
https://issues.apache.org/jira/browse/PIG-5377
PIG-5369Add llap-client dependency
https://issues.apache.org/jira/browse/PIG-5369
PIG-5360Pig sets working directory of input file systems causes exception 
thrown
https://issues.apache.org/jira/browse/PIG-5360
PIG-5338Prevent deep copy of DataBag into Jython List
https://issues.apache.org/jira/browse/PIG-5338
PIG-5323Implement LastInputStreamingOptimizer in Tez
https://issues.apache.org/jira/browse/PIG-5323
PIG-5273_SUCCESS file should be created at the end of the job
https://issues.apache.org/jira/browse/PIG-5273
PIG-5256Bytecode generation for POFilter and POForeach
https://issues.apache.org/jira/browse/PIG-5256
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues.apache.org/jira/browse/PIG-5160
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4373Implement PIG-3861 in Tez
https://issues.apache.org/jira/browse/PIG-4373
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython function to implement Algebraic and/or Accumulator 
interfaces
https://issues.apache.org/jira/browse/PIG-1804

You may edit this subscription at:
https://issues.apache.org/jira/secure/EditSubscription!default.jspa?subId=16328=12322384