from:"Adam Szita \(JIRA\)"

[jira] [Commented] (PIG-5140) fix TestEmptyInputDir unit test failure after PIG-5132

2017-02-27 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885835#comment-15885835
 ] 

Adam Szita commented on PIG-5140:
-

Seems like Pig on Spark was never able to handle empty directories. If 0 splits 
(partitions) were read then Spark will cut the job submission before notifying 
listeners here: 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L577.
Because of this the mapping of the new JobId with its job group will not happen 
as it normally does at: 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala#L190

Uploaded [^PIG-5140.0.patch] to fix Pig's Spark integration so it doesn't 
expect new JobIds when the Store RDD had zero partitions.

> fix TestEmptyInputDir unit test failure after PIG-5132
> --
>
> Key: PIG-5140
> URL: https://issues.apache.org/jira/browse/PIG-5140
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5140.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5140) fix TestEmptyInputDir unit test failure after PIG-5132

2017-02-27 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5140:

Attachment: PIG-5140.0.patch

> fix TestEmptyInputDir unit test failure after PIG-5132
> --
>
> Key: PIG-5140
> URL: https://issues.apache.org/jira/browse/PIG-5140
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5140.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (PIG-5140) fix TestEmptyInputDir unit test failure after PIG-5132

2017-02-27 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on PIG-5140 started by Adam Szita.
---
> fix TestEmptyInputDir unit test failure after PIG-5132
> --
>
> Key: PIG-5140
> URL: https://issues.apache.org/jira/browse/PIG-5140
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5153) Change of behavior in FLATTEN(map)

2017-02-27 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5153:
---

Assignee: Adam Szita

> Change of behavior in FLATTEN(map)
> --
>
> Key: PIG-5153
> URL: https://issues.apache.org/jira/browse/PIG-5153
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Adam Szita
>Priority: Minor
>
> In PIG-5085 we changed the behavior of FLATTEN on map.  
> (I didn't even know this was even allowed until I saw the following test 
> failure.)
> e2e nightly FOREACH_6 started failing after this change.
> {code}
> 'num' => 6,
> 'pig' => q\register :FUNCPATH:/testudf.jar;
> a = load ':INPATH:/singlefile/studenttab10k' as (name, age, gpa);
> b = foreach a generate flatten(name) as n, 
> flatten(org.apache.pig.test.udf.evalfunc.CreateMap((chararray)name, gpa)) as 
> m;
> store b into ':OUTPATH:' using 
> org.apache.pig.test.udf.storefunc.StringStore();\,
> },
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5153) Change of behavior in FLATTEN(map)

2017-02-27 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885501#comment-15885501
 ] 

Adam Szita commented on PIG-5153:
-

[~rohini], sure I'll handle this

> Change of behavior in FLATTEN(map)
> --
>
> Key: PIG-5153
> URL: https://issues.apache.org/jira/browse/PIG-5153
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Priority: Minor
>
> In PIG-5085 we changed the behavior of FLATTEN on map.  
> (I didn't even know this was even allowed until I saw the following test 
> failure.)
> e2e nightly FOREACH_6 started failing after this change.
> {code}
> 'num' => 6,
> 'pig' => q\register :FUNCPATH:/testudf.jar;
> a = load ':INPATH:/singlefile/studenttab10k' as (name, age, gpa);
> b = foreach a generate flatten(name) as n, 
> flatten(org.apache.pig.test.udf.evalfunc.CreateMap((chararray)name, gpa)) as 
> m;
> store b into ':OUTPATH:' using 
> org.apache.pig.test.udf.storefunc.StringStore();\,
> },
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-27 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885405#comment-15885405
 ] 

Adam Szita commented on PIG-5132:
-

[~kellyzly] what you are talking about (ivy/lib/Pig) is build related.
The list above is in *runtime.dependencies-withouthadoop.jar* and is runtime 
related. These jars were there before the merge and I think we should to leave 
them there.

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, hadoop-streaming.jar, 
> jenkins.5132.2.fix.PNG, PIG-5132.1_fixes.patch, PIG-5132.1.patch, 
> PIG-5132.2.fix.patch, PIG-5132.2.zip, PIG-5132.aftermerge.0.patch, 
> PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5140) fix TestEmptyInputDir unit test failure after PIG-5132

2017-02-24 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882967#comment-15882967
 ] 

Adam Szita commented on PIG-5140:
-

Looks like when using spark as execution engine we get exceptions if we try to 
load empty inputs. Will take a look..

> fix TestEmptyInputDir unit test failure after PIG-5132
> --
>
> Key: PIG-5140
> URL: https://issues.apache.org/jira/browse/PIG-5140
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5140) fix TestEmptyInputDir unit test failure after PIG-5132

2017-02-24 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5140:
---

Assignee: Adam Szita

> fix TestEmptyInputDir unit test failure after PIG-5132
> --
>
> Key: PIG-5140
> URL: https://issues.apache.org/jira/browse/PIG-5140
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-24 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882252#comment-15882252
 ] 

Adam Szita commented on PIG-5132:
-

Attached [^PIG-5132.aftermerge.0.patch] with the small fixes and 
hadoop-streaming.jar. Please apply the patch and replace hadoop-streaming.jar 
under test/e2e/pig/lib

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, hadoop-streaming.jar, 
> jenkins.5132.2.fix.PNG, PIG-5132.1_fixes.patch, PIG-5132.1.patch, 
> PIG-5132.2.fix.patch, PIG-5132.2.zip, PIG-5132.aftermerge.0.patch, 
> PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-24 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: hadoop-streaming.jar

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, hadoop-streaming.jar, 
> jenkins.5132.2.fix.PNG, PIG-5132.1_fixes.patch, PIG-5132.1.patch, 
> PIG-5132.2.fix.patch, PIG-5132.2.zip, PIG-5132.aftermerge.0.patch, 
> PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-24 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: PIG-5132.aftermerge.0.patch

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, jenkins.5132.2.fix.PNG, 
> PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.2.fix.patch, 
> PIG-5132.2.zip, PIG-5132.aftermerge.0.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-24 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882240#comment-15882240
 ] 

Adam Szita commented on PIG-5132:
-

Thanks [~kellyzly]! I see your new commit, it looks much much better now.
I can see very few things that we still have to take care of in:
- libraries.properties
- POMergeJoin.java
- hadoop-streaming.jar
- build.xml
- ivy.xml

I'll put together a quick patch for these shortly and it should be good then

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, jenkins.5132.2.fix.PNG, 
> PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.2.fix.patch, 
> PIG-5132.2.zip, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PIG-5149) fix TestPredeployedJar unit test failure after PIG-5132

2017-02-23 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita resolved PIG-5149.
-
Resolution: Duplicate

> fix TestPredeployedJar unit test failure after PIG-5132
> ---
>
> Key: PIG-5149
> URL: https://issues.apache.org/jira/browse/PIG-5149
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5149) fix TestPredeployedJar unit test failure after PIG-5132

2017-02-23 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880706#comment-15880706
 ] 

Adam Szita commented on PIG-5149:
-

This seems to fail on some OSes and pass on others. It seems like that this is 
not spark related, see PIG-5152. I'm closing this one as duplicate.

> fix TestPredeployedJar unit test failure after PIG-5132
> ---
>
> Key: PIG-5149
> URL: https://issues.apache.org/jira/browse/PIG-5149
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5154) Fix GFCross related issues after merging from trunk to spark

2017-02-23 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5154:

Status: Patch Available  (was: Open)

> Fix GFCross related issues after merging from trunk to spark
> 
>
> Key: PIG-5154
> URL: https://issues.apache.org/jira/browse/PIG-5154
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5154.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5154) Fix GFCross related issues after merging from trunk to spark

2017-02-23 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880420#comment-15880420
 ] 

Adam Szita commented on PIG-5154:
-

Several unit tests were failing after the merge because GFCross started to 
depend on MRConfiguration.TASK_ID value which is not set in Spark mode. 
Attached [^PIG-5154.0.patch] to fix it

> Fix GFCross related issues after merging from trunk to spark
> 
>
> Key: PIG-5154
> URL: https://issues.apache.org/jira/browse/PIG-5154
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5154.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5154) Fix GFCross related issues after merging from trunk to spark

2017-02-23 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5154:

Attachment: PIG-5154.0.patch

> Fix GFCross related issues after merging from trunk to spark
> 
>
> Key: PIG-5154
> URL: https://issues.apache.org/jira/browse/PIG-5154
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5154.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (PIG-5154) Fix GFCross related issues after merging from trunk to spark

2017-02-23 Thread Adam Szita (JIRA)

Adam Szita created PIG-5154:
---

 Summary: Fix GFCross related issues after merging from trunk to 
spark
 Key: PIG-5154
 URL: https://issues.apache.org/jira/browse/PIG-5154
 Project: Pig
  Issue Type: Sub-task
Reporter: Adam Szita
Assignee: Adam Szita






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5137) fix TestBuiltin unit test failure after PIG-5132

2017-02-23 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880413#comment-15880413
 ] 

Adam Szita commented on PIG-5137:
-

Also please don't commit until final patch for parent ticket PIG-5132 is not 
committed

> fix TestBuiltin unit test failure  after PIG-5132
> -
>
> Key: PIG-5137
> URL: https://issues.apache.org/jira/browse/PIG-5137
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5137.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5137) fix TestBuiltin unit test failure after PIG-5132

2017-02-23 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880254#comment-15880254
 ] 

Adam Szita commented on PIG-5137:
-

+1
Failing test was: testUniqueID()
For future reference: after merging from trunk input has changed 
"1\n2\n3\n4\n5\n" -> "1\n2\n3\n4\n5" , last newline char was removed by 
PIG-4881 so we don't have to handle spark mode as special case here anymore

> fix TestBuiltin unit test failure  after PIG-5132
> -
>
> Key: PIG-5137
> URL: https://issues.apache.org/jira/browse/PIG-5137
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5137.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-23 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880198#comment-15880198
 ] 

Adam Szita commented on PIG-5132:
-

[~kellyzly]: there are still a lot of differences that your patch doesn't have. 
I only listed some examples before, not the whole list, see 
[^diffOfPatches.png].
Here:
-blue means diff in the file
-black denotes missing files from your side
-green means files that should be deleted but are still present on your side
You can see that throughout ivy, documentation, hadoop-streaming and other 
locations there is still differences to be pulled from trunk into your patch.

So your recent patch (PIG-5132.2.fix.patch) only fixes 4 problems and I don't 
think it is worth to fix every little issue one-by-one manually.
I still think you should revert the committed patch, and *git apply 
PIG-5132.1.patch*. After this you have to *git apply PIG-5132.1_fixes.patch* as 
I wrote in the steps above.
The reason why I placed two diffs is:
PIG-5132.1.patch: merge_from_trunk diff (but this is not necessarily results in 
a build-able Pig code base)
PIG-5132.1_fixes.patch: takes care of necessary changes on spark branch side to 
follow trunk properly (e.g. removing SparkMiniCluster.java from hadoop20, 
moving SparkMiniCluster from hadoop23 under pig/test directory

If you apply these patches after each other (using git apply) hadoop20 and 
hadoop23 dirs get removed from shims because git doesn't keep empty directory 
structures.


> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, jenkins.5132.2.fix.PNG, 
> PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.2.fix.patch, 
> PIG-5132.2.zip, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-23 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: diffOfPatches.png

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: diffOfPatches.png, jenkins.5132.2.fix.PNG, 
> PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.2.fix.patch, 
> PIG-5132.2.zip, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5145) fix TestLineageFindRelVisitor unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5145:
---

Assignee: Adam Szita

> fix TestLineageFindRelVisitor unit test failure after PIG-5132
> --
>
> Key: PIG-5145
> URL: https://issues.apache.org/jira/browse/PIG-5145
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5145.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5145) fix TestLineageFindRelVisitor unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878567#comment-15878567
 ] 

Adam Szita commented on PIG-5145:
-

Order of tuples is reversed after join in 
testUDFForwardingLoadCasterWithMultipleParams
Expected order:
"123"
"456"
"789"
Actual order:
"789"
"456"
"123"

Not sure if there is a contract that order should be kept by a join.
If not we can add an order-by statement to this test case (which is about 
types, so it is valid to do?) just like in the patch attached

> fix TestLineageFindRelVisitor unit test failure after PIG-5132
> --
>
> Key: PIG-5145
> URL: https://issues.apache.org/jira/browse/PIG-5145
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5145.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5145) fix TestLineageFindRelVisitor unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5145:

Attachment: PIG-5145.0.patch

> fix TestLineageFindRelVisitor unit test failure after PIG-5132
> --
>
> Key: PIG-5145
> URL: https://issues.apache.org/jira/browse/PIG-5145
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5145.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (PIG-5147) fix TestMultiQuery unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita resolved PIG-5147.
-
Resolution: Duplicate

> fix TestMultiQuery unit test failure after PIG-5132
> ---
>
> Key: PIG-5147
> URL: https://issues.apache.org/jira/browse/PIG-5147
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (PIG-5144) fix TestJoinSmoke unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878430#comment-15878430
 ] 

Adam Szita edited comment on PIG-5144 at 2/22/17 3:17 PM:
--

Seems to be FRJoin related:
{code}
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at org.apache.pig.test.TestJoinSmoke.testFRJoin(TestJoinSmoke.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
{code}


was (Author: szita):
Seems to be FRJoin related

> fix TestJoinSmoke unit test failure after PIG-5132
> --
>
> Key: PIG-5144
> URL: https://issues.apache.org/jira/browse/PIG-5144
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Nandor Kollar
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5144) fix TestJoinSmoke unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5144:
---

Assignee: Nandor Kollar

> fix TestJoinSmoke unit test failure after PIG-5132
> --
>
> Key: PIG-5144
> URL: https://issues.apache.org/jira/browse/PIG-5144
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Nandor Kollar
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5144) fix TestJoinSmoke unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878430#comment-15878430
 ] 

Adam Szita commented on PIG-5144:
-

Seems to be FRJoin related

> fix TestJoinSmoke unit test failure after PIG-5132
> --
>
> Key: PIG-5144
> URL: https://issues.apache.org/jira/browse/PIG-5144
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Nandor Kollar
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5138) fix TestCounters unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878424#comment-15878424
 ] 

Adam Szita commented on PIG-5138:
-

We just need to exclude this suite on Spark. We got a lot of metrics that throw 
UnsupportedOperationException in Spark mode. (This test was not failing before 
the merge because MR mode was hard-wired in it..)

> fix TestCounters unit test failure after PIG-5132
> -
>
> Key: PIG-5138
> URL: https://issues.apache.org/jira/browse/PIG-5138
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5138.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5138) fix TestCounters unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5138:

Attachment: PIG-5138.0.patch

> fix TestCounters unit test failure after PIG-5132
> -
>
> Key: PIG-5138
> URL: https://issues.apache.org/jira/browse/PIG-5138
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5138.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5138) fix TestCounters unit test failure after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5138:
---

Assignee: Adam Szita

> fix TestCounters unit test failure after PIG-5132
> -
>
> Key: PIG-5138
> URL: https://issues.apache.org/jira/browse/PIG-5138
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878265#comment-15878265
 ] 

Adam Szita commented on PIG-5132:
-

[~kellyzly] I see you have committed your patch (PIG-5132.2.patch). I think 
we'll be better off reverting this commit as the patch contains a lot of errors.
Some examples:
1. XPathTest#testExecTupleWithDontIgnoreNamespace is present 3 times
2. XPathTest#testFunctionInXPath is present 2 times
3. Random whitespaces came into NonFSLoadFunc:21 and :22
..etc
It seems like tests can not even be built (*ant clean jar test*) because 
shims/test/hadoop2 folder doesn't exist.
This is because some missing SVN add/mv/rm commands have to be run as well.

I think the following steps should be taken after reverting the last commit:
{code}
#my patch was generated using git diff HEAD~ --full-index --binary so it has 
every piece, no need to download anything else, just use git apply instead of 
patch
git apply PIG-5132.1.patch
git apply PIG-5132.1_fixes.patch
#let's check if everything is okay:
ant clean jar pigunit-jar
ant clean jar -f contrib/piggybank/java/build.xml
#delete build files before committing to svn
ant clean
ant clean -f contrib/piggybank/java/build.xml
#svn rm files that were deleted
svn rm 
contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HadoopJobHistoryLoader.java
svn rm 
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestHadoopJobHistoryLoader.java
svn rm 
shims/src/hadoop20/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapBase.java
svn rm 
shims/src/hadoop20/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapReduce.java
svn rm 
shims/src/hadoop20/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java
svn rm shims/src/hadoop20/org/apache/pig/backend/hadoop20/PigJobControl.java
svn rm shims/src/hadoop23/org/apache/hadoop/mapred/DowngradeHelper.java
svn rm 
shims/src/hadoop23/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapBase.java
svn rm 
shims/src/hadoop23/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapReduce.java
svn rm 
shims/src/hadoop23/org/apache/pig/backend/hadoop/executionengine/shims/HadoopShims.java
svn rm shims/src/hadoop23/org/apache/pig/backend/hadoop23/PigJobControl.java
svn rm shims/test/hadoop20/org/apache/pig/test/MiniCluster.java
svn rm shims/test/hadoop20/org/apache/pig/test/SparkMiniCluster.java
svn rm shims/test/hadoop20/org/apache/pig/test/TezMiniCluster.java
svn rm shims/test/hadoop23/org/apache/pig/test/MiniCluster.java
svn rm shims/test/hadoop23/org/apache/pig/test/SparkMiniCluster.java
svn rm shims/test/hadoop23/org/apache/pig/test/TezMiniCluster.java
svn rm 
src/META-INF/services/org.apache.hadoop.mapreduce.protocol.ClientProtocolProvider
svn rm src/docs/jdiff/pig_0.15.0.xml
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantBooleanObjectInspector.java
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantDoubleObjectInspector.java
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantFloatObjectInspector.java
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantIntObjectInspector.java
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantLongObjectInspector.java
svn rm 
src/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaConstantStringObjectInspector.java
svn rm test/e2e/pig/lib/hadoop-0.23.0-streaming.jar
svn rm test/excluded-tests-20
svn rm test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld
#recursively add every new file/folder (this is why we have to ant clean some 
steps above)
svn add .
#finally commit
svn commit
{code}




> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1_fixes.patch, PIG-5132.1.patch, 
> PIG-5132.2.zip, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (PIG-5134) fix TestAvroStorage unit test failures after PIG-5132

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5134:
---

Assignee: Nandor Kollar

> fix  TestAvroStorage unit test failures after PIG-5132
> --
>
> Key: PIG-5134
> URL: https://issues.apache.org/jira/browse/PIG-5134
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Nandor Kollar
> Fix For: spark-branch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15877732#comment-15877732
 ] 

Adam Szita commented on PIG-5132:
-

Test runs are now finished with a lot of failures:
E2E (spark mode with MR as benchmark):
[exec] Final results ,PASSED: 515  FAILED: 28   SKIPPED: 95   ABORTED: 104  
FAILED DEPENDENCY: 0

Unit tests (-Dtest.exec.type=spark):
{code}
./TEST-org.apache.pig.builtin.TestAvroStorage.txt:Tests run: 41, Failures: 1, 
Errors: 0, Time elapsed: 23.737 sec
./TEST-org.apache.pig.builtin.TestOrcStoragePushdown.txt:Tests run: 16, 
Failures: 9, Errors: 0, Time elapsed: 32.895 sec
./TEST-org.apache.pig.test.pigunit.TestPigTest.txt:Tests run: 21, Failures: 0, 
Errors: 1, Time elapsed: 31.092 sec
./TEST-org.apache.pig.test.TestBuiltin.txt:Tests run: 75, Failures: 1, Errors: 
0, Time elapsed: 34.719 sec
./TEST-org.apache.pig.test.TestCounters.txt:Tests run: 13, Failures: 1, Errors: 
8, Time elapsed: 19.508 sec
./TEST-org.apache.pig.test.TestCustomPartitioner.txt:Tests run: 4, Failures: 0, 
Errors: 1, Time elapsed: 25.945 sec
./TEST-org.apache.pig.test.TestEmptyInputDir.txt:Tests run: 9, Failures: 4, 
Errors: 0, Time elapsed: 9.081 sec
./TEST-org.apache.pig.test.TestEvalPipeline2.txt:Tests run: 55, Failures: 0, 
Errors: 1, Time elapsed: 30.533 sec
./TEST-org.apache.pig.test.TestFRJoinNullValue.txt:Tests run: 4, Failures: 4, 
Errors: 0, Time elapsed: 8.468 sec
./TEST-org.apache.pig.test.TestFRJoin.txt:Tests run: 18, Failures: 11, Errors: 
0, Time elapsed: 18.057 sec
./TEST-org.apache.pig.test.TestJoinSmoke.txt:Tests run: 3, Failures: 1, Errors: 
0, Time elapsed: 9.582 sec
./TEST-org.apache.pig.test.TestLineageFindRelVisitor.txt:Tests run: 6, 
Failures: 1, Errors: 0, Time elapsed: 6.655 sec
./TEST-org.apache.pig.test.TestMultiQuery.txt:Tests run: 16, Failures: 2, 
Errors: 1, Time elapsed: 12.556 sec
./TEST-org.apache.pig.test.TestPigRunner.txt:Tests run: 32, Failures: 1, 
Errors: 0, Time elapsed: 35.882 sec
./TEST-org.apache.pig.test.TestPredeployedJar.txt:Tests run: 2, Failures: 0, 
Errors: 1, Time elapsed: 32.77 sec
./TEST-org.apache.pig.test.TestPruneColumn.txt:Tests run: 71, Failures: 2, 
Errors: 2, Time elapsed: 22.66 sec
{code}

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1_fixes.patch, PIG-5132.1.patch, 
> PIG-5132.2.zip, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-21 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876266#comment-15876266
 ] 

Adam Szita commented on PIG-5132:
-

[~kellyzly]: [~nkollar] and I have created the merge patch too, see 
[^PIG-5132.1.patch].
It is less in size because file movements are handled as moves and not as 
deletion/additions. It also includes binary file changes (e.g. ant-contrib.jar).
Basically diff was generated as
{code}
git diff HEAD~ --full-index --binary
{code}
After resolving the merge conflicts we saw some build problems. These along 
with some e2e testing fixes can be found in a subsequent patch 
[^PIG-5132.1_fixes.patch]
We're running unit and e2e tests right now.

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-21 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: PIG-5132.1_fixes.patch

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1_fixes.patch, PIG-5132.1.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-21 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: PIG-5132.1.patch

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-21 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: (was: PIG-5132.1.patch)

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5132) Merge from trunk (5) [Spark Branch]

2017-02-21 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5132:

Attachment: PIG-5132.1.patch

> Merge from trunk (5) [Spark Branch]
> ---
>
> Key: PIG-5132
> URL: https://issues.apache.org/jira/browse/PIG-5132
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5132.1.patch, PIG-5132.patch
>
>
> merge changes from trunk to branch.
> the latest commit in trunk is
>  92df45d - (origin/trunk, origin/HEAD, trunk) PIG-5085: Support FLATTEN of 
> maps (szita via rohini) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-02-21 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875950#comment-15875950
 ] 

Adam Szita commented on PIG-5110:
-

[~rohini]: doc and test updated in [^PIG-5110.1.patch] according to your 
comments

> Removing schema alias and :: coming from parent relation
> 
>
> Key: PIG-5110
> URL: https://issues.apache.org/jira/browse/PIG-5110
> Project: Pig
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-5110.0.patch, PIG-5110.1.patch
>
>
> Customers have asked for a feature to get rid of the schema alias prefixes. 
> CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field 
> alias and ::
> I would like to find a way to disable this feature. (The burden of making 
> sure not to have duplicate aliases - and hence the appropriate 
> FrontendException getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-02-21 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5110:

Attachment: PIG-5110.1.patch

> Removing schema alias and :: coming from parent relation
> 
>
> Key: PIG-5110
> URL: https://issues.apache.org/jira/browse/PIG-5110
> Project: Pig
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-5110.0.patch, PIG-5110.1.patch
>
>
> Customers have asked for a feature to get rid of the schema alias prefixes. 
> CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field 
> alias and ::
> I would like to find a way to disable this feature. (The burden of making 
> sure not to have duplicate aliases - and hence the appropriate 
> FrontendException getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5131) Typo issue in MinutesBetween

2017-02-20 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874715#comment-15874715
 ] 

Adam Szita commented on PIG-5131:
-

Hi [~kalyanhadoop], you can also attach a patch with the required change, 
please refer 
https://cwiki.apache.org/confluence/display/PIG/HowToContribute#HowToContribute-MakingChanges
 for more info.

> Typo issue in MinutesBetween
> 
>
> Key: PIG-5131
> URL: https://issues.apache.org/jira/browse/PIG-5131
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Reporter: Kalyan
>Priority: Minor
>
> http://pig.apache.org/docs/r0.16.0/func.html#minutes-between
> Use the MinutsBetween function to get the number of minutes between the two 
> given datetime objects.
> Correct the above line like below:
> Use the MinutesBetween function to get the number of minutes between the two 
> given datetime objects.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-4899) The number of records of input file is calculated wrongly in spark mode in multiquery case

2017-02-20 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4899:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

>  The number of records of input file is calculated wrongly in spark mode in 
> multiquery case
> ---
>
> Key: PIG-4899
> URL: https://issues.apache.org/jira/browse/PIG-4899
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-4899.2.patch, PIG-4899.3IncrFrom2.patch, 
> PIG-4899.patch
>
>
> sparkCounter to calucate the records of input 
> file(LoadConverter#ToTupleFunction#apply) will be executed multiple times in 
> multiquery case. This will cause the input records number is calculated 
> wrongly. for example:
> {code}
> #--
> # Spark Plan  
> #--
> Spark node scope-534
> Split - scope-548
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-538
> |   |
> |   |---C: Filter[bag] - scope-495
> |   |   |
> |   |   Less Than or Equal[boolean] - scope-498
> |   |   |
> |   |   |---Project[int][1] - scope-496
> |   |   |
> |   |   |---Constant(5) - scope-497
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp804709981:org.apache.pig.impl.io.InterStorage)
>  - scope-546
> |   |
> |   |---B: Filter[bag] - scope-507
> |   |   |
> |   |   Equal To[boolean] - scope-510
> |   |   |
> |   |   |---Project[int][0] - scope-508
> |   |   |
> |   |   |---Constant(3) - scope-509
> |
> |---A: New For Each(false,false,false)[bag] - scope-491
> |   |
> |   Cast[int] - scope-483
> |   |
> |   |---Project[bytearray][0] - scope-482
> |   |
> |   Cast[int] - scope-486
> |   |
> |   |---Project[bytearray][1] - scope-485
> |   |
> |   Cast[int] - scope-489
> |   |
> |   |---Project[bytearray][2] - scope-488
> |
> |---A: 
> Load(hdfs://localhost:48350/user/root/input:org.apache.pig.builtin.PigStorage)
>  - scope-481
> Spark node scope-540
> C: 
> Store(hdfs://localhost:48350/user/root/output:org.apache.pig.builtin.PigStorage)
>  - scope-502
> |
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-539
> Spark node scope-542
> D: 
> Store(hdfs://localhost:48350/user/root/output2:org.apache.pig.builtin.PigStorage)
>  - scope-533
> |
> |---D: FRJoin[tuple] - scope-525
> |   |
> |   Project[int][0] - scope-522
> |   |
> |   Project[int][0] - scope-523
> |   |
> |   Project[int][0] - scope-524
> |
> 
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-541
> Spark node scope-545
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp-2036144538:org.apache.pig.impl.io.InterStorage)
>  - scope-547
> |
> |---A1: New For Each(false,false,false)[bag] - scope-521
> |   |
> |   Cast[int] - scope-513
> |   |
> |   |---Project[bytearray][0] - scope-512
> |   |
> |   Cast[int] - scope-516
> |   |
> |   |---Project[bytearray][1] - scope-515
> |   |
> |   Cast[int] - scope-519
> |   |
> |   |---Project[bytearray][2] - scope-518
> |
> |---A1: 
> Load(hdfs://localhost:48350/user/root/input2:org.apache.pig.builtin.PigStorage)
>  - scope-511---
> {code}
> PhysicalOperator (LoadA) will be executed in 
> LoadConverter#ToTupleFunction#apply for more than the correct times because 
> this is a multi-query case. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-4899) The number of records of input file is calculated wrongly in spark mode in multiquery case

2017-02-20 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874701#comment-15874701
 ] 

Adam Szita commented on PIG-4899:
-

I have tested this with a single script:
{code}
fs -rm -r multi_1 multi_2
A = LOAD 'ccymin.csv' AS (id:int, ccy:chararray);
B1 = FOREACH A GENERATE id;
B2 = FOREACH A GENERATE ccy;
STORE B1 INTO 'multi_1';
STORE B2 INTO 'multi_2';
{code}

This used to throw NPE in yarn mode, but with the last change 
(0786828e276f6db3f6355324f170e2a4658ce7fc) it succeeds.
Nandor, Kelly: thanks for the review and the commit, I'm marking this as 
resolved.

>  The number of records of input file is calculated wrongly in spark mode in 
> multiquery case
> ---
>
> Key: PIG-4899
> URL: https://issues.apache.org/jira/browse/PIG-4899
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-4899.2.patch, PIG-4899.3IncrFrom2.patch, 
> PIG-4899.patch
>
>
> sparkCounter to calucate the records of input 
> file(LoadConverter#ToTupleFunction#apply) will be executed multiple times in 
> multiquery case. This will cause the input records number is calculated 
> wrongly. for example:
> {code}
> #--
> # Spark Plan  
> #--
> Spark node scope-534
> Split - scope-548
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-538
> |   |
> |   |---C: Filter[bag] - scope-495
> |   |   |
> |   |   Less Than or Equal[boolean] - scope-498
> |   |   |
> |   |   |---Project[int][1] - scope-496
> |   |   |
> |   |   |---Constant(5) - scope-497
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp804709981:org.apache.pig.impl.io.InterStorage)
>  - scope-546
> |   |
> |   |---B: Filter[bag] - scope-507
> |   |   |
> |   |   Equal To[boolean] - scope-510
> |   |   |
> |   |   |---Project[int][0] - scope-508
> |   |   |
> |   |   |---Constant(3) - scope-509
> |
> |---A: New For Each(false,false,false)[bag] - scope-491
> |   |
> |   Cast[int] - scope-483
> |   |
> |   |---Project[bytearray][0] - scope-482
> |   |
> |   Cast[int] - scope-486
> |   |
> |   |---Project[bytearray][1] - scope-485
> |   |
> |   Cast[int] - scope-489
> |   |
> |   |---Project[bytearray][2] - scope-488
> |
> |---A: 
> Load(hdfs://localhost:48350/user/root/input:org.apache.pig.builtin.PigStorage)
>  - scope-481
> Spark node scope-540
> C: 
> Store(hdfs://localhost:48350/user/root/output:org.apache.pig.builtin.PigStorage)
>  - scope-502
> |
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-539
> Spark node scope-542
> D: 
> Store(hdfs://localhost:48350/user/root/output2:org.apache.pig.builtin.PigStorage)
>  - scope-533
> |
> |---D: FRJoin[tuple] - scope-525
> |   |
> |   Project[int][0] - scope-522
> |   |
> |   Project[int][0] - scope-523
> |   |
> |   Project[int][0] - scope-524
> |
> 
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-541
> Spark node scope-545
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp-2036144538:org.apache.pig.impl.io.InterStorage)
>  - scope-547
> |
> |---A1: New For Each(false,false,false)[bag] - scope-521
> |   |
> |   Cast[int] - scope-513
> |   |
> |   |---Project[bytearray][0] - scope-512
> |   |
> |   Cast[int] - scope-516
> |   |
> |   |---Project[bytearray][1] - scope-515
> |   |
> |   Cast[int] - scope-519
> |   |
> |   |---Project[bytearray][2] - scope-518
> |
> |---A1: 
> Load(hdfs://localhost:48350/user/root/input2:org.apache.pig.builtin.PigStorage)
>  - scope-511---
> {code}
> PhysicalOperator (LoadA) will be executed in 
> LoadConverter#ToTupleFunction#apply for more than the correct times because 
> this is a multi-query case. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5085) Support FLATTEN of maps

2017-02-18 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873268#comment-15873268
 ] 

Adam Szita commented on PIG-5085:
-

Thanks Rohini for reviewing.

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch, PIG-5085.2.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (PIG-5085) Support FLATTEN of maps

2017-02-17 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871910#comment-15871910
 ] 

Adam Szita edited comment on PIG-5085 at 2/17/17 2:19 PM:
--

Comments addressed in [^PIG-5085.2.patch], thanks [~rohini]
Meanwhile I finished running unit tests, they've all passed.


was (Author: szita):
Comments addressed in [^PIG-5085.2.patch], thanks [~rohini]

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch, PIG-5085.2.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5085) Support FLATTEN of maps

2017-02-17 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871910#comment-15871910
 ] 

Adam Szita commented on PIG-5085:
-

Comments addressed in [^PIG-5085.2.patch], thanks [~rohini]

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch, PIG-5085.2.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5085) Support FLATTEN of maps

2017-02-17 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5085:

Attachment: PIG-5085.2.patch

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch, PIG-5085.2.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-4899) The number of records of input file is calculated wrongly in spark mode in multiquery case

2017-02-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4899:

Status: Patch Available  (was: Reopened)

>  The number of records of input file is calculated wrongly in spark mode in 
> multiquery case
> ---
>
> Key: PIG-4899
> URL: https://issues.apache.org/jira/browse/PIG-4899
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-4899.2.patch, PIG-4899.3IncrFrom2.patch, 
> PIG-4899.patch
>
>
> sparkCounter to calucate the records of input 
> file(LoadConverter#ToTupleFunction#apply) will be executed multiple times in 
> multiquery case. This will cause the input records number is calculated 
> wrongly. for example:
> {code}
> #--
> # Spark Plan  
> #--
> Spark node scope-534
> Split - scope-548
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-538
> |   |
> |   |---C: Filter[bag] - scope-495
> |   |   |
> |   |   Less Than or Equal[boolean] - scope-498
> |   |   |
> |   |   |---Project[int][1] - scope-496
> |   |   |
> |   |   |---Constant(5) - scope-497
> |   |
> |   
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp804709981:org.apache.pig.impl.io.InterStorage)
>  - scope-546
> |   |
> |   |---B: Filter[bag] - scope-507
> |   |   |
> |   |   Equal To[boolean] - scope-510
> |   |   |
> |   |   |---Project[int][0] - scope-508
> |   |   |
> |   |   |---Constant(3) - scope-509
> |
> |---A: New For Each(false,false,false)[bag] - scope-491
> |   |
> |   Cast[int] - scope-483
> |   |
> |   |---Project[bytearray][0] - scope-482
> |   |
> |   Cast[int] - scope-486
> |   |
> |   |---Project[bytearray][1] - scope-485
> |   |
> |   Cast[int] - scope-489
> |   |
> |   |---Project[bytearray][2] - scope-488
> |
> |---A: 
> Load(hdfs://localhost:48350/user/root/input:org.apache.pig.builtin.PigStorage)
>  - scope-481
> Spark node scope-540
> C: 
> Store(hdfs://localhost:48350/user/root/output:org.apache.pig.builtin.PigStorage)
>  - scope-502
> |
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-539
> Spark node scope-542
> D: 
> Store(hdfs://localhost:48350/user/root/output2:org.apache.pig.builtin.PigStorage)
>  - scope-533
> |
> |---D: FRJoin[tuple] - scope-525
> |   |
> |   Project[int][0] - scope-522
> |   |
> |   Project[int][0] - scope-523
> |   |
> |   Project[int][0] - scope-524
> |
> 
> |---Load(hdfs://localhost:48350/tmp/temp649016960/tmp48836938:org.apache.pig.impl.io.InterStorage)
>  - scope-541
> Spark node scope-545
> Store(hdfs://localhost:48350/tmp/temp649016960/tmp-2036144538:org.apache.pig.impl.io.InterStorage)
>  - scope-547
> |
> |---A1: New For Each(false,false,false)[bag] - scope-521
> |   |
> |   Cast[int] - scope-513
> |   |
> |   |---Project[bytearray][0] - scope-512
> |   |
> |   Cast[int] - scope-516
> |   |
> |   |---Project[bytearray][1] - scope-515
> |   |
> |   Cast[int] - scope-519
> |   |
> |   |---Project[bytearray][2] - scope-518
> |
> |---A1: 
> Load(hdfs://localhost:48350/user/root/input2:org.apache.pig.builtin.PigStorage)
>  - scope-511---
> {code}
> PhysicalOperator (LoadA) will be executed in 
> LoadConverter#ToTupleFunction#apply for more than the correct times because 
> this is a multi-query case. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5085:

Status: Patch Available  (was: In Progress)

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5085:

Fix Version/s: 0.17.0

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870133#comment-15870133
 ] 

Adam Szita commented on PIG-5085:
-

Attached [^PIG-5085.1.patch] to include documentation.

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5085:

Attachment: PIG-5085.1.patch

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-5085.0.patch, PIG-5085.1.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870021#comment-15870021
 ] 

Adam Szita commented on PIG-5085:
-

Attached [^PIG-5085.0.patch] with the feature, basically breaks down maps into 
multiple tuples:
Example:
{code}
--
| A | id:chararray| m:map(:chararray)| 
--
|   | 3   | {color=green, name=kiwi} | 
--
-
| B | id:chararray| m::key:chararray| m::value:chararray| 
-
|   | 3   | color   | green | 
|   | 3   | name| kiwi  | 
-
{code}
I'm still waiting for unit tests and will attach another patch with the doc 
changes too.
[~rohini] can you please take a look to see if you agree with my approach?

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-5085.0.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5085) Support FLATTEN of maps

2017-02-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5085:

Attachment: PIG-5085.0.patch

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-5085.0.patch
>
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (PIG-5085) Support FLATTEN of maps

2017-02-15 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on PIG-5085 started by Adam Szita.
---
> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5128) Fix TestPigRunner.simpleMultiQueryTest3 unit test failure

2017-02-15 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867441#comment-15867441
 ] 

Adam Szita commented on PIG-5128:
-

[~kellyzly] NPE is a known issue affecting all yarn mode pig-on-spark jobs at 
the moment as stated in PIG-4899.
Can you please review that ticket and commit the uploaded patch there 
(PIG-4899.3IncrFrom2.patch) - I believe it will fix your above concern

> Fix TestPigRunner.simpleMultiQueryTest3 unit test failure
> -
>
> Key: PIG-5128
> URL: https://issues.apache.org/jira/browse/PIG-5128
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: Nandor Kollar
> Fix For: spark-branch
>
> Attachments: PIG-5128.patch
>
>
> After PIG-4891 is committed, TestPigRunner.simpleMultiQueryTest3 
> fails(detailed see 
> https://builds.apache.org/job/Pig-spark/lastCompletedBuild/testReport/org.apache.pig.test/TestPigRunner/simpleMultiQueryTest3/).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5044) Create SparkCompiler#getSamplingJob in spark mode

2017-02-09 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859466#comment-15859466
 ] 

Adam Szita commented on PIG-5044:
-

[~kellyzly] RB reports the latest diff to be PIG-5044_3.patch, I can't see the 
new (_4) diff uploaded there.
Can you please post it there as well?

> Create SparkCompiler#getSamplingJob in spark mode
> -
>
> Key: PIG-5044
> URL: https://issues.apache.org/jira/browse/PIG-5044
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5044_2.patch, PIG-5044_3.patch, PIG-5044_4.patch
>
>
> Like MRCompiler#getSamplingJob, we also need a function like that to sample 
> data from a file, sort sampling data  and generate output by 
> UDF(org.apache.pig.impl.builtin.FindQuantiles).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-5044) Create SparkCompiler#getSamplingJob in spark mode

2017-02-09 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5044:

Summary: Create SparkCompiler#getSamplingJob in spark mode  (was: Create 
SparlCompiler#getSamplingJob in spark mode)

> Create SparkCompiler#getSamplingJob in spark mode
> -
>
> Key: PIG-5044
> URL: https://issues.apache.org/jira/browse/PIG-5044
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-5044_2.patch, PIG-5044_3.patch, PIG-5044_4.patch
>
>
> Like MRCompiler#getSamplingJob, we also need a function like that to sample 
> data from a file, sort sampling data  and generate output by 
> UDF(org.apache.pig.impl.builtin.FindQuantiles).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-5127) Test fail when running test-core-mrtez

2017-02-09 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859229#comment-15859229
 ] 

Adam Szita commented on PIG-5127:
-

+1, looks good, test passes

> Test fail when running test-core-mrtez
> --
>
> Key: PIG-5127
> URL: https://issues.apache.org/jira/browse/PIG-5127
> Project: Pig
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.17.0
>
> Attachments: PIG-5127-1.patch
>
>
> For example, the following command fail:
> ant -Dtestcase=TestPredeployedJar test-core-mrtez
> The reason is mr test left hadoop-site.xml and interfere with tez test. 
> MiniMRCluster and MiniTezCluster use a different set of config files 
> (hadoop-site.xml vs core-site.xml+hdfs-site.xml) and will only clear it's own 
> config file when starting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-4748) DateTimeWritable forgets Chronology

2017-02-08 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858161#comment-15858161
 ] 

Adam Szita commented on PIG-4748:
-

Would [^PIG-4748.2.patch] be something you had in mind [~rohini], [~daijy]?

> DateTimeWritable forgets Chronology
> ---
>
> Key: PIG-4748
> URL: https://issues.apache.org/jira/browse/PIG-4748
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Martin Junghanns
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4748.2.patch, PIG-4748.patch
>
>
> The following test fails:
> {code}
> @Test
> public void foo() throws IOException {
> DateTime nowIn = DateTime.now();
> DateTimeWritable in = new DateTimeWritable(nowIn);
> ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
> DataOutputStream dataOut = new DataOutputStream(outputStream);
> in.write(dataOut);
> dataOut.flush();
> // read from byte[]
> DateTimeWritable out = new DateTimeWritable();
> ByteArrayInputStream inputStream = new ByteArrayInputStream(
>   outputStream.toByteArray());
> DataInputStream dataIn = new DataInputStream(inputStream);
> out.readFields(dataIn);
> assertEquals(in.get(), out.get());
> }
> {code}
> In equals(), the original instance has
> {code}
> ISOChronology[Europe/Berlin]
> {code}
> while the deserialized instance has
> {code}
> ISOChronology[+01:00]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-4913) Reduce jython function initiation during compilation

2017-02-07 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4913:

Status: Patch Available  (was: In Progress)

> Reduce jython function initiation during compilation
> 
>
> Key: PIG-4913
> URL: https://issues.apache.org/jira/browse/PIG-4913
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-4913.2.patch, PIG-4913.patch
>
>
> While investigating PIG-4908, saw that ScriptEngine.getScriptAsStream was 
> invoked way too many times during compilation phase for a simple script.
> {code:title=sleep.py}
> #!/usr/bin/python
> import time;
> @outputSchema("sltime:int")
> def sleep(num):
> if num == 1:
> print "Sleeping for %d minutes" % num;
> time.sleep(num * 60);
> return num;
> {code}
> {code:title=sleep.pig}
> register 'sleep.py' using jython;
> A = LOAD '/tmp/sleepdata' as (f1:int);
> B = FOREACH A generate $0, sleep($0);
> STORE B into '/tmp/tezout';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-4913) Reduce jython function initiation during compilation

2017-02-07 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855973#comment-15855973
 ] 

Adam Szita commented on PIG-4913:
-

I see, [~rohini]. 
It looks like that the frequent recompilation is caused by getFunction method 
which always calls the init method. This is necessary if we are using multiple 
Python scripts: one instance of PythonInterpreter seems to bind to one script 
(from which we can retrieve the Python locals, etc..).

I've attached a new patch [^PIG-4913.2.patch] to address this.
My approach is to use a pool of these interpreters and keep a number of them in 
memory for future use so Pig doesn't have to recompile each time. If the pool 
is full we'll remove the oldest instance and do Python script compilation then. 
The pool size can be set by
{code}
static final int INTERPRETER_POOL_SIZE = 10;
{code}


> Reduce jython function initiation during compilation
> 
>
> Key: PIG-4913
> URL: https://issues.apache.org/jira/browse/PIG-4913
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-4913.2.patch, PIG-4913.patch
>
>
> While investigating PIG-4908, saw that ScriptEngine.getScriptAsStream was 
> invoked way too many times during compilation phase for a simple script.
> {code:title=sleep.py}
> #!/usr/bin/python
> import time;
> @outputSchema("sltime:int")
> def sleep(num):
> if num == 1:
> print "Sleeping for %d minutes" % num;
> time.sleep(num * 60);
> return num;
> {code}
> {code:title=sleep.pig}
> register 'sleep.py' using jython;
> A = LOAD '/tmp/sleepdata' as (f1:int);
> B = FOREACH A generate $0, sleep($0);
> STORE B into '/tmp/tezout';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (PIG-4748) DateTimeWritable forgets Chronology

2017-02-02 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4748:

Attachment: PIG-4748.2.patch

> DateTimeWritable forgets Chronology
> ---
>
> Key: PIG-4748
> URL: https://issues.apache.org/jira/browse/PIG-4748
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Martin Junghanns
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4748.2.patch, PIG-4748.patch
>
>
> The following test fails:
> {code}
> @Test
> public void foo() throws IOException {
> DateTime nowIn = DateTime.now();
> DateTimeWritable in = new DateTimeWritable(nowIn);
> ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
> DataOutputStream dataOut = new DataOutputStream(outputStream);
> in.write(dataOut);
> dataOut.flush();
> // read from byte[]
> DateTimeWritable out = new DateTimeWritable();
> ByteArrayInputStream inputStream = new ByteArrayInputStream(
>   outputStream.toByteArray());
> DataInputStream dataIn = new DataInputStream(inputStream);
> out.readFields(dataIn);
> assertEquals(in.get(), out.get());
> }
> {code}
> In equals(), the original instance has
> {code}
> ISOChronology[Europe/Berlin]
> {code}
> while the deserialized instance has
> {code}
> ISOChronology[+01:00]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (PIG-4748) DateTimeWritable forgets Chronology

2017-02-02 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850147#comment-15850147
 ] 

Adam Szita commented on PIG-4748:
-

I upgraded the patch for this, see [^PIG-4748.2.patch].
DateTimeWritables now write (long,int,int) corresponding to the DateTime 
instant (millis), offsetInMillis, position in the zone ID list respectively.
Zone list itself is carried along to backends via JobConf. For some inputs we 
can only rely on offsetInMillis (e.g. +01:00), on others we rely on ZoneID 
(e.g. America/Los_Angeles).

> DateTimeWritable forgets Chronology
> ---
>
> Key: PIG-4748
> URL: https://issues.apache.org/jira/browse/PIG-4748
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Martin Junghanns
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4748.2.patch, PIG-4748.patch
>
>
> The following test fails:
> {code}
> @Test
> public void foo() throws IOException {
> DateTime nowIn = DateTime.now();
> DateTimeWritable in = new DateTimeWritable(nowIn);
> ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
> DataOutputStream dataOut = new DataOutputStream(outputStream);
> in.write(dataOut);
> dataOut.flush();
> // read from byte[]
> DateTimeWritable out = new DateTimeWritable();
> ByteArrayInputStream inputStream = new ByteArrayInputStream(
>   outputStream.toByteArray());
> DataInputStream dataIn = new DataInputStream(inputStream);
> out.readFields(dataIn);
> assertEquals(in.get(), out.get());
> }
> {code}
> In equals(), the original instance has
> {code}
> ISOChronology[Europe/Berlin]
> {code}
> while the deserialized instance has
> {code}
> ISOChronology[+01:00]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] (PIG-4748) DateTimeWritable forgets Chronology

2017-01-31 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847362#comment-15847362
 ] 

Adam Szita commented on PIG-4748:
-

[~daijy], what's your opinion?

> DateTimeWritable forgets Chronology
> ---
>
> Key: PIG-4748
> URL: https://issues.apache.org/jira/browse/PIG-4748
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Martin Junghanns
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4748.patch
>
>
> The following test fails:
> {code}
> @Test
> public void foo() throws IOException {
> DateTime nowIn = DateTime.now();
> DateTimeWritable in = new DateTimeWritable(nowIn);
> ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
> DataOutputStream dataOut = new DataOutputStream(outputStream);
> in.write(dataOut);
> dataOut.flush();
> // read from byte[]
> DateTimeWritable out = new DateTimeWritable();
> ByteArrayInputStream inputStream = new ByteArrayInputStream(
>   outputStream.toByteArray());
> DataInputStream dataIn = new DataInputStream(inputStream);
> out.readFields(dataIn);
> assertEquals(in.get(), out.get());
> }
> {code}
> In equals(), the original instance has
> {code}
> ISOChronology[Europe/Berlin]
> {code}
> while the deserialized instance has
> {code}
> ISOChronology[+01:00]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-01-26 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on PIG-5110 started by Adam Szita.
---
> Removing schema alias and :: coming from parent relation
> 
>
> Key: PIG-5110
> URL: https://issues.apache.org/jira/browse/PIG-5110
> Project: Pig
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>
> Customers have asked for a feature to get rid of the schema alias prefixes. 
> CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field 
> alias and ::
> I would like to find a way to disable this feature. (The burden of making 
> sure not to have duplicate aliases - and hence the appropriate 
> FrontendException getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-01-26 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839997#comment-15839997
 ] 

Adam Szita commented on PIG-5110:
-

Thanks for the tip [~rohini], I'll attach a basic implementation shortly.

> Removing schema alias and :: coming from parent relation
> 
>
> Key: PIG-5110
> URL: https://issues.apache.org/jira/browse/PIG-5110
> Project: Pig
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>
> Customers have asked for a feature to get rid of the schema alias prefixes. 
> CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field 
> alias and ::
> I would like to find a way to disable this feature. (The burden of making 
> sure not to have duplicate aliases - and hence the appropriate 
> FrontendException getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-5114) Getting error 1006-unable to iterate alias for r

2017-01-26 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839834#comment-15839834
 ] 

Adam Szita commented on PIG-5114:
-

Can you share some more details please?
A stack trace of the error and the pig script could both come handy.

> Getting error 1006-unable to iterate alias for r
> 
>
> Key: PIG-5114
> URL: https://issues.apache.org/jira/browse/PIG-5114
> Project: Pig
>  Issue Type: Bug
> Environment: OS  - Ubuntu 16.04
> 2 Virtual machines with OS Ubuntu-16.04 having 
> Hadoop 2.5.1 installed as master and slave.
> HBase 1.1.4 installed in distributed mode.
> Pig-15 is installed in master virtual machine.
>Reporter: Sandip Samaddar
>
> I am using 2 virtual machines where 1 is hadoop master and another is hadoop 
> slave. I have installed HBase 1.1.4 in distributed mode . and then pig 15 is 
> installed in master. Now I open pig in mapreduce mode and load a txt file 
> from hdfs and then dump it , I get error unable to iterate alias. 
> But in local mode dump is working fine.
> It is also to mention that I did 
> ant clean tar -Dhadoopversion=23 -Dhbase95.version=1.1.2 
> -Dforrest.home=/home/hduser/forrest/apache-forrest-0.9 
> Build was successful, still getting error. Kindly help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-20 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831663#comment-15831663
 ] 

Adam Szita commented on PIG-4923:
-

I see, [~daijy]. I believe [^PIG-4923.mvnDeployFix.patch] should fix this, 
although I don't have permission for Apache mvn repo, so let me know in case of 
any issues.

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: ant-contrib-1.0b3.jar, PIG-4923.1.patch, 
> PIG-4923.2.IncrementalHadoop3.patch, PIG-4923.2.patch, PIG-4923.3.patch, 
> PIG-4923.4.patch, PIG-4923.5.patch, PIG-4923.mvnDeployFix.patch
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-20 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4923:

Attachment: PIG-4923.mvnDeployFix.patch

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: ant-contrib-1.0b3.jar, PIG-4923.1.patch, 
> PIG-4923.2.IncrementalHadoop3.patch, PIG-4923.2.patch, PIG-4923.3.patch, 
> PIG-4923.4.patch, PIG-4923.5.patch, PIG-4923.mvnDeployFix.patch
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-01-18 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828231#comment-15828231
 ] 

Adam Szita commented on PIG-5110:
-

Obviously this can be worked around with FOREACH operators and specifying 
custom alias names for small relations, but for those with lot of fields or 
where we don't know the field name this is not an option.

Possible solutions that pop into my mind are:
-Introducing a Pig property to disable the feature, and hooking the 
configuration to the relevant operators so that they know if prepending with :: 
is required or not
  e.g. in 
https://github.com/apache/pig/blob/trunk/src/org/apache/pig/newplan/logical/relational/LOJoin.java#L150
 we can enhance the condition according to what our new property is set to.
-Expanding the query language by introducing a new operator just for this 
purpose - this could be used just before STORE or DUMP operators to fix alias 
names.

[~rohini] what's your opinion?

> Removing schema alias and :: coming from parent relation
> 
>
> Key: PIG-5110
> URL: https://issues.apache.org/jira/browse/PIG-5110
> Project: Pig
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>
> Customers have asked for a feature to get rid of the schema alias prefixes. 
> CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field 
> alias and ::
> I would like to find a way to disable this feature. (The burden of making 
> sure not to have duplicate aliases - and hence the appropriate 
> FrontendException getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (PIG-5110) Removing schema alias and :: coming from parent relation

2017-01-18 Thread Adam Szita (JIRA)

Adam Szita created PIG-5110:
---

 Summary: Removing schema alias and :: coming from parent relation
 Key: PIG-5110
 URL: https://issues.apache.org/jira/browse/PIG-5110
 Project: Pig
  Issue Type: New Feature
Reporter: Adam Szita
Assignee: Adam Szita


Customers have asked for a feature to get rid of the schema alias prefixes. 
CROSS, JOIN, FLATTEN, etc.. prepend the field name with the parent field alias 
and ::
I would like to find a way to disable this feature. (The burden of making sure 
not to have duplicate aliases - and hence the appropriate FrontendException 
getting thrown - is on the user)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4913) Reduce jython function initiation during compilation

2017-01-18 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827971#comment-15827971
 ] 

Adam Szita commented on PIG-4913:
-

[~rohini] can you share your thoughts please?

> Reduce jython function initiation during compilation
> 
>
> Key: PIG-4913
> URL: https://issues.apache.org/jira/browse/PIG-4913
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
> Attachments: PIG-4913.patch
>
>
> While investigating PIG-4908, saw that ScriptEngine.getScriptAsStream was 
> invoked way too many times during compilation phase for a simple script.
> {code:title=sleep.py}
> #!/usr/bin/python
> import time;
> @outputSchema("sltime:int")
> def sleep(num):
> if num == 1:
> print "Sleeping for %d minutes" % num;
> time.sleep(num * 60);
> return num;
> {code}
> {code:title=sleep.pig}
> register 'sleep.py' using jython;
> A = LOAD '/tmp/sleepdata' as (f1:int);
> B = FOREACH A generate $0, sleep($0);
> STORE B into '/tmp/tezout';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-5109) Remove HadoopJobHistoryLoader

2017-01-18 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827835#comment-15827835
 ] 

Adam Szita commented on PIG-5109:
-

SVN commands to run:
{code}
svn rm 
contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/HadoopJobHistoryLoader.java
svn rm 
contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/TestHadoopJobHistoryLoader.java
 
{code}

> Remove HadoopJobHistoryLoader
> -
>
> Key: PIG-5109
> URL: https://issues.apache.org/jira/browse/PIG-5109
> Project: Pig
>  Issue Type: Sub-task
>  Components: piggybank
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5109.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-5109) Remove HadoopJobHistoryLoader

2017-01-18 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5109:

Status: Patch Available  (was: Open)

> Remove HadoopJobHistoryLoader
> -
>
> Key: PIG-5109
> URL: https://issues.apache.org/jira/browse/PIG-5109
> Project: Pig
>  Issue Type: Sub-task
>  Components: piggybank
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5109.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-18 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827827#comment-15827827
 ] 

Adam Szita commented on PIG-4923:
-

[~daijy], I've created a subtask (PIG-5109) for the removal of 
HadoopJobHistoryLoader - I think it will be easier for everyone in the future 
to refer to a separate ticket on the matter.
As for h2 classifier, I don't think it is used in mvn-install after this path. 
If I try:
{code}
ant clean mvn-install
{code}
{code}
...
mvn-install:
[artifact:install] [INFO] Installing 
/Users/szita/shadow/CDH/pig/build/pig-0.17.0-SNAPSHOT-h2.jar to 
/Users/szita/.m2/repository/org/apache/pig/pig/0.17.0-SNAPSHOT/pig-0.17.0-SNAPSHOT.jar
[artifact:install] [INFO] Installing 
/Users/szita/shadow/CDH/pig/build/pig-0.17.0-SNAPSHOT-sources.jar to 
/Users/szita/.m2/repository/org/apache/pig/pig/0.17.0-SNAPSHOT/pig-0.17.0-SNAPSHOT-sources.jar
[artifact:install] [INFO] Installing 
/Users/szita/shadow/CDH/pig/build/pig-0.17.0-SNAPSHOT-javadoc.jar to 
/Users/szita/.m2/repository/org/apache/pig/pig/0.17.0-SNAPSHOT/pig-0.17.0-SNAPSHOT-javadoc.jar
[artifact:install] [INFO] Installing /Users/szita/shadow/CDH/pig/pigunit.jar to 
/Users/szita/.m2/repository/org/apache/pig/pigunit/0.17.0-SNAPSHOT/pigunit-0.17.0-SNAPSHOT.jar
[artifact:install] [INFO] Installing 
/Users/szita/shadow/CDH/pig/build/pig-0.17.0-SNAPSHOT-smoketests.jar to 
/Users/szita/.m2/repository/org/apache/pig/pigsmoke/0.17.0-SNAPSHOT/pigsmoke-0.17.0-SNAPSHOT.jar
[artifact:install] [INFO] Installing 
/Users/szita/shadow/CDH/pig/contrib/piggybank/java/piggybank.jar to 
/Users/szita/.m2/repository/org/apache/pig/piggybank/0.17.0-SNAPSHOT/piggybank-0.17.0-SNAPSHOT.jar

BUILD SUCCESSFUL
Total time: 46 seconds
{code}

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: ant-contrib-1.0b3.jar, PIG-4923.1.patch, 
> PIG-4923.2.IncrementalHadoop3.patch, PIG-4923.2.patch, PIG-4923.3.patch, 
> PIG-4923.4.patch, PIG-4923.5.patch
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-5109) Remove HadoopJobHistoryLoader

2017-01-18 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5109:

Attachment: PIG-5109.0.patch

> Remove HadoopJobHistoryLoader
> -
>
> Key: PIG-5109
> URL: https://issues.apache.org/jira/browse/PIG-5109
> Project: Pig
>  Issue Type: Sub-task
>  Components: piggybank
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-5109.0.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (PIG-5109) Remove HadoopJobHistoryLoader

2017-01-18 Thread Adam Szita (JIRA)

Adam Szita created PIG-5109:
---

 Summary: Remove HadoopJobHistoryLoader
 Key: PIG-5109
 URL: https://issues.apache.org/jira/browse/PIG-5109
 Project: Pig
  Issue Type: Sub-task
  Components: piggybank
Reporter: Adam Szita
Assignee: Adam Szita






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (PIG-5085) Support FLATTEN of maps

2017-01-16 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-5085:
---

Assignee: Adam Szita

> Support FLATTEN of maps
> ---
>
> Key: PIG-5085
> URL: https://issues.apache.org/jira/browse/PIG-5085
> Project: Pig
>  Issue Type: New Feature
>Reporter: Rohini Palaniswamy
>Assignee: Adam Szita
>
>   I have come across users asking for this quite a few times. Don't see why 
> we should not support it with FLATTEN instead of users having to write a UDF 
> for that



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-13 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822411#comment-15822411
 ] 

Adam Szita commented on PIG-4728:
-

Thanks for the review

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch, PIG-4728-3.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (PIG-4631) UT TestHBaseStorage failed with HBase 1.1.0+

2017-01-13 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita resolved PIG-4631.
-
Resolution: Fixed

> UT TestHBaseStorage failed with HBase 1.1.0+
> 
>
> Key: PIG-4631
> URL: https://issues.apache.org/jira/browse/PIG-4631
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Xiang Li
>Priority: Minor
> Fix For: 0.17.0
>
>
> Pig UT TestHBaseStorage failed with : java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/procedure2/Procedure when HBase is 1.1.0+
> Need to add hadoop-procedure into the dependency list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4631) UT TestHBaseStorage failed with HBase 1.1.0+

2017-01-13 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4631:

Fix Version/s: 0.17.0

> UT TestHBaseStorage failed with HBase 1.1.0+
> 
>
> Key: PIG-4631
> URL: https://issues.apache.org/jira/browse/PIG-4631
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Xiang Li
>Priority: Minor
> Fix For: 0.17.0
>
>
> Pig UT TestHBaseStorage failed with : java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/procedure2/Procedure when HBase is 1.1.0+
> Need to add hadoop-procedure into the dependency list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4631) UT TestHBaseStorage failed with HBase 1.1.0+

2017-01-13 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4631:

Affects Version/s: (was: 0.17.0)
   0.16.0

> UT TestHBaseStorage failed with HBase 1.1.0+
> 
>
> Key: PIG-4631
> URL: https://issues.apache.org/jira/browse/PIG-4631
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Xiang Li
>Priority: Minor
> Fix For: 0.17.0
>
>
> Pig UT TestHBaseStorage failed with : java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/procedure2/Procedure when HBase is 1.1.0+
> Need to add hadoop-procedure into the dependency list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4631) UT TestHBaseStorage failed with HBase 1.1.0+

2017-01-13 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4631:

Affects Version/s: (was: 0.15.0)
   0.17.0

> UT TestHBaseStorage failed with HBase 1.1.0+
> 
>
> Key: PIG-4631
> URL: https://issues.apache.org/jira/browse/PIG-4631
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.16.0
>Reporter: Xiang Li
>Priority: Minor
> Fix For: 0.17.0
>
>
> Pig UT TestHBaseStorage failed with : java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/procedure2/Procedure when HBase is 1.1.0+
> Need to add hadoop-procedure into the dependency list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4631) UT TestHBaseStorage failed with HBase 1.1.0+

2017-01-13 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822407#comment-15822407
 ] 

Adam Szita commented on PIG-4631:
-

This is now fixed by PIG-4728, I'm closing this ticket.

> UT TestHBaseStorage failed with HBase 1.1.0+
> 
>
> Key: PIG-4631
> URL: https://issues.apache.org/jira/browse/PIG-4631
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.15.0
>Reporter: Xiang Li
>Priority: Minor
>
> Pig UT TestHBaseStorage failed with : java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/procedure2/Procedure when HBase is 1.1.0+
> Need to add hadoop-procedure into the dependency list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-12 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4728:

Attachment: PIG-4728-3.patch

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch, PIG-4728-3.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-12 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821205#comment-15821205
 ] 

Adam Szita commented on PIG-4728:
-

[~rohini], I agree on pulling hbase-procedure for testing only. New patch 
[^PIG-4728-3.patch] attached with this change and I also fixed the 
copyDependencies targets in build.xml to have proper hbase jars and other 
required dependencies copied to lib dir.

[~yuzhih...@gmail.com], I've run the following simple Pig-HBase scenario on a 
cluster (with Pig having [^PIG-4728-3.patch]):
Create HBase table, write it with Pig, check content from HBase shell and 
finally load data back to Pig as well:

{code}
hbase(main):000:0>create 'ccys', 'ccy', 'country', 'rate'
{code}
{code}
grunt> cat ccymin.csv
52000   EUR HUN 100.999
0   GBP HUN 100.3713375
1   NOK HUN 102.9676184
2   NOK HUN 104.6205186
3   NOK HUN 102.4529758
4   HUF HUN 104.8512737
5   JPY HUN 101.3875097
6   USD HUN 101.5471822
7   DKK USA 102.922554
8   EUR HUN 103.3619292

grunt> A = LOAD 'ccymin.csv' using PigStorage() as (id: long, ccy: chararray, 
country: chararray, rate: float);
grunt> B = FOREACH A GENERATE ccy, country, rate;
grunt> STORE B INTO 'hbase://ccys' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('ccy:ccy country:country 
rate:rate');
{code}
{code}
hbase(main):001:0> t = get_table 'ccys'
hbase(main):002:0> t.get 'HUF'
COLUMN  CELL
 ccy:ccytimestamp=1484130902361, value=HUN
 country:countrytimestamp=1484130902361, value=104.85127
   
2 row(s) in 0.0670 seconds
{code}
{code}
grunt> C = LOAD 'hbase://ccys' USING 
org.apache.pig.backend.hadoop.hbase.HBaseStorage('ccy:* country:*', '-loadKey 
true');
(DKK,[ccy#USA],[country#102.922554])
(EUR,[ccy#HUN],[country#103.36193])
(GBP,[ccy#HUN],[country#100.37134])
(HUF,[ccy#HUN],[country#104.85127])
(JPY,[ccy#HUN],[country#101.38751])
(NOK,[ccy#HUN],[country#102.45297])
(USD,[ccy#HUN],[country#101.54718])
{code}

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch, PIG-4728-3.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-5087) e2e Native3 failing after PIG-4923 (dropping hadoop 1.x)

2017-01-12 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15820686#comment-15820686
 ] 

Adam Szita commented on PIG-5087:
-

Interestingly this doesn't fail on my cluster when running the 'Native' suite.
Also org.apache.hadoop.streaming.PipeMapRunner seems to present in both 
hadoop-streaming.jar and hadoop-0.23.0-streaming.jar so it's strange that 
ClassNotFoundException is thrown.

Anyway +1 non-binding for removing the 23 config on the e2e side, thanks.

> e2e Native3 failing after PIG-4923 (dropping hadoop 1.x)
> 
>
> Key: PIG-5087
> URL: https://issues.apache.org/jira/browse/PIG-5087
> Project: Pig
>  Issue Type: Test
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Trivial
> Attachments: pig-5087-v01.patch
>
>
> After hadoop 1.x was dropped in PIG-4923, e2e Native 3 test started failing 
> with 
> {noformat}
> 2017-01-10 22:23:38,070 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.streaming.PipeMapRunner not found
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2248)
>   at org.apache.hadoop.mapred.JobConf.getMapRunnerClass(JobConf.java:1127)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1850)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> Class org.apache.hadoop.streaming.PipeMapRunner not found
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2216)
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2240)
>   ... 8 more
> Caused by: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.streaming.PipeMapRunner not found
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2122)
>   at 
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2214)
>   ... 9 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-10 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15815107#comment-15815107
 ] 

Adam Szita commented on PIG-4728:
-

PIG-4923 was just recently resolved - Hadoop 1 support is now officially 
removed and Pig is now being built with Hadoop 2.7.3 by default.
I think we should move ahead with this ticket and upgrade the HBase version.
Looking at the compatibility matrix at http://hbase.apache.org/book.html#hadoop 
HBase 1.2.4 seems like to be the most recent and stable release that also 
matches Hadoop 2.7.3.
Let me know if any objections against this version.

I uploaded the change in [^PIG-4728-2.patch].
I got all-green TestHBaseStorage*.java test runs for both MR and TEZ modes. 
[~daijy], is there any other tool for verification?

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-10 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4728:

Attachment: PIG-4728-2.patch

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-10 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4728:

Attachment: (was: PIG-4728-2.patch)

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-10 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-4728:

Attachment: PIG-4728-2.patch

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (PIG-4728) Compilation against hbase 1.x fails with hbase-hadoop1-compat not found

2017-01-10 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-4728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita reassigned PIG-4728:
---

Assignee: Adam Szita

> Compilation against hbase 1.x fails with hbase-hadoop1-compat not found
> ---
>
> Key: PIG-4728
> URL: https://issues.apache.org/jira/browse/PIG-4728
> Project: Pig
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4728-1.patch, PIG-4728-2.patch
>
>
> With the following change:
> {code}
> diff --git a/ivy/libraries.properties b/ivy/libraries.properties
> index c40befd..41ce9fb 100644
> --- a/ivy/libraries.properties
> +++ b/ivy/libraries.properties
> @@ -46,7 +46,7 @@ hadoop-common.version=2.6.0
>  hadoop-hdfs.version=2.6.0
>  hadoop-mapreduce.version=2.6.0
>  hbase94.version=0.94.1
> -hbase95.version=0.98.12-${hbase.hadoop.version}
> +hbase95.version=1.1.2
>  hsqldb.version=1.8.0.10
>  hive.version=1.2.1
>  httpcomponents.version=4.1
> {code}
> I ran 'ant compile'
> However, compilation failed with:
> {code}
> [ivy:resolve] ::
> [ivy:resolve] ::  UNRESOLVED DEPENDENCIES ::
> [ivy:resolve] ::
> [ivy:resolve] :: org.apache.hbase#hbase-hadoop1-compat;1.1.2: 
> not found
> [ivy:resolve] ::
> {code}
> In hbase 1.x releases, hbase-hadoop1-compat module doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-09 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812474#comment-15812474
 ] 

Adam Szita commented on PIG-4923:
-

Looks good, thanks for the fix and all the guidance and review too, [~rohini]

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4923.1.patch, PIG-4923.2.IncrementalHadoop3.patch, 
> PIG-4923.2.patch, PIG-4923.3.patch, PIG-4923.4.patch, PIG-4923.5.patch, 
> ant-contrib-1.0b3.jar
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-09 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15811350#comment-15811350
 ] 

Adam Szita commented on PIG-4923:
-

Thanks a lot [~rohini]!

The code in SVN now looks good - although we need to do a couple things:
-Add .gitignore in shims/test/hadoop2 so that the dir structure is present in 
git as well (otherwise git clone; ant clean pigunit-jar will fail)
-Remove unnecessary old (and empty) dirs from shims (e.g.: 
http://svn.apache.org/repos/asf/pig/trunk/shims/test/hadoop20 )

{code}
#Add empty .gitignore so that git keeps dir structure
touch shims/test/hadoop2/.gitignore
svn add shims/test/hadoop2/.gitignore

#Delete empty dir structures
svn rm shims/src/hadoop20
svn rm shims/src/hadoop23
svn rm shims/test/hadoop20
svn rm shims/test/hadoop23
{code} 

I guess all of this could go with this ticket as well?

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4923.1.patch, PIG-4923.2.IncrementalHadoop3.patch, 
> PIG-4923.2.patch, PIG-4923.3.patch, PIG-4923.4.patch, PIG-4923.5.patch, 
> ant-contrib-1.0b3.jar
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PIG-4923) Drop Hadoop 1.x support in Pig 0.17

2017-01-06 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15806070#comment-15806070
 ] 

Adam Szita commented on PIG-4923:
-

[~rohini], can you please let me know what you think?

> Drop Hadoop 1.x support in Pig 0.17
> ---
>
> Key: PIG-4923
> URL: https://issues.apache.org/jira/browse/PIG-4923
> Project: Pig
>  Issue Type: Improvement
>Reporter: Daniel Dai
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-4923.1.patch, PIG-4923.2.IncrementalHadoop3.patch, 
> PIG-4923.2.patch, PIG-4923.3.patch, PIG-4923.4.patch, PIG-4923.5.patch, 
> ant-contrib-1.0b3.jar
>
>
> To facilitate the future development, we want to get rid of the legacy Hadoop 
> 1.x support and reduce the code complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-1804) Alow Jython function to implement Algebraic and/or Accumulator interfaces

2017-01-04 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-1804:

Fix Version/s: 0.17.0

> Alow Jython function to implement Algebraic and/or Accumulator interfaces
> -
>
> Key: PIG-1804
> URL: https://issues.apache.org/jira/browse/PIG-1804
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Julien Le Dem
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-1804.0.patch
>
>
> Currently Python UDFs can only be simple functions. For performance 
> improvements in advanced use cases it should be possible to extend Algebraic 
> and/or Accumulator interfaces



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-1804) Alow Jython function to implement Algebraic and/or Accumulator interfaces

2017-01-04 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-1804:

Affects Version/s: (was: 0.9.0)
   Status: Patch Available  (was: Open)

> Alow Jython function to implement Algebraic and/or Accumulator interfaces
> -
>
> Key: PIG-1804
> URL: https://issues.apache.org/jira/browse/PIG-1804
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.16.0
>Reporter: Julien Le Dem
>Assignee: Adam Szita
> Fix For: 0.17.0
>
> Attachments: PIG-1804.0.patch
>
>
> Currently Python UDFs can only be simple functions. For performance 
> improvements in advanced use cases it should be possible to extend Algebraic 
> and/or Accumulator interfaces



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 1 2 3 4 5 6 7 8 9 >

501 - 600 of 811 matches

Mail list logo