[jira] [Commented] (PIG-5201) Null handling on FLATTEN

2017-08-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121076#comment-16121076
 ] 

Daniel Dai commented on PIG-5201:
-

What's your idea for the column padding?

> Null handling on FLATTEN
> 
>
> Key: PIG-5201
> URL: https://issues.apache.org/jira/browse/PIG-5201
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: pig-5201-v00-testonly.patch, pig-5201-v01.patch, 
> pig-5201-v02.patch, pig-5201-v03.patch
>
>
> Sometimes, FLATTEN(null) or FLATTEN(bag-with-null) seem to produce incorrect 
> results.
> Test code/script to follow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5256) Bytecode generation for POFilter and POForeach

2017-08-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121068#comment-16121068
 ] 

Daniel Dai commented on PIG-5256:
-

Haven't checked the code line by line, but the overall approach looks good to 
me. The patch flattened expression tree and nested plan, so we can avoid 
virtual function call, and have a cleaner solution for multi-time evaluation 
issue. This can be extended to flatten the whole operator tree in the future 
(whole stage codegen).

> Bytecode generation for POFilter and POForeach
> --
>
> Key: PIG-5256
> URL: https://issues.apache.org/jira/browse/PIG-5256
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.18.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : Pig-trunk #2038

2017-08-09 Thread Apache Jenkins Server
See 



[jira] [Commented] (PIG-5201) Null handling on FLATTEN

2017-08-09 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120561#comment-16120561
 ] 

Koji Noguchi commented on PIG-5201:
---

bq. Shall we produce the same number of nulls columns according to schema?
Yes, we should.

bq. It might be the same as PIG-2537.
Could be, but I don't think we can fix at Utf8StorageConverter loadcaster level.


> Null handling on FLATTEN
> 
>
> Key: PIG-5201
> URL: https://issues.apache.org/jira/browse/PIG-5201
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: pig-5201-v00-testonly.patch, pig-5201-v01.patch, 
> pig-5201-v02.patch, pig-5201-v03.patch
>
>
> Sometimes, FLATTEN(null) or FLATTEN(bag-with-null) seem to produce incorrect 
> results.
> Test code/script to follow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5201) Null handling on FLATTEN

2017-08-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120551#comment-16120551
 ] 

Daniel Dai commented on PIG-5201:
-

It is equivalent to flatten a scalar. That's sounds fine. How about columns? 
Shall we produce the same number of nulls columns according to schema? It might 
be the same as PIG-2537.

> Null handling on FLATTEN
> 
>
> Key: PIG-5201
> URL: https://issues.apache.org/jira/browse/PIG-5201
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: pig-5201-v00-testonly.patch, pig-5201-v01.patch, 
> pig-5201-v02.patch, pig-5201-v03.patch
>
>
> Sometimes, FLATTEN(null) or FLATTEN(bag-with-null) seem to produce incorrect 
> results.
> Test code/script to follow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5201) Null handling on FLATTEN

2017-08-09 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120468#comment-16120468
 ] 

Koji Noguchi commented on PIG-5201:
---

[~daijy], as I wrote above, are you ok with FLATTEN(null-bag) not dropping ?  
(e.g. if this bag has schema, then FLATTEN(null) be expanded to multiple nulls 
to match the number of fields)

> Null handling on FLATTEN
> 
>
> Key: PIG-5201
> URL: https://issues.apache.org/jira/browse/PIG-5201
> Project: Pig
>  Issue Type: Bug
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Minor
> Fix For: 0.18.0
>
> Attachments: pig-5201-v00-testonly.patch, pig-5201-v01.patch, 
> pig-5201-v02.patch, pig-5201-v03.patch
>
>
> Sometimes, FLATTEN(null) or FLATTEN(bag-with-null) seem to produce incorrect 
> results.
> Test code/script to follow.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5283) Configuration is not passed to SparkPigSplits on the backend

2017-08-09 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5283:

   Resolution: Fixed
Fix Version/s: 0.18.0
   Status: Resolved  (was: Patch Available)

> Configuration is not passed to SparkPigSplits on the backend
> 
>
> Key: PIG-5283
> URL: https://issues.apache.org/jira/browse/PIG-5283
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: 0.18.0
>
> Attachments: PIG-5283.0.patch, PIG-5283.1.patch
>
>
> When a Hadoop ObjectWritable is created during a Spark job, the instantiated 
> PigSplit (wrapped into a SparkPigSplit) is given an empty Configuration 
> instance.
> This happens 
> [here|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5283) Configuration is not passed to SparkPigSplits on the backend

2017-08-09 Thread Adam Szita (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119583#comment-16119583
 ] 

Adam Szita commented on PIG-5283:
-

[^PIG-5283.1.patch] committed to trunk. Thanks for the review Rohini, Liyun and 
Nandor

> Configuration is not passed to SparkPigSplits on the backend
> 
>
> Key: PIG-5283
> URL: https://issues.apache.org/jira/browse/PIG-5283
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-5283.0.patch, PIG-5283.1.patch
>
>
> When a Hadoop ObjectWritable is created during a Spark job, the instantiated 
> PigSplit (wrapped into a SparkPigSplit) is given an empty Configuration 
> instance.
> This happens 
> [here|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5283) Configuration is not passed to SparkPigSplits on the backend

2017-08-09 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119488#comment-16119488
 ] 

liyunzhang_intel commented on PIG-5283:
---

[~szita]: thanks for your explaination. +1

> Configuration is not passed to SparkPigSplits on the backend
> 
>
> Key: PIG-5283
> URL: https://issues.apache.org/jira/browse/PIG-5283
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-5283.0.patch, PIG-5283.1.patch
>
>
> When a Hadoop ObjectWritable is created during a Spark job, the instantiated 
> PigSplit (wrapped into a SparkPigSplit) is given an empty Configuration 
> instance.
> This happens 
> [here|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5283) Configuration is not passed to SparkPigSplits on the backend

2017-08-09 Thread Adam Szita (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119467#comment-16119467
 ] 

Adam Szita commented on PIG-5283:
-

[~kellyzly] because insisde {{readFields}} [compression configs are 
used|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigSplit.java#L307]

> Configuration is not passed to SparkPigSplits on the backend
> 
>
> Key: PIG-5283
> URL: https://issues.apache.org/jira/browse/PIG-5283
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: PIG-5283.0.patch, PIG-5283.1.patch
>
>
> When a Hadoop ObjectWritable is created during a Spark job, the instantiated 
> PigSplit (wrapped into a SparkPigSplit) is given an empty Configuration 
> instance.
> This happens 
> [here|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] Subscription: PIG patch available

2017-08-09 Thread jira
Issue Subscription
Filter: PIG patch available (35 issues)

Subscriber: pigdaily

Key Summary
PIG-5268Review of org.apache.pig.backend.hadoop.datastorage.HDataStorage
https://issues-test.apache.org/jira/browse/PIG-5268
PIG-5267Review of org.apache.pig.impl.io.BufferedPositionedInputStream
https://issues-test.apache.org/jira/browse/PIG-5267
PIG-5264Remove deprecated keys from PigConfiguration
https://issues-test.apache.org/jira/browse/PIG-5264
PIG-5246Modify bin/pig about SPARK_HOME, SPARK_ASSEMBLY_JAR after upgrading 
spark to 2
https://issues-test.apache.org/jira/browse/PIG-5246
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues-test.apache.org/jira/browse/PIG-5160
PIG-5157Upgrade to Spark 2.0
https://issues-test.apache.org/jira/browse/PIG-5157
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues-test.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues-test.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues-test.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues-test.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues-test.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues-test.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues-test.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues-test.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues-test.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues-test.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues-test.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues-test.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues-test.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues-test.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues-test.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues-test.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues-test.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues-test.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues-test.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues-test.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues-test.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues-test.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues-test.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues-test.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues-test.apache.org/jira/browse/PIG-3864
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues-test.apache.org/jira/browse/PIG-3668
PIG-3655BinStorage and InterStorage approach to record markers is broken
https://issues-test.apache.org/jira/browse/PIG-3655
PIG-3587add functionality for rolling over dates
https://issues-test.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython function to implement Algebraic and/or Accumulator 
interfaces
https://issues-test.apache.org/jira/browse/PIG-1804

You may edit this subscription at:
https://issues-test.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328=12322384


[jira] Subscription: PIG patch available

2017-08-09 Thread jira
Issue Subscription
Filter: PIG patch available (33 issues)

Subscriber: pigdaily

Key Summary
PIG-5283Configuration is not passed to SparkPigSplits on the backend
https://issues.apache.org/jira/browse/PIG-5283
PIG-5268Review of org.apache.pig.backend.hadoop.datastorage.HDataStorage
https://issues.apache.org/jira/browse/PIG-5268
PIG-5267Review of org.apache.pig.impl.io.BufferedPositionedInputStream
https://issues.apache.org/jira/browse/PIG-5267
PIG-5256Bytecode generation for POFilter and POForeach
https://issues.apache.org/jira/browse/PIG-5256
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues.apache.org/jira/browse/PIG-5160
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues.apache.org/jira/browse/PIG-3864
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython function to implement Algebraic and/or Accumulator 
interfaces
https://issues.apache.org/jira/browse/PIG-1804

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328=12322384