[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302653
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 28/Aug/19 08:16
Start Date: 28/Aug/19 08:16
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525635634
 
 
   Thanks Kyle.
   
   @RyanSkraba raises some valid points. There is something weird in our 
current translation and the fact that we ignore keys in particular for the 
`Reshuffle.viaRandomKey()` case.
   We should maybe fill a JIRA to track this + discuss in the mailing list. 
(some [previous discussion on 
Reshuffle](https://lists.apache.org/thread.html/820064a81c86a6d44f21f0d6c68ea3f46cec151e5e1a0b52eeed3fbf@%3Cdev.beam.apache.org%3E)).
   
   I was also wondering to what extent in our current implementation (and in 
particular for the random key case) we could do a repartition with more 
partitions (based on available CPUs). Of course this has the risk of eating 
more resources than defined by the job but on the other hand it could be a way 
to optimize such shuffles downstream. [but well this is a different subject 
just thinking]
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302653)
Time Spent: 2h 50m  (was: 2h 40m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
> Fix For: 2.16.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302652
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 28/Aug/19 08:16
Start Date: 28/Aug/19 08:16
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525635634
 
 
   Thanks Kyle.
   
   @RyanSkraba raises some valid points. There is something weird in our 
current translation and the fact that we ignore keys in particular for the 
`Reshuffle.viaRandomKey()` case.
   We should maybe fill a JIRA to track this + discuss in the mailing list. 
(some [previous discussion on Reshuffle 
here](https://lists.apache.org/thread.html/820064a81c86a6d44f21f0d6c68ea3f46cec151e5e1a0b52eeed3fbf@%3Cdev.beam.apache.org%3E)).
   
   I was also wondering to what extent in our current implementation (and in 
particular for the random key case) we could do a repartition with more 
partitions (based on available CPUs). Of course this has the risk of eating 
more resources than defined by the job but on the other hand it could be a way 
to optimize such shuffles downstream. [but well this is a different subject 
just thinking]
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302652)
Time Spent: 2h 40m  (was: 2.5h)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
> Fix For: 2.16.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302443
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 27/Aug/19 22:30
Start Date: 27/Aug/19 22:30
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302443)
Time Spent: 2.5h  (was: 2h 20m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302442
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 27/Aug/19 22:29
Start Date: 27/Aug/19 22:29
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525507950
 
 
   Failing tests are unrelated (BEAM-8025 and BEAM-8102).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302442)
Time Spent: 2h 20m  (was: 2h 10m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302404
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 27/Aug/19 20:57
Start Date: 27/Aug/19 20:57
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525479659
 
 
   Run Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302404)
Time Spent: 2h 10m  (was: 2h)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=302190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-302190
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 27/Aug/19 16:44
Start Date: 27/Aug/19 16:44
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525385967
 
 
   Run Java Spark PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 302190)
Time Spent: 2h  (was: 1h 50m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301854
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 27/Aug/19 10:05
Start Date: 27/Aug/19 10:05
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525234062
 
 
   Still LGTM -- as I mentioned, I'm not entirely sure *why* the original 
implementation is done as it is!
   
   It seems that there could be some useful discussion around `Reshuffle` and 
it's contract -- if it weren't deprecated!
   
   As far as I can tell, the *intention* of `Reshuffle.of` is to ensure that 
the "upstream" is materialized, and if upstream is ever rematerialized for any 
reason, the resulting partitions are deterministic.  We don't care whether 
partition sizes are skewed.
   
   The reference implementation implies that records with the same key are in 
the same partition, but the spark replacement has never done this.  It also 
rebalances the partitions (deterministically) whether we care or not.
   
   The *intention* of `Reshuffle.viaRandomKey` is to add balancing the 
partitions.  We don't care whether the results are deterministic.
   
   The spark implementation rebalances the partitions, but it would have done 
that anyway if the entire `.viaRandomKey` were replaced with an `.of`.  The 
entire `viaRandomKey()` translation is extra unnecessary cruft in Spark 
_unless_ random repartitioning is a requirement.  Is it?
   
   `I want materialization of partitions of approximately the same size`  does 
not mean `I need the data to be randomly assigned to partitions.`
   
   *Anyway* sorry for the sidetrack!  I just feel like I might be missing a 
piece here and would welcome clarity  :D  In my opinion, all of the current 
usages of `Reshuffle.of()` and `viaRandomKey()` are valid with this 
re-implementation, except for the Deduplicate in GDF which isn't relevant.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301854)
Time Spent: 1h 50m  (was: 1h 40m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301543=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301543
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 26/Aug/19 21:50
Start Date: 26/Aug/19 21:50
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525047135
 
 
   Run Python Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301543)
Time Spent: 1h 40m  (was: 1.5h)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301542=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301542
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 26/Aug/19 21:50
Start Date: 26/Aug/19 21:50
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525047102
 
 
   Run Java Spark PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301542)
Time Spent: 1.5h  (was: 1h 20m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301513
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 26/Aug/19 21:01
Start Date: 26/Aug/19 21:01
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525031072
 
 
   Run Java Spark PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301513)
Time Spent: 1h 10m  (was: 1h)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301514
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 26/Aug/19 21:01
Start Date: 26/Aug/19 21:01
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] 
Simplify/generalize Spark reshuffle translation
URL: https://github.com/apache/beam/pull/9410#issuecomment-525031112
 
 
   Run Python Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301514)
Time Spent: 1h 20m  (was: 1h 10m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable Spark Reshuffle coder cast exception

2019-08-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=301452=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301452
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 26/Aug/19 20:08
Start Date: 26/Aug/19 20:08
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] Fix Spark 
reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410#issuecomment-525011659
 
 
   Thanks for the review @RyanSkraba. In that case, I will change the shared 
translation instead of just the portable runner's.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301452)
Time Spent: 1h  (was: 50m)

> Portable Spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable spark Reshuffle coder cast exception

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=300464=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300464
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 23/Aug/19 19:43
Start Date: 23/Aug/19 19:43
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] fix Spark 
reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410#issuecomment-524438917
 
 
   R: @iemejia Any particular reason we need to separate the keys and values in 
the current reshuffle translation? 
https://github.com/apache/beam/blob/c5d45331796693ad48c0cceaf1e4d9903c1d98fb/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/GroupCombineFunctions.java#L191-L197
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300464)
Time Spent: 50m  (was: 40m)

> Portable spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable spark Reshuffle coder cast exception

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=300447=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300447
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 23/Aug/19 19:24
Start Date: 23/Aug/19 19:24
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] fix Spark 
reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410#issuecomment-524433753
 
 
   Run Python Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300447)
Time Spent: 40m  (was: 0.5h)

> Portable spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable spark Reshuffle coder cast exception

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=299849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299849
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 23/Aug/19 01:38
Start Date: 23/Aug/19 01:38
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] fix Spark 
reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410#issuecomment-524138861
 
 
   Run Java Spark PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299849)
Time Spent: 0.5h  (was: 20m)

> Portable spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7864) Portable spark Reshuffle coder cast exception

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=299846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299846
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 23/Aug/19 01:30
Start Date: 23/Aug/19 01:30
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9410: [BEAM-7864] fix 
Spark reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410
 
 
   The previous implementation of reshuffle on the portable Spark runner made 
assumptions its inputs that proved false when running some Python pipelines. 
This new translation is made more general to fix that.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 

[jira] [Work logged] (BEAM-7864) Portable spark Reshuffle coder cast exception

2019-08-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7864?focusedWorklogId=299847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-299847
 ]

ASF GitHub Bot logged work on BEAM-7864:


Author: ASF GitHub Bot
Created on: 23/Aug/19 01:30
Start Date: 23/Aug/19 01:30
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9410: [BEAM-7864] fix Spark 
reshuffle translation with Python SDK
URL: https://github.com/apache/beam/pull/9410#issuecomment-524137681
 
 
   Run Java Spark PortableValidatesRunner Batch
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 299847)
Time Spent: 20m  (was: 10m)

> Portable spark Reshuffle coder cast exception
> -
>
> Key: BEAM-7864
> URL: https://issues.apache.org/jira/browse/BEAM-7864
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> running :sdks:python:test-suites:portable:py35:portableWordCountBatch in 
> either loopback or docker mode on master fails with exception:
>  
> java.lang.ClassCastException: org.apache.beam.sdk.coders.LengthPrefixCoder 
> cannot be cast to org.apache.beam.sdk.coders.KvCoder
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translateReshuffle(SparkBatchPortablePipelineTranslator.java:400)
>  at 
> org.apache.beam.runners.spark.translation.SparkBatchPortablePipelineTranslator.translate(SparkBatchPortablePipelineTranslator.java:147)
>  at 
> org.apache.beam.runners.spark.SparkPipelineRunner.lambda$run$1(SparkPipelineRunner.java:96)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)