Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39812228
I'm still tracking down exactly where this problem is coming from, but
here's a little more detail on what's going wrong with Python accumulators when
the closure passed
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39610852
So I just added this to `ClosureCleanerSuite` and I'm not seeing the same
behavior. (I already had added a captured-field test to
`ClosureCleanerSuite`.) I don't have
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39610969
(e.g. here:
https://github.com/willb/spark/commit/12c63a7e03bce359fd7eb7faf0a054bd32f85824#diff-f949ef08cc8a2b36861af3beb4309a88R161)
---
If your project is set up
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39613329
@mateiz, these naming and stylistic suggestions make sense; thanks!
Cloning the closure in `runJob` is what caused Python accumulators to stop
working. I will have
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39613399
(BTW, as a matter of style, is it better to rebase my branch or add another
commit that makes the change?)
---
If your project is set up for it, you can reply
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39617193
OK, I've made the changes and will push updates after re-running tests
locally. (I'll also follow up on the Python accumulators.)
---
If your project is set up
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39626070
Matei, my latest commit addresses the style and naming issues; I'll need to
dig in to the cause of the Python issue some more over the weekend. Thanks
again for your
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39490073
Thanks for taking another look, Matei! I know there's a lot of stuff to
get in before the merge window closes and appreciate the update.
---
If your project is set up
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-39144319
It looks like this Travis error is the same one others have seen on the dev
list -- that is, the hive test is timing out.
---
If your project is set up for it, you can
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-38839409
I have rebased this branch to remove the commit that took out the
serializability check in `DAGScheduler`.
---
If your project is set up for it, you can reply
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-38330588
@mateiz, I was pretty confused about this but it looks like the python
accumulator tests are what is failing. I'm not super-familiar with pyspark yet
but am trying
Github user willb commented on the pull request:
https://github.com/apache/spark/pull/189#issuecomment-38340085
The Python accumulators still work on master and with the changes from #143
(which serializes the closure at closure cleaning but doesn't use the
serialized value
101 - 112 of 112 matches
Mail list logo