GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/21913
[WIP][SPARK-24005][CORE] Remove usage of Scalaâs parallel collection
## What changes were proposed in this pull request?
In the PR, I propose to replace usage of Scala parallel collections on
executors side by new methods `parmap()`. The methods use futures to transform
a sequential collection by applying a lambda function to each element in
parallel. The result of `parmap` is another regular (sequential) collection.
## How was this patch tested?
I am still testing the changes.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 par-map
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21913.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21913
----
commit ac26dea47f624d4617a24ef75fb145050862a3dd
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-07-29T17:15:48Z
Initial implementation of parmap function
commit 506fa93107269f3a2a701ca20ddad655766ff503
Author: Maxim Gekk <maxim.gekk@...>
Date: 2018-07-29T18:18:18Z
Initial implementation of parmap()
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]