Github user kellrott commented on the pull request:
https://github.com/apache/spark/pull/1292#issuecomment-76562027
This was originally written as a response to Spark didn't scale to having
multiple jobs running at the same time (
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201407.mbox/%3ccakxmip2he3mocuu09hf5dxreq7kgc3rvnujtsu54w82bawn...@mail.gmail.com%3E
). One of the main reasons it wasn't written as an external package was
that too much of the mllib code was marked as private and couldn't be used.
Kyle
On Fri, Feb 27, 2015 at 1:18 AM, Xiangrui Meng <[email protected]>
wrote:
> @nchammas <https://github.com/nchammas> @kellrott
> <https://github.com/kellrott> Sorry that I haven't got a time to review
> this PR. The main issue is that this PR is not small. Though there are no
> break changes, the added code is very similar to the ungrouped and hence
we
> have to maintain both in the future. I'm wondering how this compared to
> letting Spark launching multiple jobs at the same time but using existing
> code.
>
> Given the current backlog, I'd suggest maintaining it as a Spark package (
> http://spark-packages.org/). So users can grab them easily and try.
>
> â
> Reply to this email directly or view it on GitHub
> <https://github.com/apache/spark/pull/1292#issuecomment-76360812>.
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]