[
https://issues.apache.org/jira/browse/SDAP-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705206#comment-16705206
]
ASF GitHub Bot commented on SDAP-151:
-------------------------------------
fgreg opened a new pull request #60: SDAP-151 Determine parallelism
automatically for Spark analytics (#50)
URL: https://github.com/apache/incubator-sdap-nexus/pull/60
* Removed spark configuration, added nparts configuration, and autocompute
parallelism for spark-based time series.
* SDAP-151 Determine parallelism automatically for Spark analytics
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Determine parallelism automatically for Spark analytics
> -------------------------------------------------------
>
> Key: SDAP-151
> URL: https://issues.apache.org/jira/browse/SDAP-151
> Project: Apache Science Data Analytics Platform
> Issue Type: Improvement
> Reporter: Joseph Jacob
> Assignee: Joseph Jacob
> Priority: Major
>
> Some of the built-in NEXUS analytics like TimeSeries and TimeAvgMap currently
> get the desired parallelism from a job request parameter like
> "spark=mesos,16,32". If that is omitted, we currently default to
> "spark=local,1,1", which runs on a single core. Instead we would like to
> automatically determine the appropriate level of parallelism based on the
> job's input data size.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)