GitHub user kwmonroe opened a pull request:
https://github.com/apache/bigtop/pull/194
BIGTOP-2737: Spark charm doesn't handle HA or examples well
See [jira](https://issues.apache.org/jira/browse/BIGTOP-2737) for details,
but in general, the spark charm didn't transition well among execution modes
(services are started when they don't need to be, stale zk connection info
persists, etc). Fix this by waiting for required services to become ready and
only starting spark services as needed by our execution mode.
Also, with the release of bigtop-1.2, we get spark-2.1. Hooray! However,
the `spark-examples.jar` has moved, so we need to be smarter about finding that
programmatically in our actions (`sparkpi`, `pagerank`, etc).
On the subject of benchmarks, the SparkBench suite that we've been making
available to our spark charm doesn't work with spark-2.1, so we'll need to
deprecate it or possibly find a version that works. For now, just make sure we
can run a consistent set of benchmarks across spark-1.5 and spark-2.1 -- that
means `sparkpi` and `pagerank`.
And finally, the spark charm reactive logic was difficult to follow. Use
this opportunity to refactor it to ease future maintenance.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/juju-solutions/bigtop
bug/spark-zk-example-fixes
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/bigtop/pull/194.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #194
----
commit 8f4c5e05af08539f8e4e55b16282c37986d430ad
Author: Kevin W Monroe <[email protected]>
Date: 2017-03-28T20:46:17Z
find spark example vs hard code (path is not the same between spark1 and
spark2)
commit 2b42991feb2ddd834053067fe05bd7f90db68fca
Author: Kevin W Monroe <[email protected]>
Date: 2017-03-30T19:26:18Z
wait for zk ensemble when zks change in HA mode
commit 29eba746167f7fa22b901f6e74c54a0675ad9896
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-03T17:02:30Z
remove unused resources_mirror config; reset zk string correctly; do not
worry about zks when a sparkpeer departs (zk relations will reconfigure as
needed)
commit 83dc0fe113bf27367b568ac437ed487387a08369
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-03T21:05:11Z
reorder reactive methods
commit 1672cd804386142b2197e38d10db758c2eda803e
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-03T21:06:14Z
fix deployment mode logic; update comment
commit f09936e4a056958ee98051ab24fca47cd9811653
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T03:14:59Z
WIP: spark updates
- consolidate calls to get_master_url
- remove 2nd bigtop trigger (WIP to see if spark-worker is ok)
- stop master/worker in yarn mode
- remove unused 'upgrading' check in start()
- refactor reactive methods:
- only include nn/rm in hosts[] if in yarn mode
- clean up zk logic
- play it safe and reconfigure on new peers (not sure if needed)
- fold config.changed into reinstall_spark
commit c096e4d4ed62999a07313a17f311c9a70d23ecfd
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T04:39:12Z
update pagerank benchmark to use built-in spark example
commit 9a3ebc69a6b3cc9bae073e41d636e2f0c563ece4
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T04:39:53Z
update pagerank benchmark to use built-in spark example
commit c4fbd61d0ddf33cac151aaff939fdb330dc3cef6
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T18:14:50Z
start/stop logic fix: only start master/worker in standalone; wait for
possible master recovery
commit 03d4dfb92d6d3a9d983575576a528fcdf3d5aa4d
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T20:43:40Z
support mem config opts; better status reporting; refactor install into
hadoop helper and standalone helper
commit cbe57ff7c99b7c3bb8b0d28ea6a0190c1e4247c1
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T21:06:45Z
export mem vars
commit 8ed403a77dacbd168f62ad7e0a088ead6338aa81
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T22:44:57Z
increase timeout on worker restart; log deployment matrix
commit 8ba1904c779eb3d0cc4054e6e3e2de6fe5ac488e
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-04T23:34:24Z
fix logic gating reinstall
commit e86782885d99379eb1713d5920b23b66dc3b7a72
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-05T18:21:39Z
fix pagerank for yarn mode
commit 5b0a230a7cc3d14342a936c334e80fe289145c2a
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-05T18:30:49Z
only wait on startup if the master started (may not start in non-HA
standalone mode)
commit daf8407e067a9b4ee6e3bcd46cb32a7249258730
Author: Kevin W Monroe <[email protected]>
Date: 2017-04-05T18:53:19Z
fix yarn/hdfs logic
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---