[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73470437
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27089/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73470429
  
  [Test build #27089 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27089/consoleFull)
 for   PR 4385 at commit 
[`7e2b4be`](https://github.com/apache/spark/commit/7e2b4be993e4029204bee14647e0564de528c490).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73566658
  
Thanks @florianverhein - I just tested this locally and it worked fine. 
Lets wait for @nchammas to sign off and we can merge it after that (also 
Jenkins passed !)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73615976
  
Thanks. No problem.
I also have [SPARK-5641] ready to go once this is merged.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73616133
  
Looks like 3 users in agreement and to my moderately trained eye looks good 
too. Jenkins says aye, so let's merge it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4385


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73611158
  
Great. Thanks for that @shivaram.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-09 Thread nchammas
Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73612751
  
LGTM. Thanks for working on this @florianverhein.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread nchammas
Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73390259
  
I agree we don't want to put much effort into this feature since the use 
case isn't standard.

Maybe instead we can force the repo to be named `spark-ec2` by checking the 
input? In general I just prefer we over-restrict and avoid user 
confusion/frustration from untested code paths where possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73332095
  
Thanks @nchammas. I've tested only with a fork named `spark-ec2`. I relied 
on your comment/request re cloning from a different name (though I agree 
there's no reason why it shouldn't work, since git will just override the 
default directory name it clones into). 
I think it's feature creep beyond the scope of the original issue though, 
and I have a feeling that if someone forked to a different name, they might 
also want to change the paths in their altered spark-ec2 scripts to reflect 
that, which wouldn't work. So I'm not sure it's a feature that's worth much 
effort at this stage. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24278389
  
--- Diff: ec2/spark_ec2.py ---
@@ -1007,6 +1023,11 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Prevent breaking ami_prefix
--- End diff --

Sure, why not. Will wait for any other changes to this.
I think it's a pretty rare case, which currently would just fail at with 
Could not resolve AMI at:   But we may as well cover it. 
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73409569
  
  [Test build #27036 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27036/consoleFull)
 for   PR 4385 at commit 
[`7beea2d`](https://github.com/apache/spark/commit/7beea2dedbe5fc609137b166eeb09aa0cc4ed7cb).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73409574
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27036/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73406716
  
  [Test build #27036 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27036/consoleFull)
 for   PR 4385 at commit 
[`7beea2d`](https://github.com/apache/spark/commit/7beea2dedbe5fc609137b166eeb09aa0cc4ed7cb).
 * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73464449
  
  [Test build #27089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27089/consoleFull)
 for   PR 4385 at commit 
[`7e2b4be`](https://github.com/apache/spark/commit/7e2b4be993e4029204bee14647e0564de528c490).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73460952
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/27079/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73460944
  
  [Test build #27079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27079/consoleFull)
 for   PR 4385 at commit 
[`8b653dc`](https://github.com/apache/spark/commit/8b653dcf64e6888ca167cc13f106bb05992911c0).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24311076
  
--- Diff: ec2/spark_ec2.py ---
@@ -1026,6 +1042,17 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Prevent breaking ami_prefix (/, .git and startswith checks)
+# Prevent forks with non spark-ec2 names for now.
+if opts.spark_ec2_git_repo.endswith(/) or \
+opts.spark_ec2_git_repo.endswith(.git) or \
+not opts.spark_ec2_git_repo.startswith(https://github.com;) 
or \
+not opts.spark_ec2_git_repo.endswith(spark-ec2):
+print  stderr, spark-ec2-git-repo must be a github repo and it 
must not have a  \
+ training / or .git.  \
--- End diff --

This should be `trailing` instead of `training` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24311190
  
--- Diff: ec2/spark_ec2.py ---
@@ -1026,6 +1042,17 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Prevent breaking ami_prefix (/, .git and startswith checks)
+# Prevent forks with non spark-ec2 names for now.
+if opts.spark_ec2_git_repo.endswith(/) or \
+opts.spark_ec2_git_repo.endswith(.git) or \
+not opts.spark_ec2_git_repo.startswith(https://github.com;) 
or \
+not opts.spark_ec2_git_repo.endswith(spark-ec2):
+print  stderr, spark-ec2-git-repo must be a github repo and it 
must not have a  \
+ training / or .git.  \
--- End diff --

yep. typo. fixed - thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-7344
  
There may be flakiness in a test or in Jenkins. I think you can ignore 
failures from outside Python for now. The CI says it doesn't add classes, good. 
But the bigger question I see in the thread is whether it's worth making this 
change? or was part of this going to be reverted?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73442070
  
@srowen any feedback/tips about this?
I expect the merge issue, as there's been a commit to this file since my 
last commit on this fork.
When I run run-tests, it produces a few failures in Spark itself, which is 
strange. 
The only tests relevant to this change that I can see in dev/run-tests is 
lint-python. 
This patch won't add any public classes. It's just a small change to 
ec2/spark_ec2.py. Does this mean it will never pass the CI tests as they stand?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24308635
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
masterformat(rb=repo_branch)
--- End diff --

@nchammas Just realised that master is actually correct here, since it 
refers to the fact that this is happening on the master node (as per original 
log message and consistent with other log messages), rather than being a 
reference a branch name. 

I'll add it back in (but keep the .../tree/... change) unless you have any 
objections. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73457137
  
  [Test build #27079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/27079/consoleFull)
 for   PR 4385 at commit 
[`8b653dc`](https://github.com/apache/spark/commit/8b653dcf64e6888ca167cc13f106bb05992911c0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73457613
  
@srowen @nchammas Ok, I've rebased on top of current master and added all 
requested changes as best I understand them (also minor change to a log line to 
as per comment in an earlier commit). Tested against regressions without the 
new args, as well as with them (both repo and branch args), successfully 
bringing up clusters in both cases. Passes /dev/lint-python (I believe there 
are no additional tests for ec2/*.py right?).
I think it's ready to go if @nchammas gives the green light (and fingers 
crossed for Jenkins).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73442633
  
@srowen The proposed change is definitely worth making. There was a small 
extension/generalisation suggested/requested by @nchammas, which as you saw we 
were discussing above. However I think we agree that this extension is out of 
scope - so we're just finalising.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73443768
  
@nchammas I can revert the input check back to what I had earlier to 
enforce `spark-ec2` only. We can remove the check later if needed. 
Given the help string and the type of user who would try forks with 
different names, I think it's over-cautious. But I'm more concerned with 
getting this finalised so really have no strong opinions. You can decide ;-)

Apart from this (please let me know what you'd like) and the github input 
check you mentioned above, is there anything else required? We've done a full 
circle and the testing loop is expensive (time to spin up EC2 clusters and 
manually check, etc), so I'd really like to get this PR out of the way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-08 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73443784
  
OK, rebase your branch and push any other changes to it, and let's see that 
the python style checks and tests pass, and hopefully Jenkins does too. If it 
looks good and @nchammas agrees we can merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-06 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24264051
  
--- Diff: ec2/spark_ec2.py ---
@@ -1007,6 +1023,11 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Prevent breaking ami_prefix
--- End diff --

If we're validating this input, perhaps we should also check that the repo 
string starts with https://github.com;. Sounds good?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-06 Thread nchammas
Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73295741
  
@florianverhein This is looking good. Have you tested this against a fork 
named `spark-ec2` as well as a fork named something else?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24221789
  
--- Diff: ec2/spark_ec2.py ---
@@ -145,6 +145,14 @@ def parse_args():
 default=DEFAULT_SPARK_GITHUB_REPO,
 help=Github repo from which to checkout supplied commit hash 
(default: %default))
 parser.add_option(
+--spark-ec2-git-repo,
+default=DEFAULT_SPARK_EC2_GITHUB_REPO,
+help=Github repo from which to checkout spark-ec2 (default: 
%default))
+parser.add_option(
+--spark-ec2-branch,
--- End diff --

I'll do the former so that it's consistent with the existing 
--spark-git-repo arg


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread shivaram
Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73185820
  
Thanks @florianverhein for the change - This is a pretty useful change as I 
often modify these variables inline for my experiments. 

@nchammas @JoshRosen could you take a look at the python style changes and 
make sure they are okay ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24224925
  
--- Diff: ec2/spark_ec2.py ---
@@ -1007,6 +1022,14 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Limit naming to avoid breaking things, as we rely on the repo
--- End diff --

Sorry, misread/understood. Will change and test. May as well keep the CLI 
check, but use it to  ensure there's no trailing / or .git. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24222907
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
masterformat(rb=repo_branch)
 ssh(
 host=master,
 opts=opts,
 command=rm -rf spark-ec2
 +   
-+ git clone https://github.com/mesos/spark-ec2.git -b 
{b}.format(b=MESOS_SPARK_EC2_BRANCH)
++ git clone {rb}.format(rb=repo_branch)
--- End diff --

Good point. I'll enforce this by checking the CLI option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24222849
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
masterformat(rb=repo_branch)
--- End diff --

missed that, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73185215
  
Thanks for prompt feedback @nchammas. Much appreciated. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24224602
  
--- Diff: ec2/spark_ec2.py ---
@@ -1007,6 +1022,14 @@ def real_main():
 print  stderr, ebs-vol-num cannot be greater than 8
 sys.exit(1)
 
+# Limit naming to avoid breaking things, as we rely on the repo
--- End diff --

Can't we allow people to name the repo whatever they want? All we need to 
do is make sure the repo gets cloned to `spark-ec2` regardless of what it is 
actually called, no?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24207640
  
--- Diff: ec2/spark_ec2.py ---
@@ -145,6 +145,14 @@ def parse_args():
 default=DEFAULT_SPARK_GITHUB_REPO,
 help=Github repo from which to checkout supplied commit hash 
(default: %default))
 parser.add_option(
+--spark-ec2-git-repo,
+default=DEFAULT_SPARK_EC2_GITHUB_REPO,
+help=Github repo from which to checkout spark-ec2 (default: 
%default))
+parser.add_option(
+--spark-ec2-branch,
+default=DEFAULT_SPARK_EC2_BRANCH,
+help=Spark-ec2 branch to use (default: %default))
--- End diff --

Nit: Lowercase `spark-ec2`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24208073
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
--- End diff --

Can't we keep this as part of the `ssh(...)` call? Or does the line become 
too long / hard to format?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24207944
  
--- Diff: ec2/spark_ec2.py ---
@@ -330,7 +338,10 @@ def get_spark_ami(opts):
 print  stderr,\
 Don't recognize %s, assuming type is pvm % opts.instance_type
 
-ami_path = %s/%s/%s % (AMI_PREFIX, opts.region, instance_type)
+# URL prefix from which to fetch AMI information
+ami_prefix = 
{r}/{b}/ami-list.format(r=opts.spark_ec2_git_repo.replace(github,raw.github),
 b=opts.spark_ec2_branch)
--- End diff --

What if GitHub is using Spark and they have a copy under `github/spark-ec2`?

I'd say to be safe/defensive, match on `https://github.com` and only at the 
very start of the string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24208401
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
master.format(rb=repo_branch)
--- End diff --

This print looks incorrect.

 Cloning spark-ec2 scripts from https://github.com/mesos/spark-ec2 -b 
branch-1.3 on master

Should probably instead be something like:

 Cloning spark-ec2 scripts from 
https://github.com/mesos/spark-ec2/tree/branch-1.3 ...

Also, formatting nit: Typically we use 3 dots `...` to end statements like 
these.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73156805
  
Thank @florianverhein for submitting this. It should be useful. I know I've 
manually edited `spark_ec2.py` several times to point it to different places.

cc @shivaram 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24218617
  
--- Diff: ec2/spark_ec2.py ---
@@ -145,6 +145,14 @@ def parse_args():
 default=DEFAULT_SPARK_GITHUB_REPO,
 help=Github repo from which to checkout supplied commit hash 
(default: %default))
 parser.add_option(
+--spark-ec2-git-repo,
+default=DEFAULT_SPARK_EC2_GITHUB_REPO,
+help=Github repo from which to checkout spark-ec2 (default: 
%default))
+parser.add_option(
+--spark-ec2-branch,
+default=DEFAULT_SPARK_EC2_BRANCH,
+help=Spark-ec2 branch to use (default: %default))
--- End diff --

Haha, I was debating this. Changed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24218680
  
--- Diff: ec2/spark_ec2.py ---
@@ -330,7 +338,10 @@ def get_spark_ami(opts):
 print  stderr,\
 Don't recognize %s, assuming type is pvm % opts.instance_type
 
-ami_path = %s/%s/%s % (AMI_PREFIX, opts.region, instance_type)
+# URL prefix from which to fetch AMI information
+ami_prefix = 
{r}/{b}/ami-list.format(r=opts.spark_ec2_git_repo.replace(github,raw.github),
 b=opts.spark_ec2_branch)
--- End diff --

Good point. 
replace doesn't seem to support regexes, so to keep things simple I matched 
like you suggested, and limited to 1 match


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24218731
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
--- End diff --

No reason for it other than I use it twice


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread florianverhein
Github user florianverhein commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24218853
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
master.format(rb=repo_branch)
--- End diff --

I wanted to keep it closer to the actual clone command used. 
I don't know git well enough to be certain that it's always equivalent to 
the {r}/tree/{b} pattern.
But happy to change. Just let me know. 

fixed the 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24219625
  
--- Diff: ec2/spark_ec2.py ---
@@ -145,6 +145,14 @@ def parse_args():
 default=DEFAULT_SPARK_GITHUB_REPO,
 help=Github repo from which to checkout supplied commit hash 
(default: %default))
 parser.add_option(
+--spark-ec2-git-repo,
+default=DEFAULT_SPARK_EC2_GITHUB_REPO,
+help=Github repo from which to checkout spark-ec2 (default: 
%default))
+parser.add_option(
+--spark-ec2-branch,
+default=DEFAULT_SPARK_EC2_BRANCH,
+help=spark-ec2 branch to use (default: %default))
--- End diff --

Another super nit here: Is it a spark-ec2 branch or a GitHub repo branch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24219572
  
--- Diff: ec2/spark_ec2.py ---
@@ -145,6 +145,14 @@ def parse_args():
 default=DEFAULT_SPARK_GITHUB_REPO,
 help=Github repo from which to checkout supplied commit hash 
(default: %default))
 parser.add_option(
+--spark-ec2-git-repo,
+default=DEFAULT_SPARK_EC2_GITHUB_REPO,
+help=Github repo from which to checkout spark-ec2 (default: 
%default))
+parser.add_option(
+--spark-ec2-branch,
--- End diff --

Nit: This should probably be `--spark-ec2-git-branch` to match 
`--spark-ec2-git-repo`.

Or alternately, `--spark-ec2-repo` and `--spark-ec2-branch`, since we don't 
take any git repo (it has to be on GitHub) and we call that out in the help 
text.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24219713
  
--- Diff: ec2/spark_ec2.py ---
@@ -330,7 +338,10 @@ def get_spark_ami(opts):
 print  stderr,\
 Don't recognize %s, assuming type is pvm % opts.instance_type
 
-ami_path = %s/%s/%s % (AMI_PREFIX, opts.region, instance_type)
+# URL prefix from which to fetch AMI information
+ami_prefix = 
{r}/{b}/ami-list.format(r=opts.spark_ec2_git_repo.replace(https://github.com,https://raw.github.com,1),
 b=opts.spark_ec2_branch)
--- End diff --

Looks good. Does `dev/lint-python` pass on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24219781
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
masterformat(rb=repo_branch)
--- End diff --

We're saying the branch twice here, once in `repo_branch`, and then again 
with the hard coded `master` in this string. I don't think it's correct.

Something that says:

 Cloning spark-ec2 scripts from 
https://github.com/mesos/spark-ec2/tree/branch-1.3 ...

should be fine since that's the convention that GitHub follows and this is 
just an info message anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-05 Thread nchammas
Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24219900
  
--- Diff: ec2/spark_ec2.py ---
@@ -643,12 +654,14 @@ def setup_cluster(conn, master_nodes, slave_nodes, 
opts, deploy_ssh_key):
 
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
+repo_branch={r} -b {b}.format(r=opts.spark_ec2_git_repo, 
b=opts.spark_ec2_branch)
+print Cloning spark-ec2 scripts from {rb} on 
masterformat(rb=repo_branch)
 ssh(
 host=master,
 opts=opts,
 command=rm -rf spark-ec2
 +   
-+ git clone https://github.com/mesos/spark-ec2.git -b 
{b}.format(b=MESOS_SPARK_EC2_BRANCH)
++ git clone {rb}.format(rb=repo_branch)
--- End diff --

I think we need to change this to always clone to `spark-ec2`.

Otherwise, if people fork `spark-ec2` to something with a different name, 
the cloned repo will have a random name and stuff will break like the `rm -rf` 
command a few lines up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-72996393
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-04 Thread florianverhein
GitHub user florianverhein opened a pull request:

https://github.com/apache/spark/pull/4385

[SPARK-5611] [EC2] Allow spark-ec2 repo and branch to be set on CLI of 
spark_ec2.py

and by extension, the ami-list

Useful for using alternate spark-ec2 repos or branches. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/florianverhein/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4385.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4385


commit 811138824bf23de0e42a5786b729cfea2261222f
Author: Florian Verhein florian.verh...@gmail.com
Date:   2015-02-05T05:18:03Z

[SPARK-5611] [EC2] Allow spark-ec2 repo and branch to be set on CLI of 
spark_ec2.py




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org