[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51133444 lol, it was worth a shot. Anyway, I just noticed that there are a few line break fixes I missed. Gonna submit one more commit for that. Sorry about the spam

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51134952 OK, I've submitted the final set of fixes to undo the damage caused by calling `autopep8` with the wrong settings. Once tests pass for this, I'd say it's ready

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51135706 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51139536 Hmm, I'm not sure why this latest round of tests failed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51139547 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51143491 I don't see how these test failures might be related to the changes introduced in this PR. I see that the issue @JoshRosen called out earlier here has been [resolved

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-03 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-50983745 Hey Pat! 1. I've edited the title accordingly. 2. Makes sense. I'll take a crack at fetching `pep8` lazily as you describe. Is there something you can point

[GitHub] spark pull request: [SPARK-2627] [PySpark] have the build enforce ...

2014-08-03 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1744#issuecomment-51001657 Jenkins, my man, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-1981] Add AWS Kinesis streaming support

2014-08-02 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1434#issuecomment-50974810 This was an epic pull request. Nice work, people. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-2627] have the build enforce PEP 8 auto...

2014-08-02 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/1744 [SPARK-2627] have the build enforce PEP 8 automatically As described in [SPARK-2627](https://issues.apache.org/jira/browse/SPARK-2627), we'd like Python code to automatically be checked for PEP 8

[GitHub] spark pull request: [SPARK-2141] Adding getPersistentRddIds and un...

2014-07-28 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1082#issuecomment-50351593 Hi folks, Kan asked me to comment here on my use case for this PySpark call since I reported the [original JIRA issue](https://issues.apache.org/jira/browse/SPARK-2141

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-22 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49760182 do you mind submitting a pep8 checker as part of Jenkins? Will do. I won't be able to work on this today, but I will open a separate PR for this this week

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-22 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49791186 I've created [SPARK-2627](https://issues.apache.org/jira/browse/SPARK-2627) to track the Jenkins/CI part of this work. I'll post future updates there. --- If your

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/1505#discussion_r15198830 --- Diff: python/pyspark/cloudpickle.py --- @@ -55,7 +55,7 @@ import dis import traceback -#relevant opcodes +# relevant opcodes

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49671550 @davies PEP 8 recommends using [implied line continuation over backslashes](http://legacy.python.org/dev/peps/pep-0008/#maximum-line-length) where possible. It appears

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/1505#discussion_r15200778 --- Diff: python/pyspark/cloudpickle.py --- @@ -55,7 +55,7 @@ import dis import traceback -#relevant opcodes +# relevant opcodes

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49678538 OK, I've fixed the problems you pointed out, reverted the changes to cloudpickle, and confirmed that `python/run-tests` passes. --- If your project is set up

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49689180 Reynold, is there anything else we need to clean up before we can have `pep8` checks become part of the CI cycle? Also, it sounds like we want to make

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-21 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1505#issuecomment-49698547 If we have to, we could probably somehow package `pep8` and its dependencies as a standalone. It's doable but I think also a bit ugly and harder to update

[GitHub] spark pull request: [SPARK-2470] PEP8 fixes to PySpark

2014-07-20 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/1505 [SPARK-2470] PEP8 fixes to PySpark This pull request aims to resolve all outstanding PEP8 violations in PySpark. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: name ec2 instances and security groups consist...

2014-07-10 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1344#issuecomment-48637382 OK, sounds good to me! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: name ec2 instances and security groups consist...

2014-07-09 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1344#issuecomment-48496508 For what it's worth, I think `spark-ec2` _should_ somehow tag all the instances it launches with the word Spark so that they are easy to find in the EC2 console

[GitHub] spark pull request: name ec2 instances and security groups consist...

2014-07-09 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1344#issuecomment-48514129 If users want to select all spark clusters, can't they just name their clusters spark-XXX? Absolutely, and I think that's fine. I would worry a bit

[GitHub] spark pull request: name ec2 instances and security groups consist...

2014-07-09 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1344#issuecomment-48566678 @pwendell - Should I specifically mention you when responding, or is that not necessary? Just making sure you got my response. Anyway, in summary, I think

[GitHub] spark pull request: [SPARK-2065] give launched instances names

2014-06-10 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/1043 [SPARK-2065] give launched instances names This update resolves [SPARK-2065](https://issues.apache.org/jira/browse/SPARK-2065). It gives launched EC2 instances descriptive names by using instance

[GitHub] spark pull request: [SPARK-2065] give launched instances names

2014-06-10 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1043#issuecomment-45684503 Caveat: I had trouble testing this with spot instances. For some reason, even when I set my spot price really high, I could not get Amazon to grant me spot instances

[GitHub] spark pull request: [SPARK-2065] give launched instances names

2014-06-10 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/1043#issuecomment-45696914 Ah, I see. :) Thank you for the gentle pointer. I've fixed 2 PEP8 violations in my patch, as well as 1 PEP8 violation elsewhere. --- If your project is set up

[GitHub] spark pull request: updated link to mailing list

2014-05-30 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/923 updated link to mailing list You can merge this pull request into a Git repository by running: $ git pull https://github.com/nchammas/spark patch-1 Alternatively you can review and apply

[GitHub] spark pull request: [SPARK-1308] Add partitions() method to PySpar...

2014-04-22 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/218#issuecomment-41060733 Dunno if I've done this correctly, but I've rebased my fork and changed this method per our discussion here. The change list seems to contain everything I merged

[GitHub] spark pull request: [SPARK-1308] Add partitions() method to PySpar...

2014-04-22 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/218#issuecomment-41062176 Ah, looks like I have indeed [messed up this PR](http://stackoverflow.com/q/6102297/877069). I will revisit this issue later. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-1308] Add partitions() method to PySpar...

2014-04-22 Thread nchammas
Github user nchammas closed the pull request at: https://github.com/apache/spark/pull/218 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-1308] Add partitions() method to PySpar...

2014-03-24 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/218 [SPARK-1308] Add partitions() method to PySpark RDDs I've added the `partitions()` method per [the discussion here](http://apache-spark-user-list.1001560.n3.nabble.com/How-many-partitions-is-my-RDD

[GitHub] spark pull request: [SPARK-1308] Add partitions() method to PySpar...

2014-03-24 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/218#discussion_r10913578 --- Diff: python/pyspark/rdd.py --- @@ -202,6 +202,12 @@ def flatMap(self, f, preservesPartitioning=False): def func(s, iterator): return

<    5   6   7   8   9   10