[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-16 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55793191
  
Thanks!  I've merged this to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1592


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-16 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55793986
  
Great! Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55687685
  
Now that #2385 has been merged, this looks like it will be ready to merge 
as soon as you rebase it on top of master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-15 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55693317
  
Ok, just merged with master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55504491
  
SQL LGTM.

@JoshRosen is this ready to go?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/1592#discussion_r17514641
  
--- Diff: python/pyspark/context.py ---
@@ -36,6 +37,65 @@
 
 from py4j.java_collections import ListConverter
 
+__all__ = [JavaStackTrace, SparkContext]
--- End diff --

This file sets `__all__` further down on line 100; we should only set it 
once.  Actually, do you mind just moving the `extract_concise_traceback` stuff 
to its own file?  There's actually an open JIRA ticket for this 
([SPARK-1087](https://issues.apache.org/jira/browse/SPARK-1087)) and if you're 
going to touch the traceback code anyways now seems like a good time to do this 
refactoring.  You could name the file something like `traceback_utils.py`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/1592#discussion_r17514658
  
--- Diff: python/pyspark/rdd.py ---
@@ -704,7 +651,8 @@ def collect(self):
 
 Return a list that contains all of the elements in this RDD.
 
-with _JavaStackTrace(self.context) as st:
+from pyspark.context import JavaStackTrace
--- End diff --

Can you place this import at the top of the file, alongside the other 
PySpark imports?  If you placed it here to avoid a circular import, then this 
is further evidence that the traceback functions belong in their own file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/1592#discussion_r17514679
  
--- Diff: python/pyspark/sql.py ---
@@ -1624,15 +1636,40 @@ def count(self):
 return self._jschema_rdd.count()
 
 def collect(self):
-
-Return a list that contains all of the rows in this RDD.
+Return a list that contains all of the rows in this RDD.
 
-Each object in the list is on Row, the fields can be accessed as
+Each object in the list is a Row, the fields can be accessed as
 attributes.
+
+Unlike the base RDD implementation of collect, this implementation
+leverages the query optimizer to perform a collect on the 
SchemaRDD,
+which supports features such as filter pushdown.
+
+ srdd = sqlCtx.inferSchema(rdd)
+ srdd.collect()
+[Row(field1=1, field2=u'row1'), ..., Row(field1=3, field2=u'row3')]
 
-rows = RDD.collect(self)
+from pyspark.context import JavaStackTrace
--- End diff --

Same here; this import should be at the top of the file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55507731
  
@staple @marmbrus Aside from my comments on moving the traceback functions 
into their own file, this looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-13 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55516200
  
Thanks for taking a look, guys.

Hmm, it looks like the duplicate __all__ variables result from a recent 
merge. I went ahead and created a separate PR for SPARK-1087, to put the 
traceback code in its own file: https://github.com/apache/spark/pull/2385. Once 
that’s merged to master I’ll circle back to finish up this PR, making the 
include changes you requested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55306732
  
@staple The Jenkins pull request builder is in an odd state of flux right 
now.  I've manually re-triggered your build (I should have self-service retest 
this please working more consistently by sometime next week).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55306927
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/71/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55307181
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20165/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55307274
  
@JoshRosen Great, thanks for your help!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55321104
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/71/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaStackTrace(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55321504
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20165/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaStackTrace(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55227656
  
Thanks!, I'm not getting any more NPEs now. I went ahead and merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55235315
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/67/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55244823
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/67/consoleFull)
 for   PR 1592 at commit 
[`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaStackTrace(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55283908
  
Hi, the failing test was CheckpointSuite / 'recovery with file input 
stream'. This test passed when I ran the tests locally, and it sometimes fails 
spuriously according to this ticket:

'flaky test case in streaming.CheckpointSuite'
https://issues.apache.org/jira/browse/SPARK-1600


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-11 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55283980
  
I'm assuming I don't have permission to ask jenkins to run a test myself, 
right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-10 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55216993
  
I just merged #2323 to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-09 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-55003590
  
Great! I'll go ahead and merge once #2323 is in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-08 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-54900606
  
#2323 seems to fix the Kryo error for me.  Let me know if you have further 
issues after updating this to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-07 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-54760558
  
Hmm, that kryo error is unfortunate.  We'll probably need to add a special 
serializer in SparkSqlSerializer to work around this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-54694556
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-09-02 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-54252457
  
Hey @staple if you have time to update this would be great to include it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-08-03 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-51014567
  
Hi folks I’ve merged with the most recent code (pushed to my branch), but 
for some reason with the most recent merge I am getting NPEs in Kryo for 
schemas containing array data type fields in the sql.py tests. I’m away from 
home, with no real dev system and spotty internet access until Thursday, so 
unfortunately I think it’s impractical to diagnose the problem until then. 
Sorry for the delay.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-08-02 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50978374
  
This seems to have captured a bunch of unrelated changes during the rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-08-02 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50981508
  
Sorry, I'm away from home and had limited time / access to try and do the 
merge last night - which I didn't finish, and as you mentioned messed up the 
included commits. I'll post an explicit comment here when the merge is ready.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-08-01 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50947257
  
@staple could you rebase this pr to PR-1598? it gets very close to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-31 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50722408
  
@staple Could you hold it one more days until we merge the changes in #1598 
 about serialization between Java and Python? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-31 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50722658
  
Sure, no problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-30 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50577338
  
Sure, I’m fine with reworking based on other changes (it seems that some 
merge conflicts have already cropped up in master since I submitted my PR last 
week). I think my change set is a little simpler than the one you linked to, so 
would it make sense for me to wait until that one goes in?

I also thought I’d add a couple of notes on what I had in mind with this 
patch:

1) I added a new Row serialization pathway between python and java, based 
on JList[Array[Byte]] versus the existing RDD[Array[Byte]]. I wasn’t 
overjoyed about doing this, but I noticed that some QueryPlans implement 
optimizations in executeCollect(), which outputs an Array[Row] rather than the 
typical RDD[Row] that can be shipped to python using the existing serialization 
code. To me it made sense to ship the Array[Row] over to python directly 
instead of converting it back to an RDD[Row] just for the purpose of sending 
the Rows to python using the existing serialization code. But let me know if 
you have any thoughts about this.

2) I moved JavaStackTrace from rdd.py to context.py. This made sense to me 
since JavaStackTrace is all about configuring a context attribute, and the 
_extract_concise_traceback function it depends on was already being called 
separately from context.py (as a ‘private’ function of rdd.py).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-30 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50577608
  
Yeah, I'm hoping to merge #1346 as soon as it passes Jenkins, so I'd wait 
for that.

 I also thought I’d add a couple of notes on what I had in mind with 
this patch: ...

Can you add these notes to the PR description so that they get included in 
the commit message?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-30 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50577874
  
Sure, added the notes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-30 Thread staple
Github user staple commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50705813
  
Hi, I've updated the patch to work with the new code in master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-29 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50576177
  
Thanks for working on this!  We'll need to coordinate merging with #1346 
and related PRs. (cc @yhuai)

@JoshRosen can you look at the other pyspark changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1592#issuecomment-50196980
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1421#issuecomment-49085915
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1421#issuecomment-49086561
  
QA tests have started for PR 1421. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16691/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...

2014-07-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1421#issuecomment-49099033
  
QA results for PR 1421:br- This patch PASSES unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16691/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---