[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55793191 Thanks! I've merged this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1592 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55793986 Great! Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55687685 Now that #2385 has been merged, this looks like it will be ready to merge as soon as you rebase it on top of master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55693317 Ok, just merged with master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55504491 SQL LGTM. @JoshRosen is this ready to go? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1592#discussion_r17514641 --- Diff: python/pyspark/context.py --- @@ -36,6 +37,65 @@ from py4j.java_collections import ListConverter +__all__ = [JavaStackTrace, SparkContext] --- End diff -- This file sets `__all__` further down on line 100; we should only set it once. Actually, do you mind just moving the `extract_concise_traceback` stuff to its own file? There's actually an open JIRA ticket for this ([SPARK-1087](https://issues.apache.org/jira/browse/SPARK-1087)) and if you're going to touch the traceback code anyways now seems like a good time to do this refactoring. You could name the file something like `traceback_utils.py`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1592#discussion_r17514658 --- Diff: python/pyspark/rdd.py --- @@ -704,7 +651,8 @@ def collect(self): Return a list that contains all of the elements in this RDD. -with _JavaStackTrace(self.context) as st: +from pyspark.context import JavaStackTrace --- End diff -- Can you place this import at the top of the file, alongside the other PySpark imports? If you placed it here to avoid a circular import, then this is further evidence that the traceback functions belong in their own file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1592#discussion_r17514679 --- Diff: python/pyspark/sql.py --- @@ -1624,15 +1636,40 @@ def count(self): return self._jschema_rdd.count() def collect(self): - -Return a list that contains all of the rows in this RDD. +Return a list that contains all of the rows in this RDD. -Each object in the list is on Row, the fields can be accessed as +Each object in the list is a Row, the fields can be accessed as attributes. + +Unlike the base RDD implementation of collect, this implementation +leverages the query optimizer to perform a collect on the SchemaRDD, +which supports features such as filter pushdown. + + srdd = sqlCtx.inferSchema(rdd) + srdd.collect() +[Row(field1=1, field2=u'row1'), ..., Row(field1=3, field2=u'row3')] -rows = RDD.collect(self) +from pyspark.context import JavaStackTrace --- End diff -- Same here; this import should be at the top of the file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55507731 @staple @marmbrus Aside from my comments on moving the traceback functions into their own file, this looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55516200 Thanks for taking a look, guys. Hmm, it looks like the duplicate __all__ variables result from a recent merge. I went ahead and created a separate PR for SPARK-1087, to put the traceback code in its own file: https://github.com/apache/spark/pull/2385. Once thatâs merged to master Iâll circle back to finish up this PR, making the include changes you requested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55306732 @staple The Jenkins pull request builder is in an odd state of flux right now. I've manually re-triggered your build (I should have self-service retest this please working more consistently by sometime next week). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55306927 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/71/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55307181 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20165/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55307274 @JoshRosen Great, thanks for your help! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55321104 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/71/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class JavaStackTrace(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55321504 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20165/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch **passes** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class JavaStackTrace(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55227656 Thanks!, I'm not getting any more NPEs now. I went ahead and merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55235315 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/67/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55244823 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/67/consoleFull) for PR 1592 at commit [`e3b802f`](https://github.com/apache/spark/commit/e3b802fc4483259e724fd4c2c3d7e1b338b550fd). * This patch **fails** unit tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class JavaStackTrace(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55283908 Hi, the failing test was CheckpointSuite / 'recovery with file input stream'. This test passed when I ran the tests locally, and it sometimes fails spuriously according to this ticket: 'flaky test case in streaming.CheckpointSuite' https://issues.apache.org/jira/browse/SPARK-1600 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55283980 I'm assuming I don't have permission to ask jenkins to run a test myself, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55216993 I just merged #2323 to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-55003590 Great! I'll go ahead and merge once #2323 is in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-54900606 #2323 seems to fix the Kryo error for me. Let me know if you have further issues after updating this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-54760558 Hmm, that kryo error is unfortunate. We'll probably need to add a special serializer in SparkSqlSerializer to work around this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-54694556 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-54252457 Hey @staple if you have time to update this would be great to include it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-51014567 Hi folks Iâve merged with the most recent code (pushed to my branch), but for some reason with the most recent merge I am getting NPEs in Kryo for schemas containing array data type fields in the sql.py tests. Iâm away from home, with no real dev system and spotty internet access until Thursday, so unfortunately I think itâs impractical to diagnose the problem until then. Sorry for the delay. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50978374 This seems to have captured a bunch of unrelated changes during the rebase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50981508 Sorry, I'm away from home and had limited time / access to try and do the merge last night - which I didn't finish, and as you mentioned messed up the included commits. I'll post an explicit comment here when the merge is ready. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50947257 @staple could you rebase this pr to PR-1598? it gets very close to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50722408 @staple Could you hold it one more days until we merge the changes in #1598 about serialization between Java and Python? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50722658 Sure, no problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50577338 Sure, Iâm fine with reworking based on other changes (it seems that some merge conflicts have already cropped up in master since I submitted my PR last week). I think my change set is a little simpler than the one you linked to, so would it make sense for me to wait until that one goes in? I also thought Iâd add a couple of notes on what I had in mind with this patch: 1) I added a new Row serialization pathway between python and java, based on JList[Array[Byte]] versus the existing RDD[Array[Byte]]. I wasnât overjoyed about doing this, but I noticed that some QueryPlans implement optimizations in executeCollect(), which outputs an Array[Row] rather than the typical RDD[Row] that can be shipped to python using the existing serialization code. To me it made sense to ship the Array[Row] over to python directly instead of converting it back to an RDD[Row] just for the purpose of sending the Rows to python using the existing serialization code. But let me know if you have any thoughts about this. 2) I moved JavaStackTrace from rdd.py to context.py. This made sense to me since JavaStackTrace is all about configuring a context attribute, and the _extract_concise_traceback function it depends on was already being called separately from context.py (as a âprivateâ function of rdd.py). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50577608 Yeah, I'm hoping to merge #1346 as soon as it passes Jenkins, so I'd wait for that. I also thought Iâd add a couple of notes on what I had in mind with this patch: ... Can you add these notes to the PR description so that they get included in the commit message? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50577874 Sure, added the notes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50705813 Hi, I've updated the patch to work with the new code in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50576177 Thanks for working on this! We'll need to coordinate merging with #1346 and related PRs. (cc @yhuai) @JoshRosen can you look at the other pyspark changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1592#issuecomment-50196980 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49085915 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49086561 QA tests have started for PR 1421. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16691/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2314][SQL] Override collect and take in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1421#issuecomment-49099033 QA results for PR 1421:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16691/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---