Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-4104
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-4769
Can you try using the existing `addTaskCompletionListener` with
`context.isInterrupted()`, roll back the other changes / listener interfaces,
and add a streaming unit
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/3917#issuecomment-5142
Hey I commented on the JIRA, but some recent changes in the way we publish
artifacts actually makes this more tenable of a change.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4917#issuecomment-77769395
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-4567
Actually, this just occurred to me: why not use `addTaskCompletionListener`
and check `TaskContext.isInterrupted()` inside of your listener in order to
decide whether
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/spark/pull/4917#discussion_r26007391
--- Diff:
examples/src/main/scala/org/apache/spark/examples/graphx/LiveJournalPageRank.scala
---
@@ -30,14 +25,14 @@ object LiveJournalPageRank {
def
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4917#issuecomment-77769390
[Test build #28373 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28373/consoleFull)
for PR 4917 at commit
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-4362
I think we discussed this on the first PR, but it would be good to add an
explicit note here summarizing why we're not using `TaskCompletionListener`,
since its
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-4344
[Test build #28374 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28374/consoleFull)
for PR 4135 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4917#issuecomment-77765206
[Test build #28373 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28373/consoleFull)
for PR 4917 at commit
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/4940
[SPARK-6215][SQL] Shorten apply and update funcs in GenerateProjection
Some codes in `GenerateProjection` look redundant and can be shortened.
You can merge this pull request into a Git repository
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4940#issuecomment-77741269
[Test build #28371 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28371/consoleFull)
for PR 4940 at commit
Github user shivaram commented on the pull request:
https://github.com/apache/spark/pull/4916#issuecomment-77781120
@srowen I'm fine with the line either way and it shouldn't really matter.
As I described above the problem of master_instance_type being different from
slaves is a
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4923#discussion_r26011565
--- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
---
@@ -575,15 +583,32 @@ private[spark] object PythonRDD extends Logging {
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4935#issuecomment-77786580
I'm going to close this PR and re-open against master, since it looks like
the conflicts shouldn't be too bad.
---
If your project is set up for it, you can reply to
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/4805#discussion_r26011962
--- Diff:
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/SparkKafkaUtils.scala
---
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4923#issuecomment-77785375
Now that this has been updated to collect results via a socket, it looks
like we may finally be able to close
https://issues.apache.org/jira/browse/SPARK-677, one of
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786363
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786367
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786385
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786348
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/4805#discussion_r26011997
--- Diff:
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala
---
@@ -158,4 +166,37 @@ class
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4942#issuecomment-77790044
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/4940#discussion_r26011394
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateProjection.scala
---
@@ -84,9 +84,15 @@ object
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786280
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4923#issuecomment-77786229
This looks really good to me overall. It would be great if you could
update the PR description to reflect the most recent changes (collecting via a
socket instead of
Github user jerryshao commented on the pull request:
https://github.com/apache/spark/pull/4805#issuecomment-77787450
Hi @koeninger , thanks a lot for your review. I will the fix the all the
comments you addressed.
The reason why I put updating ZK in `StreamingListener` rather
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/4805#discussion_r26012257
--- Diff:
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala
---
@@ -158,4 +166,37 @@ class
Github user uronce-cc commented on a diff in the pull request:
https://github.com/apache/spark/pull/4901#discussion_r26012420
--- Diff: ec2/spark_ec2.py ---
@@ -872,9 +890,16 @@ def deploy_files(conn, root_dir, opts, master_nodes,
slave_nodes, modules):
if . in
Github user hhbyyh commented on the pull request:
https://github.com/apache/spark/pull/4899#issuecomment-77789383
Thanks a lot for providing the feedback.
Move it to comments as suggested.
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user yejiming opened a pull request:
https://github.com/apache/spark/pull/4942
Branch 1.3
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/spark branch-1.3
Alternatively you can review and apply these changes as
GitHub user yejiming opened a pull request:
https://github.com/apache/spark/pull/4943
Update RandomForest.scala
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/yejiming/spark master
Alternatively you can review and apply these
Github user yejiming closed the pull request at:
https://github.com/apache/spark/pull/4943
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user suyanNone commented on the pull request:
https://github.com/apache/spark/pull/4887#issuecomment-77791515
@srowen also be my fault, for lazy to make description more clear... I will
update the description to make more sense
---
If your project is set up for it, you can
Github user yejiming closed the pull request at:
https://github.com/apache/spark/pull/4942
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4941#issuecomment-77792236
[Test build #28375 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28375/consoleFull)
for PR 4941 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4382#issuecomment-77792262
[Test build #28377 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28377/consoleFull)
for PR 4382 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4941#issuecomment-77792239
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/4926#discussion_r26013109
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala
---
@@ -179,7 +179,12 @@ private[hive] case class
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/4926#discussion_r26013144
--- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/UDFSuite.scala
---
@@ -32,5 +32,6 @@ class UDFSuite extends QueryTest {
Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/4805#discussion_r26013229
--- Diff:
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala
---
@@ -82,8 +83,12 @@ class
Github user zhpengg commented on a diff in the pull request:
https://github.com/apache/spark/pull/4909#discussion_r26013467
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -467,7 +467,9 @@ private[spark] class Master(
* two executors on the
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/4940#issuecomment-77793875
@chenghao-intel These are the random accessors for the row (`SpecificRow`)
objects produced by the projection `GenerateProjection`. So I think they are
not resolved
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4899#issuecomment-77794351
[Test build #28376 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28376/consoleFull)
for PR 4899 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4899#issuecomment-77794358
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/4940#discussion_r26013756
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateProjection.scala
---
@@ -84,9 +84,15 @@ object
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-8576
[Test build #28374 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28374/consoleFull)
for PR 4135 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4135#issuecomment-8577
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4923#discussion_r26011521
--- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
---
@@ -341,7 +342,7 @@ private[spark] object PythonRDD extends Logging {
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4899#issuecomment-77789229
[Test build #28376 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28376/consoleFull)
for PR 4899 at commit
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/4916#discussion_r26010428
--- Diff: ec2/spark_ec2.py ---
@@ -1259,6 +1259,15 @@ def real_main():
cluster_instances=(master_nodes + slave_nodes),
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4923#discussion_r26011616
--- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
---
@@ -575,15 +583,32 @@ private[spark] object PythonRDD extends Logging {
Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4923#discussion_r26011596
--- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
---
@@ -575,15 +583,32 @@ private[spark] object PythonRDD extends Logging {
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4941#issuecomment-77786854
[Test build #28375 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28375/consoleFull)
for PR 4941 at commit
Github user jerryshao commented on the pull request:
https://github.com/apache/spark/pull/4723#issuecomment-77786791
Thanks @tdas for your review, maybe we should figure out a way to test the
Kafka Python API at first.
---
If your project is set up for it, you can reply to this
GitHub user nchammas opened a pull request:
https://github.com/apache/spark/pull/4941
[SPARK-6219] [Build] Check that Python code compiles
This PR expands the Python lint checks so that they check for obvious
compilation errors in our Python code.
This PR also bumps up the
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4944#issuecomment-77798728
[Test build #28378 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28378/consoleFull)
for PR 4944 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4805#issuecomment-77800424
[Test build #28379 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28379/consoleFull)
for PR 4805 at commit
Github user jerryshao commented on the pull request:
https://github.com/apache/spark/pull/4805#issuecomment-77801344
Hi @koeninger , would you please review this again? Thanks a lot and
appreciate your time.
Here I still keep using the HashMap for Time - offset relation
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/4907#issuecomment-77801338
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4382#issuecomment-77798081
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4382#issuecomment-77798078
[Test build #28377 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28377/consoleFull)
for PR 4382 at commit
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/4427#issuecomment-77801470
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4944#issuecomment-77803785
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4944#issuecomment-77803782
[Test build #28378 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28378/consoleFull)
for PR 4944 at commit
Github user JoshRosen closed the pull request at:
https://github.com/apache/spark/pull/4935
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/4935#issuecomment-77798705
I've opened an updated PR against the master branch at #4944 and added a
regression test (which was slightly non-trivial to write).
---
If your project is set up for
GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/4944
[SPARK-6209] Clean up connections in ExecutorClassLoader after failing to
load classes (master branch PR)
ExecutorClassLoader does not ensure proper cleanup of network connections
that it opens.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4805#issuecomment-77800707
[Test build #28380 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28380/consoleFull)
for PR 4805 at commit
Github user rezazadeh commented on the pull request:
https://github.com/apache/spark/pull/4934#issuecomment-77801646
Thank you for this PR @staple !
@mengxr I suggested to @staple to first implement without backtracking to
keep the PR as simple as possible. According to his
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4907#issuecomment-77801605
[Test build #28381 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28381/consoleFull)
for PR 4907 at commit
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/3850#issuecomment-77749877
In order to close this out, is it worth just merging this into 1.0 as
@JoshRosen suggests?
Same for some of these other back-ports to 0.9 or 1.0.
---
If your
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4917#discussion_r26005247
--- Diff:
examples/src/main/scala/org/apache/spark/examples/graphx/LiveJournalPageRank.scala
---
@@ -30,14 +25,14 @@ object LiveJournalPageRank {
def
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4922#issuecomment-77750283
Sounds like a good improvement, changes look OK to my mildly informed eyes,
and you have both reviewed and tested the change. LGTM.
---
If your project is set up for it,
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/4916#issuecomment-77750513
I'll wait another day in case the consensus is that this line should indeed
change, but it sounds like it is likely fine as is.
---
If your project is set up for it, you
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/4922
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4940#issuecomment-77743415
[Test build #28371 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28371/consoleFull)
for PR 4940 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4940#issuecomment-77743416
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4917#discussion_r26005234
--- Diff: bin/pyspark ---
@@ -60,6 +60,9 @@ fi
#
# For backwards-compatibility, we retain the old IPYTHON and IPYTHON_OPTS
variables.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/4933
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user shivaram commented on a diff in the pull request:
https://github.com/apache/spark/pull/4901#discussion_r26010373
--- Diff: ec2/spark_ec2.py ---
@@ -872,9 +890,16 @@ def deploy_files(conn, root_dir, opts, master_nodes,
slave_nodes, modules):
if . in
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/4940#issuecomment-77785497
@viirya I am not sure if we really need this change, seems the checking `if
(i 0 || i = this.length)` is not necessary to me, because the `ordinal`
should have
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786424
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786413
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user kellyzly commented on the pull request:
https://github.com/apache/spark/pull/4491#issuecomment-77786426
@srowen,@tgravescs,@vanzin: Encrypted shuffle can make the process of
shuffle more safer. I think it is necessary in spark. Previous design is
reusing hadoop encrypted
Github user nchammas commented on the pull request:
https://github.com/apache/spark/pull/4941#issuecomment-77787089
cc @JoshRosen @davies
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/4930#issuecomment-77787040
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user koeninger commented on a diff in the pull request:
https://github.com/apache/spark/pull/4805#discussion_r26012160
--- Diff:
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/DirectKafkaInputDStream.scala
---
@@ -158,4 +166,37 @@ class
GitHub user hhbyyh reopened a pull request:
https://github.com/apache/spark/pull/4899
[SPARK-6177][MLlib] LDA should check partitions size of the input
JIRA: https://issues.apache.org/jira/browse/SPARK-6177
Add coalesce to LDA example to avoid the possible massive partitions
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4904#issuecomment-77756133
[Test build #28372 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28372/consoleFull)
for PR 4904 at commit
Github user cloud-fan commented on the pull request:
https://github.com/apache/spark/pull/4904#issuecomment-77756049
Hi @marmbrus , it feels hard for me to resolve the base attribute but not
the GetFields that are on top. When we get into `LogicalPlan#resolve`, the
`Attribute`s are
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/4904#issuecomment-77760987
[Test build #28372 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28372/consoleFull)
for PR 4904 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/4904#issuecomment-77760988
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
93 matches
Mail list logo