Github user avi8tr commented on the issue:
https://github.com/apache/spark/pull/16782
Hi, thanks for explaining that there is a purpose for the retention and
passing of the user-supplied arguments outside of the function call (while not
changing the public api). This fix enabling
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16809
thanks, merging to master!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/17089
closing in favor of https://github.com/apache/spark/pull/16809
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user cloud-fan closed the pull request at:
https://github.com/apache/spark/pull/17089
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/17100
Jenkins, ok to test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17064#discussion_r103526741
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -634,4 +634,28 @@ class CachedTableSuite extends QueryTest with
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/9524
**[Test build #73602 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73602/consoleFull)**
for PR 9524 at commit
GitHub user actuaryzhang opened a pull request:
https://github.com/apache/spark/pull/17103
[Minor][Doc] Update GLM doc to include tweedie distribution
Update GLM documentation to include the Tweedie distribution. #16344
@jkbradley @yanboliang
You can merge this pull
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17064#discussion_r103527113
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -634,4 +634,28 @@ class CachedTableSuite extends QueryTest with
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17103
**[Test build #73601 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73601/testReport)**
for PR 17103 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17056
**[Test build #73595 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73595/testReport)**
for PR 17056 at commit
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/17090
@MLnick Thanks for showing those comparison numbers. If your
implementation is faster, then I'm happy going with it. I do wonder if we
might hit scalability issues with RDDs which we would not
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17056
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user tejasapatil commented on the issue:
https://github.com/apache/spark/pull/17056
cc @cloud-fan @gatorsmile
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/17100
@rberenguel : how about adding the "[SQL]" tag to this, since while the
feature request comes out of PySpark its changing the SQL code.
---
If your project is set up for it, you can reply to this
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17056
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73595/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17100
**[Test build #73604 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73604/testReport)**
for PR 17100 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17104
**[Test build #73603 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73603/testReport)**
for PR 17104 at commit
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17064#discussion_r103528104
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -634,4 +634,28 @@ class CachedTableSuite extends QueryTest with
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
@jinxing64 thanks for updating this to be just the simpler fix. Since the
original jira has a bit of a longer discussion on it, do you mind opening a new
jira for this change, and linking it to the
Github user squito commented on the issue:
https://github.com/apache/spark/pull/16867
other than a bit of jira re-organization, lgtm
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17103
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user mgummelt commented on a diff in the pull request:
https://github.com/apache/spark/pull/17045#discussion_r103529500
--- Diff:
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -256,7 +259,7
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17103
**[Test build #73601 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73601/testReport)**
for PR 17103 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17056
**[Test build #73596 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73596/testReport)**
for PR 17056 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17103
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73601/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16639
**[Test build #73598 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73598/testReport)**
for PR 16639 at commit
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17064#discussion_r103530085
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ---
@@ -634,4 +634,28 @@ class CachedTableSuite extends QueryTest with
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16639
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73598/
Test FAILed.
---
Github user mgummelt commented on the issue:
https://github.com/apache/spark/pull/17045
@srowen Can we get a merge? This is a bugfix, so it probably belongs in
all supported branches (1.6, 2.0, 2.1, master)
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17056
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17056
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73596/
Test PASSed.
---
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/10307#discussion_r103530578
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -388,16 +388,18 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/17096
Thank you for taking this over @HyukjinKwon :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user Yunni commented on the issue:
https://github.com/apache/spark/pull/17092
@jkbradley @MLnick Here is a clean PR. Sorry for messing up the previous
one!
@merlintang I am happy to continue our discussion here:
https://issues.apache.org/jira/browse/SPARK-19771 as
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16499#discussion_r103530816
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -1018,7 +1025,9 @@ private[spark] class BlockManager(
try
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/16938#discussion_r103531754
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
---
@@ -412,15 +412,28 @@ case class DataSource(
Github user mgummelt commented on the issue:
https://github.com/apache/spark/pull/13326
What value is there in showing killed drivers that never ran?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user holdenk commented on a diff in the pull request:
https://github.com/apache/spark/pull/16845#discussion_r103532524
--- Diff: python/pyspark/util.py ---
@@ -0,0 +1,45 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/17093#discussion_r103532597
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala
---
@@ -300,7 +300,8 @@ object
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/17096
Let's double check with @viirya to make sure his comment was addressed, but
I really appreciate the improved test coverage :)
---
If your project is set up for it, you can reply to this email and
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16784
**[Test build #73605 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73605/testReport)**
for PR 16784 at commit
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/17090
Fitting into the CV / evaluator is actually fairly straightforward. It's
just that the semantics of `transform` for top-k recommendation must fit into
whatever we decide on for `RankingEvaluator`,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17104
**[Test build #73603 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73603/testReport)**
for PR 17104 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17104
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17104
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73603/
Test PASSed.
---
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/17104
Does it not work? I thought the full qualified name was fine.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/16330
yea log file should be fine to put in temp dir.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16784
**[Test build #73606 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73606/testReport)**
for PR 16784 at commit
Github user wangmiao1981 commented on the issue:
https://github.com/apache/spark/pull/16784
@jkbradley I simplified the tests and modified the data generation API by
using toSparse method, which eliminates the index variable.
"Is this multivariate online summarizer issue really a
Github user VinceShieh commented on a diff in the pull request:
https://github.com/apache/spark/pull/16883#discussion_r103597822
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ---
@@ -163,25 +190,28 @@ class StringIndexerModel (
}
Github user tejasapatil commented on a diff in the pull request:
https://github.com/apache/spark/pull/17075#discussion_r103598158
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala ---
@@ -193,7 +193,7 @@ class DecimalSuite extends SparkFunSuite
Github user sitalkedia commented on the issue:
https://github.com/apache/spark/pull/17088
>> Why is this a no-op when the shuffle service isn't enabled? It looks
like you mark the slave as lost in all cases?
@kayousterhout - You are right. It's kind of confusing that we are
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17064
**[Test build #73641 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73641/testReport)**
for PR 17064 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17088
**[Test build #73640 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73640/testReport)**
for PR 17088 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17109
**[Test build #73645 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73645/testReport)**
for PR 17109 at commit
Github user windpiger commented on a diff in the pull request:
https://github.com/apache/spark/pull/16938#discussion_r103600104
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
---
@@ -412,15 +412,28 @@ case class DataSource(
Github user zjffdu commented on a diff in the pull request:
https://github.com/apache/spark/pull/10307#discussion_r103600310
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -388,16 +388,18 @@ def csv(self, path, schema=None, sep=None,
encoding=None, quote=None, escape=Non
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17110
**[Test build #73644 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73644/testReport)**
for PR 17110 at commit
GitHub user jinxing64 opened a pull request:
https://github.com/apache/spark/pull/17111
[SPARK-19777] Scan runningTasksSet when check speculatable tasks in Tâ¦
â¦askSetManager.
## What changes were proposed in this pull request?
When check speculatable tasks in
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17047
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73626/
Test PASSed.
---
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/16987#discussion_r103603671
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala ---
@@ -208,6 +208,11 @@ trait StreamTest extends QueryTest with
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17047
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/16987#discussion_r103603693
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -662,6 +665,154 @@ class FileStreamSourceSuite extends
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/16987#discussion_r103603705
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -662,6 +665,154 @@ class FileStreamSourceSuite extends
Github user lw-lin commented on a diff in the pull request:
https://github.com/apache/spark/pull/16987#discussion_r103603687
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
---
@@ -662,6 +665,154 @@ class FileStreamSourceSuite extends
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17112
**[Test build #73654 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73654/testReport)**
for PR 17112 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17112
**[Test build #73654 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73654/testReport)**
for PR 17112 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17112
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73654/
Test FAILed.
---
Github user gczsjdy commented on the issue:
https://github.com/apache/spark/pull/16476
@gatorsmile Do you think we can merge this PR? Or is there something that
need to modify?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/17112
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/16677
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16677
**[Test build #73655 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73655/testReport)**
for PR 16677 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/17053
**[Test build #73635 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73635/testReport)**
for PR 17053 at commit
Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/16959#discussion_r103606657
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/OutputCommitCoordinatorSuite.scala
---
@@ -195,6 +195,17 @@ class OutputCommitCoordinatorSuite
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16910
**[Test build #73666 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73666/testReport)**
for PR 16910 at commit
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/17105
so tl; dr; I think we should support duplicated name like everything else
in Spark does.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/17105
@actuaryzhang there's a bit of a history about this... but long story
short, Spark does support DataFrame with multiple columns having the same name,
for example
```
# in pyspark
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/17056#discussion_r103616628
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
---
@@ -732,6 +743,51 @@ object HiveHashFunction extends
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/17056#discussion_r103616585
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
---
@@ -732,6 +743,51 @@ object HiveHashFunction extends
701 - 781 of 781 matches
Mail list logo