[GitHub] spark pull request #15820: [SPARK-18373][SS][Kafka]Make failOnDataLoss=false...

2016-11-18 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15820#discussion_r88758179 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -47,40 +51,191 @@ private[kafka010] case class

[GitHub] spark pull request #15820: [SPARK-18373][SS][Kafka]Make failOnDataLoss=false...

2016-11-18 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15820#discussion_r88757785 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala --- @@ -47,40 +51,191 @@ private[kafka010] case class

[GitHub] spark issue #15594: [SPARK-18061][SQL][Security] Spark Thriftserver needs to...

2016-11-18 Thread lresende
Github user lresende commented on the issue: https://github.com/apache/spark/pull/15594 @vanzin I believe this might be your realm :) Could you please help review this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-11-18 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15915 To clarify, does this overflow only occur when `spark.storage.unrollMemoryThreshold` has been manually set to a huge value which doesn't fit into an int? Setting the chunk size to `Intege

[GitHub] spark pull request #15594: [SPARK-18061][SQL][Security] Spark Thriftserver n...

2016-11-18 Thread lresende
Github user lresende commented on a diff in the pull request: https://github.com/apache/spark/pull/15594#discussion_r88752705 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIService.scala --- @@ -57,7 +59,24 @@ private[hive] class S

[GitHub] spark issue #15827: [SPARK-18187][STREAMING] CompactibleFileStreamLog should...

2016-11-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15827 Thanks for doing this. I just merged #15852. Could you close this PR, please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r88756017 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -305,40 +312,82 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-18 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/13065#discussion_r88755507 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala --- @@ -103,5 +109,192 @@ case class GenerateExec( } }

[GitHub] spark issue #15936: [SPARK-18504][SQL] Scalar subquery with extra group by c...

2016-11-18 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/15936 I think I have contaminated the PR with some of my old code. I will push a new PR soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-18 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/13065#discussion_r88754155 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -113,4 +117,25 @@ class WholeStageCodegenSuite extends S

[GitHub] spark issue #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13065 **[Test build #68875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68875/consoleFull)** for PR 13065 at commit [`af9a516`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #15935: [SPARK-18188] add checksum for blocks of broadcast

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15935 **[Test build #68874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68874/consoleFull)** for PR 15935 at commit [`328ca39`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #15936: [SPARK-18504][SQL] Scalar subquery with extra group by c...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15936 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request #15936: [SPARK-18504][SQL] Scalar subquery with extra gro...

2016-11-18 Thread nsyca
GitHub user nsyca opened a pull request: https://github.com/apache/spark/pull/15936 [SPARK-18504][SQL] Scalar subquery with extra group by columns returning incorrect result ## What changes were proposed in this pull request? This PR blocks an incorrect result scenario in s

[GitHub] spark pull request #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13065#discussion_r88753306 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala --- @@ -103,5 +109,182 @@ case class GenerateExec( }

[GitHub] spark pull request #15935: [SPARK-] add checksum for blocks of broadcast

2016-11-18 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/15935 [SPARK-] add checksum for blocks of broadcast ## What changes were proposed in this pull request? A TorrentBroadcast is serialized and compressed first, then splitted as fixed size blocks,

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-18 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15923 @joshrosen @zsxwing Could you help to review this one ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-11-18 Thread skonto
Github user skonto commented on the issue: https://github.com/apache/spark/pull/14650 @rxin @srowen could I get merge pls if there are no other issues? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #15934: [SPARK-18497][SS]Make ForeachSink support watermark

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15934 **[Test build #68873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68873/consoleFull)** for PR 15934 at commit [`0f3e4af`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #15880: [SPARK-17913][SQL] compare long and string type column m...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15880 In the future, more code changes might break Hive compatibility accidentally or intentionally. Maybe, we can introduce a configuration flag. For example, in DB2, we have a flag like

[GitHub] spark issue #15934: [SPARK-18497][SS]Make ForeachSink support watermark

2016-11-18 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/15934 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feat

[GitHub] spark issue #15934: [SPARK-18497][SS]Make ForeachSink support watermark

2016-11-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15934 cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #15934: [SPARK-18497][SS]Make ForeachSink support waterma...

2016-11-18 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/15934 [SPARK-18497][SS]Make ForeachSink support watermark ## What changes were proposed in this pull request? The issue in ForeachSink is the new created DataSet still uses the old QueryExecutio

[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15868 Documentation and PR descriptions are updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark issue #15933: [SPARK-18505][SQL] Simplify AnalyzeColumnCommand

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15933 **[Test build #68872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68872/consoleFull)** for PR 15933 at commit [`1a713fd`](https://github.com/apache/spark/commit/1

[GitHub] spark pull request #15933: [SPARK-18505][SQL] Simplify AnalyzeColumnCommand

2016-11-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15933#discussion_r88747950 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -97,7 +97,7 @@ private[hive] class HiveClientImpl( }

[GitHub] spark pull request #15933: [SPARK-18505][SQL] Simplify AnalyzeColumnCommand

2016-11-18 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/15933 [SPARK-18505][SQL] Simplify AnalyzeColumnCommand ## What changes were proposed in this pull request? I'm spending more time at the design & code level for cost-based optimizer now, and have found

[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15868 **[Test build #68871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68871/consoleFull)** for PR 15868 at commit [`28718d0`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15923 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68870/ Test PASSed. ---

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15923 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15923 **[Test build #68870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68870/consoleFull)** for PR 15923 at commit [`c85a216`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pa...

2016-11-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15866: [SPARK-18422][CORE] Fix wholeTextFiles test to pass on W...

2016-11-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15866 Merged to master/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15868 Yeah, please improve the document comments and PR descriptions. It sounds like this is almost ready to merge after these changes. --- If your project is set up for it, you can reply to

[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-18 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15868 Ah, right. Now always! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #15868: [SPARK-18413][SQL] Add `maxConnections` JDBCOption

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15868 > This option applies only to writing with additional coalesce operation. It is not clear. We do not always call `coalesce`. : ) --- If your project is set up for it, you can reply to th

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

2016-11-18 Thread Yunni
Github user Yunni commented on the issue: https://github.com/apache/spark/pull/15874 @jkbradley Awesome, thanks so much! :) Now that the API is finalized, I will work on the User Doc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu

[GitHub] spark pull request #13065: [SPARK-15214][SQL] Code-generation for Generate

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/13065#discussion_r88740233 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala --- @@ -103,5 +109,182 @@ case class GenerateExec( }

[GitHub] spark issue #15926: [SPARK-16803] [SQL] SaveAsTable does not work when targe...

2016-11-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15926 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

2016-11-18 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15874 I will take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark issue #15870: [SPARK-18425][Structured Streaming][Tests] Test `Compact...

2016-11-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15870 Thanks for doing this. Could you resolve the conflicts, please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark issue #15932: [SPARK-18448][CORE] SparkSession should implement java.l...

2016-11-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15932 Sure I'll put it in first thing tomorrow if there are no objections. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

2016-11-18 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15874 @Yunni Thanks for the updates! I don't think we should include AND-amplification for 2.1 since we're already in QA. But it'd be nice to get it in 2.2. Also, 2.2 will give us plenty of time to d

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 @yinxusen I took a look at the updates. Will you be able to create the design doc that Joseph mentioned? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88725427 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -35,7 +38,25 @@ import org.apache.spark.sql.functions.{col, udf} import

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88724978 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -35,7 +38,25 @@ import org.apache.spark.sql.functions.{col, udf} import

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88714626 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -124,7 +147,8 @@ class KMeansModel private[ml] ( @Since("2.0.0")

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88713108 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala --- @@ -414,6 +414,8 @@ object KMeans { val RANDOM = "random" @Sin

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88722547 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -145,18 +150,67 @@ class KMeansSuite extends SparkFunSuite with MLlibT

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88713359 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -284,11 +309,26 @@ class KMeans @Since("1.5.0") ( /** @group se

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88715322 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -306,6 +346,25 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88725396 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -35,7 +38,25 @@ import org.apache.spark.sql.functions.{col, udf} import

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-11-18 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r88713635 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -284,11 +309,26 @@ class KMeans @Since("1.5.0") ( /** @group se

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15921 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15921 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68868/ Test FAILed. ---

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15921 **[Test build #68868 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68868/consoleFull)** for PR 15921 at commit [`306f7fd`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15921: [SPARK-18493] Add missing python APIs: withWaterm...

2016-11-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15921#discussion_r88731863 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -495,7 +498,10 @@ class Dataset[T] private[sql]( def checkpoint(): Dataset[

[GitHub] spark pull request #15921: [SPARK-18493] Add missing python APIs: withWaterm...

2016-11-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15921#discussion_r88731845 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -485,7 +485,10 @@ class Dataset[T] private[sql]( def isStreaming: Boolean =

[GitHub] spark pull request #15921: [SPARK-18493] Add missing python APIs: withWaterm...

2016-11-18 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15921#discussion_r88731880 --- Diff: python/pyspark/sql/dataframe.py --- @@ -322,6 +322,53 @@ def show(self, n=20, truncate=True): def __repr__(self): return "Data

[GitHub] spark pull request #15813: [SPARK-18362][SQL] Use TextFileFormat in JsonFile...

2016-11-18 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/15813#discussion_r88728867 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala --- @@ -173,35 +178,17 @@ class CSVFileFormat extends

[GitHub] spark issue #15925: [SPARK-18436][SQL]isin with a empty list throw exception

2016-11-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15925 That's why I was suggesting just generate "false" for in(empty seq). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15877: [SPARK-18429] [SQL] implement a new Aggregate for...

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15877#discussion_r88722954 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CountMinSketchAgg.scala --- @@ -0,0 +1,150 @@ +/* + * L

[GitHub] spark issue #15932: [SPARK-18448][CORE] SparkSession should implement java.l...

2016-11-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15932 Nice - given it's a tiny change I'd put this in 2.1, unless somebody else objects. @srowen ? --- If your project is set up for it, you can reply to this email and have your reply appear on Gi

[GitHub] spark issue #15917: SPARK-18252: Using RoaringBitmap for bloom filters

2016-11-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15917 @ponkin mind closing the pull request given your findings on https://issues.apache.org/jira/browse/SPARK-18252 ? --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark pull request #15898: [SPARK-18457][SQL] ORC and other columnar formats...

2016-11-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15898 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15898: [SPARK-18457][SQL] ORC and other columnar formats using ...

2016-11-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15898 Thanks - merging in master/branch-2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enable

[GitHub] spark pull request #15852: [Spark-18187] [SQL] CompactibleFileStreamLog shou...

2016-11-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15852 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15852: [Spark-18187] [SQL] CompactibleFileStreamLog should not ...

2016-11-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15852 LGTM. Thanks! Merging to master and 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature en

[GitHub] spark issue #15899: [SPARK-18466] added withFilter method to RDD

2016-11-18 Thread reggert
Github user reggert commented on the issue: https://github.com/apache/spark/pull/15899 I don't get why you say that it "doesn't even work in general". Under what circumstances doesn't it work? I've never run into any problems with it. The "simple syntactic sugar" allows very

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15923 **[Test build #68870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68870/consoleFull)** for PR 15923 at commit [`c85a216`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #15909: [SPARK-18475] Be able to increase parallelism in Structu...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15909 **[Test build #68869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68869/consoleFull)** for PR 15909 at commit [`e92a038`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15909: [SPARK-18475] Be able to increase parallelism in Structu...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68869/ Test PASSed. ---

[GitHub] spark issue #15909: [SPARK-18475] Be able to increase parallelism in Structu...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15909 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13320 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68863/ Test PASSed. ---

[GitHub] spark issue #13320: [SPARK-13184][SQL] Add a datasource-specific option minP...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13320 **[Test build #68863 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68863/consoleFull)** for PR 13320 at commit [`ba3f2a0`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15889 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68861/ Test PASSed. ---

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15889 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15730: [SPARK-18218][ML][MLLib] Optimize BlockMatrix multiplica...

2016-11-18 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/15730 Hi @WeichenXu123 Thank you for this PR. Sorry for taking so long to get back to you. Your optimization would be very helpful. I have a couple thoughts though. Your examples always take into account f

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15889 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68859/ Test PASSed. ---

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15889 **[Test build #68861 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68861/consoleFull)** for PR 15889 at commit [`ee5b035`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15889 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15889: [SPARK-18445][BUILD][DOCS] Fix the markdown for `Note:`/...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15889 **[Test build #68859 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68859/consoleFull)** for PR 15889 at commit [`4576f7d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13736: [SPARK-12113][SQL] Add some timing metrics for blocking ...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13736 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13736: [SPARK-12113][SQL] Add some timing metrics for blocking ...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13736 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68862/ Test PASSed. ---

[GitHub] spark issue #13736: [SPARK-12113][SQL] Add some timing metrics for blocking ...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13736 **[Test build #68862 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68862/consoleFull)** for PR 13736 at commit [`10dca0e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15909: [SPARK-18475] Be able to increase parallelism in Structu...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15909 **[Test build #68869 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68869/consoleFull)** for PR 15909 at commit [`e92a038`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #14812: [SPARK-17237][SQL] Remove unnecessary backticks in a piv...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14812 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14812: [SPARK-17237][SQL] Remove unnecessary backticks in a piv...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14812 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68860/ Test PASSed. ---

[GitHub] spark issue #14812: [SPARK-17237][SQL] Remove unnecessary backticks in a piv...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14812 **[Test build #68860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68860/consoleFull)** for PR 14812 at commit [`22743c7`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15921: [SPARK-18493] Add missing python APIs: withWatermark and...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15921 **[Test build #68868 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68868/consoleFull)** for PR 15921 at commit [`306f7fd`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #15901: [SPARK-18467][SQL] Extracts method for preparing argumen...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15901 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15901: [SPARK-18467][SQL] Extracts method for preparing argumen...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15901 **[Test build #68858 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68858/consoleFull)** for PR 15901 at commit [`f8acda6`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15901: [SPARK-18467][SQL] Extracts method for preparing argumen...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15901 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68858/ Test FAILed. ---

[GitHub] spark issue #15932: [SPARK-18448][CORE] SparkSession should implement java.l...

2016-11-18 Thread ash211
Github user ash211 commented on the issue: https://github.com/apache/spark/pull/15932 Yep that's precisely what I was envisioning. Thanks @srowen ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #15929: [SPARK-18053][SQL] compare unsafe and safe comple...

2016-11-18 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15929#discussion_r88701973 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -512,6 +517,9 @@ class CodegenContext

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68856/ Test PASSed. ---

[GitHub] spark issue #15930: [SPARK-18501][ML][SparkR] Fix spark.glm errors when fitt...

2016-11-18 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15930 cc @mengxr @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13909 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13909 **[Test build #68856 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68856/consoleFull)** for PR 13909 at commit [`b713712`](https://github.com/apache/spark/commit/

<    1   2   3   4   5   >