[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71280/testReport)** for PR 16566 at commit

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16565 **[Test build #71281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71281/consoleFull)** for PR 16565 at commit

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95921320 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark issue #16549: [SPARK-19151][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-12 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16549 ok thanks~ I am working to find out the reason why it failed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16467 **[Test build #71283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71283/testReport)** for PR 16467 at commit

[GitHub] spark issue #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/16233 @yhuai I was trying to do that yesterday and found that JIRA went something wrong, will try again later today. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71282/ Test FAILed. ---

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71282 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71282/testReport)** for PR 16566 at commit

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/16565 okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16565 Could you change the PR title to `[SPARK-17237][SQL][Backport-2.0] Remove backticks in a pivot result schema` --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #16524: [SPARK-19110][MLLIB][FollowUP]: Add a unit test f...

2017-01-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16524 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16550: [SPARK-19178][SQL] convert string of large numbers to in...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16550 **[Test build #71287 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71287/testReport)** for PR 16550 at commit

[GitHub] spark pull request #16542: [SPARK-18905][STREAMING] Fix the issue of removin...

2017-01-12 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/16542#discussion_r95935489 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala --- @@ -200,19 +200,19 @@ class JobScheduler(val ssc:

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-12 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95935452 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -116,6 +116,12 @@ case class

[GitHub] spark pull request #16500: [SPARK-19120] [SPARK-19121] Refresh Metadata Cach...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16500#discussion_r95937423 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -392,7 +392,9 @@ case class InsertIntoHiveTable(

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16566 ``` * checking Rd \usage sections ... WARNING Duplicated \argument entries in documentation object 'fitted': 'object' 'method' '...' ``` --- If your project is set up for it,

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71289 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71289/testReport)** for PR 16355 at commit

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95938844 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -196,6 +196,12 @@ test_that("create DataFrame from RDD", { expect_equal(dtypes(df),

[GitHub] spark pull request #16564: [SPARK-19065][SQL]Don't inherit expression id in ...

2017-01-12 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16564#discussion_r95938778 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -898,11 +899,15 @@ class DatasetSuite extends QueryTest with

[GitHub] spark issue #16542: [SPARK-18905][STREAMING] Fix the issue of removing a fai...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16542 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71288/ Test PASSed. ---

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16528 **[Test build #71292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71292/testReport)** for PR 16528 at commit

[GitHub] spark issue #16542: [SPARK-18905][STREAMING] Fix the issue of removing a fai...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16542 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16528 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71292/ Test FAILed. ---

[GitHub] spark issue #16542: [SPARK-18905][STREAMING] Fix the issue of removing a fai...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16542 **[Test build #71288 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71288/testReport)** for PR 16542 at commit

[GitHub] spark issue #16528: [SPARK-19148][SQL] do not expose the external table conc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16528 **[Test build #71292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71292/testReport)** for PR 16528 at commit

[GitHub] spark issue #16559: [WIP] Add expression index and test cases

2017-01-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16559 we already have `GetArrayItem` and `GetMapValue`, and we have special parser rules to support it, e.g. `SELECT array_col[3], map_co['key']`. We can just treat `index` as an alias if

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71289/ Test PASSed. ---

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16547 Could you also update the JIRA and the PR description after your changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16547 Discussed with @tdas offline. Regarding the changes to the append mode, it's unknown that if adding the filter is better because it will apply the filter on all rows but there are usually only few

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16557 **[Test build #71278 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71278/testReport)** for PR 16557 at commit

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71278/ Test FAILed. ---

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923580 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -307,54 +307,11 @@ object ScalaReflection extends

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16565 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71281/ Test FAILed. ---

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16565 **[Test build #71281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71281/consoleFull)** for PR 16565 at commit

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16565 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16546: [WIP][SQL] Put check in ExpressionEncoder.fromRow...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16546#discussion_r95927063 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -120,6 +120,32 @@ object ScalaReflection extends

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15671 Thanks for the updates! For docConcentration and quantileProbabilities, I agree it could be problematic if these are too large. How about: * We don't log docConcentration since that

[GitHub] spark issue #16524: [SPARK-19110][MLLIB][FollowUP]: Add a unit test for test...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16524 LGTM Merging with master Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16527: [SPARK-19146][Core]Drop more elements when stageData.tas...

2017-01-12 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/16527 I use following code log trim stages/jobs time consuming: ```:scala /** If stages is too large, remove and garbage collect old stages */ private def trimStagesIfNecessary(stages:

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16503 **[Test build #71284 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71284/testReport)** for PR 16503 at commit

[GitHub] spark issue #16542: [SPARK-18905][STREAMING] Fix the issue of removing a fai...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16542 **[Test build #71288 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71288/testReport)** for PR 16542 at commit

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread imatiach-msft
Github user imatiach-msft commented on the issue: https://github.com/apache/spark/pull/16355 @jkbradley thanks, I've updated the code based on your latest comments - I removed k and the verification for the setters. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16355 **[Test build #71289 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71289/testReport)** for PR 16355 at commit

[GitHub] spark issue #15324: [SPARK-16872][ML] Gaussian Naive Bayes Classifier

2017-01-12 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/15324 @jkbradley What's your opinion about whether GNB should be a separated Classifier or a modeltype in existing NB? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #16565: [SPARK-17237][SQL] Remove backticks in a pivot re...

2017-01-12 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/16565 [SPARK-17237][SQL] Remove backticks in a pivot result schema ## What changes were proposed in this pull request? Pivoting adds backticks (e.g. 3_count(\`c\`)) in column names and, in some

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71282 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71282/testReport)** for PR 16566 at commit

[GitHub] spark issue #12064: [SPARK-14272][ML] Evaluate GaussianMixtureModel with Log...

2017-01-12 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/12064 @yanboliang Updated! Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/16547 Thanks for the feedback! Ah, sure, let me update accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16467 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71283/ Test PASSed. ---

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-12 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r95933009 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskDescription.scala --- @@ -52,7 +55,43 @@ private[spark] class TaskDescription( val

[GitHub] spark pull request #16564: [SPARK-19065][SQL]Don't inherit expression id in ...

2017-01-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16564#discussion_r95942576 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -898,11 +899,15 @@ class DatasetSuite extends QueryTest with

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71276/ Test PASSed. ---

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16564 **[Test build #71275 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71275/testReport)** for PR 16564 at commit

[GitHub] spark issue #16564: [SPARK-19065][SS] Don't inherit expression id in dropDup...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16564 Discussed with @marmbrus offline and decided to not support using `df("col")` like `ds.dropDuplicates("_1").select(ds("_1").as[String], ds("_2").as[Int])`. --- If your project is set up for it,

[GitHub] spark issue #16564: [SPARK-19065][SS] Don't inherit expression id in dropDup...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16564 Also cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16556: [SPARK-19184][MLlib] Improve numerical stability for met...

2017-01-12 Thread tygert
Github user tygert commented on the issue: https://github.com/apache/spark/pull/16556 Maybe @hl475 could add his remark about how "we don't use LAPACK here since there is no pivoted QR in LAPACK that stops when the rank is exhausted" to the inline comments he highlighted?

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95910083 --- Diff: R/pkg/R/context.R --- @@ -91,6 +91,13 @@ objectFile <- function(sc, path, minPartitions = NULL) { #' will write it to disk and send the file

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95909536 --- Diff: R/pkg/R/context.R --- @@ -128,12 +135,15 @@ parallelize <- function(sc, coll, numSlices = 1) { objectSize <- object.size(coll)

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16564 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71277/ Test PASSed. ---

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-12 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95916520 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555 @@ +/* +

[GitHub] spark pull request #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/16566 [SparkR]: add bisecting kmeans R wrapper ## What changes were proposed in this pull request? Add R wrapper for bisecting Kmeans. As JIRA is down, I will update title to link

[GitHub] spark issue #10498: [SPARK-12539][SQL] support writing bucketed table

2017-01-12 Thread infinitymittal
Github user infinitymittal commented on the issue: https://github.com/apache/spark/pull/10498 Hi, There is the limitation of "Can't insert bucketed data into existing hive tables.". Do we have any plans to relax the same? I want to insert data using a query into an already

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15018 @zapletal-martin Pinging since you wrote the original PR: There's discussion here about whether IsotonicRegression should support negative weights. Is there a good reason to? I haven't seen

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71280/testReport)** for PR 16566 at commit

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71280/ Test FAILed. ---

[GitHub] spark issue #16208: [WIP][SPARK-10849][SQL] Adds a new column metadata prope...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16208 @sureshthalamati Could you resolve the conflicts in both PRs? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16514: [SPARK-19128] [SQL] Refresh Cache after Set Locat...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16514#discussion_r95928814 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -555,6 +557,61 @@ class HiveDDLSuite } }

[GitHub] spark pull request #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/16233#discussion_r95929443 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #16500: [SPARK-19120] [SPARK-19121] Refresh Metadata Cache After...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16500 Let me think whether we can improve the existing verification mechanism for both caches in the test cases. It can help us to know what the caches actually contain. --- If your project is set

[GitHub] spark pull request #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter...

2017-01-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16481 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16481 I'll update JIRA once the service is back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16481 LGTM, merging to master! It conflicts with branch-2.1, can you send a new PR? thanks --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95939102 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -196,6 +196,12 @@ test_that("create DataFrame from RDD", { expect_equal(dtypes(df),

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16503 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71284/ Test PASSed. ---

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16503 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16503 **[Test build #71284 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71284/testReport)** for PR 16503 at commit

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/15671 @jkbradley Updated. Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71286/ Test PASSed. ---

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95922004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16481 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71279/ Test PASSed. ---

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923248 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -307,54 +307,11 @@ object ScalaReflection extends

[GitHub] spark pull request #16500: [SPARK-19120] [SPARK-19121] Refresh Metadata Cach...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16500#discussion_r95929082 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -392,7 +392,9 @@ case class InsertIntoHiveTable(

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15671 **[Test build #71285 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71285/testReport)** for PR 15671 at commit

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16467 **[Test build #71283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71283/testReport)** for PR 16467 at commit

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #71286 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71286/testReport)** for PR 15505 at commit

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15671 **[Test build #71285 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71285/testReport)** for PR 15671 at commit

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15671 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71285/ Test PASSed. ---

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15671 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16557 **[Test build #71291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71291/testReport)** for PR 16557 at commit

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16481 **[Test build #71279 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71279/testReport)** for PR 16481 at commit

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923927 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark pull request #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorith...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16355#discussion_r95928035 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -160,6 +162,17 @@ object KMeansSuite {

[GitHub] spark pull request #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorith...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16355#discussion_r95928023 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala --- @@ -51,6 +54,21 @@ class BisectingKMeansSuite

[GitHub] spark issue #16503: [SPARK-18113] Use ask to replace askWithRetry in canComm...

2017-01-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16503 @ash211 Thanks a lot for your comment. I've already fixed the failing Scala style tests. Running `./dev/scalastyle` passed. Could you give another look? --- If your project is set up for

  1   2   3   4   5   >