[GitHub] spark issue #16500: [SPARK-19120] [SPARK-19121] Refresh Metadata Cache After...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16500 Let me think whether we can improve the existing verification mechanism for both caches in the test cases. It can help us to know what the caches actually contain. --- If your project is set

[GitHub] spark pull request #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/16233#discussion_r95929443 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #16500: [SPARK-19120] [SPARK-19121] Refresh Metadata Cach...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16500#discussion_r95929082 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala --- @@ -392,7 +392,9 @@ case class InsertIntoHiveTable(

[GitHub] spark issue #12064: [SPARK-14272][ML] Evaluate GaussianMixtureModel with Log...

2017-01-12 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/12064 @yanboliang Updated! Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16514: [SPARK-19128] [SQL] Refresh Cache after Set Locat...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16514#discussion_r95928814 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -555,6 +557,61 @@ class HiveDDLSuite } }

[GitHub] spark pull request #16524: [SPARK-19110][MLLIB][FollowUP]: Add a unit test f...

2017-01-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16524 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15671: [SPARK-18206][ML]Add instrumentation for MLP,NB,LDA,AFT,...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15671 Thanks for the updates! For docConcentration and quantileProbabilities, I agree it could be problematic if these are too large. How about: * We don't log docConcentration since that

[GitHub] spark issue #16524: [SPARK-19110][MLLIB][FollowUP]: Add a unit test for test...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/16524 LGTM Merging with master Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorith...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16355#discussion_r95928023 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/BisectingKMeansSuite.scala --- @@ -51,6 +54,21 @@ class BisectingKMeansSuite

[GitHub] spark pull request #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorith...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16355#discussion_r95928035 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/KMeansSuite.scala --- @@ -160,6 +162,17 @@ object KMeansSuite {

[GitHub] spark pull request #16546: [WIP][SQL] Put check in ExpressionEncoder.fromRow...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16546#discussion_r95927063 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -120,6 +120,32 @@ object ScalaReflection extends

[GitHub] spark issue #16208: [WIP][SPARK-10849][SQL] Adds a new column metadata prope...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16208 @sureshthalamati Could you resolve the conflicts in both PRs? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/16565 okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16565 Could you change the PR title to `[SPARK-17237][SQL][Backport-2.0] Remove backticks in a pivot result schema` --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16565 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71281/ Test FAILed. ---

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16565 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16565 **[Test build #71281 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71281/consoleFull)** for PR 16565 at commit

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71282/ Test FAILed. ---

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71282 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71282/testReport)** for PR 16566 at commit

[GitHub] spark issue #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/16233 @yhuai I was trying to do that yesterday and found that JIRA went something wrong, will try again later today. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923927 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923580 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -307,54 +307,11 @@ object ScalaReflection extends

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95923248 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -307,54 +307,11 @@ object ScalaReflection extends

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16467 **[Test build #71283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71283/testReport)** for PR 16467 at commit

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95922004 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16481 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71279/ Test PASSed. ---

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16481 **[Test build #71279 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71279/testReport)** for PR 16481 at commit

[GitHub] spark issue #16549: [SPARK-19151][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-12 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16549 ok thanks~ I am working to find out the reason why it failed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95921320 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71282 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71282/testReport)** for PR 16566 at commit

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71278/ Test FAILed. ---

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16557 **[Test build #71278 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71278/testReport)** for PR 16557 at commit

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71280/testReport)** for PR 16566 at commit

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16566 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71280/ Test FAILed. ---

[GitHub] spark issue #10498: [SPARK-12539][SQL] support writing bucketed table

2017-01-12 Thread infinitymittal
Github user infinitymittal commented on the issue: https://github.com/apache/spark/pull/10498 Hi, There is the limitation of "Can't insert bucketed data into existing hive tables.". Do we have any plans to relax the same? I want to insert data using a query into an already

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15018 @zapletal-martin Pinging since you wrote the original PR: There's discussion here about whether IsotonicRegression should support negative weights. Is there a good reason to? I haven't seen

[GitHub] spark issue #16565: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16565 **[Test build #71281 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71281/consoleFull)** for PR 16565 at commit

[GitHub] spark issue #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71280/testReport)** for PR 16566 at commit

[GitHub] spark pull request #16566: [SparkR]: add bisecting kmeans R wrapper

2017-01-12 Thread wangmiao1981
GitHub user wangmiao1981 opened a pull request: https://github.com/apache/spark/pull/16566 [SparkR]: add bisecting kmeans R wrapper ## What changes were proposed in this pull request? Add R wrapper for bisecting Kmeans. As JIRA is down, I will update title to link

[GitHub] spark pull request #16565: [SPARK-17237][SQL] Remove backticks in a pivot re...

2017-01-12 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/16565 [SPARK-17237][SQL] Remove backticks in a pivot result schema ## What changes were proposed in this pull request? Pivoting adds backticks (e.g. 3_count(\`c\`)) in column names and, in some

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16547 Could you also update the JIRA and the PR description after your changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16547 Discussed with @tdas offline. Regarding the changes to the append mode, it's unknown that if adding the filter is better because it will apply the filter on all rows but there are usually only few

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-01-12 Thread ron8hu
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r95916520 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,555 @@ +/* +

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r95914526 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -312,90 +313,120 @@ class IsotonicRegression private

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r95914244 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -312,90 +313,120 @@ class IsotonicRegression private

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r95914074 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -312,90 +313,120 @@ class IsotonicRegression private

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2017-01-12 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r95914179 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -312,90 +313,120 @@ class IsotonicRegression private

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16564 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71277/ Test PASSed. ---

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16564 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16564 **[Test build #71277 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71277/testReport)** for PR 16564 at commit

[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates

2017-01-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/15945 @cloud-fan How about this fix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16547: [SPARK-19168][Structured Streaming] Improvement: filter ...

2017-01-12 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/16547 @zsxwing updated as per your comments; would you take another look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14812: [SPARK-17237][SQL] Remove backticks in a pivot result sc...

2017-01-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/14812 okay, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2017-01-12 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15018 @neggert I'll ask @mengxr about the negative weights since he oversaw the original work here. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16208: [WIP][SPARK-10849][SQL] Adds a new column metadata prope...

2017-01-12 Thread sureshthalamati
Github user sureshthalamati commented on the issue: https://github.com/apache/spark/pull/16208 @miketrewartha Yes, i am hoping one of the fixes for this issues will get merged. I proposed two solutions this PR , and another one https://github.com/apache/spark/pull/16209. Waiting

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95909368 --- Diff: R/pkg/R/context.R --- @@ -91,6 +91,13 @@ objectFile <- function(sc, path, minPartitions = NULL) { #' will write it to disk and send the file

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95909985 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -196,6 +196,12 @@ test_that("create DataFrame from RDD", { expect_equal(dtypes(df),

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95910083 --- Diff: R/pkg/R/context.R --- @@ -91,6 +91,13 @@ objectFile <- function(sc, path, minPartitions = NULL) { #' will write it to disk and send the file

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95909536 --- Diff: R/pkg/R/context.R --- @@ -128,12 +135,15 @@ parallelize <- function(sc, coll, numSlices = 1) { objectSize <- object.size(coll)

[GitHub] spark pull request #16541: [SPARK-19088][SQL] Optimize sequence type deseria...

2017-01-12 Thread michalsenkyr
Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16541#discussion_r95909808 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -589,6 +590,171 @@ case class MapObjects

[GitHub] spark issue #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter should...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16481 **[Test build #71279 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71279/testReport)** for PR 16481 at commit

[GitHub] spark pull request #16481: [SPARK-19092] [SQL] Save() API of DataFrameWriter...

2017-01-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16481#discussion_r95905076 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -413,10 +413,85 @@ case class DataSource(

[GitHub] spark issue #16556: [SPARK-19184][MLlib] Improve numerical stability for met...

2017-01-12 Thread tygert
Github user tygert commented on the issue: https://github.com/apache/spark/pull/16556 Maybe @hl475 could add his remark about how "we don't use LAPACK here since there is no pivoted QR in LAPACK that stops when the rank is exhausted" to the inline comments he highlighted?

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16557 **[Test build #71278 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71278/testReport)** for PR 16557 at commit

[GitHub] spark issue #16208: [WIP][SPARK-10849][SQL] Adds a new column metadata prope...

2017-01-12 Thread miketrewartha
Github user miketrewartha commented on the issue: https://github.com/apache/spark/pull/16208 @sureshthalamati Are you still planning on trying to get this merged soon? This would be a hugely useful feature for us! --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/16233 @jiangxb1987 Once jira is back, let's create jiras to address follow-up issues (probably you have already done that before jira went down). --- If your project is set up for it, you can reply to

[GitHub] spark pull request #16233: [SPARK-18801][SQL] Support resolve a nested view

2017-01-12 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/16233#discussion_r95896683 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #16564: [SPARK-19065][SQL]Don't inherit expression id in dropDup...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16564 **[Test build #71277 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71277/testReport)** for PR 16564 at commit

[GitHub] spark issue #16564: [SPARK-19065][SS] Don't inherit expression id in dropDup...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16564 Discussed with @marmbrus offline and decided to not support using `df("col")` like `ds.dropDuplicates("_1").select(ds("_1").as[String], ds("_2").as[Int])`. --- If your project is set up for it,

[GitHub] spark issue #16564: [SPARK-19065][SS] Don't inherit expression id in dropDup...

2017-01-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16564 Also cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16346: [SPARK-16654][CORE] Add UI coverage for Application Leve...

2017-01-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16346 @ajbozarth whoops, I hadn't noticed the extra "Blacklisted" column in the summary table at the top -- I was thinking we'd add another row for blacklisted executors. But I actually think the current

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16564 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71275/ Test PASSed. ---

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16564 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16564 **[Test build #71275 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71275/testReport)** for PR 16564 at commit

[GitHub] spark issue #16559: [WIP] Add expression index and test cases

2017-01-12 Thread aray
Github user aray commented on the issue: https://github.com/apache/spark/pull/16559 It can already be done with the `posexplode` UDTF like ``` with t as (values (array(1,2,3)), (array(4,5,6)) as (a)) select col from t lateral view posexplode(a) tt where pos = 2 ```

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16564 Hmm, I'm not sure that I agree with the solution from #15427. I do not think that it should be valid to have to different expressions that have the same expression id. There are many case where

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16512 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71276/ Test PASSed. ---

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16512 **[Test build #71276 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71276/testReport)** for PR 16512 at commit

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71272/ Test FAILed. ---

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16557 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16557: [SPARK-18693][ML][MLLIB][WIP] ML Evaluators should use w...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16557 **[Test build #71272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71272/testReport)** for PR 16557 at commit

[GitHub] spark issue #16473: [SPARK-19069] [CORE] Expose task 'status' and 'duration'...

2017-01-12 Thread paragpc
Github user paragpc commented on the issue: https://github.com/apache/spark/pull/16473 cc @zsxwing, @vanzin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16512 we probably need to add the getNumPartition to compliment this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #16346: [SPARK-16654][CORE] Add UI coverage for Application Leve...

2017-01-12 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/16346 I haven't been able to take a detailed look at the code, but all of my UI concerns seemed to have been addressed in the previous pr. I am wondering what you mean by having the summary table list

[GitHub] spark pull request #16512: [SPARK-18335][SPARKR] createDataFrame to support ...

2017-01-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16512#discussion_r95874458 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -196,6 +196,12 @@ test_that("create DataFrame from RDD", { expect_equal(dtypes(df),

[GitHub] spark issue #16564: [SPARK-19065][SS]Rewrite Alias in StreamExecution if nec...

2017-01-12 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/16564 LGTM as long as we decide to preserve #15427. The root cause of this issue is still the `df("col")` syntax, which is the motivation behind #15427. We decided not to deprecate/remove this

[GitHub] spark issue #16523: [SPARK-19142][SparkR]:spark.kmeans should take seed, ini...

2017-01-12 Thread wangmiao1981
Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/16523 @felixcheung I will review these items after wrapping up my current work. Now I am working on two items: The bug 18011; and bisecting kmeans. bisecting kmeans should be ready soon. Bug

[GitHub] spark issue #16527: [SPARK-19146][Core]Drop more elements when stageData.tas...

2017-01-12 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/16527 My only concern would be users misunderstanding how `spark.ui.retainedTasks` works from its documentation. But I agree that this is a good change to reduce friction, and updating all the retained

[GitHub] spark issue #16249: [SPARK-18828][SPARKR] Refactor scripts for R

2017-01-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16249 @shivaram how about it? ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16550: [SPARK-19178][SQL] convert string of large number...

2017-01-12 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/16550#discussion_r95871812 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -835,6 +835,187 @@ public UTF8String

[GitHub] spark issue #16555: [SPARK-19180][SQL] the offset of short should be 4 in Of...

2017-01-12 Thread aray
Github user aray commented on the issue: https://github.com/apache/spark/pull/16555 The title should say 2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16523: [SPARK-19142][SparkR]:spark.kmeans should take seed, ini...

2017-01-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16523 LGTM. Probably will be a good idea to review: - vignettes - programming guide - R examples to see if there is anything to add (there might not - we don't want to overload

[GitHub] spark issue #16512: [SPARK-18335][SPARKR] createDataFrame to support numPart...

2017-01-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16512 **[Test build #71276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71276/testReport)** for PR 16512 at commit

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @holdenk Indeed. Not the most fortunate moment for making a bunch of connected PRs :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16534 @holdenk I don't think it should go to the point release at all (same as https://github.com/apache/spark/pull/16533 which, depending on the resolution, may introduce new functionality or breaking

[GitHub] spark issue #16534: [SPARK-19161][PYTHON][SQL] Improving UDF Docstrings

2017-01-12 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16534 It's a bit hard to follow up wit those during JIRA maintenance window - I'll follow up after JIRA comes back online :) --- If your project is set up for it, you can reply to this email and have

<    1   2   3   4   5   >