[GitHub] spark pull request #14452: [SPARK-16849][SQL] Improve subquery execution by ...

2016-09-01 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/14452#discussion_r77123756 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -89,6 +90,8 @@ abstract class Optimizer(sessionCatalog:

[GitHub] spark issue #14765: [SPARK-15815] Keeping tell yarn the target executors in ...

2016-09-01 Thread suyanNone
Github user suyanNone commented on the issue: https://github.com/apache/spark/pull/14765 jenkins retest. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-09-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r77124744 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -590,8 +590,12 @@ class SQLBuilder private ( object E

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14452 **[Test build #64758 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64758/consoleFull)** for PR 14452 at commit [`e9b0952`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14883: [SPARK-17319] [SQL] Move addJar from HiveSessionState to...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14883 **[Test build #64767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64767/consoleFull)** for PR 14883 at commit [`ad37055`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14452 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14452 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64758/ Test PASSed. ---

[GitHub] spark pull request #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14433#discussion_r77125115 --- Diff: core/src/main/scala/org/apache/spark/internal/Logging.scala --- @@ -135,7 +136,12 @@ private[spark] trait Logging { val replLevel = Opti

[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/14712 Maybe I found another bug in the master branch? When calculating statistics for data source tables, we do not exclude the staging directory. However, we exclude them when `AnalyzeTableCom

[GitHub] spark pull request #14883: [SPARK-17319] [SQL] Move addJar from HiveSessionS...

2016-09-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14883#discussion_r77125955 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -69,6 +71,29 @@ private[sql] class SharedState(val sparkContext:

[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14659 **[Test build #64757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64757/consoleFull)** for PR 14659 at commit [`ae42093`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14659 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14659: [SPARK-16757] Set up Spark caller context to HDFS

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64757/ Test PASSed. ---

[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13584 **[Test build #64765 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64765/consoleFull)** for PR 13584 at commit [`1701252`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13584 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64765/ Test PASSed. ---

[GitHub] spark issue #13584: [SPARK-15509][ML][SparkR] R MLlib algorithms should supp...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13584 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #14868: [SPARK-16283][SQL] Implements percentile_approx a...

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/14868#discussion_r77126967 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -0,0 +1,321 @@ +/* + *

[GitHub] spark pull request #14868: [SPARK-16283][SQL] Implements percentile_approx a...

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/14868#discussion_r77127003 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala --- @@ -0,0 +1,321 @@ +/* + *

[GitHub] spark pull request #14913: [SPARK-17358][SQL] Cached table(parquet/orc) shou...

2016-09-01 Thread watermen
GitHub user watermen opened a pull request: https://github.com/apache/spark/pull/14913 [SPARK-17358][SQL] Cached table(parquet/orc) should be shard between beelines ## What changes were proposed in this pull request? Cached table(parquet/orc) couldn't be shard between beelines,

[GitHub] spark pull request #14901: [SPARK-17347][SQL][Examples]Encoder in Dataset ex...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14901#discussion_r77127538 --- Diff: examples/src/main/scala/org/apache/spark/examples/sql/SparkSQLExample.scala --- @@ -203,7 +203,7 @@ object SparkSQLExample { // No pre-defi

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread lw-lin
GitHub user lw-lin opened a pull request: https://github.com/apache/spark/pull/14914 [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) instead of ArrayBuffer.append(A) in performance critical paths ## What changes were proposed in this pull request? We should generally use `A

[GitHub] spark issue #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) instead ...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14914 **[Test build #64769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64769/consoleFull)** for PR 14914 at commit [`fba1c5e`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #14913: [SPARK-17358][SQL] Cached table(parquet/orc) should be s...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14913 **[Test build #64768 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64768/consoleFull)** for PR 14913 at commit [`fc93356`](https://github.com/apache/spark/commit/f

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread lw-lin
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77128029 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new Array

[GitHub] spark issue #14911: [SPARK-17355] Workaround for HIVE-14684 / HiveResultSetM...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14911 **[Test build #64761 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64761/consoleFull)** for PR 14911 at commit [`6b56880`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14911: [SPARK-17355] Workaround for HIVE-14684 / HiveResultSetM...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14911 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64761/ Test PASSed. ---

[GitHub] spark issue #14911: [SPARK-17355] Workaround for HIVE-14684 / HiveResultSetM...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14911 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) instead ...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14914 **[Test build #64770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64770/consoleFull)** for PR 14914 at commit [`980a3a4`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #14910: [SPARK-17271] [SQL] Remove redundant `semanticEquals()` ...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14910 **[Test build #64760 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64760/consoleFull)** for PR 14910 at commit [`56eb557`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14910: [SPARK-17271] [SQL] Remove redundant `semanticEquals()` ...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14910 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64760/ Test PASSed. ---

[GitHub] spark issue #14910: [SPARK-17271] [SQL] Remove redundant `semanticEquals()` ...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14910 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14823: [SPARK-17257][SQL] the physical plan of CREATE TABLE or ...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14823 **[Test build #64762 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64762/consoleFull)** for PR 14823 at commit [`52a40d9`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14823: [SPARK-17257][SQL] the physical plan of CREATE TABLE or ...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14823 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64762/ Test PASSed. ---

[GitHub] spark issue #14823: [SPARK-17257][SQL] the physical plan of CREATE TABLE or ...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14823 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77129894 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new Array

[GitHub] spark issue #14640: [SPARK-17055] [MLLIB] add labelKFold to CrossValidator

2016-09-01 Thread VinceShieh
Github user VinceShieh commented on the issue: https://github.com/apache/spark/pull/14640 Updates: 1. code refactoring. Rename the API to align with Sklearn changes 2. add implementation in CrossValidator --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark issue #14892: [SPARK-17329] [BUILD] Don't build PRs with -Pyarn unless...

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14892 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark pull request #14892: [SPARK-17329] [BUILD] Don't build PRs with -Pyarn...

2016-09-01 Thread srowen
Github user srowen closed the pull request at: https://github.com/apache/spark/pull/14892 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread lw-lin
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77130880 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new Array

[GitHub] spark issue #14891: [SQL][DOC][MINOR] Add (Scala-specific) and (Java-specifi...

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14891 OK fair enough LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread lw-lin
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77131280 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new Array

[GitHub] spark issue #14531: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SQL] Fix mult...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14531 **[Test build #64763 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64763/consoleFull)** for PR 14531 at commit [`4bcb306`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14908: [WEBUI][SPARK-17352]Executor computing time can be negat...

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14908 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark issue #14531: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SQL] Fix mult...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64763/ Test PASSed. ---

[GitHub] spark issue #14531: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SQL] Fix mult...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14531 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14900: [WEBUI][SPARK-17342] Style of event timeline is broken

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14900 LGTM. In light of this change, was https://github.com/apache/spark/pull/14791 necessary, or at least still a valid change? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14597#discussion_r77131705 --- Diff: python/pyspark/mllib/feature.py --- @@ -276,24 +276,64 @@ class ChiSqSelector(object): """ Creates a ChiSquared feature selector.

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14863 I don't know if performance is important here. I'd rather either batch this together with other changes that make this change consistently or drop this one. --- If your project is set up for it, you

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14567 OK I'd leave these changes to Python people like @davies @holdenk @MLnick to comment on from here. I think style changes can be OK if they're consistent, enforceable, and moving the code towards more

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77132904 --- Diff: docs/ml-features.md --- @@ -1102,7 +1102,8 @@ for more details on the API. ## QuantileDiscretizer `QuantileDiscretizer` takes a colum

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77133636 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala --- @@ -114,10 +115,10 @@ final class QuantileDiscretizer @Since("1.6.0")

[GitHub] spark issue #14868: [SPARK-16283][SQL] Implements percentile_approx aggregat...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14868 LGTM, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark issue #14877: fixed typos

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14877 Merged to master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark pull request #14868: [SPARK-16283][SQL] Implements percentile_approx a...

2016-09-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14868 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14877: fixed typos

2016-09-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14877 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14531: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SQL] F...

2016-09-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14531 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread VinceShieh
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77134887 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala --- @@ -114,10 +115,10 @@ final class QuantileDiscretizer @Since("1.6

[GitHub] spark issue #14531: [SPARK-17353] [SPARK-16943] [SPARK-16942] [SQL] Fix mult...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14531 merging to master! @gatorsmile can you send a new PR to backport it to 2.0? thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77135297 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala --- @@ -114,10 +115,10 @@ final class QuantileDiscretizer @Since("1.6.0")

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77135576 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new Array

[GitHub] spark issue #14823: [SPARK-17257][SQL] the physical plan of CREATE TABLE or ...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14823 thanks for the review, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14823: [SPARK-17257][SQL] the physical plan of CREATE TA...

2016-09-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14823 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #14910: [SPARK-17271] [SQL] Remove redundant `semanticEquals()` ...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14910 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark pull request #14910: [SPARK-17271] [SQL] Remove redundant `semanticEqu...

2016-09-01 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14910 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #14515: [SPARK-16926] [SQL] Remove partition columns from partit...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/14515 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if t

[GitHub] spark pull request #14515: [SPARK-16926] [SQL] Remove partition columns from...

2016-09-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14515#discussion_r77136669 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala --- @@ -162,7 +162,13 @@ private[hive] case class MetastoreRelation(

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14863 I agree. I would prefer if Spark examples also "promotes" the good practice of Python, ie, replacing 'map' and 'filter' by list or map comprehension ('reduce' has no equivalent on comprehension), e

[GitHub] spark issue #14912: [SPARK-17357][SQL] Simplified predicates should be pushe...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14912 **[Test build #64766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64766/consoleFull)** for PR 14912 at commit [`9e1c315`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14912: [SPARK-17357][SQL] Simplified predicates should be pushe...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14912 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14912: [SPARK-17357][SQL] Simplified predicates should be pushe...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14912 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64766/ Test PASSed. ---

[GitHub] spark issue #14863: [SPARK-16992][PYSPARK] use map comprehension in doc

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14863 OK well I'd leave it to people here with more taste to agree about what's canonical but I take your word for it. I'm mostly interested in consistency if anythign. --- If your project is set up for

[GitHub] spark issue #14515: [SPARK-16926] [SQL] Remove partition columns from partit...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14515 **[Test build #64771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64771/consoleFull)** for PR 14515 at commit [`fd37123`](https://github.com/apache/spark/commit/f

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-09-01 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r77137539 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala --- @@ -52,7 +52,8 @@ case class LogicalRelation(

[GitHub] spark pull request #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector...

2016-09-01 Thread mpjlu
Github user mpjlu commented on a diff in the pull request: https://github.com/apache/spark/pull/14597#discussion_r77137991 --- Diff: python/pyspark/mllib/feature.py --- @@ -276,24 +276,64 @@ class ChiSqSelector(object): """ Creates a ChiSquared feature selector.

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread VinceShieh
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77138037 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala --- @@ -114,10 +115,10 @@ final class QuantileDiscretizer @Since("1.6

[GitHub] spark pull request #14915: [SPARK-17356][SQL] Fix out of memory issue when c...

2016-09-01 Thread clockfly
GitHub user clockfly opened a pull request: https://github.com/apache/spark/pull/14915 [SPARK-17356][SQL] Fix out of memory issue when calling TreeNode.toJSON ## What changes were proposed in this pull request? class `org.apache.spark.sql.types.Metadata` is widely used in ml

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread VinceShieh
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77138983 --- Diff: docs/ml-features.md --- @@ -1102,7 +1102,8 @@ for more details on the API. ## QuantileDiscretizer `QuantileDiscretizer` takes a c

[GitHub] spark issue #14915: [SPARK-17356][SQL][WIP] Fix out of memory issue when gen...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14915 **[Test build #64772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64772/consoleFull)** for PR 14915 at commit [`368e097`](https://github.com/apache/spark/commit/3

[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-09-01 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14712#discussion_r77139470 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala --- @@ -88,24 +85,53 @@ case class AnalyzeTableCommand(tabl

[GitHub] spark pull request #14915: [SPARK-17356][SQL][WIP] Fix out of memory issue w...

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request: https://github.com/apache/spark/pull/14915#discussion_r77139596 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala --- @@ -617,7 +618,9 @@ abstract class TreeNode[BaseType <: TreeNo

[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-09-01 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/14712 @gatorsmile Yes, we should exclude the staging dir. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark issue #14151: [SPARK-16496][SQL] Add wholetext as option for reading t...

2016-09-01 Thread ScrapCodes
Github user ScrapCodes commented on the issue: https://github.com/apache/spark/pull/14151 Thanks @gatorsmile. I was actually wondering, where can I document this option. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #14915: [SPARK-17356][SQL][WIP] Fix out of memory issue when gen...

2016-09-01 Thread clockfly
Github user clockfly commented on the issue: https://github.com/apache/spark/pull/14915 @mengxr @yhuai, comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #14151: [SPARK-16496][SQL] Add wholetext as option for reading t...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14151 **[Test build #64773 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64773/consoleFull)** for PR 14151 at commit [`8ac37c1`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #14873: [SPARK-17308]Improved the spark core code by replacing a...

2016-09-01 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/14873 From my understanding it is more like a personal preference rather than code style issue. We may change the code for now, but how can we guarantee other people not to use pattern match in future?

[GitHub] spark issue #14873: [SPARK-17308]Improved the spark core code by replacing a...

2016-09-01 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/14873 Although I'm slightly positive on it, I would not merge if there are two slightly negative reviews. I think it's a bit more than style preference, but not much more. Is there ever a benefit to patter

[GitHub] spark issue #14883: [SPARK-17319] [SQL] Move addJar from HiveSessionState to...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14883 **[Test build #64767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64767/consoleFull)** for PR 14883 at commit [`ad37055`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14883: [SPARK-17319] [SQL] Move addJar from HiveSessionState to...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14883 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14883: [SPARK-17319] [SQL] Move addJar from HiveSessionState to...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14883 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64767/ Test PASSed. ---

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77141646 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SortOrder.scala --- @@ -26,21 +26,40 @@ import org.apache.spark.util.col

[GitHub] spark pull request #14914: [SPARK-17359][SQL][MLLib] Use ArrayBuffer.+=(A) i...

2016-09-01 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/14914#discussion_r77141688 --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Matrices.scala --- @@ -999,7 +999,7 @@ object Matrices { val data = new ArrayBu

[GitHub] spark pull request #14858: [SPARK-17219][ML] Add NaN value handling in Bucke...

2016-09-01 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14858#discussion_r77141804 --- Diff: docs/ml-features.md --- @@ -1102,7 +1102,8 @@ for more details on the API. ## QuantileDiscretizer `QuantileDiscretizer` takes a colum

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77141840 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -324,7 +324,7 @@ queryPrimary ; sortItem

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77142212 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -1204,9 +1204,29 @@ class AstBuilder extends SqlBaseBa

[GitHub] spark issue #14873: [SPARK-17308]Improved the spark core code by replacing a...

2016-09-01 Thread shiv4nsh
Github user shiv4nsh commented on the issue: https://github.com/apache/spark/pull/14873 @jerryshao : It is always better to not to use the pattern matching on the boolean AFAIK , and it reduces the bytecode too.. you can take a look here: http://stackoverflow.com/questions/9266822/pa

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77142563 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala --- @@ -1204,9 +1204,29 @@ class AstBuilder extends SqlBaseBa

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77143025 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SortPrefixUtils.scala --- @@ -40,29 +40,64 @@ object SortPrefixUtils { def g

[GitHub] spark pull request #14842: [SPARK-10747][SQL] Support NULLS FIRST|LAST claus...

2016-09-01 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/14842#discussion_r77143047 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SortPrefixUtils.scala --- @@ -89,7 +124,8 @@ object SortPrefixUtils { * Returns w

[GitHub] spark issue #14913: [SPARK-17358][SQL] Cached table(parquet/orc) should be s...

2016-09-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14913 **[Test build #64768 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64768/consoleFull)** for PR 14913 at commit [`fc93356`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14913: [SPARK-17358][SQL] Cached table(parquet/orc) should be s...

2016-09-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14913 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64768/ Test PASSed. ---

  1   2   3   4   5   6   7   >