[GitHub] spark issue #15492: [DO NOT MERGE][TEST] Testing flakiness of StreamingQuery...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15492 **[Test build #67057 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67057/consoleFull)** for PR 15492 at commit

[GitHub] spark pull request #15497: [Test][SPARK-16002][Follow-up] Fix flaky test in ...

2016-10-17 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15497#discussion_r83585137 --- Diff: core/src/main/scala/org/apache/spark/util/ManualClock.scala --- @@ -27,6 +27,7 @@ package org.apache.spark.util private[spark] class

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15316 **[Test build #67058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67058/consoleFull)** for PR 15316 at commit

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15316 Also cc @cloud-fan and @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15316 cc @hvanhovell @rxin Any more comment about this PR? I assume Spark 2.0.2 needs it. Recently, when we analyzing the JIRA https://issues.apache.org/jira/browse/SPARK-17709, we are

[GitHub] spark issue #15508: [DO-NOT-MERGE]

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15508 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15508: [DO-NOT-MERGE]

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15508 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67059/ Test FAILed. ---

[GitHub] spark issue #12761: [SPARK-14464] [MLLIB] Better support for logistic regres...

2016-10-17 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/12761 I'm benchmarking LOR with 14M features of internal company dataset (unfortunately, it's not public). Regrading using sparse data structure for aggregation, I'm not so sure how much this

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15511 BTW, I guess per-line JSON also complies a standard - https://tools.ietf.org/html/rfc7159#section-4. We should add a test, fix the title to summarise what the PR proposes and fill the PR

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15316 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67058/ Test PASSed. ---

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread codlife
Github user codlife commented on the issue: https://github.com/apache/spark/pull/15511 Compile is ok, but when we call show(), we will get a _corrupt_record, besides when we call select on this df, we will get an exception. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15316 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query i...

2016-10-17 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/15502 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query in CTAS ...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15502 Thanks! Close it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #15510: new

2016-10-17 Thread codlife
Github user codlife closed the pull request at: https://github.com/apache/spark/pull/15510 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/9 **[Test build #67051 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67051/consoleFull)** for PR 9 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15481 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67054/ Test FAILed. ---

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15511 I guess it'd be nicer if this PR resembles https://github.com/apache/spark/pull/14151 The suggested change is to read each JSON object per file which I guess we can share some codes in the

[GitHub] spark issue #15302: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-10-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15302 Hi, @hvanhovell . When using `Expression`, I faced two situations. - `checkAnalysis` raises exceptions because the column is unresolved, e.g., `country` is unresolved. - As a

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67052/ Test FAILed. ---

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15511 OK, I think in both cases "standard" JSON is read, and in both cases, each record is a JSON document. These aren't different cases. If you mean to read small JSON files as records, you just use

[GitHub] spark issue #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query in CTAS ...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15502 cc @yhuai @hvanhovell @cloud-fan I guess this needs to be merged to 2.0.2 ASAP? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15316 LGTM, cc @hvanhovell @rxin for final sign-off --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query in CTAS ...

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15502 LGTM, merging to 2.0! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13780: [SPARK-16063][SQL] Add storageLevel to Dataset

2016-10-17 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/13780 @marmbrus thanks for merging this. For me there is still an open question around handling of deser storage levels on the PySpark side (see my comments

[GitHub] spark pull request #15509: Merge pull request #1 from apache/master

2016-10-17 Thread someorz
GitHub user someorz opened a pull request: https://github.com/apache/spark/pull/15509 Merge pull request #1 from apache/master ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested?

[GitHub] spark issue #15508: [DO-NOT-MERGE]

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15508 **[Test build #67059 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67059/consoleFull)** for PR 15508 at commit

[GitHub] spark pull request #15510: new

2016-10-17 Thread codlife
GitHub user codlife opened a pull request: https://github.com/apache/spark/pull/15510 new ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread codlife
Github user codlife commented on the issue: https://github.com/apache/spark/pull/15511 In standard json file, multi lines json object is legal, but currently, we can just load single-line json obejct directly. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #15512: The SerializerInstance instance used when deseria...

2016-10-17 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15512 The SerializerInstance instance used when deserializing a TaskResult is not reused ## What changes were proposed in this pull request? The following code is called when the DirectTaskResult

[GitHub] spark issue #15274: [SPARK-17699] Support for parsing JSON string columns

2016-10-17 Thread DanielMe
Github user DanielMe commented on the issue: https://github.com/apache/spark/pull/15274 Is there any workaround I can use to achieve a similar effect in 1.6? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/15481 LGTM, sorry to bring in deadlock issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15509: Merge pull request #1 from apache/master

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15509 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15509: Merge pull request #1 from apache/master

2016-10-17 Thread someorz
Github user someorz closed the pull request at: https://github.com/apache/spark/pull/15509 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15481 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15481 **[Test build #67054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67054/consoleFull)** for PR 15481 at commit

[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15423 **[Test build #67060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67060/consoleFull)** for PR 15423 at commit

[GitHub] spark issue #15505: [WIP][SPARK-17931]taskScheduler has some unneeded serial...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67062/consoleFull)** for PR 15505 at commit

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/9 Please remove `WIP` in the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15511 I don't quite understand this -- what does "standard" mean? This still doesn't load a 'standard JSON' file. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15316 **[Test build #67058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67058/consoleFull)** for PR 15316 at commit

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15503 Merged to master/2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #67052 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67052/consoleFull)** for PR 15376 at commit

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query i...

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15502#discussion_r83586555 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -510,7 +510,7 @@ private[hive] case class

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67055/ Test PASSed. ---

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15316 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15511 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15376 `KafkaSourceSuite` failure seems to be irrelevant. ``` [info] KafkaSourceSuite: [info] - cannot stop Kafka stream (1 minute, 1 second) [info] - subscribing topic by name from

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15376 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r83600176 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +312,20 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15148 **[Test build #67055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67055/consoleFull)** for PR 15148 at commit

[GitHub] spark issue #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15148 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15508: [DO-NOT-MERGE]

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15508 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #15508: [DO-NOT-MERGE]

2016-10-17 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15508 [DO-NOT-MERGE] You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark api-backport Alternatively you can review and apply these

[GitHub] spark issue #15508: [DO-NOT-MERGE]

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15508 **[Test build #67059 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67059/consoleFull)** for PR 15508 at commit

[GitHub] spark issue #15505: [SPARK-17931]taskScheduler has some unneeded serializati...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67062/consoleFull)** for PR 15505 at commit

[GitHub] spark issue #15505: [SPARK-17931]taskScheduler has some unneeded serializati...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15505: [SPARK-17931]taskScheduler has some unneeded serializati...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67062/ Test FAILed. ---

[GitHub] spark issue #15512: The SerializerInstance instance used when deserializing ...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15512 **[Test build #67063 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67063/consoleFull)** for PR 15512 at commit

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #67064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67064/consoleFull)** for PR 15376 at commit

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15512 Hm, if the benchmark you give generalizes much that is certainly compelling. I guess I'm surprised that instantiating the object can be so expensive relative to deserialization since it just happens

[GitHub] spark issue #15467: [SPARK-17912][SQL] Refactor code generation to get data ...

2016-10-17 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15467 Will do On Sun, Oct 16, 2016, 11:35 PM Kazuaki Ishizaki wrote: > @ericl , could you please review this? cc > @davies

[GitHub] spark pull request #15502: [SPARK-17892] [SQL] [2.0] Do Not Optimize Query i...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15502#discussion_r83587470 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -510,7 +510,7 @@ private[hive] case class

[GitHub] spark issue #15509: Merge pull request #1 from apache/master

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15509 Could you please close this? It seems mistakenly opened. @someorz --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/9 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67051/ Test PASSed. ---

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #67061 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67061/consoleFull)** for PR 14136 at commit

[GitHub] spark pull request #15511: [SPARK-17969]I think it's user unfriendly to proc...

2016-10-17 Thread codlife
GitHub user codlife opened a pull request: https://github.com/apache/spark/pull/15511 [SPARK-17969]I think it's user unfriendly to process standard json file with DataFrame ## What changes were proposed in this pull request? Currently, with DataFrame API, we can't load

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r83599465 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +312,20 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15512 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #15503: Fix example of tf_idf with minDocFreq

2016-10-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15503 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15495: [SPARK-17620][SQL] Determine Serde by hive.default.filef...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15495 **[Test build #67097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67097/consoleFull)** for PR 15495 at commit

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15421 Thanks - the lines in [3], [4] will be called if we do any operation on the DataFrame. i.e. something like `dim(c)`. Also can we use the same test case that is in the test file checked in ?

[GitHub] spark issue #15428: [SPARK-17219][ML] enchanced NaN value handling in Bucket...

2016-10-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15428 Thanks! I'll take a look. Could you please fix the typo in the title? "enchanced" -> "enhanced" --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #15266: [SPARK-17693] [SQL] Fixed Insert Failure To Data Source ...

2016-10-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15266 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15266: [SPARK-17693] [SQL] Fixed Insert Failure To Data Source ...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15266 **[Test build #67098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67098/consoleFull)** for PR 15266 at commit

[GitHub] spark issue #15521: [SPARK-17980] [SQL] Fix refreshByPath for converted Hive...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15521 **[Test build #67090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67090/consoleFull)** for PR 15521 at commit

[GitHub] spark issue #15521: [SPARK-17980] [SQL] Fix refreshByPath for converted Hive...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15521 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15521: [SPARK-17980] [SQL] Fix refreshByPath for converted Hive...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15521 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67090/ Test PASSed. ---

[GitHub] spark issue #15445: [SPARK-17817][PySpark][FOLLOWUP] PySpark RDD Repartition...

2016-10-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15445 ping @davies @felixcheung May you review this again? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #15517: [SPARK-17972][SQL] Cache analyzed plan instead of optimi...

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15517 We need to swap `analyzed` and `optimized` in your PR description some where. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #14198: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2016-10-17 Thread zasdfgbnm
Github user zasdfgbnm commented on the issue: https://github.com/apache/spark/pull/14198 Hi @holdenk , I think I'm done. I create a test for this issue and I do find from the test that spark has the same issue not only for float but also for byte and short. After several commits,

[GitHub] spark issue #15501: Branch 2.0

2016-10-17 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15501 please close this issue, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15518: [SPARK-17974] Refactor FileCatalog classes to simplify t...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15518 **[Test build #67092 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67092/consoleFull)** for PR 15518 at commit

[GitHub] spark issue #15518: [SPARK-17974] Refactor FileCatalog classes to simplify t...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15518 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15518: [SPARK-17974] Refactor FileCatalog classes to simplify t...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67092/ Test PASSed. ---

[GitHub] spark issue #15285: [SPARK-17711] Compress rolled executor log

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15285 **[Test build #67099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67099/consoleFull)** for PR 15285 at commit

[GitHub] spark issue #15518: [SPARK-17974] Refactor FileCatalog classes to simplify t...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15518 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15518: [SPARK-17974] Refactor FileCatalog classes to simplify t...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67084/ Test PASSed. ---

[GitHub] spark pull request #15518: [SPARK-17974] Refactor FileCatalog classes to sim...

2016-10-17 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/15518#discussion_r83741391 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -626,8 +626,9 @@ class

[GitHub] spark issue #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return duplica...

2016-10-17 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/15450 The cases you enumerated are the ones I was thinking of. The changes introduced here would alleviate those problems, I agree. What I'm wondering is if this problem still exists in other cases. If

[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...

2016-10-17 Thread kishorvpatil
Github user kishorvpatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15009#discussion_r83744591 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -760,7 +787,7 @@ private[spark] class Client( .foreach { case

[GitHub] spark issue #15471: [SPARK-17919] Make timeout to RBackend configurable in S...

2016-10-17 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/15471 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15519: [WIP][SQL][STREAMING][TEST] Fix flaky tests in Streaming...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15519 **[Test build #3359 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3359/consoleFull)** for PR 15519 at commit

[GitHub] spark issue #13775: [SPARK-16060][SQL] Vectorized Orc reader

2016-10-17 Thread tejasapatil
Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/13775 Earlier this year I had spent some time trying out Presto's ORC reader with Spark. In standalone benchmark, Presto's ORC reader is 3x faster than the one in Hive. My experimental

[GitHub] spark issue #15520: [SPARK-13747][SQL]Fix concurrent executions in ForkJoinP...

2016-10-17 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15520 > Is it to avoid the blocking() call in Await? Yep. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #15522: [MINOR][DOC] Add more built-in sources in sql-programmin...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15522 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15285: [SPARK-17711] Compress rolled executor log

2016-10-17 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15285#discussion_r83742279 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala --- @@ -115,6 +117,19 @@ private[ui] class LogPage(parent: WorkerWebUI) extends

[GitHub] spark pull request #15285: [SPARK-17711] Compress rolled executor log

2016-10-17 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15285#discussion_r83749687 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1448,14 +1450,35 @@ private[spark] object Utils extends Logging {

  1   2   3   4   5   6   >