date:20160420

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212756422
  
**[Test build #56485 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56485/consoleFull)**
 for PR 12556 at commit 
[`c8708f7`](https://github.com/apache/spark/commit/c8708f7e9395811c9796bcbba68f63243bdda6cc).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `   *   [STORED AS file_format | STORED BY storage_handler_class [WITH 
SERDEPROPERTIES (...)]]`
  * `case class CreateTableAsSelectLogicalPlan(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212756425
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56485/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212756424
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212756240
  
**[Test build #56485 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56485/consoleFull)**
 for PR 12556 at commit 
[`c8708f7`](https://github.com/apache/spark/commit/c8708f7e9395811c9796bcbba68f63243bdda6cc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754891
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Update DAGScheduler.scala

2016-04-20 Thread jodersky

Github user jodersky commented on the pull request:

https://github.com/apache/spark/pull/12524#issuecomment-212754822
  
is this related to #12436 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754896
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56475/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754203
  
**[Test build #56475 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56475/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14786] Remove hive-cli dependency from ...

2016-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12551


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60529161
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SortPrefixUtils.scala ---
@@ -66,6 +66,32 @@ object SortPrefixUtils {
   }
 
   /**
+   * Returns whether the specified SortOrder can be satisfied with a radix 
sort on the prefix.
+   */
+  def canSortFullyWithPrefix(sortOrder: SortOrder): Boolean = {
+sortOrder.dataType match {
+  // TODO(ekl) long-type is problematic because it's null prefix 
representation collides with
+  // the lowest possible long value. Handle this special case outside 
radix sort.
+  case LongType if sortOrder.nullable =>
+false
+  case BooleanType | ByteType | ShortType | IntegerType | LongType | 
DateType |
+   TimestampType | FloatType | DoubleType =>
--- End diff --

nvm, the prefix of `null` had been changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60529190
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/Sort.scala 
---
@@ -139,11 +148,15 @@ case class Sort(
 val dataSize = metricTerm(ctx, "dataSize")
 val spillSize = metricTerm(ctx, "spillSize")
 val spillSizeBefore = ctx.freshName("spillSizeBefore")
+val startTime = ctx.freshName("startTime")
+val sortTime = metricTerm(ctx, "sortTime")
 s"""
| if ($needToSort) {
|   $addToSorter();
|   Long $spillSizeBefore = $metrics.memoryBytesSpilled();
+   |   Long $startTime = System.nanoTime();
--- End diff --

long


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread hvanhovell

Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212753448
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14786] Remove hive-cli dependency from ...

2016-04-20 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12551#issuecomment-212753365
  
Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60528873
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SortPrefixUtils.scala ---
@@ -66,6 +66,32 @@ object SortPrefixUtils {
   }
 
   /**
+   * Returns whether the specified SortOrder can be satisfied with a radix 
sort on the prefix.
+   */
+  def canSortFullyWithPrefix(sortOrder: SortOrder): Boolean = {
+sortOrder.dataType match {
+  // TODO(ekl) long-type is problematic because it's null prefix 
representation collides with
+  // the lowest possible long value. Handle this special case outside 
radix sort.
+  case LongType if sortOrder.nullable =>
+false
+  case BooleanType | ByteType | ShortType | IntegerType | LongType | 
DateType |
+   TimestampType | FloatType | DoubleType =>
--- End diff --

the prefix of null for DoubleType (Double.NegativeInfinity) is also 
collides with the lowest possible long value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14786] Remove hive-cli dependency from ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12551#issuecomment-212752845
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56470/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14786] Remove hive-cli dependency from ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12551#issuecomment-212752841
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346] [SQL] Show create table

2016-04-20 Thread xwu0226

Github user xwu0226 closed the pull request at:

https://github.com/apache/spark/pull/12406


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346] [SQL] Show create table

2016-04-20 Thread xwu0226

Github user xwu0226 commented on the pull request:

https://github.com/apache/spark/pull/12406#issuecomment-212752759
  
Thanks @wangmiao1981 .

This happens again. all the commits after my last commits got pulled into 
this PR. I need to close it and open a new PR. Will submit a new PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14786] Remove hive-cli dependency from ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12551#issuecomment-212752681
  
**[Test build #56470 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56470/consoleFull)**
 for PR 12551 at commit 
[`e8c6f35`](https://github.com/apache/spark/commit/e8c6f35e4475586e87df6582ea51a18727ad8062).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14346] [SQL] Show create table

2016-04-20 Thread wangmiao1981

Github user wangmiao1981 commented on the pull request:

https://github.com/apache/spark/pull/12406#issuecomment-212752658
  
@xwu0226 use git rebase upstream/master. Do not use git merge 
upstream/master. I have the same issue before. git merge will add others' 
commits to your PR. git rebase will discard others' commits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12560#issuecomment-212752513
  
**[Test build #56484 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56484/consoleFull)**
 for PR 12560 at commit 
[`239cabf`](https://github.com/apache/spark/commit/239cabf459fee929b19d541bf019de445ea2026d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60528592
  
--- Diff: 
core/src/test/scala/org/apache/spark/util/collection/unsafe/sort/PrefixComparatorsSuite.scala
 ---
@@ -110,4 +112,12 @@ class PrefixComparatorsSuite extends SparkFunSuite 
with PropertyChecks {
 assert(PrefixComparators.DOUBLE.compare(nan1Prefix, doubleMaxPrefix) 
=== 1)
   }
 
+  test("double prefix comparator handles negative NaNs properly") {
+val negativeNan: Double = 
java.lang.Double.longBitsToDouble(0xfff1L)
+assert(negativeNan.isNaN)
+assert(java.lang.Double.doubleToRawLongBits(negativeNan) < 0)
+val prefix = 
PrefixComparators.DoublePrefixComparator.computePrefix(negativeNan)
+val doubleMaxPrefix = 
PrefixComparators.DoublePrefixComparator.computePrefix(Double.MaxValue)
+assert(PrefixComparators.DOUBLE.compare(prefix, doubleMaxPrefix) === 1)
--- End diff --

Could you also test MinValue, 0, NegativeInfinity, PositiveInfinity


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212752316
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56478/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212752210
  
**[Test build #56478 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56478/consoleFull)**
 for PR 12556 at commit 
[`a5408e5`](https://github.com/apache/spark/commit/a5408e526a06a5d2629f21df1005696122916214).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212752313
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12559#issuecomment-212751784
  
**[Test build #56483 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56483/consoleFull)**
 for PR 12559 at commit 
[`e7afed9`](https://github.com/apache/spark/commit/e7afed92a21835bbb6d92df2dbd51fa872e2dbfa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-20 Thread wangmiao1981

GitHub user wangmiao1981 opened a pull request:

https://github.com/apache/spark/pull/12560

[SPARK-14571][ML]Log instrumentation in ALS

## What changes were proposed in this pull request?

Add log instrumentation for parameters:
rank, numUserBlocks, numItemBlocks, implicitPrefs, alpha,
userCol, itemCol, ratingCol, predictionCol, maxIter,
regParam, nonnegative, checkpointInterval, seed

Add log instrumentation for numUserFeatures and numItemFeatures

## How was this patch tested?

Manual test: Set breakpoint in intellij and run def testALS(). Single step 
debugging and check the log method is called.






You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangmiao1981/spark log

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12560.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12560


commit 239cabf459fee929b19d541bf019de445ea2026d
Author: wm...@hotmail.com 
Date:   2016-04-21T05:31:33Z

add instrumentation to ALS




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212751532
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12559#issuecomment-212750650
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12559#issuecomment-212750651
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56481/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12559#issuecomment-212750649
  
**[Test build #56481 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56481/consoleFull)**
 for PR 12559 at commit 
[`f17f42a`](https://github.com/apache/spark/commit/f17f42ac60e46fd26586dc9cb960689d7869f700).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10001][Core] Interrupt tasks in repl wi...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12557#issuecomment-212750487
  
**[Test build #56482 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56482/consoleFull)**
 for PR 12557 at commit 
[`4f9bf69`](https://github.com/apache/spark/commit/4f9bf695344a5c4c54372eaa6bf54af0d2da1f74).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12559#issuecomment-212750481
  
**[Test build #56481 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56481/consoleFull)**
 for PR 12559 at commit 
[`f17f42a`](https://github.com/apache/spark/commit/f17f42ac60e46fd26586dc9cb960689d7869f700).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13643][SQL] Implement SparkSession

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12553#issuecomment-212750214
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56468/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13643][SQL] Implement SparkSession

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12553#issuecomment-212750212
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13643][SQL] Implement SparkSession

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12553#issuecomment-212750048
  
**[Test build #56468 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56468/consoleFull)**
 for PR 12553 at commit 
[`7ccfb38`](https://github.com/apache/spark/commit/7ccfb38d1cf378d017fd6a570e41d16bb02dbf86).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14793][SQL] Code generation for large c...

2016-04-20 Thread ueshin

GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/12559

[SPARK-14793][SQL] Code generation for large complex type exceeds JVM size 
limit.

## What changes were proposed in this pull request?

Code generation for complex type, `CreateArray`, `CreateMap`, 
`CreateStruct`, `CreateNamedStruct`, exceeds JVM size limit for large elements.

## How was this patch tested?

I added some tests to check if the generated codes for the expressions 
exceed or not.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-14793

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12559.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12559


commit f17f42ac60e46fd26586dc9cb960689d7869f700
Author: Takuya UESHIN 
Date:   2016-04-21T05:14:42Z

Split wide complex type creation into blocks due to JVM code size limit.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13988][Core] Make replaying event logs ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11800#issuecomment-212749912
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56474/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13988][Core] Make replaying event logs ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11800#issuecomment-212749911
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13988][Core] Make replaying event logs ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11800#issuecomment-212749831
  
**[Test build #56474 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56474/consoleFull)**
 for PR 11800 at commit 
[`858e8ff`](https://github.com/apache/spark/commit/858e8ffefaa26c45249f81ed047ff00c77416bb6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14597] [Streaming] Streaming Listener t...

2016-04-20 Thread agsachin

Github user agsachin commented on the pull request:

https://github.com/apache/spark/pull/12357#issuecomment-212749355
  
hey closed this thread as we are moving towards approach 2 explained in the 
jira https://issues.apache.org/jira/browse/SPARK-14597



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14597] [Streaming] Streaming Listener t...

2016-04-20 Thread agsachin

Github user agsachin closed the pull request at:

https://github.com/apache/spark/pull/12357


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/12494#issuecomment-212748601
  
@viirya Thanks for bearing with me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/12494#issuecomment-212748355
  
@HyukjinKwon yea thanks for comment.

I will remove the null check then. And I should make change to CSVOption to 
avoid the null exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10001][Core] Interrupt tasks in repl wi...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12557#issuecomment-212748073
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang

Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212748052
  
@davies Now all tests have been passed. So Could you take a look again? 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10001][Core] Interrupt tasks in repl wi...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12557#issuecomment-212748076
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56480/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10001][Core] Interrupt tasks in repl wi...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12557#issuecomment-212748057
  
**[Test build #56480 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56480/consoleFull)**
 for PR 12557 at commit 
[`94323b9`](https://github.com/apache/spark/commit/94323b9a7e498c03f0de93bc536e5fe6710d062b).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747490
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56466/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/12239#discussion_r60527018
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -414,8 +414,42 @@ class Analyzer(
 }
 
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
-  case i @ InsertIntoTable(u: UnresolvedRelation, _, _, _, _) =>
-i.copy(table = EliminateSubqueryAliases(getTable(u)))
+  case i @ InsertIntoTable(u: UnresolvedRelation, parts, child, _, _) 
if child.resolved =>
+val table = getTable(u)
+// adding the table's partitions or validate the query's partition 
info
+table match {
+  case relation: PartitionedRelation if 
relation.partitionColumns.nonEmpty =>
+val tablePartitionNames = relation.partitionColumns.map(_.name)
+if (parts.keys.nonEmpty) {
+  // the query's partitioning must match the table's 
partitioning
+  // this is set for queries like: insert into ... partition 
(one = "a", two = )
+  if (tablePartitionNames.size != parts.keySet.size) {
+throw new AnalysisException(
+  s"""Requested partitioning does not match the 
${u.tableIdentifier} table:
+ |Requested partitions: ${parts.keys.mkString(",")}
+ |Table partitions: 
${tablePartitionNames.mkString(",")}""".stripMargin)
+  }
+  // assumes partition columns are correctly placed at the end 
of the child's output
+  i.copy(table = EliminateSubqueryAliases(table))
+} else {
+  // Set up the table's partition scheme with all dynamic 
partitions by moving partition
+  // columns to the end of the column list, in partition order.
+  val (inputPartCols, columns) = child.output.partition { attr 
=>
+tablePartitionNames.contains(attr.name)
+  }
+  // All partition columns are dynamic because this 
InsertIntoTable had no partitioning
+  val partColumns = tablePartitionNames.map { name =>
--- End diff --

when will the `partColumns` different from `inputPartCols`? Seems neve?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747343
  
**[Test build #56466 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56466/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/12239#discussion_r60526988
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -414,8 +414,42 @@ class Analyzer(
 }
 
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
-  case i @ InsertIntoTable(u: UnresolvedRelation, _, _, _, _) =>
-i.copy(table = EliminateSubqueryAliases(getTable(u)))
+  case i @ InsertIntoTable(u: UnresolvedRelation, parts, child, _, _) 
if child.resolved =>
+val table = getTable(u)
+// adding the table's partitions or validate the query's partition 
info
+table match {
+  case relation: PartitionedRelation if 
relation.partitionColumns.nonEmpty =>
+val tablePartitionNames = relation.partitionColumns.map(_.name)
+if (parts.keys.nonEmpty) {
+  // the query's partitioning must match the table's 
partitioning
+  // this is set for queries like: insert into ... partition 
(one = "a", two = )
+  if (tablePartitionNames.size != parts.keySet.size) {
--- End diff --

why do we only check size here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10001][Core] Allow interrupting tasks i...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12557#issuecomment-212747247
  
**[Test build #56480 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56480/consoleFull)**
 for PR 12557 at commit 
[`94323b9`](https://github.com/apache/spark/commit/94323b9a7e498c03f0de93bc536e5fe6710d062b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-20 Thread HyukjinKwon

Github user HyukjinKwon commented on the pull request:

https://github.com/apache/spark/pull/12494#issuecomment-212747103
  
@viirya Ah, thanks! however, I am a bit worried if disallowing `null` as a 
value in options is correct. I don't want to be kind of so picky but I think it 
might not be guaranteed that every option does not take `null` as a value for 
all.

Maybe in some external datasources or future options, I think there might 
be some cases that setting `null` as an option and not setting are not the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60526773
  
--- Diff: 
core/src/main/java/org/apache/spark/util/collection/unsafe/sort/PrefixComparators.java
 ---
@@ -28,88 +28,84 @@
 public class PrefixComparators {
   private PrefixComparators() {}
 
-  public static final StringPrefixComparator STRING = new 
StringPrefixComparator();
-  public static final StringPrefixComparatorDesc STRING_DESC = new 
StringPrefixComparatorDesc();
-  public static final BinaryPrefixComparator BINARY = new 
BinaryPrefixComparator();
-  public static final BinaryPrefixComparatorDesc BINARY_DESC = new 
BinaryPrefixComparatorDesc();
-  public static final LongPrefixComparator LONG = new 
LongPrefixComparator();
-  public static final LongPrefixComparatorDesc LONG_DESC = new 
LongPrefixComparatorDesc();
-  public static final DoublePrefixComparator DOUBLE = new 
DoublePrefixComparator();
-  public static final DoublePrefixComparatorDesc DOUBLE_DESC = new 
DoublePrefixComparatorDesc();
-
-  public static final class StringPrefixComparator extends 
PrefixComparator {
-@Override
-public int compare(long aPrefix, long bPrefix) {
-  return UnsignedLongs.compare(aPrefix, bPrefix);
-}
-
+  public static final PrefixComparator STRING = new 
UnsignedPrefixComparator();
+  public static final PrefixComparator STRING_DESC = new 
UnsignedPrefixComparatorDesc();
+  public static final PrefixComparator BINARY = new 
UnsignedPrefixComparator();
+  public static final PrefixComparator BINARY_DESC = new 
UnsignedPrefixComparatorDesc();
+  public static final PrefixComparator LONG = new SignedPrefixComparator();
+  public static final PrefixComparator LONG_DESC = new 
SignedPrefixComparatorDesc();
+  public static final PrefixComparator DOUBLE = new 
SignedPrefixComparator();
+  public static final PrefixComparator DOUBLE_DESC = new 
SignedPrefixComparatorDesc();
+
+  public static final class StringPrefixComparator {
 public static long computePrefix(UTF8String value) {
   return value == null ? 0L : value.getPrefix();
 }
   }
 
-  public static final class StringPrefixComparatorDesc extends 
PrefixComparator {
-@Override
-public int compare(long bPrefix, long aPrefix) {
-  return UnsignedLongs.compare(aPrefix, bPrefix);
+  public static final class BinaryPrefixComparator {
+public static long computePrefix(byte[] bytes) {
+  return ByteArray.getPrefix(bytes);
 }
   }
 
-  public static final class BinaryPrefixComparator extends 
PrefixComparator {
-@Override
-public int compare(long aPrefix, long bPrefix) {
-  return UnsignedLongs.compare(aPrefix, bPrefix);
+  public static final class DoublePrefixComparator {
+public static long computePrefix(double value) {
+  // Java's doubleToLongBits already canonicalizes all NaN values to 
the lowest possible NaN,
+  // so there's nothing special we need to do here.
+  return Double.doubleToLongBits(value);
 }
+  }
 
-public static long computePrefix(byte[] bytes) {
-  return ByteArray.getPrefix(bytes);
-}
+  /**
+   * Provides radix sort parameters. Comparators implementing this also 
are indicating that the
+   * ordering they define is compatible with radix sort.
+   */
+  public static abstract class RadixSortSupport extends PrefixComparator {
+/** @return Whether the sort should be descending in binary sort 
order. */
+public abstract boolean sortDescending();
+
+/** @return Whether the sort should take into account the sign bit. */
+public abstract boolean sortSigned();
   }
 
-  public static final class BinaryPrefixComparatorDesc extends 
PrefixComparator {
+  //
+  // Standard prefix comparator implementations
+  //
+
+  public static final class UnsignedPrefixComparator extends 
RadixSortSupport {
+@Override public final boolean sortDescending() { return false; }
+@Override public final boolean sortSigned() { return false; }
 @Override
-public int compare(long bPrefix, long aPrefix) {
+public final int compare(long aPrefix, long bPrefix) {
   return UnsignedLongs.compare(aPrefix, bPrefix);
 }
   }
 
-  public static final class LongPrefixComparator extends PrefixComparator {
+  public static final class UnsignedPrefixComparatorDesc extends 
RadixSortSupport {
+@Override public final boolean sortDescending() { return true; }
+@Override public final boolean sortSigned() { return false; }
 @Override
-public int compare(long a, long b) {
-  return (a < b) ? -1 : (a > b) ? 1 : 0;
+public

[GitHub] spark pull request: [SPARK-10101] [SQL] Add maxlength to JDBC fiel...

2016-04-20 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/8374#issuecomment-212746977
  
Seems no because it gets stale. If nobody takes this, I'll do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14459] [SQL] Detect relation partitioni...

2016-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/12239#discussion_r60526769
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -414,8 +414,42 @@ class Analyzer(
 }
 
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
-  case i @ InsertIntoTable(u: UnresolvedRelation, _, _, _, _) =>
-i.copy(table = EliminateSubqueryAliases(getTable(u)))
+  case i @ InsertIntoTable(u: UnresolvedRelation, parts, child, _, _) 
if child.resolved =>
+val table = getTable(u)
+// adding the table's partitions or validate the query's partition 
info
+table match {
+  case relation: PartitionedRelation if 
relation.partitionColumns.nonEmpty =>
+val tablePartitionNames = relation.partitionColumns.map(_.name)
+if (parts.keys.nonEmpty) {
+  // the query's partitioning must match the table's 
partitioning
+  // this is set for queries like: insert into ... partition 
(one = "a", two = )
+  if (tablePartitionNames.size != parts.keySet.size) {
--- End diff --

why do we only check size here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746211
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56462/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746210
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12525#issuecomment-212746170
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56467/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12525#issuecomment-212746169
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746062
  
**[Test build #56462 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56462/consoleFull)**
 for PR 10024 at commit 
[`7ea7274`](https://github.com/apache/spark/commit/7ea727470735cb2a420bd5411af0202d264d9ec7).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12525#issuecomment-212746059
  
**[Test build #56467 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56467/consoleFull)**
 for PR 12525 at commit 
[`3fcc4c3`](https://github.com/apache/spark/commit/3fcc4c34f515cfcb2b6dd56480e8824d1fa66e46).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14724] Use radix sort for shuffles and ...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12490#discussion_r60526550
  
--- Diff: 
core/src/main/java/org/apache/spark/shuffle/sort/ShuffleInMemorySorter.java ---
@@ -87,18 +102,18 @@ public void expandPointerArray(LongArray newArray) {
   array.getBaseOffset(),
   newArray.getBaseObject(),
   newArray.getBaseOffset(),
-  array.size() * 8L
+  array.size() * (Long.BYTES / memoryAllocationFactor)
--- End diff --

* ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212745547
  
**[Test build #56472 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56472/consoleFull)**
 for PR 12556 at commit 
[`25da47d`](https://github.com/apache/spark/commit/25da47dfb040e0936eda8bfd5b282e1d3e094b5a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212745631
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56472/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212745630
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/12494#issuecomment-212745533
  
@HyukjinKwon ok. make thing clearer...

@mathieulongtin wants to use `null` as option value in spark-csv from 
databricks. So he did the PR to allow passing `None` from python to Scala, 
instead of string "None". 

The passed `None` will be `null` in Scala side.

However, passing `null` as option value in current codes will cause null 
exception. Because Spark CSV data source does not handle `null`.

In this PR, I filter nulls and disallow using null as option value. It is 
not @mathieulongtin wants in his original PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12555#issuecomment-212745308
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56463/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12555#issuecomment-212745302
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14790] Always run scalastyle on sbt com...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12555#issuecomment-212744664
  
**[Test build #56463 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56463/consoleFull)**
 for PR 12555 at commit 
[`403fab6`](https://github.com/apache/spark/commit/403fab62fe6f40f2b21e09f06ff1481cb2e19cec).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60526334
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -238,4 +281,42 @@ class VectorizedHashMapGenerator(
|}
  """.stripMargin
   }
+
+  private def computeHash(
+  input: String,
+  dataType: DataType,
+  result: String,
+  ctx: CodegenContext): String = {
--- End diff --

We usually put `ctx` as the first argument


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14787][SQL] Upgrade Joda-Time library f...

2016-04-20 Thread holdenk

Github user holdenk commented on the pull request:

https://github.com/apache/spark/pull/12552#issuecomment-212744081
  
@HyukjinKwon thanks for picking up this PR & taking the time to investigate 
the places where the changes could be useful for Spark. This looks good to me 
as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60526194
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -238,4 +281,42 @@ class VectorizedHashMapGenerator(
|}
  """.stripMargin
   }
+
+  private def computeHash(
--- End diff --

genComputeHash ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-20 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/12494#discussion_r60526159
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -367,16 +370,19 @@ def format(self, source):
 @since(1.5)
 def option(self, key, value):
 """Adds an output option for the underlying data source.
+
+>>> csvpath = os.path.join(tempfile.mkdtemp(), 'data')
+>>> df.write.option('quote', None).format('csv').save(csvpath)
--- End diff --

I added null check that disallows null input.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60526133
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/BenchmarkWholeStageCodegen.scala
 ---
@@ -224,6 +224,127 @@ class BenchmarkWholeStageCodegen extends 
SparkFunSuite {
 */
   }
 
+  ignore("aggregate with string key") {
+val N = 20 << 20
+
+val benchmark = new Benchmark("Aggregate w string key", N)
+def f(): Unit = sqlContext.range(N).selectExpr("id", "cast(id & 1023 
as string) as k")
+  .groupBy("k").count().collect()
+
+benchmark.addCase(s"codegen = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", 
"false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = T") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", "true")
+  f()
+}
+
+benchmark.run()
+
+/*
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02 on Mac OS X 10.11.4
+Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
+Aggregate w string key: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
+
---
+codegen = F  3307 / 3376  6.3  
   157.7   1.0X
+codegen = T hashmap = F  2364 / 2471  8.9  
   112.7   1.4X
+codegen = T hashmap = T  1740 / 1841 12.0  
83.0   1.9X
+*/
+  }
+
+  ignore("aggregate with decimal key") {
+val N = 20 << 20
+
+val benchmark = new Benchmark("Aggregate w decimal key", N)
+def f(): Unit = sqlContext.range(N).selectExpr("id", "cast(id & 65535 
as decimal) as k")
+  .groupBy("k").count().collect()
+
+benchmark.addCase(s"codegen = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", 
"false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = T") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", "true")
+  f()
+}
+
+benchmark.run()
+
+/*
+Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02 on Mac OS X 10.11.4
+Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
+Aggregate w decimal key: Best/Avg Time(ms)Rate(M/s)   
Per Row(ns)   Relative
+
---
+codegen = F  2756 / 2817  7.6  
   131.4   1.0X
+codegen = T hashmap = F  1580 / 1647 13.3  
75.4   1.7X
+codegen = T hashmap = T   641 /  662 32.7  
30.6   4.3X
+*/
+  }
+
+  ignore("aggregate with multiple key types") {
+val N = 20 << 20
+
+val benchmark = new Benchmark("Aggregate w multiple keys", N)
+def f(): Unit = sqlContext.range(N)
+  .selectExpr(
+"id",
+"(id & 1023) as k1",
+"cast(id & 1023 as string) as k2",
+"cast(id & 1023 as int) as k3",
+"cast(id & 1023 as double) as k4",
+"cast(id & 1023 as float) as k5",
+"id > 1023 as k6")
+  .groupBy("k1", "k2", "k3", "k4", "k5", "k6")
+  .sum()
+  .collect()
+
+benchmark.addCase(s"codegen = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = F") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", 
"false")
+  f()
+}
+
+benchmark.addCase(s"codegen = T hashmap = T") { iter =>
+  sqlContext.setConf("spark.sql.codegen.wholeStage", "true")
+  sqlContext.setConf("spark.sql.codegen.aggregate.map.enabled", "true")
+  f()
+}
+
+benchmark.run()
+
+/*
+Java HotSpot(TM) 64-Bit Server VM

[GitHub] spark pull request: TEST - Throw exception on unsupported analyze ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12558#issuecomment-212742755
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56473/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: TEST - Throw exception on unsupported analyze ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12558#issuecomment-212742754
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: TEST - Throw exception on unsupported analyze ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12558#issuecomment-212742683
  
**[Test build #56473 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56473/consoleFull)**
 for PR 12558 at commit 
[`a37eb1d`](https://github.com/apache/spark/commit/a37eb1d60b4c630dbd85753cb17b9d8c7f25ef20).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60526044
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -69,17 +90,29 @@ class VectorizedHashMapGenerator(
 val generatedSchema: String =
   s"""
  |new org.apache.spark.sql.types.StructType()
- |${(groupingKeySchema ++ bufferSchema).map(key =>
-  s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.${key.dataType})""")
-  .mkString("\n")};
+ |${(groupingKeySchema ++ bufferSchema).map { key =>
+key.dataType match {
+  case d: DecimalType =>
+s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.createDecimalType(
+   |${d.precision}, ${d.scale}))""".stripMargin
+  case _ =>
+s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.${key.dataType})"""
+}
+  }.mkString("\n")};
   """.stripMargin
 
 val generatedAggBufferSchema: String =
   s"""
  |new org.apache.spark.sql.types.StructType()
- |${bufferSchema.map(key =>
-s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.${key.dataType})""")
-.mkString("\n")};
+ |${bufferSchema.map { key =>
--- End diff --

same this one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14780][R] Add `setLogLevel` to SparkR

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12547#issuecomment-212742481
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60525999
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -69,17 +90,29 @@ class VectorizedHashMapGenerator(
 val generatedSchema: String =
   s"""
  |new org.apache.spark.sql.types.StructType()
- |${(groupingKeySchema ++ bufferSchema).map(key =>
-  s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.${key.dataType})""")
-  .mkString("\n")};
+ |${(groupingKeySchema ++ bufferSchema).map { key =>
+key.dataType match {
--- End diff --

It's to pull this big expression out of the string


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14780][R] Add `setLogLevel` to SparkR

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12547#issuecomment-212742482
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56479/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60525942
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -69,17 +90,29 @@ class VectorizedHashMapGenerator(
 val generatedSchema: String =
   s"""
  |new org.apache.spark.sql.types.StructType()
- |${(groupingKeySchema ++ bufferSchema).map(key =>
-  s""".add("${key.name}", 
org.apache.spark.sql.types.DataTypes.${key.dataType})""")
-  .mkString("\n")};
+ |${(groupingKeySchema ++ bufferSchema).map { key =>
+key.dataType match {
--- End diff --

more `|` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14780][R] Add `setLogLevel` to SparkR

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12547#issuecomment-212742410
  
**[Test build #56479 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56479/consoleFull)**
 for PR 12547 at commit 
[`0abf874`](https://github.com/apache/spark/commit/0abf874b4b5402197c28b74ba16f50b46c81a1d4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60525850
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -40,12 +41,32 @@ import org.apache.spark.sql.types.StructType
  */
 class VectorizedHashMapGenerator(
 ctx: CodegenContext,
+aggregateExpressions: Seq[AggregateExpression],
 generatedClassName: String,
 groupingKeySchema: StructType,
 bufferSchema: StructType) {
-  val groupingKeys = groupingKeySchema.map(k => (k.dataType.typeName, 
ctx.freshName("key")))
-  val bufferValues = bufferSchema.map(k => (k.dataType.typeName, 
ctx.freshName("value")))
-  val groupingKeySignature = 
groupingKeys.map(_.productIterator.toList.mkString(" ")).mkString(", ")
+  case class Buffer(dataType: DataType, name: String)
+  val groupingKeys = groupingKeySchema.map(k => Buffer(k.dataType, 
ctx.freshName("key")))
+  val bufferValues = bufferSchema.map(k => Buffer(k.dataType, 
ctx.freshName("value")))
+  val groupingKeySignature = groupingKeys.map(key => 
(ctx.javaType(key.dataType), key.name))
--- End diff --

`s"${ctx.javaType(key.dataType)} ${key.name}"` will be easier to understand


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14680][SQL]Support all datatypes to use...

2016-04-20 Thread davies

Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/12440#discussion_r60525541
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
 ---
@@ -271,6 +271,74 @@ class CodegenContext {
   }
 
   /**
+   * Returns the specialized code to set a given value in a column vector 
for a given `DataType`.
+   */
+  def setValue(batch: String, row: String, dataType: DataType, ordinal: 
Int,
+  value: String): String = {
+val jt = javaType(dataType)
+dataType match {
+  case _ if isPrimitiveType(jt) =>
+s"$batch.column($ordinal).put${primitiveTypeName(jt)}($row, 
$value);"
+  case t: DecimalType => s"$batch.column($ordinal).putDecimal($row, 
$value, ${t.precision});"
+  case t: StringType => s"$batch.column($ordinal).putByteArray($row, 
$value.getBytes());"
+  case _ =>
+throw new IllegalArgumentException(s"cannot generate code for 
unsupported type: $dataType")
+}
+  }
+
+  /**
+   * Returns the specialized code to set a given value in a column vector 
for a given `DataType`
+   * that could potentially be nullable.
+   */
+  def updateColumn(
+  batch: String,
+  row: String,
+  dataType: DataType,
+  ordinal: Int,
+  ev: ExprCode,
+  nullable: Boolean): String = {
+if (nullable) {
+  // Can't call setNullAt on DecimalType, because we need to keep the 
offset
--- End diff --

For batch, this is not true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212739055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56459/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212739052
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/12525#discussion_r60525166
  
--- Diff: core/src/main/scala/org/apache/spark/scheduler/StageInfo.scala ---
@@ -36,7 +36,7 @@ class StageInfo(
 val rddInfos: Seq[RDDInfo],
 val parentIds: Seq[Int],
 val details: String,
-val taskMetrics: TaskMetrics = new TaskMetrics,
+val taskMetrics: TaskMetrics = null,
--- End diff --

creating `TaskMetrics` is not that cheap(register and un-register 
accumulators), so using null here as we won't call it when using default value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212738764
  
**[Test build #56459 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56459/consoleFull)**
 for PR 10024 at commit 
[`e009d95`](https://github.com/apache/spark/commit/e009d95c715879269253da2b47e669ffc2e13683).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212737750
  
**[Test build #56478 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56478/consoleFull)**
 for PR 12556 at commit 
[`a5408e5`](https://github.com/apache/spark/commit/a5408e526a06a5d2629f21df1005696122916214).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14792][SQL] Move as many parsing rules ...

2016-04-20 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12556#issuecomment-212737840
  
cc @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14780][R] Add `setLogLevel` to SparkR

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12547#issuecomment-212737812
  
**[Test build #56479 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56479/consoleFull)**
 for PR 12547 at commit 
[`0abf874`](https://github.com/apache/spark/commit/0abf874b4b5402197c28b74ba16f50b46c81a1d4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/12525#discussion_r60524791
  
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala 
---
@@ -244,7 +243,14 @@ private[spark] class ListenerTaskMetrics(accumUpdates: 
Seq[AccumulableInfo]) ext
 
 private[spark] object TaskMetrics extends Logging {
 
-  def empty: TaskMetrics = new TaskMetrics
+  /**
+   * Create an empty task metrics that doesn't register its accumulators.
+   */
+  def empty: TaskMetrics = {
--- End diff --

This is not only used in test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12525#issuecomment-212735921
  
**[Test build #56477 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56477/consoleFull)**
 for PR 12525 at commit 
[`ce0262b`](https://github.com/apache/spark/commit/ce0262b156990ed4a8e5ff854794a33f4bef582a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14753][CORE] remove internal flag in Ac...

2016-04-20 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/12525#issuecomment-212733494
  
LGTM pending Jenkins.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14782][SPARK-14778][SQL] Remove HiveCon...

2016-04-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12550


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-14780][R] Add `setLogLevel` to SparkR

2016-04-20 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/12547#discussion_r60524553
  
--- Diff: R/pkg/R/context.R ---
@@ -225,3 +225,17 @@ broadcast <- function(sc, object) {
 setCheckpointDir <- function(sc, dirName) {
   invisible(callJMethod(sc, "setCheckpointDir", 
suppressWarnings(normalizePath(dirName
 }
+
+#' Set new log level
+#'
+#' Set new log level: "ALL", "DEBUG", "ERROR", "FATAL", "INFO", "OFF", 
"TRACE", "WARN"
+#' @param sc Spark Context to use
+#' @param level New log level
+#' @examples
--- End diff --

Thank you, @felixcheung . I'll fix soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1201 matches

Mail list logo