[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13703
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60634/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13703
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13703
  
**[Test build #60634 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60634/consoleFull)**
 for PR 13703 at commit 
[`b7952d3`](https://github.com/apache/spark/commit/b7952d332084864c595eb0a9e28ed68c100510e6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ReaderUtils(object):`
  * `class DataFrameReader(ReaderUtils):`
  * `class DataStreamReader(ReaderUtils):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...

2016-06-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13631
  
@cloud-fan no problem at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13652: [SPARK-15613] [SQL] Fix incorrect days to millis ...

2016-06-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/13652#discussion_r67289086
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 ---
@@ -851,6 +851,29 @@ object DateTimeUtils {
   }
 
   /**
+   * Lookup the offset for given millis seconds since 1970-01-01 00:00:00 
in a timezone.
--- End diff --

since 1970-01-01 00:00:00 UTC or local timezone?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13698: [SQL] Removes FileFormat.prepareRead

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13698
  
**[Test build #60635 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60635/consoleFull)**
 for PR 13698 at commit 
[`eeb8d52`](https://github.com/apache/spark/commit/eeb8d520dcab29ff384c1627ec9ad9b9502898fa).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13703
  
**[Test build #60634 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60634/consoleFull)**
 for PR 13703 at commit 
[`b7952d3`](https://github.com/apache/spark/commit/b7952d332084864c595eb0a9e28ed68c100510e6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13697: [SPARK-15977][SQL] Fix TRUNCATE TABLE for Spark specific...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13697
  
**[Test build #60633 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60633/consoleFull)**
 for PR 13697 at commit 
[`59dfab9`](https://github.com/apache/spark/commit/59dfab9dc1fb40d8e2d9cadcb0176a9694b79011).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13631: [SPARK-15911][SQL] Remove the additional Project to be c...

2016-06-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13631
  
hi @viirya , we are auditing the insertion behaviour of spark sql, and will 
have an agreement this week. How about we revisit this PR  after that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13703
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60632/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13703
  
**[Test build #60632 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60632/consoleFull)**
 for PR 13703 at commit 
[`ac5df18`](https://github.com/apache/spark/commit/ac5df18481e45c30d5ad51716f01458cafa2dcf2).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class ReaderUtils(object):`
  * `class DataFrameReader(ReaderUtils):`
  * `class DataStreamReader(ReaderUtils):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13703
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13663: [SPARK-15950][SQL] Eliminate unreachable code at ...

2016-06-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13663#discussion_r67288169
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
@@ -26,6 +26,38 @@ import org.apache.spark.sql.test.SharedSQLContext
 class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  test("primitive type on array") {
+val df = sparkContext.parallelize(Seq(1, 2), 1).toDF("v")
--- End diff --

how do these tests prove the `zeroOutNullBytes` is eliminated?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13697: [SPARK-15977][SQL] Fix TRUNCATE TABLE for Spark specific...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13697
  
**[Test build #3116 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3116/consoleFull)**
 for PR 13697 at commit 
[`765be24`](https://github.com/apache/spark/commit/765be24bd871e062e05ae1d825771fae52fa6598).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13703
  
**[Test build #60632 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60632/consoleFull)**
 for PR 13703 at commit 
[`ac5df18`](https://github.com/apache/spark/commit/ac5df18481e45c30d5ad51716f01458cafa2dcf2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/13703
  
@zsxwing Can you please take a look. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13703: [SPARK-15981][Fixed bug and added tests

2016-06-15 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/13703

[SPARK-15981][Fixed bug and added tests

## What changes were proposed in this pull request?

- Fixed bug in Python API of DataStreamReader
- Reduced code duplication between DataStreamReader and DataFrameWriter
- Added missing Python doctests


## How was this patch tested?
New tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark SPARK-15981

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13703.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13703


commit ac5df18481e45c30d5ad51716f01458cafa2dcf2
Author: Tathagata Das 
Date:   2016-06-16T05:08:12Z

Fixed bug and added tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13663: [SPARK-15950][SQL] Eliminate unreachable code at ...

2016-06-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13663#discussion_r67288116
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -71,7 +71,8 @@ case class CreateArray(children: Seq[Expression]) extends 
Expression {
   s"""
 final ArrayData ${ev.value} = new $arrayClass($values);
 this.$values = null;
-  """)
+  """,
+  isNull = "false")
--- End diff --

with this, I think we can remove the `final boolean ${ev.isNull} = false;` 
in the generated code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13698: [SQL] Removes FileFormat.prepareRead

2016-06-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13698
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/13684#discussion_r67287926
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1856,10 +1856,11 @@ setMethod("where",
 #' the subset of columns.
 #'
 #' @param x A SparkDataFrame.
-#' @param colnames A character vector of column names.
+#' @param ... A character vector of column names or string column names.
+#' If the first argument contains a character vector then the following 
column names are ignored.
--- End diff --

align this line to "A character ..." of the above line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13700
  
**[Test build #3118 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3118/consoleFull)**
 for PR 13700 at commit 
[`036662b`](https://github.com/apache/spark/commit/036662bcab458039d4c37009cf9dd040c45e090c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13700
  
**[Test build #3117 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3117/consoleFull)**
 for PR 13700 at commit 
[`036662b`](https://github.com/apache/spark/commit/036662bcab458039d4c37009cf9dd040c45e090c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13702: [SPARK-15980][SQL] Add PushPredicateThroughObjectConsume...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13702
  
**[Test build #60631 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60631/consoleFull)**
 for PR 13702 at commit 
[`75805a4`](https://github.com/apache/spark/commit/75805a491dca1c4f8b03bae43449c56fc31a5584).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60629/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13684
  
Thank you, @shivaram and @sun-rui .
Now, it's ready for review again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60629 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60629/consoleFull)**
 for PR 13684 at commit 
[`51d870e`](https://github.com/apache/spark/commit/51d870eeba4714aa3b522748a3b4a7c285324bd9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13702: [SPARK-15980][SQL] Add PushPredicateThroughObject...

2016-06-15 Thread ueshin
GitHub user ueshin opened a pull request:

https://github.com/apache/spark/pull/13702

[SPARK-15980][SQL] Add PushPredicateThroughObjectConsumer rule to Optimizer.

## What changes were proposed in this pull request?

This pr adds `PushPredicateThroughObjectConsumer` rule to push-down 
predicates through `ObjectConsumer`.
And as an example, I implemented push-down typed filter through 
`SerializeFromObject`.

## How was this patch tested?

Added tests to check if push-down typed filter through 
`SerializeFromObject` correctly works.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ueshin/apache-spark issues/SPARK-15980

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13702.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13702


commit 492193c32b8c490032153003e679b152f36b130f
Author: Takuya UESHIN 
Date:   2016-06-15T10:16:31Z

Add PushPredicateThroughObjectConsumer rule.

commit 75805a491dca1c4f8b03bae43449c56fc31a5584
Author: Takuya UESHIN 
Date:   2016-06-15T10:21:34Z

Add a test to DatasetSuite to check if map and filter runs correctly.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13635: [SPARK-15159][SPARKR] SparkR SparkSession API

2016-06-15 Thread sun-rui
Github user sun-rui commented on the issue:

https://github.com/apache/spark/pull/13635
  
@shivaram, I probably take a look at this tonight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13700
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60626/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13700
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13700
  
**[Test build #60626 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60626/consoleFull)**
 for PR 13700 at commit 
[`036662b`](https://github.com/apache/spark/commit/036662bcab458039d4c37009cf9dd040c45e090c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on ...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12836


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread shivaram
Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/12836
  
Merging this to master and branch-2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13661: [SPARK-15942][REPL] Unblock `:reset` command in REPL.

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13661
  
**[Test build #60630 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60630/consoleFull)**
 for PR 13661 at commit 
[`4a70912`](https://github.com/apache/spark/commit/4a709123df115a86820f545481ff9e047e8143a5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... s...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13678


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13561: [SPARK-15824][SQL] Run 'with ... insert ... selec...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13561


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...

2016-06-15 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13678
  
thanks, merging to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/10896
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/10896
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60624/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/10896
  
**[Test build #60624 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60624/consoleFull)**
 for PR 10896 at commit 
[`c40388a`](https://github.com/apache/spark/commit/c40388acf657f18ae1431cb33515e395f6c64eb0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13641: [SPARK-10258][DOC][ML] Add @Since annotations to ml.feat...

2016-06-15 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/13641
  
@MLnick I found you did not add ```@Since``` for all params definition, is 
this as expected?I think we should add them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60629 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60629/consoleFull)**
 for PR 13684 at commit 
[`51d870e`](https://github.com/apache/spark/commit/51d870eeba4714aa3b522748a3b4a7c285324bd9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11863
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60622/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/11863
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11863
  
**[Test build #60622 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60622/consoleFull)**
 for PR 11863 at commit 
[`b404304`](https://github.com/apache/spark/commit/b4043049a9e864bf0b0c0e13affe5102c6c9278c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12836
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60621/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/12836
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12836
  
**[Test build #60621 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60621/consoleFull)**
 for PR 12836 at commit 
[`fe36d24`](https://github.com/apache/spark/commit/fe36d24139ca0fe22b80836b85c7ec67503c1104).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13640: [SPARK-15916][SQL] Correctly pushdown top level AND oper...

2016-06-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13640
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13701
  
**[Test build #60628 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60628/consoleFull)**
 for PR 13701 at commit 
[`5711ae4`](https://github.com/apache/spark/commit/5711ae46ca29c8e10bd218b31ca0e37694c61886).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13701
  
@yhuai I am not sure the row-group info is exposed. But I've manually 
output the `totalRowCount` in `SpecificParquetRecordReaderBase` to check the 
total number of rows this `RecordReader` will eventually read. The results are 
shown in the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13701
  
**[Test build #60627 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60627/consoleFull)**
 for PR 13701 at commit 
[`97ccacf`](https://github.com/apache/spark/commit/97ccacfca1f7a039bc7bf7b8a4f8f975deb70197).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13701: [SPARK-15639][SQL] Try to push down filter at Row...

2016-06-15 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/13701

 [SPARK-15639][SQL] Try to push down filter at RowGroups level for parquet 
reader

## What changes were proposed in this pull request?

The base class `SpecificParquetRecordReaderBase` used for vectorized 
parquet reader will try to get pushed-down filters from the given 
configuration. This pushed-down filters are used for RowGroups-level filtering. 
However, we don't set up the filters to push down into the configuration. In 
other words, the filters are not actually pushed down to do RowGroups-level 
filtering. This patch is to fix this and tries to set up the filters for 
pushing down to configuration for the reader.

The benchmark that excludes the time of writing Parquet file:

test("Benchmark for Parquet") {
  val N = 1 << 50
withParquetTable((0 until N).map(i => (101, i)), "t") {
  val benchmark = new Benchmark("Parquet reader", N)
  benchmark.addCase("reading Parquet file", 10) { iter =>
sql("SELECT _1 FROM t where t._1 < 100").collect()
  }
  benchmark.run()
  }
}

`withParquetTable` in default will run tests for vectorized reader 
non-vectorized readers. I only let it run vectorized reader.

After this patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_25-b17 on Linux 
3.13.0-57-generic
Westmere E56xx/L56xx/X56xx (Nehalem-C)
Parquet reader:  Best/Avg Time(ms)Rate(M/s) 
  Per Row(ns)   Relative


reading Parquet file76 /   88  3.4  
   291.0   1.0X

Before this patch:

Java HotSpot(TM) 64-Bit Server VM 1.8.0_25-b17 on Linux 
3.13.0-57-generic
Westmere E56xx/L56xx/X56xx (Nehalem-C)
Parquet reader:  Best/Avg Time(ms)Rate(M/s) 
  Per Row(ns)   Relative


reading Parquet file81 /   91  3.2  
   310.2   1.0X

Next, I run the benchmark for non-pushdown case using the same benchmark 
code but with disabled pushdown configuration.

After this patch:

Parquet reader:  Best/Avg Time(ms)Rate(M/s) 
  Per Row(ns)   Relative


reading Parquet file80 /   95  3.3  
   306.5   1.0X

Before this patch:

Parquet reader:  Best/Avg Time(ms)Rate(M/s) 
  Per Row(ns)   Relative


reading Parquet file80 /  103  3.3  
   306.7   1.0X

For non-pushdown case, from the results, I think this patch doesn't affect 
normal code path.

I've manually output the `totalRowCount` in 
`SpecificParquetRecordReaderBase` to see if this patch actually filter the 
row-groups. When running the above benchmark:

After this patch:
`totalRowCount = 0`

Before this patch:
`totalRowCount = 131072`


## How was this patch tested?
Existing tests should be passed.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 
vectorized-reader-push-down-filter2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13701.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13701


commit 5687a3b5527817c809244305468bfe4968bedcec
Author: Liang-Chi Hsieh 
Date:   2016-05-28T05:03:06Z

Try to push down filter at RowGroups level for parquet reader.

commit 077f7f8813a76d38c8a6d898ec54e401c91b6014
Author: Liang-Chi Hsieh 
Date:   2016-06-09T21:19:47Z

Merge remote-tracking branch 'upstream/master' into 
vectorized-reader-push-down-filter

Conflicts:

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala

commit 97ccacfca1f7a039bc7bf7b8a4f8f975deb70197
Author: Liang-Chi Hsieh 
Date:   2016-06-14T07:22:53Z

Merge remote-tracking branch 'upstream/master' into 
vectorized-reader-push-down-filter




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature 

[GitHub] spark issue #13697: [SPARK-15977][SQL] Fix TRUNCATE TABLE for Spark specific...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13697
  
**[Test build #3116 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3116/consoleFull)**
 for PR 13697 at commit 
[`765be24`](https://github.com/apache/spark/commit/765be24bd871e062e05ae1d825771fae52fa6598).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60625/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60625 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60625/consoleFull)**
 for PR 13684 at commit 
[`4430d72`](https://github.com/apache/spark/commit/4430d7252dbe9296d39c15a632318b32a7382b0c).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13371: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread yhuai
Github user yhuai commented on the issue:

https://github.com/apache/spark/pull/13371
  
Yea. Since this one was closed by asfgit, I am not sure you can reopen it.





On Wed, Jun 15, 2016 at 7:39 PM -0700, "Liang-Chi Hsieh" 
 wrote:












@yhuai ok. Do you mean I need to create a new PR for this?



—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


  
  











---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60623/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60623 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60623/consoleFull)**
 for PR 13684 at commit 
[`e6dd074`](https://github.com/apache/spark/commit/e6dd074f9ce44984da15665d36dfa976a6ee384c).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13684
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13700
  
**[Test build #60626 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60626/consoleFull)**
 for PR 13700 at commit 
[`036662b`](https://github.com/apache/spark/commit/036662bcab458039d4c37009cf9dd040c45e090c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13700: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13700
  
https://github.com/apache/spark/pull/13696 is the original patch for master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13696: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13696
  
See https://github.com/apache/spark/pull/13700


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13700: [SPARK-15979][SQL] Rename various Parquet support...

2016-06-15 Thread rxin
GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/13700

[SPARK-15979][SQL] Rename various Parquet support classes (branch-2.0).

## What changes were proposed in this pull request?
This patch renames various Parquet support classes from CatalystAbc to 
ParquetAbc. This new naming makes more sense for two reasons:

1. These are not optimizer related (i.e. Catalyst) classes.
2. We are in the Spark code base, and as a result it'd be more clear to 
call out these are Parquet support classes, rather than some Spark classes.

## How was this patch tested?
Renamed test cases as well.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark parquet-rename-branch-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13700.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13700


commit 036662bcab458039d4c37009cf9dd040c45e090c
Author: Reynold Xin 
Date:   2016-06-16T03:23:19Z

[SPARK-15979][SQL] Rename various Parquet support classes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60625 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60625/consoleFull)**
 for PR 13684 at commit 
[`4430d72`](https://github.com/apache/spark/commit/4430d7252dbe9296d39c15a632318b32a7382b0c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13474: [SPARK-15547][SQL] nested case class in encoder can have...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13474
  
Looks like it's not. I'm going to merge it in 2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/10896
  
**[Test build #60624 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60624/consoleFull)**
 for PR 10896 at commit 
[`c40388a`](https://github.com/apache/spark/commit/c40388acf657f18ae1431cb33515e395f6c64eb0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13474: [SPARK-15547][SQL] nested case class in encoder can have...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13474
  
Was this merged into 2.0?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13691: [SPARK-15851][Build] Fix the call of the bash scr...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13691


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13612: [SPARK-15851] [Build] Fix the call of the bash sc...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13612


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13694: [SPARK-13498] [SQL] Increment the recordsRead inp...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13694


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #11373: [SPARK-13498] [SQL] Increment the recordsRead inp...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11373


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13691: [SPARK-15851][Build] Fix the call of the bash script to ...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13691
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #3976: [SPARK-5173]support python application running on yarn cl...

2016-06-15 Thread lianhuiwang
Github user lianhuiwang commented on the issue:

https://github.com/apache/spark/pull/3976
  
@dileep1236 Can you find pyspark.zip in SPARK_HOME/python/lib/?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13694: [SPARK-13498] [SQL] Increment the recordsRead input metr...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13694
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13696: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13696
  
I merged this into master but there is a conflict in 2.0. Going to submit a 
separate pr.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13696: [SPARK-15979][SQL] Rename various Parquet support...

2016-06-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13696


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13447: [SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT E...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13447
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60619/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13447: [SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT E...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13447
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13447: [SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT E...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13447
  
**[Test build #60619 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60619/consoleFull)**
 for PR 13447 at commit 
[`84ed71a`](https://github.com/apache/spark/commit/84ed71ad89f3571f317694fd5803ce341a3b46b8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13696: [SPARK-15979][SQL] Rename various Parquet support classe...

2016-06-15 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13696
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13699: [SPARK-15958] Make initial buffer size for the Sorter co...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13699
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60620/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13699: [SPARK-15958] Make initial buffer size for the Sorter co...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13699
  
**[Test build #60620 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60620/consoleFull)**
 for PR 13699 at commit 
[`82e540c`](https://github.com/apache/spark/commit/82e540c00c85f0c9f2d9d34c37420752fce2d18f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `final class UnsafeExternalRowSorter `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13699: [SPARK-15958] Make initial buffer size for the Sorter co...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13699
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...

2016-06-15 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13676#discussion_r67281085
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala ---
@@ -479,7 +354,287 @@ private[hive] trait HiveInspectors {
   }
 
   /**
-   * Builds specific unwrappers ahead of time according to object inspector
+   * Strictly follows the following order in unwrapping (constant OI has 
the higher priority):
+   * Constant Null object inspector =>
+   *   return null
+   * Constant object inspector =>
+   *   extract the value from constant object inspector
+   * If object inspector prefers writable =>
+   *   extract writable from `data` and then get the catalyst type from 
the writable
+   * Extract the java object directly from the object inspector
+   *
+   * NOTICE: the complex data type requires recursive unwrapping.
+   */
+  def unwrapperFor(objectInspector: ObjectInspector): Any => Any =
+objectInspector match {
+  case coi: ConstantObjectInspector if coi.getWritableConstantValue == 
null =>
+data: Any => null
+  case poi: WritableConstantStringObjectInspector =>
+data: Any =>
+  UTF8String.fromString(poi.getWritableConstantValue.toString)
+  case poi: WritableConstantHiveVarcharObjectInspector =>
+data: Any =>
+  
UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue)
+  case poi: WritableConstantHiveCharObjectInspector =>
+data: Any =>
+  
UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue)
+  case poi: WritableConstantHiveDecimalObjectInspector =>
+data: Any =>
+  HiveShim.toCatalystDecimal(
+PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector,
+poi.getWritableConstantValue.getHiveDecimal)
+  case poi: WritableConstantTimestampObjectInspector =>
+data: Any => {
+  val t = poi.getWritableConstantValue
+  t.getSeconds * 100L + t.getNanos / 1000L
+}
+  case poi: WritableConstantIntObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantDoubleObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantBooleanObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantLongObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantFloatObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantShortObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantByteObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantBinaryObjectInspector =>
+data: Any => {
+  val writable = poi.getWritableConstantValue
+  val temp = new Array[Byte](writable.getLength)
+  System.arraycopy(writable.getBytes, 0, temp, 0, temp.length)
+  temp
+}
+  case poi: WritableConstantDateObjectInspector =>
+data: Any =>
+  DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get())
+  case mi: StandardConstantMapObjectInspector =>
+val keyUnwrapper = unwrapperFor(mi.getMapKeyObjectInspector)
+val valueUnwrapper = unwrapperFor(mi.getMapValueObjectInspector)
+data: Any => {
+  // take the value from the map inspector object, rather than the 
input data
+  val keyValues = mi.getWritableConstantValue.asScala.toSeq
+  val keys = keyValues.map(kv => keyUnwrapper(kv._1)).toArray
+  val values = keyValues.map(kv => valueUnwrapper(kv._2)).toArray
+  ArrayBasedMapData(keys, values)
+}
+  case li: StandardConstantListObjectInspector =>
+val unwrapper = unwrapperFor(li.getListElementObjectInspector)
+data: Any => {
+  // take the value from the list inspector object, rather than 
the input data
+  val values = li.getWritableConstantValue.asScala
+.map(unwrapper)
+.toArray
+  new GenericArrayData(values)
+}
+  case poi: VoidObjectInspector =>
+data: Any =>
+  null // always be null for void object inspector
+  case pi: PrimitiveObjectInspector => pi match {
+// We think HiveVarchar/HiveChar is also a String
+case hvoi: 

[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13684
  
**[Test build #60623 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60623/consoleFull)**
 for PR 13684 at commit 
[`e6dd074`](https://github.com/apache/spark/commit/e6dd074f9ce44984da15665d36dfa976a6ee384c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13684: [SPARK-15908][R] Add varargs-type dropDuplicates() funct...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13684
  
Thank you, @sun-rui . 
Now, this PR checks all parameters' type correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pattern ma...

2016-06-15 Thread lianhuiwang
Github user lianhuiwang commented on the issue:

https://github.com/apache/spark/pull/13676
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13684: [SPARK-15908][R] Add varargs-type dropDuplicates(...

2016-06-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13684#discussion_r67280096
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1869,14 +1871,22 @@ setMethod("where",
 #' path <- "path/to/file.json"
 #' df <- read.json(path)
 #' dropDuplicates(df)
+#' dropDuplicates(df, "col1", "col2")
 #' dropDuplicates(df, c("col1", "col2"))
 #' }
 setMethod("dropDuplicates",
   signature(x = "SparkDataFrame"),
-  function(x, colNames = columns(x)) {
-stopifnot(class(colNames) == "character")
-
-sdf <- callJMethod(x@sdf, "dropDuplicates", as.list(colNames))
+  function(x, col, ...) {
--- End diff --

Thank you for review, @sun-rui !
I'll improve like that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13371: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13371
  
@yhuai ok. Do you mean I need to create a new PR for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11929: [SPARK-13934][SQL] fixed table identifier

2016-06-15 Thread yangw1234
Github user yangw1234 commented on the issue:

https://github.com/apache/spark/pull/11929
  
Hi @hvanhovell , just checked. In branch-1.6 latest code, yes this problem 
still exists. Branch master and branch-2.0 don't have this problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12836: [SPARK-12922][SparkR][WIP] Implement gapply() on DataFra...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12836
  
**[Test build #60621 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60621/consoleFull)**
 for PR 12836 at commit 
[`fe36d24`](https://github.com/apache/spark/commit/fe36d24139ca0fe22b80836b85c7ec67503c1104).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11863
  
**[Test build #60622 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60622/consoleFull)**
 for PR 11863 at commit 
[`b404304`](https://github.com/apache/spark/commit/b4043049a9e864bf0b0c0e13affe5102c6c9278c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/11863
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13676: [SPARK-15956] [SQL] When unwrapping ORC avoid pat...

2016-06-15 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/13676#discussion_r67277135
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala ---
@@ -479,7 +354,287 @@ private[hive] trait HiveInspectors {
   }
 
   /**
-   * Builds specific unwrappers ahead of time according to object inspector
+   * Strictly follows the following order in unwrapping (constant OI has 
the higher priority):
+   * Constant Null object inspector =>
+   *   return null
+   * Constant object inspector =>
+   *   extract the value from constant object inspector
+   * If object inspector prefers writable =>
+   *   extract writable from `data` and then get the catalyst type from 
the writable
+   * Extract the java object directly from the object inspector
+   *
+   * NOTICE: the complex data type requires recursive unwrapping.
+   */
+  def unwrapperFor(objectInspector: ObjectInspector): Any => Any =
+objectInspector match {
+  case coi: ConstantObjectInspector if coi.getWritableConstantValue == 
null =>
+data: Any => null
+  case poi: WritableConstantStringObjectInspector =>
+data: Any =>
+  UTF8String.fromString(poi.getWritableConstantValue.toString)
+  case poi: WritableConstantHiveVarcharObjectInspector =>
+data: Any =>
+  
UTF8String.fromString(poi.getWritableConstantValue.getHiveVarchar.getValue)
+  case poi: WritableConstantHiveCharObjectInspector =>
+data: Any =>
+  
UTF8String.fromString(poi.getWritableConstantValue.getHiveChar.getValue)
+  case poi: WritableConstantHiveDecimalObjectInspector =>
+data: Any =>
+  HiveShim.toCatalystDecimal(
+PrimitiveObjectInspectorFactory.javaHiveDecimalObjectInspector,
+poi.getWritableConstantValue.getHiveDecimal)
+  case poi: WritableConstantTimestampObjectInspector =>
+data: Any => {
+  val t = poi.getWritableConstantValue
+  t.getSeconds * 100L + t.getNanos / 1000L
+}
+  case poi: WritableConstantIntObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantDoubleObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantBooleanObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantLongObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantFloatObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantShortObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantByteObjectInspector =>
+data: Any =>
+  poi.getWritableConstantValue.get()
+  case poi: WritableConstantBinaryObjectInspector =>
+data: Any => {
+  val writable = poi.getWritableConstantValue
+  val temp = new Array[Byte](writable.getLength)
+  System.arraycopy(writable.getBytes, 0, temp, 0, temp.length)
+  temp
+}
+  case poi: WritableConstantDateObjectInspector =>
+data: Any =>
+  DateTimeUtils.fromJavaDate(poi.getWritableConstantValue.get())
+  case mi: StandardConstantMapObjectInspector =>
+val keyUnwrapper = unwrapperFor(mi.getMapKeyObjectInspector)
+val valueUnwrapper = unwrapperFor(mi.getMapValueObjectInspector)
+data: Any => {
+  // take the value from the map inspector object, rather than the 
input data
+  val keyValues = mi.getWritableConstantValue.asScala.toSeq
+  val keys = keyValues.map(kv => keyUnwrapper(kv._1)).toArray
+  val values = keyValues.map(kv => valueUnwrapper(kv._2)).toArray
+  ArrayBasedMapData(keys, values)
+}
+  case li: StandardConstantListObjectInspector =>
+val unwrapper = unwrapperFor(li.getListElementObjectInspector)
+data: Any => {
+  // take the value from the list inspector object, rather than 
the input data
+  val values = li.getWritableConstantValue.asScala
+.map(unwrapper)
+.toArray
+  new GenericArrayData(values)
+}
+  case poi: VoidObjectInspector =>
+data: Any =>
+  null // always be null for void object inspector
+  case pi: PrimitiveObjectInspector => pi match {
+// We think HiveVarchar/HiveChar is also a String
+case 

[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-15 Thread koeninger
Github user koeninger commented on the issue:

https://github.com/apache/spark/pull/11863
  
Jenkins, retest this please
On Jun 15, 2016 8:43 PM, "UCB AMPLab"  wrote:

> Merged build finished. Test FAILed.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13698: [SQL] Removes FileFormat.prepareRead

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13698
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13698: [SQL] Removes FileFormat.prepareRead

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13698
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60618/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >