date:20160627

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13906
  
Hi, @rxin .
I just remembered this PR while looking your whitelist PR. :)
Any advice for this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13939: [SPARK-16248][SQL] Whitelist the list of Hive fallback f...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13939
  
**[Test build #61358 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61358/consoleFull)**
 for PR 13939 at commit 
[`ef5db42`](https://github.com/apache/spark/commit/ef5db42b6630c7c891c9f0e5252daf4a37ddca91).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61356/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #11863: [SPARK-12177][Streaming][Kafka] Update KafkaDStreams to ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/11863
  
**[Test build #61359 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61359/consoleFull)**
 for PR 11863 at commit 
[`db95290`](https://github.com/apache/spark/commit/db9529066e9c9dab145f09f2332284f6869ed312).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13939#discussion_r68701105
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -221,4 +214,18 @@ private[sql] class HiveSessionCatalog(
 }
 }
   }
+
+  /** List of functions we pass over to Hive. Note that over time this 
list should go to 0. */
+  // We have a list of Hive built-in functions that we do not support. So, 
we will check
+  // Hive's function registry and lazily load needed functions into our 
own function registry.
+  // Those Hive built-in functions are
+  // compute_stats, context_ngrams, create_union,
+  // current_user ,elt, ewah_bitmap, ewah_bitmap_and, ewah_bitmap_empty, 
ewah_bitmap_or, field,
+  // histogram_numeric, in_file, index, inline, java_method, map_keys, 
map_values,
+  // matchpath, ngrams, noop, noopstreaming, noopwithmap, 
noopwithmapstreaming,
+  // parse_url, parse_url_tuple, percentile, percentile_approx, 
posexplode, reflect, reflect2,
+  // regexp, sentences, stack, std, str_to_map, windowingtablefunction, 
xpath, xpath_boolean,
+  // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
+  // xpath_short, and xpath_string.
+  private val hiveFunctions = Seq("percentile", "percentile_approx")
--- End diff --

Oh.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread mengxr

Github user mengxr commented on the issue:

https://github.com/apache/spark/pull/13921
  
I think the error was because this PR left `predict`, `write.ml`, etc 
documented without title. So this PR has to be combined with SPARK-16144. 
Basically, let us add some doc to the function declarations under `generics.R`.

cc: @yinxusen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61356 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700984
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Oh, @rxin . I misunderstood your question. Yes. We don't register the hive 
function before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13939#discussion_r68700956
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -162,17 +162,6 @@ private[sql] class HiveSessionCatalog(
 }
   }
 
-  // We have a list of Hive built-in functions that we do not support. So, 
we will check
-  // Hive's function registry and lazily load needed functions into our 
own function registry.
-  // Those Hive built-in functions are
-  // assert_true, collect_list, collect_set, compute_stats, 
context_ngrams, create_union,
--- End diff --

assert_true, collect_list, collect_set are supported already


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13939: [SPARK-16248][SQL] Whitelist the list of Hive fal...

2016-06-27 Thread rxin

GitHub user rxin opened a pull request:

https://github.com/apache/spark/pull/13939

[SPARK-16248][SQL] Whitelist the list of Hive fallback functions - WIP

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?
N/A



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rxin/spark hive-whitelist

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13939


commit ef5db42b6630c7c891c9f0e5252daf4a37ddca91
Author: Reynold Xin 
Date:   2016-06-28T05:53:22Z

[SPARK-16248][SQL] Whitelist the list of Hive fallback functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61351 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)**
 for PR 13937 at commit 
[`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61357 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61357/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61351/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700695
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

I mean we need to call `createTempFunction` with `double` children instead 
of `decimal` children.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700636
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

@rxin . Actually, we do `createTempFunction` for the hive function on the 
fly but with **different** signature (Decimal).
`makeFunctionBuilder` indeed uses `children` implicitly. That's the reason 
why I rename `lookupFunction` into `subLookupFunction` and repeats the same 
process with different children.
```
  val builder = makeFunctionBuilder(functionName, className)
  // Put this Hive built-in function to our function registry.
  val info = new ExpressionInfo(className, functionName)
  createTempFunction(functionName, info, builder, ignoreIfExists = 
false)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61356 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61356/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
Again, I think the error message is not related with this change. I will 
retest this and meanwhile try to build in my local.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13938


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13938
  
LGTM - merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programmin...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13938
  
**[Test build #61355 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61355/consoleFull)**
 for PR 13938 at commit 
[`7455a49`](https://github.com/apache/spark/commit/7455a4925ea0f859ea3978930f03e972a7e07929).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/9183
  
I think @yanboliang just need to push this forward and get people to review 
it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700193
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

For the following opinion, I think that is the exact same way of the Spark 
1.6 and previous. I think that is not a problem.
> this will fail again as soon as we pass in an argument with a slightly 
different value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13839
  
LGTM pending tests.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700137
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

yea i think the problem is that we don't register the hive function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68700034
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Hi, @hvanhovell .
I tried again, but, as you saw in my first commit, this happens during 
resolving `UnresolvedFunction`.


https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L884

IMHO, we can not do this in `ExpectsInputTypes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61349 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61349/consoleFull)**
 for PR 13937 at commit 
[`5246bcf`](https://github.com/apache/spark/commit/5246bcfa1ba510c281c456b0f61bf32f70d10174).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13839: [SPARK-16128][SQL] Allow setting length of characters to...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13839
  
**[Test build #61354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61354/consoleFull)**
 for PR 13839 at commit 
[`b170741`](https://github.com/apache/spark/commit/b170741c4b286893e20b8894f20812af1d6e6fd4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68699881
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

@dongjoon-hyun the current fix is quite brittle; this will fail again as 
soon as we pass in an argument with a slightly different value. The Analyzer 
will create casts to the proper type if we implement `ExpectsInputTypes`. So 
this seems like the best course of action. It might not be the easiest fix, or 
entirely possible; but I'd prefer to try this first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61349/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #9183: [SPARK-11215] [ML] Add multiple columns support to String...

2016-06-27 Thread pkch

Github user pkch commented on the issue:

https://github.com/apache/spark/pull/9183
  
What needs to happen to move this forward? This was a PR that would have 
been the first iteration of a significant improvement in handling of wide 
datasets.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13937
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13938: [SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL pro...

2016-06-27 Thread yhuai

GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/13938

[SPARK-15863][SQL][DOC][FOLLOW-UP] Update SQL programming guide.

## What changes were proposed in this pull request?
This PR makes several updates to SQL programming guide.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13938.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13938


commit ce0f54e074099f2c416169d5f62f93b23587f43a
Author: Yin Huai 
Date:   2016-06-28T04:20:12Z

wip

commit 7455a4925ea0f859ea3978930f03e972a7e07929
Author: Yin Huai 
Date:   2016-06-28T05:26:33Z

[SPARK-15863][SQL][DOC] Update SQL programming guide.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread hhbyyh

Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/13937
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13933
  
LGTM -- cc @tdas to take a look since he wrote the original patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61353/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61353 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68699390
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -435,6 +434,37 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   }
 
   /**
+   * Parse a key-value map from a [[OptionParameterListContext]], assuming 
all values are
+   * specified. This allows string, boolean, decimal and integer literals 
which are converted
+   * to strings.
+   */
+  override def visitOptionParameterList(ctx: OptionParameterListContext): 
Map[String, String] = {
+// TODO: Currently it does not treat null. Hive does not allow null 
for metadata and
+// throws an exception.
+val properties = ctx.optionParameter.asScala.map { property =>
+  val key = visitTablePropertyKey(property.key)
+  val value = if (property.value.STRING != null) {
+string(property.value.STRING)
+  } else if (property.value.booleanValue != null) {
+property.value.getText.toLowerCase
+  } else {
+property.value.getText
+  }
+  key -> value
+}
+
+// Check for duplicate property names.
+checkDuplicateKeys(properties, ctx)
+val props = properties.toMap
+val badKeys = props.filter { case (_, v) => v == null }.keys
--- End diff --

NIT (not your code): `val badKeys = props.collect { case (key, null) => key 
}`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...

2016-06-27 Thread hhbyyh

Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/13936
  
Just saw @yanboliang opened a jira for this too. I'll close the PR and 
resolve the jira.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13936: [SPARK-16243][ML] model loading backward compatib...

2016-06-27 Thread hhbyyh

Github user hhbyyh closed the pull request at:

https://github.com/apache/spark/pull/13936


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68699131
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -45,11 +45,11 @@ statement
 | ALTER DATABASE identifier SET DBPROPERTIES tablePropertyList 
#setDatabaseProperties
 | DROP DATABASE (IF EXISTS)? identifier (RESTRICT | CASCADE)?  
#dropDatabase
 | createTableHeader ('(' colTypeList ')')? tableProvider
-(OPTIONS tablePropertyList)?
+(OPTIONS optionParameterList)?
 (PARTITIONED BY partitionColumnNames=identifierList)?
 bucketSpec?
#createTableUsing
 | createTableHeader tableProvider
-(OPTIONS tablePropertyList)?
--- End diff --

Why not generalize the `tableProperty` rule and use `optionValue` (rename 
it to something more consistent) as its value rule? Seems easier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61353 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61353/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/13933
  
cc @rxin The code is ready for review. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
Hm... am I doing something wrong here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13517: [SPARK-14839][SQL] Support for other types as opt...

2016-06-27 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/13517#discussion_r68698738
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -252,6 +252,21 @@ tablePropertyKey
 | STRING
 ;
 
+optionParameterList
+: '(' optionParameter (',' optionParameter)* ')'
+;
+
+optionParameter
+: key=tablePropertyKey (EQ? value=optionValue)?
--- End diff --

We could remove `EQ?` here. This is actually not supported by data source 
tables. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61352/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61352 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13918
  
Yea it's good to have this in branch-2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61352 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61352/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13806
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13918
  
Thank you for merging, @liancheng ! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13914
  
Thank you for merging, @rxin .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/13918
  
Thanks, merged to master.

@rxin Shall we have this in branch-2.0 at this stage?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/13915
  
@mengxr 's idea sounds good to me, too.
May I update this PR, @rxin ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger vi...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13918


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61351 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61351/consoleFull)**
 for PR 13937 at commit 
[`ce04e08`](https://github.com/apache/spark/commit/ce04e08e5fff17ecdf47a1934ae8a453d051b67e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordRea...

2016-06-27 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13914


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13517: [SPARK-14839][SQL] Support for other types as option in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13517
  
cc @hvanhovell for this one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61337/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61337 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61337/consoleFull)**
 for PR 13806 at commit 
[`5ade5a2`](https://github.com/apache/spark/commit/5ade5a2b2aa5064baed055f4f26f9335d4cb0ca0).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61350 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61350/consoleFull)**
 for PR 13937 at commit 
[`8be63d5`](https://github.com/apache/spark/commit/8be63d5dbd8e3e62fd23248efa6be826e09e3ce3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13930: [SPARK-16228][SQL] HiveSessionCatalog should retu...

2016-06-27 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/13930#discussion_r68697699
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala ---
@@ -174,6 +175,18 @@ private[sql] class HiveSessionCatalog(
   // xpath_double, xpath_float, xpath_int, xpath_long, xpath_number,
   // xpath_short, and xpath_string.
   override def lookupFunction(name: FunctionIdentifier, children: 
Seq[Expression]): Expression = {
+try {
+  subLookupFunction(name, children)
+} catch {
--- End diff --

Thank you for advice, @hvanhovell .
Do you mean adding `ExpectsInputTypes` to `HiveSimpleUDF`, 
`HiveGenericUDF`, `HiveUDAFFunction`?
We only have 4 expressions to handle all generic Hive functions. So, 
currently, `makeFunctionBuilder` seems to type-checking by calling 
`udf.dataType` on the fly .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13914: [SPARK-16111][SQL][DOC] Hide SparkOrcNewRecordReader in ...

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13914
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13915: [SPARK-16081][BUILD] Disallow using `l` as variable name

2016-06-27 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/13915
  
yea I think you can argue this should be discouraged but not necessarily 
justify banning.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS

2016-06-27 Thread hqzizania

Github user hqzizania commented on the issue:

https://github.com/apache/spark/pull/13891
  
@mengxr  this is a simple imitation of the loop in `computeFactors[ID]()` 
ALS using. It runs on a bare-metal node with 4 cores. All tests use all cores 
by RDD multi-partitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13806
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61336/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...

2016-06-27 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/13937#discussion_r68697383
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala ---
@@ -206,24 +206,21 @@ object PCAModel extends MLReadable[PCAModel] {
 override def load(path: String): PCAModel = {
   val metadata = DefaultParamsReader.loadMetadata(path, sc, className)
 
-  // explainedVariance field is not present in Spark <= 1.6
-  val versionRegex = "([0-9]+)\\.([0-9]+).*".r
-  val hasExplainedVariance = metadata.sparkVersion match {
-case versionRegex(major, minor) =>
-  major.toInt >= 2 || (major.toInt == 1 && minor.toInt > 6)
-case _ => false
-  }
+  val versionRegex = "([0-9]+)\\.(.+)".r
+  val versionRegex(major, _) = metadata.sparkVersion
 
   val dataPath = new Path(path, "data").toString
-  val model = if (hasExplainedVariance) {
+  val model = if (major.toInt >= 2) {
 val Row(pc: DenseMatrix, explainedVariance: DenseVector) =
   sparkSession.read.parquet(dataPath)
 .select("pc", "explainedVariance")
 .head()
 new PCAModel(metadata.uid, pc, explainedVariance)
   } else {
-val Row(pc: DenseMatrix) = 
sparkSession.read.parquet(dataPath).select("pc").head()
-new PCAModel(metadata.uid, pc, 
Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector])
+// explainedVariance field is not present and we use the old 
matrix in Spark <= 2.0
+val Row(pc: OldDenseMatrix) = 
sparkSession.read.parquet(dataPath).select("pc").head()
+new PCAModel(metadata.uid, pc.asML,
+  Vectors.dense(Array.empty[Double]).asInstanceOf[DenseVector])
--- End diff --

Here we combine the ```explainedVariance``` field issue and the old matrix 
issue together to handle backward compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13806: [SPARK-16044][SQL] Backport input_file_name() for data s...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13806
  
**[Test build #61336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61336/consoleFull)**
 for PR 13806 at commit 
[`2a55091`](https://github.com/apache/spark/commit/2a550912f1194e9c212d9f4f78824eaf375ddccc).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13937
  
**[Test build #61349 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61349/consoleFull)**
 for PR 13937 at commit 
[`5246bcf`](https://github.com/apache/spark/commit/5246bcfa1ba510c281c456b0f61bf32f70d10174).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13937: [SPARK-16245] [ML] model loading backward compatibility ...

2016-06-27 Thread yanboliang

Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/13937
  
cc @hhbyyh @mengxr 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #13937: [SPARK-16245] [ML] model loading backward compati...

2016-06-27 Thread yanboliang

GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/13937

[SPARK-16245] [ML] model loading backward compatibility for ml.feature.PCA

## What changes were proposed in this pull request?
model loading backward compatibility for ml.feature.PCA.

## How was this patch tested?
existing ut and manual test for loading old models.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-16245

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/13937.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #13937


commit 5246bcfa1ba510c281c456b0f61bf32f70d10174
Author: Yanbo Liang 
Date:   2016-06-28T04:42:41Z

model loading backward compatibility for ml.feature.PCA




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS

2016-06-27 Thread hqzizania

Github user hqzizania commented on the issue:

https://github.com/apache/spark/pull/13891
  
code for testing

```
  def run(rank: Int, a:Int) = {
println(s"blas.getclass() = ${blas.getClass.toString} on process $rank")

val m = 1 << a
val n = 1 << a - 1
val stack = 1 << a - 2
val matrix = new Array[Array[Float]](m).map { x =>
  val y = new Array[Float](n)
  y.map(a => Random.nextFloat())
}
val bVector = new Array[Double](m).map(x => Random.nextDouble())
val ls = new NormalEquation(n)

for (u <- 0 to 3) {
  ls.reset()
  val t0 = System.nanoTime()
  for (i <- 0 until m)
ls.add(matrix(i), bVector(i))
  val t1 = System.nanoTime()
  println("nostack Elapsed time: " + (t1 - t0) / 100 + s"ms on 
process $rank")

  ls.reset()
  val t2 = System.nanoTime()
  var i = 0
  while (i < m) {
val matrixBuffer = mutable.ArrayBuilder.make[Double]
val bBuffer = mutable.ArrayBuilder.make[Double]
for (s <- 0 until stack) {
  for (j <- 0 until n) {
matrixBuffer += matrix(i + s)(j)
  }
  bBuffer += bVector(i + s)
}
i += stack
ls.addStack(matrixBuffer.result(), bBuffer.result(), stack)
  }
  val t3 = System.nanoTime()
  println("stack Elapsed time: " + (t3 - t2) / 100 + s"ms on 
process $rank")
}
  }

  class NormalEquation(val k: Int) extends Serializable {

/** Number of entries in the upper triangular part of a k-by-k matrix. 
*/
val triK = k * (k + 1) / 2
/** A^T^ * A */
val ata = new Array[Double](triK)
/** A^T^ * b */
val atb = new Array[Double](k)

private val da = new Array[Double](k)
private val ata2 = new Array[Double](k * k)
private val upper = "U"

private def copyToDouble(a: Array[Float]): Unit = {
  var i = 0
  while (i < k) {
da(i) = a(i)
i += 1
  }
}

private def copyToTri(): Unit = {
  var ii = 0
  for(i <- 0 until k)
for(j <- 0 to i) {
  ata(ii) += ata2(i * k + j)
  ata2(i * k + j) = 0
  ii += 1
}
}

/** Adds an observation. */
def add(a: Array[Float], b: Double, c: Double = 1.0): this.type = {
  require(c >= 0.0)
  require(a.length == k)
  copyToDouble(a)
  blas.dspr(upper, k, c, da, 1, ata)
  if (b != 0.0) {
blas.daxpy(k, c * b, da, 1, atb, 1)
  }
  this
}

/** Adds a stack of observations. */
def addStack(a: Array[Double], b: Array[Double], n: Int): this.type = {
  require(a.length == n * k)
  blas.dsyrk(upper, "N", k, n, 1.0, a, k, 1.0, ata2, k)
  copyToTri()
  blas.dgemv("N", k, n, 1.0, a, k, b, 1, 1.0, atb, 1)
  this
}

/** Merges another normal equation object. */
def merge(other: NormalEquation): this.type = {
  require(other.k == k)
  blas.daxpy(ata.length, 1.0, other.ata, 1, ata, 1)
  blas.daxpy(atb.length, 1.0, other.atb, 1, atb, 1)
  this
}

/** Resets everything to zero, which should be called after each solve. 
*/
def reset(): Unit = {
  ju.Arrays.fill(ata, 0.0)
  ju.Arrays.fill(ata2, 0.0)
  ju.Arrays.fill(atb, 0.0)
}
  }
```

results:


![image](https://cloud.githubusercontent.com/assets/9315372/16404009/6914f620-3d2d-11e6-9df4-3d838341794e.png)


![image](https://cloud.githubusercontent.com/assets/9315372/16403992/42797270-3d2d-11e6-8ecf-401796b29cfa.png)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13936
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13936
  
**[Test build #61346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61346/consoleFull)**
 for PR 13936 at commit 
[`077bc8f`](https://github.com/apache/spark/commit/077bc8f9a387d69ebb9c508f3122007155ad2fda).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13936: [SPARK-16243][ML] model loading backward compatibility f...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13936
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61346/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13935: [SPARK-16242] [MLlib] [PySpark] Conversion between old/n...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13935
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13935: [SPARK-16242] [MLlib] [PySpark] Conversion between old/n...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13935
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61345/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13935: [SPARK-16242] [MLlib] [PySpark] Conversion between old/n...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13935
  
**[Test build #61345 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61345/consoleFull)**
 for PR 13935 at commit 
[`1178933`](https://github.com/apache/spark/commit/11789339b0eab023bca61e24ac5e73f715a2d97a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13868
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13868
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61344/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13868: [SPARK-15899] [SQL] Fix the construction of the file pat...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13868
  
**[Test build #61344 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61344/consoleFull)**
 for PR 13868 at commit 
[`f247423`](https://github.com/apache/spark/commit/f24742308c61c9ed7f572b1f1aacfafda666571a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13933
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61342/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13933
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13933
  
**[Test build #61342 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61342/consoleFull)**
 for PR 13933 at commit 
[`bf1c9b5`](https://github.com/apache/spark/commit/bf1c9b5007f2054e5ff6ae9cbb2dc039706e9949).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13921
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61348/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13921
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13921
  
**[Test build #61348 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61348/consoleFull)**
 for PR 13921 at commit 
[`70c312f`](https://github.com/apache/spark/commit/70c312f683159b63089c25cf62dfab1074305191).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13603: [SPARK-15865][CORE] Blacklist should not result in job h...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13603
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13921
  
**[Test build #61348 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61348/consoleFull)**
 for PR 13921 at commit 
[`70c312f`](https://github.com/apache/spark/commit/70c312f683159b63089c25cf62dfab1074305191).
 * This patch **fails some tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ShowFunctionsCommand(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13603: [SPARK-15865][CORE] Blacklist should not result in job h...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13603
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61341/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13603: [SPARK-15865][CORE] Blacklist should not result in job h...

2016-06-27 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13603
  
**[Test build #61341 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61341/consoleFull)**
 for PR 13603 at commit 
[`f4e95c6`](https://github.com/apache/spark/commit/f4e95c624db60802d08fd0e64c68fa9e0593a086).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread keypointt

Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/13921
  
OK doing it now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13921: [SPARK-16140][MLlib][SparkR][Docs] Group k-means method ...

2016-06-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13921
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 613 matches

Mail list logo