date:20160726

[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14258
  
**[Test build #62912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62912/consoleFull)**
 for PR 14258 at commit 
[`22f2f78`](https://github.com/apache/spark/commit/22f2f786bceeb599645c12210e3f49e66378ba6c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14258
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62911/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14258
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14258
  
**[Test build #62911 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62911/consoleFull)**
 for PR 14258 at commit 
[`fa94e3c`](https://github.com/apache/spark/commit/fa94e3cc99e93aea708a609733bbe9364b904efe).
 * This patch **fails R style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14258: [Spark-16579][SparkR] add install_spark function

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14258
  
**[Test build #62911 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62911/consoleFull)**
 for PR 14258 at commit 
[`fa94e3c`](https://github.com/apache/spark/commit/fa94e3cc99e93aea708a609733bbe9364b904efe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14124
  
BTW, actually, this is not only about user-given schema.

Currently, it always reads data into dataframe by datasources based on 
`FileFormat` ignoring nullability in schema (for both user-given schema and 
inferred/read schema).

However, this does not happen when reading for streaming by the datasources 
(and another JSON api).

So, this PR tries to make them consistent to ignore the nullability in 
schema.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14124
  
Thanks for feedback @cloud-fan !

If the user-given schema is wrong, it is handled differently for each 
datasource specific.

- For JSON and CSV
  it is kind of permissive generally (for example, compatibility among 
numeric types).

- For ORC and Parquet
  Generally it is strict to types. So they don't allow the compatibility 
(except for very few cases, e.g. for parquet, 
https://github.com/apache/spark/pull/14272 and 
https://github.com/apache/spark/pull/14278)

I think so. Should we disallow specifying schemas for these? 
- For JDBC
  it does not take user-given schema since it does not implement 
`SchemaRelationProvider`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14257: [SPARK-16621][SQL] Generate stable SQLs in SQLBui...

2016-07-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14257


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14257: [SPARK-16621][SQL] Generate stable SQLs in SQLBuilder

2016-07-26 Thread liancheng

Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14257
  
LGTM, merging to master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14358: [SPARK-16729][SQL] Throw analysis exception for invalid ...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14358
  
**[Test build #62910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62910/consoleFull)**
 for PR 14358 at commit 
[`0161896`](https://github.com/apache/spark/commit/016189620d711f8e8abb0b2886b9b35ac1321911).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14358: [SPARK-16729][SQL] Throw analysis exception for invalid ...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14358
  
LGTM, pending jenkins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14124
  
What will happen if the given schema is wrong? It seems weird that we allow 
users to provide schema while reading the data, but without validating it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14358: [SPARK-16729][SQL] Throw analysis exception for i...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14358#discussion_r72380761
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -54,7 +54,9 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 // follow [[org.apache.spark.sql.catalyst.expressions.Cast.canCast]] 
logic
 // to ensure we test every possible cast situation here
 atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
-  checkNullCast(from, to)
+  if (Cast.canCast(from, to)) {
--- End diff --

removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14358: [SPARK-16729][SQL] Throw analysis exception for i...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14358#discussion_r72380667
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -54,7 +54,9 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 // follow [[org.apache.spark.sql.catalyst.expressions.Cast.canCast]] 
logic
 // to ensure we test every possible cast situation here
 atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
-  checkNullCast(from, to)
+  if (Cast.canCast(from, to)) {
--- End diff --

ah this is doing self casting - i read it wrong. let me remove it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14296: [SPARK-16639][SQL] The query with having conditio...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14296#discussion_r72380502
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1207,6 +1207,17 @@ class Analyzer(
 val alias = Alias(ae, ae.toString)()
 aggregateExpressions += alias
 alias.toAttribute
+  // Replacing [[NamedExpression]] causes the error on 
[[Grouping]] because the
+  // grouping column will be new attribute created by adding 
additional [[Alias]].
+  // So we can't find the grouping column and replace it in 
the rule
+  // [[ResolveGroupingAnalytics]].
--- End diff --

I don't quite understand this comment, can you give a concrete example?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14358: [SPARK-16729][SQL] Throw analysis exception for i...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14358#discussion_r72380321
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -54,7 +54,9 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 // follow [[org.apache.spark.sql.catalyst.expressions.Cast.canCast]] 
logic
 // to ensure we test every possible cast situation here
 atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
-  checkNullCast(from, to)
+  if (Cast.canCast(from, to)) {
--- End diff --

```
def canCast(from: DataType, to: DataType): Boolean = (from, to) match {
  case (fromType, toType) if fromType == toType => true
  ..
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14358: [SPARK-16729][SQL] Throw analysis exception for i...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14358#discussion_r72380199
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -54,7 +54,9 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 // follow [[org.apache.spark.sql.catalyst.expressions.Cast.canCast]] 
logic
 // to ensure we test every possible cast situation here
 atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
-  checkNullCast(from, to)
+  if (Cast.canCast(from, to)) {
--- End diff --

Not all atomicTypes can cast from each other? E.g. date.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72380101
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLCompatibilityFunctionSuite.scala
 ---
@@ -69,4 +73,25 @@ class SQLCompatibilityFunctionSuite extends QueryTest 
with SharedSQLContext {
   sql("SELECT nvl2(null, 1, 2.1d), nvl2('n', 1, 2.1d)"),
   Row(2.1, 1.0))
   }
+
+  test("SPARK-16730 cast alias functions for Hive compatibility") {
+checkAnswer(
+  sql("SELECT boolean(1), tinyint(1), smallint(1), int(1), bigint(1)"),
+  Row(true, 1.toByte, 1.toShort, 1, 1L))
+
+checkAnswer(
+  sql("SELECT float(1), double(1), decimal(1)"),
+  Row(1.toFloat, 1.0, new BigDecimal(1)))
+
+checkAnswer(
+  sql("SELECT date(\"2014-04-04\"), timestamp(date(\"2014-04-04\"))"),
+  Row(new java.util.Date(114, 3, 4), new Timestamp(114, 3, 4, 0, 0, 0, 
0)))
+
+checkAnswer(
+  sql("SELECT string(1)"),
+  Row("1"))
+
+// Error handling: only one argument
+assert(intercept[AnalysisException](sql("SELECT string(1, 
2)")).getMessage.contains("one arg"))
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14364
  
**[Test build #62909 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62909/consoleFull)**
 for PR 14364 at commit 
[`3b78da3`](https://github.com/apache/spark/commit/3b78da343c06b7f1df2a67136cda99b4b74bc0f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14358: [SPARK-16729][SQL] Throw analysis exception for i...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14358#discussion_r72379875
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 ---
@@ -54,7 +54,9 @@ class CastSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 // follow [[org.apache.spark.sql.catalyst.expressions.Cast.canCast]] 
logic
 // to ensure we test every possible cast situation here
 atomicTypes.zip(atomicTypes).foreach { case (from, to) =>
-  checkNullCast(from, to)
+  if (Cast.canCast(from, to)) {
--- End diff --

why this check? doesn;t `from` always equal to `to` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379526
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -408,8 +409,21 @@ object FunctionRegistry {
 expression[BitwiseAnd]("&"),
 expression[BitwiseNot]("~"),
 expression[BitwiseOr]("|"),
-expression[BitwiseXor]("^")
-
+expression[BitwiseXor]("^"),
+
+// Cast aliases (SPARK-16730)
+castAlias("boolean", BooleanType),
+castAlias("tinyint", ByteType),
+castAlias("smallint", ShortType),
+castAlias("int", IntegerType),
+castAlias("bigint", LongType),
--- End diff --

ok agree


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14364
  
mostly LGTM, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14358: [SPARK-16729][SQL] Throw analysis exception for invalid ...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14358
  
Is this good to merge?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379517
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -408,8 +409,21 @@ object FunctionRegistry {
 expression[BitwiseAnd]("&"),
 expression[BitwiseNot]("~"),
 expression[BitwiseOr]("|"),
-expression[BitwiseXor]("^")
-
+expression[BitwiseXor]("^"),
+
+// Cast aliases (SPARK-16730)
+castAlias("boolean", BooleanType),
+castAlias("tinyint", ByteType),
+castAlias("smallint", ShortType),
+castAlias("int", IntegerType),
+castAlias("bigint", LongType),
+castAlias("float", FloatType),
+castAlias("double", DoubleType),
+castAlias("decimal", DecimalType.USER_DEFAULT),
--- End diff --

This is not what Hive's default does, but what Spark SQL's cast default.

I think it is a bug, but I'm not sure if it is intentional. I suggest we 
change this in a separate pull request, since there is more than one place to 
check.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379516
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/SQLCompatibilityFunctionSuite.scala
 ---
@@ -69,4 +73,25 @@ class SQLCompatibilityFunctionSuite extends QueryTest 
with SharedSQLContext {
   sql("SELECT nvl2(null, 1, 2.1d), nvl2('n', 1, 2.1d)"),
   Row(2.1, 1.0))
   }
+
+  test("SPARK-16730 cast alias functions for Hive compatibility") {
+checkAnswer(
+  sql("SELECT boolean(1), tinyint(1), smallint(1), int(1), bigint(1)"),
+  Row(true, 1.toByte, 1.toShort, 1, 1L))
+
+checkAnswer(
+  sql("SELECT float(1), double(1), decimal(1)"),
+  Row(1.toFloat, 1.0, new BigDecimal(1)))
+
+checkAnswer(
+  sql("SELECT date(\"2014-04-04\"), timestamp(date(\"2014-04-04\"))"),
+  Row(new java.util.Date(114, 3, 4), new Timestamp(114, 3, 4, 0, 0, 0, 
0)))
+
+checkAnswer(
+  sql("SELECT string(1)"),
+  Row("1"))
+
+// Error handling: only one argument
+assert(intercept[AnalysisException](sql("SELECT string(1, 
2)")).getMessage.contains("one arg"))
--- End diff --

how about we use the full error message here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379420
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -408,8 +409,21 @@ object FunctionRegistry {
 expression[BitwiseAnd]("&"),
 expression[BitwiseNot]("~"),
 expression[BitwiseOr]("|"),
-expression[BitwiseXor]("^")
-
+expression[BitwiseXor]("^"),
+
+// Cast aliases (SPARK-16730)
+castAlias("boolean", BooleanType),
+castAlias("tinyint", ByteType),
+castAlias("smallint", ShortType),
+castAlias("int", IntegerType),
+castAlias("bigint", LongType),
+castAlias("float", FloatType),
+castAlias("double", DoubleType),
+castAlias("decimal", DecimalType.USER_DEFAULT),
--- End diff --

can you double check it with hive? what's the default decimal type in hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379213
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -408,8 +409,21 @@ object FunctionRegistry {
 expression[BitwiseAnd]("&"),
 expression[BitwiseNot]("~"),
 expression[BitwiseOr]("|"),
-expression[BitwiseXor]("^")
-
+expression[BitwiseXor]("^"),
+
+// Cast aliases (SPARK-16730)
+castAlias("boolean", BooleanType),
+castAlias("tinyint", ByteType),
+castAlias("smallint", ShortType),
+castAlias("int", IntegerType),
+castAlias("bigint", LongType),
--- End diff --

I think that's actually worse, because it makes it less clear what the 
function name is by looking at this source file. Also if for some reason we 
change LongType.simpleString in the future, these functions will subtly break.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14362: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14362
  
Closing this one in favor of https://github.com/apache/spark/pull/14364


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14362: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread petermaxlee

Github user petermaxlee closed the pull request at:

https://github.com/apache/spark/pull/14362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14375: [SPARK-15194] [ML] Add Python ML API for MultivariateGau...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14375
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14364: [SPARK-16730][SQL] Implement function aliases for...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14364#discussion_r72379046
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -408,8 +409,21 @@ object FunctionRegistry {
 expression[BitwiseAnd]("&"),
 expression[BitwiseNot]("~"),
 expression[BitwiseOr]("|"),
-expression[BitwiseXor]("^")
-
+expression[BitwiseXor]("^"),
+
+// Cast aliases (SPARK-16730)
+castAlias("boolean", BooleanType),
+castAlias("tinyint", ByteType),
+castAlias("smallint", ShortType),
+castAlias("int", IntegerType),
+castAlias("bigint", LongType),
--- End diff --

use `LongType.simpleString` instead of `bigint` looks better. Same to 
others.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13248: [SPARK-15194] [ML] Add Python ML API for MultivariateGau...

2016-07-26 Thread praveendareddy21

Github user praveendareddy21 commented on the issue:

https://github.com/apache/spark/pull/13248
  
Reopened the pull request on master branch.
https://github.com/apache/spark/pull/14375


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14375: [SPARK-15194] [ML] Add Python ML API for Multivar...

2016-07-26 Thread praveendareddy21

GitHub user praveendareddy21 opened a pull request:

https://github.com/apache/spark/pull/14375

[SPARK-15194] [ML] Add Python ML API for MultivariateGaussian

## What changes were proposed in this pull request?

Added Multivariate Gaussian and tests to match Scala's ML API.

Ran pep8 and other doc changes.
Reopening Pull request from 2.0 branch on admin's request.


## How was this patch tested?

Unit tests : MultiVariateGaussianTests

Also tested manually on local setup.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/praveendareddy21/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14375.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14375


commit aa5f64c1c7d84d88b1c972e5f18236af615bd89f
Author: red 
Date:   2016-07-27T04:04:57Z

added Multivariate Gaussian for ML API




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62908/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62908 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62908/consoleFull)**
 for PR 14333 at commit 
[`7f042a2`](https://github.com/apache/spark/commit/7f042a2172166d0de413297351b4fe9b04168071).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14341: [Minor][Doc][SQL] Fix two documents regarding siz...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14341#discussion_r72378681
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -122,10 +124,10 @@ object SQLConf {
 
   val DEFAULT_SIZE_IN_BYTES = 
SQLConfigBuilder("spark.sql.defaultSizeInBytes")
 .internal()
-.doc("The default table size used in query planning. By default, it is 
set to a larger " +
-  "value than `spark.sql.autoBroadcastJoinThreshold` to be more 
conservative. That is to say " +
-  "by default the optimizer will not choose to broadcast a table 
unless it knows for sure " +
-  "its size is small enough.")
+.doc("The default table size used in query planning. By default, it is 
set to Long.MaxValue " +
+  "which is more than `spark.sql.autoBroadcastJoinThreshold` to be 
more conservative. " +
--- End diff --

`which is larger than`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14207: [SPARK-16552] [SQL] Store the Inferred Schemas in...

2016-07-26 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14207#discussion_r72378623
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
---
@@ -252,6 +252,209 @@ class DDLSuite extends QueryTest with 
SharedSQLContext with BeforeAndAfterEach {
 }
   }
 
+  private def createDataSourceTable(
+  path: File,
+  userSpecifiedSchema: Option[String],
+  userSpecifiedPartitionCols: Option[String]): (StructType, 
Seq[String]) = {
--- End diff --

how about we pass in the expected schema and partCols, and do the check in 
this method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression w...

2016-07-26 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r72377491
  
--- Diff: R/pkg/R/mllib.R ---
@@ -292,6 +299,43 @@ setMethod("summary", signature(object = 
"NaiveBayesModel"),
 return(list(apriori = apriori, tables = tables))
   })
 
+#' Isotonic Regression Model
+#' Fits an Isotonic Regression model against a Spark DataFrame, similarly 
to R's isoreg().
+#' Users can print, make predictions on the produced model and save the 
model to the input path.
+#'
+#' @param data SparkDataFrame for training
+#' @param formula A symbolic description of the model to be fitted. 
Currently only a few formula
+#'operators are supported, including '~', '.', ':', '+', 
and '-'.
+#' @param isotonic Whether the output sequence should be 
isotonic/increasing (true) or
+#' antitonic/decreasing (false)
+#' @param featureIndex The index of the feature if \code{featuresCol} is a 
vector column (default: `0`),
+#' no effect otherwise
+#' @return \code{spark.isotonicRegression} returns a fitted Isotonic 
Regression model
+#' @rdname spark.isotonicRegression
+#' @name spark.isotonicRegression
+#' @export
--- End diff --

Add ```@examples```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression w...

2016-07-26 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r72377343
  
--- Diff: R/pkg/NAMESPACE ---
@@ -24,7 +24,8 @@ exportMethods("glm",
   "spark.kmeans",
   "fitted",
   "spark.naiveBayes",
-  "spark.survreg")
+  "spark.survreg",
+  "spark.isotonicRegression")
--- End diff --

Spark MLlib ```IsotonicRegression``` is more similar with R 
[```pava```](http://www.inside-r.org/packages/cran/Iso/docs/pava) ? Should it 
be named ```spark.pava``` better?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14182: [SPARK-16444][WIP][SparkR]: Isotonic Regression w...

2016-07-26 Thread yanboliang

Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/14182#discussion_r72376639
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/r/IsotonicRegressionWrapper.scala ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ml.r
+
+import org.apache.hadoop.fs.Path
+import org.json4s._
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods._
+
+import org.apache.spark.ml.{Pipeline, PipelineModel}
+import org.apache.spark.ml.attribute.{Attribute, AttributeGroup, 
NominalAttribute}
+import org.apache.spark.ml.feature.RFormula
+import org.apache.spark.ml.regression.{IsotonicRegression, 
IsotonicRegressionModel}
+import org.apache.spark.ml.util._
+import org.apache.spark.sql.{DataFrame, Dataset}
+
+private [r] class IsotonicRegressionWrapper private (
+val pipeline: PipelineModel,
+val labels: Array[String],
+val features: Array[String]) extends MLWritable {
+
+  private val isotonicRegressionModel: IsotonicRegressionModel =
+pipeline.stages(1).asInstanceOf[IsotonicRegressionModel]
+
+  lazy val boundaries: Array[Double] = 
isotonicRegressionModel.boundaries.toArray
+
+  lazy val predictions: Array[Double] = 
isotonicRegressionModel.predictions.toArray
+
+  def fitted(method: String): Array[Double] = {
+if (method == "boundaries") {
+  boundaries
+} else if (method == "predictions") {
+  predictions
+} else {
+  throw new UnsupportedOperationException(
+s"Method (boundaries or predictions) required but $method found.")
+}
+  }
+
+  def transform(dataset: Dataset[_]): DataFrame = {
+
pipeline.transform(dataset).drop(isotonicRegressionModel.getFeaturesCol)
+  }
+
+  override def write: MLWriter = new 
IsotonicRegressionWrapper.IsotonicRegressionWrapperWriter(this)
+}
+
+private[r] object IsotonicRegressionWrapper
+extends MLReadable[IsotonicRegressionWrapper] {
+
+  def fit(
+  data: DataFrame,
+  formula: String,
+  isotonic: Boolean,
+  featureIndex: Int): IsotonicRegressionWrapper = {
+
+val rFormulaModel = new RFormula()
+  .setFormula(formula)
+  .fit(data)
+
+// get feature names from output schema
+val schema = rFormulaModel.transform(data).schema
+val labelAttr = 
Attribute.fromStructField(schema(rFormulaModel.getLabelCol))
+  .asInstanceOf[NominalAttribute]
+val labels = labelAttr.values.get
--- End diff --

Since ```IsotonicRegression``` is a regression model, so it's unnecessary 
to extract labels from column metadata (actually we did not save 
```NominalAtrribute``` for regression model). Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62908 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62908/consoleFull)**
 for PR 14333 at commit 
[`7f042a2`](https://github.com/apache/spark/commit/7f042a2172166d0de413297351b4fe9b04168071).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14333
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62907/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62907 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62907/consoleFull)**
 for PR 14333 at commit 
[`dc17da8`](https://github.com/apache/spark/commit/dc17da8eec232fcf2296deefb64222a6d07a0983).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-26 Thread dongjoon-hyun

Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72375060
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

Hi, @yhuai .
Could you give me some advice?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread ericl

Github user ericl commented on the issue:

https://github.com/apache/spark/pull/14311
  
Looks like the second jenkins run failed a slightly different set of the 
tests than the first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14311
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14311
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62906/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14311
  
**[Test build #62906 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62906/consoleFull)**
 for PR 14311 at commit 
[`7cccb39`](https://github.com/apache/spark/commit/7cccb39ec967df68427304a605dd52deade11573).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14333
  
**[Test build #62907 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62907/consoleFull)**
 for PR 14333 at commit 
[`dc17da8`](https://github.com/apache/spark/commit/dc17da8eec232fcf2296deefb64222a6d07a0983).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-07-26 Thread lirui-intel

Github user lirui-intel commented on the issue:

https://github.com/apache/spark/pull/12775
  
Could anybody help review this PR? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/13824
  
That is fine too. I'd just do something like:

```
Seq("PYSPARK_DRIVER_PYTHON", "PYSPARK_PYTHON").foreach { envname =>
  // code to set the value
}
```

To avoid the repetition. BTW I just noticed you have a typo in the PR title.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread KevinGrealish

Github user KevinGrealish commented on the issue:

https://github.com/apache/spark/pull/13824
  
How about just this:
`  // propagate PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON to driver in 
cluster mode`
`  if (!env.contains("PYSPARK_DRIVER_PYTHON")) {`
`
sys.env.get("PYSPARK_DRIVER_PYTHON").foreach(env("PYSPARK_DRIVER_PYTHON") = _)`
`  }`
`  if (!env.contains("PYSPARK_PYTHON")) {`
`sys.env.get("PYSPARK_PYTHON").foreach(env("PYSPARK_PYTHON") = _)`
`  }`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14296: [SPARK-16639][SQL] The query with having condition that ...

2016-07-26 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14296
  
ping @cloud-fan Any more comments? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14333: [SPARK-16696][ML][MLLib] destroy KMeans bcNewCenters whe...

2016-07-26 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/14333
  
@srowen 
I check `RDD.persist` referenced place:
AFTSuvivalRegression, LinearRegression, LogisticRegression, will persist 
input training RDD and unpersist them when `train` return, seems OK.
`recommend.ALS` persist many RDDs and seems unpersist them all OK.
mllib `BisectingKMeans.run` contains a TODO "unpersist old indices", I'll 
check it now.
Others seems OK.

`Broadcast.persist` referenced place already checked in this PR I think 
they are all properly handled here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14341: [Minor][Doc][SQL] Fix two documents regarding size in by...

2016-07-26 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14341
  
ping @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14270
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14270
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62904/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14270
  
**[Test build #62904 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62904/consoleFull)**
 for PR 14270 at commit 
[`8923c58`](https://github.com/apache/spark/commit/8923c58d324b8083ffb423d165f4707ec4395db2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14270
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62903/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14270
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14340: [SPARK-16534][Streaming][Kafka] Add Python API support f...

2016-07-26 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14340
  
So I would like to -1 this patch. I think it's been a mistake to support 
dstream in Python -- yes it satisfies a checkbox and Spark could claim there's 
support for streaming in Python. However, the tooling and maturity for working 
with streaming data (both in Spark and the more broad ecosystem) is simply not 
there. It is a big baggage to maintain, and creates a the wrong impression that 
production streaming jobs can be written in Python.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14270
  
**[Test build #62903 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62903/consoleFull)**
 for PR 14270 at commit 
[`b9c9a7a`](https://github.com/apache/spark/commit/b9c9a7aa2b831247ae04d655f537223a02bc8440).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14175: [SPARK-16522][MESOS] Spark application throws exc...

2016-07-26 Thread sun-rui

Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/14175#discussion_r72368797
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
 ---
@@ -552,7 +552,12 @@ private[spark] class 
MesosCoarseGrainedSchedulerBackend(
   taskId: String,
   reason: String): Unit = {
 stateLock.synchronized {
-  removeExecutor(taskId, SlaveLost(reason))
+  // Do not call removeExecutor() after this scheduler backend was 
stopped because
--- End diff --

what about submitting another JIRA issue on better handling of state 
management after stop() is called for CoarseGrainedSchedulerBackend? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14175: [SPARK-16522][MESOS] Spark application throws exception ...

2016-07-26 Thread sun-rui

Github user sun-rui commented on the issue:

https://github.com/apache/spark/pull/14175
  
Sure, will add it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/13824
  
I think there are two ways to solve the problem that might be a little 
better...

The first is to try to keep the current behavior. If, in L734 (where you're 
removing the current code that checks the `appMasterEnv` conf), you make a copy 
of the keys that were added to the `env` variable, you can then apply 
`addPathToEnvironment` just to the keys that are there. That means that the 
code will merge any user configuration with env variables created by Spark 
itself; otherwise it will use the user's override.

The second is to read certain env variables using a special method that 
first looks at `spark.yarn.appMasterEnv.FOO` and if it doesn't exist, 
`sys.env("FOO")`. Then you could modify the code that currently reads 
`PYSPARK_DRIVER_PYTHON` and friends using that new method, instead of directly 
peeking at `sys.env`. You could also apply that new method to the code that 
currently reads `PYTHONPATH`.

I think the latter is a better solution than you currently have, since it 
avoids hardcoding these env variable names in more places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread ericl

Github user ericl commented on the issue:

https://github.com/apache/spark/pull/14311
  
Ok weird, I can't reproduce locally any more, even after many tries. I 
wonder if it's just very rarely flaky, though that seems unlikely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14311: [SPARK-16550] [core] Certain classes fail to deserialize...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14311
  
**[Test build #62906 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62906/consoleFull)**
 for PR 14311 at commit 
[`7cccb39`](https://github.com/apache/spark/commit/7cccb39ec967df68427304a605dd52deade11573).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14349: [SPARK-16524][SQL] Add RowBatch and RowBasedHashM...

2016-07-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14349


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14065: [SPARK-14743][YARN] Add a configurable credential manage...

2016-07-26 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/14065
  
I'm having trouble finding the bandwidth to look at the updated patch, but 
it's on my list... there were some replies to my comments that I want to take a 
closer look at.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14349: [SPARK-16524][SQL] Add RowBatch and RowBasedHashMapGener...

2016-07-26 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14349
  
Merging in master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14349: [SPARK-16524][SQL] Add RowBatch and RowBasedHashM...

2016-07-26 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14349#discussion_r72365970
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/VariableLengthRowBasedKeyValueBatch.java
 ---
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst.expressions;
+
+import org.apache.spark.memory.TaskMemoryManager;
+import org.apache.spark.sql.types.*;
+import org.apache.spark.unsafe.Platform;
+
+/**
+ * An implementation of `RowBasedKeyValueBatch` in which key-value records 
have variable lengths.
+ *
+ *  The format for each record looks like this:
+ * [4 bytes total size = (klen + vlen + 4)] [4 bytes key size = klen]
+ * [UnsafeRow for key of length klen] [UnsafeRow for Value of length vlen]
+ * [8 bytes pointer to next]
+ * Thus, record length = 4 + 4 + klen + vlen + 8
+ */
+public final class VariableLengthRowBasedKeyValueBatch extends 
RowBasedKeyValueBatch {
--- End diff --

you can write the test suites in scala -- it tends to simplify the code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14364
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14364
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62902/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14364: [SPARK-16730][SQL] Implement function aliases for type c...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14364
  
**[Test build #62902 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62902/consoleFull)**
 for PR 14364 at commit 
[`b8fbcab`](https://github.com/apache/spark/commit/b8fbcab1d5bf78f10b0edce2a1011080a38f4fc6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13152: [SPARK-15353] [CORE] Making peer selection for block rep...

2016-07-26 Thread ericl

Github user ericl commented on the issue:

https://github.com/apache/spark/pull/13152
  
> The topology info is only queried when the executor initiates and is 
assumed to stay the same throughout the life of the executor. Depending on the 
cluster manager being used, I am assuming the exact way this information is 
provided may differ. Resolving this at the master makes this implementation 
simpler as only the master needs to be able to access the service/script/class 
being used to resolve the topology. The communication overhead is minimal as 
the executors do have to communicate with the master when they initiate anyways.

I see, that makes sense, though it is a little weird to ask the master for 
info that you use to register right away later.

> The getRandomPeer() method was doing quite a bit more than just getting a 
random peer. It was being used to manage/mutate state, which was being mutated 
in other places as well. I tried to keep the block placement strategy and the 
usage of its output separate, to make it simpler to provide a new block 
placement strategy. I also thought it would be best to de-couple any internal 
replication state management with the block replication strategy, while still 
keeping the structure of the state the same.

Still, I think it would be a smaller change to just move some of that logic 
out of getRandomPeer(), and retain the rest. Then you just need to implement 
getNextPeer(), and BlockManager doesn't need to worry about tracking the 
prioritized order internally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14226: [SPARK-16580][CORE] class Accumulator in package spark i...

2016-07-26 Thread keypointt

Github user keypointt commented on the issue:

https://github.com/apache/spark/pull/14226
  
@srowen sorry, I had a very busy week passed, now I have time to look into 
it. Will keep you posted :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14231: [SPARK-16586] Change the way the exit code of lau...

2016-07-26 Thread zasdfgbnm

Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/14231#discussion_r72364403
  
--- Diff: bin/spark-class ---
@@ -65,24 +65,25 @@ fi
 # characters that would be otherwise interpreted by the shell. Read that 
in a while loop, populating
 # an array that will be used to exec the final command.
 #
-# The exit code of the launcher is appended to the output, so the parent 
shell removes it from the
-# command array and checks the value to see if the launcher succeeded.
-build_command() {
-  "$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" 
org.apache.spark.launcher.Main "$@"
-  printf "%d\0" $?
-}
+# To keep both the output and the exit code of the launcher, the output is 
first converted to a hex
+# dump which prevents the bash from getting rid of the NULL character, and 
the exit code retrieved
+# from the bash array ${PIPESTATUS[@]}.
+#
+# Note that the seperator NULL character can not be replace with space or 
'\n' so that the command
+# won't fail if some path of the user contain special characher such as 
'\n' or space
+#
+# Also note that when the launcher fails, it might not output something 
ending with '\0' [SPARK-16586]
+_CMD=$("$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" 
org.apache.spark.launcher.Main "$@"|xxd -p|tr -d '\n';exit ${PIPESTATUS[0]})
--- End diff --

The launcher doesn't actually launch anything, instead, it just output a 
command that should be used to launch the desired class, separated by .`\0``


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14231: [SPARK-16586] Change the way the exit code of lau...

2016-07-26 Thread zasdfgbnm

Github user zasdfgbnm commented on a diff in the pull request:

https://github.com/apache/spark/pull/14231#discussion_r72364159
  
--- Diff: bin/spark-class ---
@@ -65,24 +65,25 @@ fi
 # characters that would be otherwise interpreted by the shell. Read that 
in a while loop, populating
 # an array that will be used to exec the final command.
 #
-# The exit code of the launcher is appended to the output, so the parent 
shell removes it from the
-# command array and checks the value to see if the launcher succeeded.
-build_command() {
-  "$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" 
org.apache.spark.launcher.Main "$@"
-  printf "%d\0" $?
-}
+# To keep both the output and the exit code of the launcher, the output is 
first converted to a hex
+# dump which prevents the bash from getting rid of the NULL character, and 
the exit code retrieved
+# from the bash array ${PIPESTATUS[@]}.
+#
+# Note that the seperator NULL character can not be replace with space or 
'\n' so that the command
+# won't fail if some path of the user contain special characher such as 
'\n' or space
+#
+# Also note that when the launcher fails, it might not output something 
ending with '\0' [SPARK-16586]
+_CMD=$("$RUNNER" -Xmx128m -cp "$LAUNCH_CLASSPATH" 
org.apache.spark.launcher.Main "$@"|xxd -p|tr -d '\n';exit ${PIPESTATUS[0]})
--- End diff --

If the launcher fails, it is sufficient to terminate the script and exit 
with the nonzero `$?`. But if it success, the the output, which contains `\0`, 
should be used to start a new command. That's why when we execute `"$RUNNER" 
-Xmx128m ...`, we should try to store both the exit code and the output.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13824
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62905/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13824
  
**[Test build #62905 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62905/consoleFull)**
 for PR 13824 at commit 
[`f2c2e4a`](https://github.com/apache/spark/commit/f2c2e4a82ed44d367db67f9382024a619b688104).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13824
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread marmbrus

Github user marmbrus commented on the issue:

https://github.com/apache/spark/pull/14124
  
@cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14172: [SPARK-16516][SQL] Support for pushing down filters for ...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14172
  
(cc @liancheng)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14124: [SPARK-16472][SQL] Inconsistent nullability in schema af...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14124
  
gentle ping @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14102: [SPARK-16434][SQL] Avoid per-record type dispatch in JSO...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14102
  
@yhuai I addressed the comments!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13988: [SPARK-16101][SQL] Refactoring CSV data source to be con...

2016-07-26 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/13988
  
@rxin Could you take a look please?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13824
  
**[Test build #62905 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62905/consoleFull)**
 for PR 13824 at commit 
[`f2c2e4a`](https://github.com/apache/spark/commit/f2c2e4a82ed44d367db67f9382024a619b688104).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13824: [SPARK-16110][YARN][PYSPARK] Fix allowing python version...

2016-07-26 Thread KevinGrealish

Github user KevinGrealish commented on the issue:

https://github.com/apache/spark/pull/13824
  
Created https://issues.apache.org/jira/browse/SPARK-16744 for the 
override/append issue. linked to 16110. This fix remains just about being able 
to run Python 3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14374: [SPARK-16735][SQL] `map` should create a decimal key or ...

2016-07-26 Thread biglobster

Github user biglobster commented on the issue:

https://github.com/apache/spark/pull/14374
  
@dongjoon-hyun  thank you, and I have just update the title of this pull 
request with the jira_id 

> SPARK-16735 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14270
  
**[Test build #62904 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62904/consoleFull)**
 for PR 14270 at commit 
[`8923c58`](https://github.com/apache/spark/commit/8923c58d324b8083ffb423d165f4707ec4395db2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14373: [SPARK-16740][SQL] Fix Long overflow in LongToUnsafeRowM...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14373
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62901/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14373: [SPARK-16740][SQL] Fix Long overflow in LongToUnsafeRowM...

2016-07-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14373
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14373: [SPARK-16740][SQL] Fix Long overflow in LongToUnsafeRowM...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14373
  
**[Test build #62901 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62901/consoleFull)**
 for PR 14373 at commit 
[`a30ca9f`](https://github.com/apache/spark/commit/a30ca9f4cfde295a811cbe144d6cf165be1227c2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14370: [SPARK-16713][SQL] Check codegen method size ≤ 8K on c...

2016-07-26 Thread lw-lin

Github user lw-lin commented on the issue:

https://github.com/apache/spark/pull/14370
  
@davies would you also take a look? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14270
  
**[Test build #62903 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62903/consoleFull)**
 for PR 14270 at commit 
[`b9c9a7a`](https://github.com/apache/spark/commit/b9c9a7aa2b831247ae04d655f537223a02bc8440).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14270: [SPARK-5847][CORE] Allow for configuring MetricsSystem's...

2016-07-26 Thread markgrover

Github user markgrover commented on the issue:

https://github.com/apache/spark/pull/14270
  
Fixed the nits, resolved the merge conflict. Shortened some test names.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14270: [SPARK-5847][CORE] Allow for configuring MetricsS...

2016-07-26 Thread markgrover

Github user markgrover commented on a diff in the pull request:

https://github.com/apache/spark/pull/14270#discussion_r72357797
  
--- Diff: core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala 
---
@@ -125,19 +126,26 @@ private[spark] class MetricsSystem private (
* application, executor/driver and metric source.
*/
   private[spark] def buildRegistryName(source: Source): String = {
-val appId = conf.getOption("spark.app.id")
+val metricsNamespace = conf.get(METRICS_NAMESPACE).map(Some(_))
--- End diff --

Good point, thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14270: [SPARK-5847][CORE] Allow for configuring MetricsS...

2016-07-26 Thread markgrover

Github user markgrover commented on a diff in the pull request:

https://github.com/apache/spark/pull/14270#discussion_r72357828
  
--- Diff: 
core/src/test/scala/org/apache/spark/metrics/MetricsSystemSuite.scala ---
@@ -183,4 +184,89 @@ class MetricsSystemSuite extends SparkFunSuite with 
BeforeAndAfter with PrivateM
 assert(metricName != s"$appId.$executorId.${source.sourceName}")
 assert(metricName === source.sourceName)
   }
+
+  test("MetricsSystem with Executor instance, with custom namespace") {
+val source = new Source {
+  override val sourceName = "dummySource"
+  override val metricRegistry = new MetricRegistry()
+}
+
+val appId = "testId"
+val appName = "testName"
+val executorId = "1"
+conf.set("spark.app.id", appId)
+conf.set("spark.app.name", appName)
+conf.set("spark.executor.id", executorId)
+conf.set(METRICS_NAMESPACE, "${spark.app.name}")
+
+val instanceName = "executor"
+val driverMetricsSystem = 
MetricsSystem.createMetricsSystem(instanceName, conf, securityMgr)
+
+val metricName = driverMetricsSystem.buildRegistryName(source)
+assert(metricName === s"$appName.$executorId.${source.sourceName}")
+  }
+
+  test("MetricsSystem with Executor instance and custom namespace which is 
not set") {
+val source = new Source {
+  override val sourceName = "dummySource"
+  override val metricRegistry = new MetricRegistry()
+}
+
+val executorId = "1"
+val namespaceToResolve = "${spark.doesnotexist}"
+conf.set("spark.executor.id", executorId)
+conf.set(METRICS_NAMESPACE, namespaceToResolve)
+
+val instanceName = "executor"
+val driverMetricsSystem = 
MetricsSystem.createMetricsSystem(instanceName, conf, securityMgr)
+
+val metricName = driverMetricsSystem.buildRegistryName(source)
+// If the user set the spark.metrics.namespace property to an 
expansion of another property
+// (say ${spark.doesnotexist}, the unresolved name (i.e. litterally 
${spark.doesnot})
--- End diff --

Appreciate your thoroughness!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 409 matches

Mail list logo