date:20151220

[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9593#issuecomment-166178142
  
**[Test build #48091 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48091/consoleFull)**
 for PR 9593 at commit 
[`2d750c4`](https://github.com/apache/spark/commit/2d750c4c1cedaff9849137710b58242bcd15bef9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10283#issuecomment-166184502
  
**[Test build #48094 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48094/consoleFull)**
 for PR 10283 at commit 
[`2500de3`](https://github.com/apache/spark/commit/2500de3ba716ad93dca8001f5fde6c670c898416).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10399#discussion_r48114340
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/ScalaReflectionRelationSuite.scala 
---
@@ -138,4 +144,16 @@ class ScalaReflectionRelationSuite extends 
SparkFunSuite with SharedSQLContext {
   Map(10 -> 100L, 20 -> 200L, 30 -> null),
   Row(null, "abc"
   }
+
+  test("decimal type with ScalaReflection") {
--- End diff --

Can we write the test in `ExpressionEncoderSuite`?  just add a line
`encodeDecodeTest(Decimal("32131413.211321313"), "catalyst decimal")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9593#issuecomment-166185197
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48115321
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -131,6 +131,7 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   df.explode('letters) {
 case Row(letters: String) => letters.split(" 
").map(Tuple1(_)).toSeq
   }
+assert(!df2.queryExecution.toString.contains("!"))
--- End diff --

how about `assert(df2.queryExecution.executedPlan.resolved)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12437][SQL] [WIP] Encapsulate the table...

2015-12-20 Thread naveenminchu

Github user naveenminchu commented on the pull request:

https://github.com/apache/spark/pull/10403#issuecomment-166203008
  
@rxin Agree 100%


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48117154
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -131,6 +131,7 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   df.explode('letters) {
 case Row(letters: String) => letters.split(" 
").map(Tuple1(_)).toSeq
   }
+assert(!df2.queryExecution.toString.contains("!"))
--- End diff --

@cloud-fan I like your suggestion!

`resolved` is not defined in `SparkPlan`. It is only available in 
`LogicalPlan`. I am not sure if you want me to define it in `SparkPlan` and 
overwrite it if necessary like what we did in `LogicalPlan`?

Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10398#issuecomment-166212601
  
**[Test build #48095 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48095/consoleFull)**
 for PR 10398 at commit 
[`4c745f5`](https://github.com/apache/spark/commit/4c745f5256700b160a10f0077be49e77a10e758b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6624][SQL]Convert filters into CNF for ...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/8200#issuecomment-166212614
  
It sounds like multiple PRs are blocked by this PR. I will submit a PR for 
fixing it tomorrow. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10398#issuecomment-166212830
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48095/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2015-12-20 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10152#issuecomment-166215753
  
Pinging @jkbradley @mengxr @MechCoder again for a final review - could you 
give this a look and confirm you're in agreement with my comments above. 

Also thoughts on whether this should target `1.6.1` - as it is actually a 
fairly major yet subtle bug in the implementation. Or even be backported to 
`1.5.3`? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/10331#issuecomment-166215771
  
Yeah, let's use this PR for the runtime check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232] New R API for read.table to avoi...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10406#issuecomment-166169310
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48090/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166170888
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48089/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166170840
  
**[Test build #48089 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48089/consoleFull)**
 for PR 10405 at commit 
[`0a46559`](https://github.com/apache/spark/commit/0a4655999772eed9296de438a61319765389e588).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...

2015-12-20 Thread echoTomei

GitHub user echoTomei opened a pull request:

https://github.com/apache/spark/pull/10407

[SPARK-12396][Core]Once driver connect to a master successfully, stop it 
connect to master again.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echoTomei/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10407.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10407


commit 5bee290e357d8f855bcf22393fd076a8301f1001
Author: echo2mei <534384...@qq.com>
Date:   2015-12-17T08:28:31Z

Once driver register successfully, stop it to connect master again.

commit 7959c1f75cd34e46ceda011ec11ce56e8e166fd1
Author: echo2mei <534384...@qq.com>
Date:   2015-12-21T01:57:25Z

[SPARK-12396][Core] Cancel the driver retry thread once it register 
successfull.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...

2015-12-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/10399#issuecomment-166181467
  
cc @cloud-fan @marmbrus @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12439][SQL] Fix toCatalystArray and Map...

2015-12-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/10391#issuecomment-166181451
  
cc @cloud-fan @marmbrus @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12438][SQL] Add SQLUserDefinedType supp...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10390#discussion_r48113653
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala ---
@@ -89,6 +94,23 @@ class UserDefinedTypeSuite extends QueryTest with 
SharedSQLContext with ParquetT
 assert(featuresArrays.contains(new MyDenseVector(Array(0.2, 2.0
   }
 
+  test("user type with ScalaReflection") {
+val points = Seq(
+  MyLabeledPoint(1.0, new MyDenseVector(Array(0.1, 1.0))),
+  MyLabeledPoint(0.0, new MyDenseVector(Array(0.2, 2.0
+
+val schema = 
ScalaReflection.schemaFor[MyLabeledPoint].dataType.asInstanceOf[StructType]
+val attributeSeq = schema.toAttributes
+
+val pointEncoder = encoderFor[MyLabeledPoint]
+val unsafeRows = points.map(pointEncoder.toRow(_).copy())
--- End diff --

can we also test `encoder.fromRow`?

we can just create a `MyLabelPoint` and encode it to `InternalRow` and 
decode it back by encoder, and check if the decoded `MyLabelPoint` is same with 
the original one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12327][SPARKR] fix code for lintr warni...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10408#issuecomment-166184144
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166195414
  
I'm with @markhamstra here. It is unclear what ! or ? mean. They are 
unintuitive, and are not general symbols for schema construction or 
nullability. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166195514
  
hi @markhamstra , defining a schema is very frequent when writing test, do 
you feel it's a good idea to put this functionality only in test scope?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10398#issuecomment-166201463
  
Thank you! @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/10331#discussion_r48116699
  
--- Diff: sql/core/src/test/resources/log4j.properties ---
@@ -33,7 +33,7 @@ log4j.appender.FA.layout=org.apache.log4j.PatternLayout
 log4j.appender.FA.layout.ConversionPattern=%d{HH:mm:ss.SSS} %t %p %c{1}: 
%m%n
 
 # Set the logger level of File Appender to WARN
-log4j.appender.FA.Threshold = INFO
+log4j.appender.FA.Threshold = TRACE
--- End diff --

Revert these?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48117166
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Generate.scala ---
@@ -51,9 +52,12 @@ case class Generate(
 join: Boolean,
 outer: Boolean,
 output: Seq[Attribute],
+generatorOutput: Seq[Attribute],
 child: SparkPlan)
   extends UnaryNode {
 
+  override def missingInput: AttributeSet = super.missingInput -- 
generatorOutput
+
--- End diff --

Thank you, @viirya and @cloud-fan ! Just did the change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10393#issuecomment-166203846
  
**[Test build #48096 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48096/consoleFull)**
 for PR 10393 at commit 
[`6b4ba74`](https://github.com/apache/spark/commit/6b4ba7458398ecd74c394fba0b062b2d8bfa8752).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48119148
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -131,6 +131,7 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   df.explode('letters) {
 case Row(letters: String) => letters.split(" 
").map(Tuple1(_)).toSeq
   }
+assert(!df2.queryExecution.toString.contains("!"))
--- End diff --

Thank you! @cloud-fan I did the change as you suggested. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12085] [SQL] The join condition hidden ...

2015-12-20 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/10087#issuecomment-166212178
  
@flyson Great work though, you'd better to work with #8200 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2015-12-20 Thread MLnick

Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/10152#discussion_r48119953
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala ---
@@ -77,6 +77,20 @@ class Word2Vec extends Serializable with Logging {
   private var numIterations = 1
   private var seed = Utils.random.nextLong()
   private var minCount = 5
+  private var maxSentenceLength = 1000
+
+  /**
+   * sets the maxSentenceLength, maxSentenceLength is used as the 
threshold for cutting sentence
--- End diff --

One final thing - can you address the comment above? And I think we can 
actually remove the `@param` and `@return` to match the comments for the other 
setters in this class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12153][SPARK-7617][MLlib]add support of...

2015-12-20 Thread MLnick

Github user MLnick commented on the pull request:

https://github.com/apache/spark/pull/10152#issuecomment-166215472
  
@ygcao just one final comment on the `setMaxSentenceLength` setter comment 
to address, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread ajbozarth

GitHub user ajbozarth opened a pull request:

https://github.com/apache/spark/pull/10405

[SPARK-12339] [WebUI] Added a null check that was removed in SPARK-11206

Updates made in SPARK-11206 missed an edge case which cause's a 
NullPointerException when a task is killed. In some cases when a task ends in 
failure taskMetrics is initialized as null (see 
JobProgressListener.onTaskEnd()). To address this a null check was added. 
Before the changes in SPARK-11206 this null check was called at the start of 
the updateTaskAccumulatorValues() function.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ajbozarth/spark spark12339

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10405.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10405


commit 0a4655999772eed9296de438a61319765389e588
Author: Alex Bozarth 
Date:   2015-12-20T02:54:53Z

Added null check that was removed in SPARK-11206




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232] New R API for read.table to avoi...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10406#issuecomment-166169309
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12327][SPARKR] fix code for lintr warni...

2015-12-20 Thread felixcheung

GitHub user felixcheung opened a pull request:

https://github.com/apache/spark/pull/10408

[SPARK-12327][SPARKR] fix code for lintr warning for commented code

@shivaram 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/felixcheung/spark rcodecomment

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10408.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10408


commit a4f47a2e31d908a1214e3a680cbe34b28e5f6049
Author: felixcheung 
Date:   2015-12-21T02:13:44Z

fix code for lintr warning for commented code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12327][SPARKR] fix code for lintr warni...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10408#issuecomment-166184145
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48092/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12327][SPARKR] fix code for lintr warni...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10408#issuecomment-166184083
  
**[Test build #48092 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48092/consoleFull)**
 for PR 10408 at commit 
[`a4f47a2`](https://github.com/apache/spark/commit/a4f47a2e31d908a1214e3a680cbe34b28e5f6049).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread rxin

Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166195733
  
Note that we also have the "ColumnName" implicit. Using that you can 
already define a struct field using: `'fieldName.int`.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12392][Core] Optimize a location order ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10346#issuecomment-166199376
  
**[Test build #48093 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48093/consoleFull)**
 for PR 10346 at commit 
[`d962f15`](https://github.com/apache/spark/commit/d962f15e186bfe77d3fb3e5e4ec44d10b5523c0f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12392][Core] Optimize a location order ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10346#issuecomment-166199432
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48093/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread markhamstra

Github user markhamstra commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166197805
  
@cloud-fan You're kind of hinting at my point: To me, this DSL seems to 
make life easier for Spark developers, not Spark users.  In that kind of 
trade-off, we should always opt for making things easier for users.

Putting this functionality in test scope or otherwise hiding it from the 
public API isn't as troubling, but having secret or unintuitive shortcuts that 
only Spark developers use will make getting up to speed more difficult for 
people looking to contribute to Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48118001
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
---
@@ -131,6 +131,7 @@ class DataFrameSuite extends QueryTest with 
SharedSQLContext {
   df.explode('letters) {
 case Row(letters: String) => letters.split(" 
").map(Tuple1(_)).toSeq
   }
+assert(!df2.queryExecution.toString.contains("!"))
--- End diff --

ah, I think we should not add `resolved` to `SparkPlan` for this purpose,
how about `assert(df2.queryExecution.executedPlan.missingInput.isEmpty)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12446][SQL] Add unit tests for JDBCRDD ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10409#issuecomment-166211388
  
**[Test build #48097 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48097/consoleFull)**
 for PR 10409 at commit 
[`ed94623`](https://github.com/apache/spark/commit/ed94623cb01e36e790824903b9e937495cae3942).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10393#issuecomment-166213409
  
**[Test build #48098 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48098/consoleFull)**
 for PR 10393 at commit 
[`63058e3`](https://github.com/apache/spark/commit/63058e32ebe178616af54702852a9e83fa025df9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread ajbozarth

Github user ajbozarth commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166165475
  
FYI the line in JobProgressListener.onTaskEnd that initializes the null 
value is 387.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread ajbozarth

Github user ajbozarth commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166165392
  
Looping in those involved with SPARK-11206: @carsonwang @JoshRosen @vanzin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...

2015-12-20 Thread junhaoMg

Github user junhaoMg commented on a diff in the pull request:

https://github.com/apache/spark/pull/9593#discussion_r48112803
  
--- Diff: docs/configuration.md ---
@@ -1523,6 +1523,15 @@ Apart from these, the following properties are also 
available, and may be useful
   
 
 
+  spark.streaming.backpressure.initialRate
+  not set
+  
+Initial rate for backpressure mechanism (since 1.5). This provides 
maximum receiving rate of
+receivers in the first batch when enables the backpressure mechanism, 
then the maximum receiving
--- End diff --

thank you, I have modified it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread carsonwang

Github user carsonwang commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166184298
  
Thanks for catching this. I think the null check here is necessary, and it 
seems the code that really pass a null taskMetrcis is from the `TaskSetManager` 
line 796 when a task is resubmitted because of executor lost.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166195988
  
@rxin you are right, an example is [`'b.struct('a.int, 
'b.long)`](https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala#L67)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232] New R API for read.table to avoi...

2015-12-20 Thread felixcheung

GitHub user felixcheung opened a pull request:

https://github.com/apache/spark/pull/10406

[SPARK-12232] New R API for read.table to avoid name conflict

@shivaram sorry it took longer to fix some conflicts, this is the change to 
add an alias for `table`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/felixcheung/spark readtable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10406.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10406


commit 042da7403c8cd289f0c9881b014cdacf84705421
Author: felixcheung 
Date:   2015-12-09T01:54:08Z

read.table

commit 85d54790e63266dc5390ffc73631236e48d91ba8
Author: felixcheung 
Date:   2015-12-09T05:21:24Z

test and revert change

commit 86a12f607ab1a3f843e777c03f508fc29fccf8a5
Author: felixcheung 
Date:   2015-12-14T23:28:24Z

update name as per suggestion

commit f1cd057ac8988607334db84f7d712c16c8133d28
Author: felixcheung 
Date:   2015-12-15T01:49:31Z

update test

commit 2e5c46bc9e4a45fd7662ea9924c62fed6207dbf9
Author: felixcheung 
Date:   2015-12-21T00:17:41Z

fix test

commit 2e4b0908f5fc6d46aa41d64047a347fa62fbf0e7
Author: felixcheung 
Date:   2015-12-21T00:21:06Z

fix export in namespace




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232] New R API for read.table to avoi...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10406#issuecomment-166169270
  
**[Test build #48090 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48090/consoleFull)**
 for PR 10406 at commit 
[`2e4b090`](https://github.com/apache/spark/commit/2e4b0908f5fc6d46aa41d64047a347fa62fbf0e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10398#issuecomment-166191466
  
LGTM except a minor comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10398#discussion_r48115141
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 
---
@@ -253,6 +254,18 @@ class DatasetSuite extends QueryTest with 
SharedSQLContext {
   (1, 1))
   }
 
+  test("MapPartitions can process unsafe rows") {
+// InMemoryColumnarTableScan's outputsUnsafeRows is unsafe
+val ds = sparkContext.makeRDD(Seq("a", "b", "c"), 3).toDS().cache()
+val dsMapPartitions = ds.mapPartitions(_ => Iterator(1))
+val preparedPlan = dsMapPartitions.queryExecution.executedPlan
+// unsafe->safe convertor is not inserted between Generate and 
InMemoryColumnarTableScan
+
assert(preparedPlan.children.head.isInstanceOf[InMemoryColumnarTableScan])
--- End diff --

how about 
`assert(preparedPlan.find(_.isInstanceOf[InMemoryColumnarTableScan]))).isEmpty`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread yhuai

Github user yhuai commented on the pull request:

https://github.com/apache/spark/pull/10331#issuecomment-166201679
  
Should we just keep the runtime part of changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10331#issuecomment-166213937
  
@yhuai Do you think we should move analysis phase checking into another PR 
or just drop that part? This check does find other nullability bugs (revealed 
by the Jenkins build failure). And I think Dataset nullability of Dataset 
schema should conforms to the underlying logical plan.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/10331#discussion_r48119682
  
--- Diff: sql/catalyst/src/test/resources/log4j.properties ---
@@ -16,9 +16,9 @@
 #
 
 # Set everything to be logged to the file target/unit-tests.log
-log4j.rootCategory=INFO, file
+log4j.rootCategory=TRACE, file
 log4j.appender.file=org.apache.log4j.FileAppender
-log4j.appender.file.append=true
+log4j.appender.file.append=false
--- End diff --

Oh yeah, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12400][Shuffle] Avoid generating temp s...

2015-12-20 Thread jerryshao

Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/10376#issuecomment-166170360
  
Hi @JoshRosen , from performance point I don't think there's a big 
difference with this patch, since at most we will only open `200 * Cores` 
number of files simultaneously. But at least we could avoid generating file 
while the related partition is empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12168][SPARKR] Add automated tests for ...

2015-12-20 Thread felixcheung

Github user felixcheung commented on the pull request:

https://github.com/apache/spark/pull/10171#issuecomment-166189995
  
@shivaram what do you think about adding `--vanilla` to `RRunner` 
[here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/RRunner.scala#L81)?
It'd be consistent since worker/demon is already running with `--vanilla` 
[here](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/r/RRDD.scala#L401),
 and users would still be able to have their desired environ/profile/init 
file/workspace when starting SparkR programmatically (ie. when with 
`sparkR.init()`, but not with `sparkR` or `spark-submit something.R`)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11807] Remove support for Hadoop < 2.2

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10404#issuecomment-166193507
  
**[Test build #2242 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2242/consoleFull)**
 for PR 10404 at commit 
[`6c9fb80`](https://github.com/apache/spark/commit/6c9fb800ea5d3ed2dcaba8cbbdb24bd4d32f0b65).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12392][Core] Optimize a location order ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10346#issuecomment-166199431
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11807] Remove support for Hadoop < 2.2

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10404#issuecomment-166210297
  
**[Test build #2242 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2242/consoleFull)**
 for PR 10404 at commit 
[`6c9fb80`](https://github.com/apache/spark/commit/6c9fb800ea5d3ed2dcaba8cbbdb24bd4d32f0b65).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232] New R API for read.table to avoi...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10406#issuecomment-166168381
  
**[Test build #48090 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48090/consoleFull)**
 for PR 10406 at commit 
[`2e4b090`](https://github.com/apache/spark/commit/2e4b0908f5fc6d46aa41d64047a347fa62fbf0e7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12396][Core]Once driver connect to a ma...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10407#issuecomment-166175075
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12232][SPARKR] New R API for read.table...

2015-12-20 Thread sun-rui

Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/10406#issuecomment-166180732
  
How about "tableToDF"  ? there are some API methods having table in their 
names, like "createExternalTable", "saveAsTable", "tables". "tableToDF" is 
shorter and consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12327][SPARKR] fix code for lintr warni...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10408#issuecomment-166181204
  
**[Test build #48092 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48092/consoleFull)**
 for PR 10408 at commit 
[`a4f47a2`](https://github.com/apache/spark/commit/a4f47a2e31d908a1214e3a680cbe34b28e5f6049).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9593#issuecomment-166185198
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48091/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11627] Add initial input rate limit for...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9593#issuecomment-166185146
  
**[Test build #48091 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48091/consoleFull)**
 for PR 9593 at commit 
[`2d750c4`](https://github.com/apache/spark/commit/2d750c4c1cedaff9849137710b58242bcd15bef9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12443][SQL] encoderFor should support D...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10399#discussion_r48114262
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 
---
@@ -61,6 +61,7 @@ object ScalaReflection extends ScalaReflection {
   case t if t <:< definitions.ByteTpe => ByteType
   case t if t <:< definitions.BooleanTpe => BooleanType
   case t if t <:< localTypeOf[Array[Byte]] => BinaryType
+  case t if t <:< localTypeOf[Decimal] => DecimalType.SYSTEM_DEFAULT
--- End diff --

Should we add a TODO to say that, we can remove this line after we hide the 
`Decimal`? Logically `Decimal` is an internal concept and we should not expose 
it to users.

cc @marmbrus @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Dataset nullability check

2015-12-20 Thread yhuai

Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/10331#discussion_r48116475
  
--- Diff: sql/catalyst/src/test/resources/log4j.properties ---
@@ -16,9 +16,9 @@
 #
 
 # Set everything to be logged to the file target/unit-tests.log
-log4j.rootCategory=INFO, file
+log4j.rootCategory=TRACE, file
 log4j.appender.file=org.apache.log4j.FileAppender
-log4j.appender.file.append=true
+log4j.appender.file.append=false
--- End diff --

remove these?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6624][SQL]Convert filters into CNF for ...

2015-12-20 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/8200#issuecomment-166213204
  
@gatorsmile +1 and great work :))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166165926
  
**[Test build #48089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48089/consoleFull)**
 for PR 10405 at commit 
[`0a46559`](https://github.com/apache/spark/commit/0a4655999772eed9296de438a61319765389e588).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11807] Remove support for Hadoop < 2.2

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10404#issuecomment-166168817
  
**[Test build #48088 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48088/consoleFull)**
 for PR 10404 at commit 
[`6c9fb80`](https://github.com/apache/spark/commit/6c9fb800ea5d3ed2dcaba8cbbdb24bd4d32f0b65).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11807] Remove support for Hadoop < 2.2

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10404#issuecomment-166168831
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48088/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-11807] Remove support for Hadoop < 2.2

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10404#issuecomment-166168830
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12339] [WebUI] Added a null check that ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10405#issuecomment-166170887
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12392][Core] Optimize a location order ...

2015-12-20 Thread maropu

Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/10346#issuecomment-166179434
  
@andrewor14 Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12392][Core] Optimize a location order ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10346#issuecomment-166181392
  
**[Test build #48093 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48093/consoleFull)**
 for PR 10346 at commit 
[`d962f15`](https://github.com/apache/spark/commit/d962f15e186bfe77d3fb3e5e4ec44d10b5523c0f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12438][SQL] Add SQLUserDefinedType supp...

2015-12-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/10390#issuecomment-166181434
  
cc @cloud-fan @marmbrus @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12293][SQL] Support UnsafeRow in LocalT...

2015-12-20 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/10283#issuecomment-166184035
  
@cloud-fan I think I have addressed all your comments. These bugs found in 
implementing UnsafeRow support in LocalTableScan are submitted as other PRs 
with their tests, so you can review them better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12399] Display correct error message wh...

2015-12-20 Thread carsonwang

Github user carsonwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/10352#discussion_r48114578
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
@@ -115,7 +117,17 @@ class HistoryServer(
   }
 
   def getSparkUI(appKey: String): Option[SparkUI] = {
-Option(appCache.get(appKey))
--- End diff --

`appCache.getIfPresent` returns null if there is no cached value for the 
appKey. But `appCache.get` will try to obtain that value from a `CacheLoader`, 
cache it and return it. So I think we still need use `appCache.get` here and 
handle the exception. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48114994
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Generate.scala ---
@@ -51,9 +52,12 @@ case class Generate(
 join: Boolean,
 outer: Boolean,
 output: Seq[Attribute],
+generatorOutput: Seq[Attribute],
 child: SparkPlan)
   extends UnaryNode {
 
+  override def missingInput: AttributeSet = super.missingInput -- 
generatorOutput
+
--- End diff --

You can use the same approach in logical.Generate, i.e.,

 override def expressions: Seq[Expression] = generator :: Nil

to solve this issue. Then you don't need to modify SparkStrategies.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12439][SQL] Fix toCatalystArray and Map...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/10391#issuecomment-166189382
  
Good catch! One mirror comment, can we write the test in 
`ExpressionEncoderSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12441] [SQL] Fixing missingInput in Gen...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10393#discussion_r48115249
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/Generate.scala ---
@@ -51,9 +52,12 @@ case class Generate(
 join: Boolean,
 outer: Boolean,
 output: Seq[Attribute],
+generatorOutput: Seq[Attribute],
 child: SparkPlan)
   extends UnaryNode {
 
+  override def missingInput: AttributeSet = super.missingInput -- 
generatorOutput
+
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12287] [SQL] Support UnsafeRow in MapPa...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10398#issuecomment-166202288
  
**[Test build #48095 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48095/consoleFull)**
 for PR 10398 at commit 
[`4c745f5`](https://github.com/apache/spark/commit/4c745f5256700b160a10f0077be49e77a10e758b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12292] [SQL] Support UnsafeRow in Gener...

2015-12-20 Thread gatorsmile

Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/10396#issuecomment-166203268
  
After a few tries, I am unable to create a test case that can trigger the 
issue. I think I am not the right person to fix it. Thus, For not wasting the 
reviewers' time, I close it. 

Thank you! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12292] [SQL] Support UnsafeRow in Gener...

2015-12-20 Thread gatorsmile

Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/10396


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12446][SQL] Add unit tests for JDBCRDD ...

2015-12-20 Thread maropu

GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/10409

[SPARK-12446][SQL] Add unit tests for JDBCRDD internal functions

No tests done for JDBCRDD#compileFilter.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark AddTestsInJdbcRdd

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10409.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10409


commit 30a01c9ec2f44339511cf3fb816d91650e0f7ebb
Author: Takeshi YAMAMURO 
Date:   2015-12-18T05:21:57Z

Add tests in JDBCSuite

commit ed94623cb01e36e790824903b9e937495cae3942
Author: Takeshi YAMAMURO 
Date:   2015-12-21T05:34:45Z

fix minor bugs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12446][SQL] Add unit tests for JDBCRDD ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10409#issuecomment-166211512
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12446][SQL] Add unit tests for JDBCRDD ...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10409#issuecomment-166211511
  
**[Test build #48097 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48097/consoleFull)**
 for PR 10409 at commit 
[`ed94623`](https://github.com/apache/spark/commit/ed94623cb01e36e790824903b9e937495cae3942).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12446][SQL] Add unit tests for JDBCRDD ...

2015-12-20 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10409#issuecomment-166211513
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48097/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5479] [yarn] Handle --py-files correctl...

2015-12-20 Thread zjffdu

Github user zjffdu commented on the pull request:

https://github.com/apache/spark/pull/6360#issuecomment-166224830
  
@vanzin  I am reading the yarn related code specially on 
org.apache.spark.deploy.yarn.Client.scala
Do you know where LOCAL_SCHEME("local") come from ? As I know we use 
file:// to represent a local resource, so not sure where "local" come from. 

Another question is that if I specify spark.yarn.jar as a hdfs location, 
the yarn client will still copy it to staging directory, I don't know why we do 
this. Would just use the hdfs file as LocalResource without copying much easier 
?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Runtime nullability check f...

2015-12-20 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10331#issuecomment-166227395
  
@yhuai Narrowed down the scope of this PR. As we discussed offline, will 
open another one for the analysis phase check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12102][SQL] Cast a non-nullable struct ...

2015-12-20 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/10156#discussion_r48123275
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -274,4 +274,12 @@ class AnalysisSuite extends AnalysisTest {
 assert(lits(1) >= min && lits(1) <= max)
 assert(lits(0) == lits(1))
   }
+
+  test("SPARK-12102: Ignore nullablity when comparing two sides of case") {
+val caseBranches = Seq((Literal(1) > Literal(0)),
+  CreateStruct(Seq(Cast(Floor(Literal(10)), IntegerType))),
+  CreateStruct(Seq(Literal(10
+val plan = OneRowRelation.select(Alias(CaseWhen(caseBranches), 
"val")())
+assertAnalysisSuccess(plan)
--- End diff --

we can simplify this test to:
```
val relation = LocalRelation('a.struct('x.int), 
'b.struct('x.int.withNullability(false)))
val plan = relation.select(CaseWhen(Seq(Literal(true), 'a, 'b)).as("val"))
assertAnalysisSuccess(plan)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2331] SparkContext.emptyRDD should retu...

2015-12-20 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10394#issuecomment-166158720
  
**[Test build #2241 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2241/consoleFull)**
 for PR 10394 at commit 
[`6c3df28`](https://github.com/apache/spark/commit/6c3df287eec016df93df02f2f8715fe24355cc65).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12349] [ML] Make spark.ml PCAModel load...

2015-12-20 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10327#discussion_r48101296
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/PCA.scala ---
@@ -167,14 +167,37 @@ object PCAModel extends MLReadable[PCAModel] {
 
 private val className = classOf[PCAModel].getName
 
+/**
+ * Loads a [[PCAModel]] from data the input path. Note that the model 
includes an
--- End diff --

Oops, will fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10158] [PySpark] [MLlib] ALS better err...

2015-12-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9361


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/10400

[SPARK-12444][SQL] A lightweight Scala DSL for schema construction

This PR introduces a lightweight Scala DSL for constructing complex Spark 
SQL schema without introducing implicit conversion or any new types.

Two DSL methods `!` and `?` are added to `DataType` to help indicating 
nullability of struct field, array element type, and map value type.

- `!` means non-nullable (or required), while
- `?` means nullable (or optional)

With the help of these two methods, and three more constructors, we can now 
construct schema like this:

```scala
StructType(
  "f0" -> IntegerType.!,
  "f1" -> ArrayType(IntegerType.?).!,
  "f2" -> MapType(
IntegerType,
StructType(
  "f20" -> DoubleType.!,
  "f21" -> StringType.?
).!
  ).?
)
```

which is conciser and arguably more readable than equivalent existing 
approaches:

```scala
StructType(Seq(
  StructField("f0", IntegerType, nullable = false),
  StructField("f1", ArrayType(IntegerType, containsNull = true), nullable = 
false),
  StructField("f2", MapType(
IntegerType,
StructType(Seq(
  StructField("f20", DoubleType, nullable = false),
  StructField("f21", StringType, nullable = true)
)),
valueContainsNull = false
  ), nullable = true)
))

new StructType()
  .add("f0", IntegerType, nullable = false)
  .add("f1", ArrayType(IntegerType, containsNull = true), nullable = false)
  .add("f2", MapType(
IntegerType,
new StructType()
  .add("f20", DoubleType, nullable = false)
  .add("f21", StringType, nullable = true),
valueContainsNull = false
  ), nullable = true)
```


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark schema-dsl

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10400.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10400


commit a6ea8e7ef7a8a30ebe6bc7bc931649f32e1bb7f0
Author: Cheng Lian 
Date:   2015-12-20T09:14:18Z

A lightweight DSL for schema construction




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12292] [SQL] Support UnsafeRow in Gener...

2015-12-20 Thread hvanhovell

Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/10396#issuecomment-16611
  
@gatorsmile I think this needs a bit more work. Generate produces new rows, 
this means we also need to add a code path for generating ```UnsafeRow```s. I 
think we need to add/change code in ```Generate.execute``` and also to the 
```UserDefinedGenerator``` and ```Explode``` generators.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12369][SQL]DataFrameReader fails on glo...

2015-12-20 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10379#issuecomment-166088383
  
@yanakad Thanks for your explanation, now I understand your use case. I 
agree that this is somewhat inconvenient under this use case. But I still tend 
to say this shouldn't be an issue, because:

1. At application level, this issue can be worked around by first globbing 
the lowest directories first, and then passing result path(s) to 
`DataFrameReader.parquet()` method.
2. Changes made in this PR bring negative impact to the public API:

   - As mentioned above, the behavior becomes more error-prone and dangerous
   - The behavior becomes inconsistent with other data sources. For 
example, ORC, JSON, and JDBC all throws exception when the input path/JDBC URL 
is invalid or doesn't exist.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12010][SQL] Spark JDBC requires support...

2015-12-20 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/10380#discussion_r48101417
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
 ---
@@ -60,20 +60,6 @@ object JdbcUtils extends Logging {
   }
 
   /**
-   * Returns a PreparedStatement that inserts a row into table via conn.
--- End diff --

Hm, the only problem here is that this is a public method, and while it 
feels like it was intended to be a Spark-only utility method, I'm not sure it's 
marked as such. 

It's not a big deal to retain it and implement in terms of the new method. 
However it's now a function of a dialect, which is not an argument here. I 
suppose any dialect will do since they all behave the same now. This method 
could then be deprecated.

However: yeah, the behavior is actually the same for all dialects now. 
Really, this has come full circle and can just be a modification to this 
method, which was already the same for all dialects. Is there reason to believe 
the insert statement might vary later? Then I could see keeping the current 
structure here and just deprecating this method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12444][SQL] A lightweight Scala DSL for...

2015-12-20 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10400#issuecomment-166112698
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-12371][SQL] Checks Dataset nullability ...

2015-12-20 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/10331#issuecomment-166142301
  
@yhuai Thanks a lot for the explanation, I misunderstood the scope of the 
JIRA ticket. Updated this PR according to @marmbrus's [comment][1] in #10296. A 
new expression `AssertNotNull` is added to assert non-nullable constructor 
arguments are indeed non-null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 207 matches

Mail list logo