date:20141011

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739434
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739441
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21632/consoleFull)
 for   PR 2761 at commit 
[`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-11 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18739656
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

OK... verified this behavior with Hive, I've no idea about this :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739472
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21632/consoleFull)
 for   PR 2761 at commit 
[`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker `
  * `class SparkRunnerSettings(error: String = Unit) extends 
Settings(error) `
  * `trait ActorHelper extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739473
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21632/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Docs] logNormalGraph missing partition parame...

2014-10-11 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2523#issuecomment-58739482
  
@elmalto It looks like GitHub says that this PR was opened from unknown 
repository, which might explain why you're not able to update its code.  If 
that's the case, could you close this PR and open a new one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread sarutak

Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739516
  
Oh, I didn't run scalastyle for yarn-alpha.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-10-11 Thread adrian-wang

Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2344#discussion_r18739665
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -220,20 +220,44 @@ trait HiveTypeCoercion {
   case a: BinaryArithmetic if a.right.dataType == StringType =
 a.makeCopy(Array(a.left, Cast(a.right, DoubleType)))
 
+  // we should cast all timestamp/date/string compare into string 
compare
+  case p: BinaryPredicate if p.left.dataType == StringType
+ p.right.dataType == DateType =
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == StringType =
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
   case p: BinaryPredicate if p.left.dataType == StringType
  p.right.dataType == TimestampType =
-p.makeCopy(Array(Cast(p.left, TimestampType), p.right))
+p.makeCopy(Array(p.left, Cast(p.right, StringType)))
   case p: BinaryPredicate if p.left.dataType == TimestampType
  p.right.dataType == StringType =
-p.makeCopy(Array(p.left, Cast(p.right, TimestampType)))
+p.makeCopy(Array(Cast(p.left, StringType), p.right))
+  case p: BinaryPredicate if p.left.dataType == TimestampType
+ p.right.dataType == DateType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
+  case p: BinaryPredicate if p.left.dataType == DateType
+ p.right.dataType == TimestampType =
+p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, 
StringType)))
--- End diff --

So Michael agreed to leave the whole ordering and comparing stuff in a 
separated PR :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739525
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21633/consoleFull)
 for   PR 2761 at commit 
[`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739546
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21633/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739545
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21633/consoleFull)
 for   PR 2761 at commit 
[`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker `
  * `class SparkRunnerSettings(error: String = Unit) extends 
Settings(error) `
  * `trait ActorHelper extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2574#issuecomment-58739664
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21624/consoleFull)
 for   PR 2574 at commit 
[`4fee5a8`](https://github.com/apache/spark/commit/4fee5a8400e87f7bb33363194cc3039feb3dbed6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2574#issuecomment-58739666
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21624/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...

2014-10-11 Thread lirui-intel

Github user lirui-intel commented on the pull request:

https://github.com/apache/spark/pull/2760#issuecomment-58739690
  
Looks great! I think it's very useful to have these async APIs in java :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...

2014-10-11 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2675#issuecomment-58739685
  
@marmbrus This should be ready to go once Jenkins nods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739709
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21634/consoleFull)
 for   PR 2761 at commit 
[`64b2c46`](https://github.com/apache/spark/commit/64b2c46474a48fc0906f140edf310c46eb63).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2675#issuecomment-58739744
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21626/consoleFull)
 for   PR 2675 at commit 
[`1c384b7`](https://github.com/apache/spark/commit/1c384b7bc8b0b8d5b9b6bf294f399de5bb8a9976).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2675#issuecomment-58739745
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21626/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739778
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21634/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58739777
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21634/consoleFull)
 for   PR 2761 at commit 
[`64b2c46`](https://github.com/apache/spark/commit/64b2c46474a48fc0906f140edf310c46eb63).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker `
  * `class SparkRunnerSettings(error: String = Unit) extends 
Settings(error) `
  * `trait ActorHelper extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-11 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2570#issuecomment-58739817
  
test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-11 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2570#issuecomment-58739815
  
Seems the failure is not related to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3867] ./python/run-tests failed when it...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2759#issuecomment-58739867
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21625/consoleFull)
 for   PR 2759 at commit 
[`f068eb5`](https://github.com/apache/spark/commit/f068eb508c7f0e6991d296f4473eb754c7d5090f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3867] ./python/run-tests failed when it...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2759#issuecomment-58739870
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21625/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2570#issuecomment-58739903
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21635/consoleFull)
 for   PR 2570 at commit 
[`3774bd4`](https://github.com/apache/spark/commit/3774bd4617cb4dec3f78a08bdf42653b682102fd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58740083
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21636/consoleFull)
 for   PR 2761 at commit 
[`d80d71a`](https://github.com/apache/spark/commit/d80d71abc4cf3d85a2585729719b35a5eca84551).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2760#issuecomment-58740225
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21628/consoleFull)
 for   PR 2760 at commit 
[`ff28e49`](https://github.com/apache/spark/commit/ff28e49d990577635fa148bd57461a387bd3466d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaFutureActionWrapper[S, T](futureAction: FutureAction[S], 
converter: S = T)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2760#issuecomment-58740227
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21628/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread chenghao-intel

GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/2762

[SPARK-3904] [SQL] add constant objectinspector support for udfs

In HQL, we convert all of the data type into normal `ObjectInspector`s for 
UDFs, most of cases it work, however, some of the UDF actually requires the 
input `ObjectInspector` to be the `ConstantObjectInspector`, which will cause 
exception.
e.g.
select named_struct(x, str) from src limit 1;

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark udf_coi

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2762.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2762


commit 06581e31aaef055c89a0d89ddaac657a9609d571
Author: Cheng Hao hao.ch...@intel.com
Date:   2014-10-11T06:34:24Z

add constant objectinspector support for udfs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58740431
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58740447
  
test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58740444
  
test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58740484
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21637/consoleFull)
 for   PR 2762 at commit 
[`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2760#issuecomment-58740594
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21630/consoleFull)
 for   PR 2760 at commit 
[`6f8f6ac`](https://github.com/apache/spark/commit/6f8f6ac668d74a3164bcf037f09c8353134b53f6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class JavaFutureActionWrapper[S, T](futureAction: FutureAction[S], 
converter: S = T)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2760#issuecomment-58740597
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21630/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58740581
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21638/consoleFull)
 for   PR 2762 at commit 
[`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58740676
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/350/consoleFull)
 for   PR 2538 at commit 
[`6db00da`](https://github.com/apache/spark/commit/6db00da9595e38eccff7bfb5683b32cee3ac6263).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class StreamingContext(object):`
  * `class DStream(object):`
  * `class TransformedDStream(DStream):`
  * `class TransformFunction(object):`
  * `class TransformFunctionSerializer(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread tianyi

Github user tianyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/2762#discussion_r18739877
  
--- Diff: 
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala
 ---
@@ -578,6 +578,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest 
with BeforeAndAfter {
 multi_join_union,
 multiMapJoin1,
 multiMapJoin2,
+udf_named_struct,
--- End diff --

I think you should put it after udf_month


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58740718
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21631/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58740717
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21631/consoleFull)
 for   PR 2538 at commit 
[`64561e4`](https://github.com/apache/spark/commit/64561e4e503eafb958f6769383ba3b37edbe5fa2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class StreamingContext(object):`
  * `class DStream(object):`
  * `class TransformedDStream(DStream):`
  * `class TransformFunction(object):`
  * `class TransformFunctionSerializer(object):`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2570#issuecomment-58740765
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21635/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2570#issuecomment-58740758
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21635/consoleFull)
 for   PR 2570 at commit 
[`3774bd4`](https://github.com/apache/spark/commit/3774bd4617cb4dec3f78a08bdf42653b682102fd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class CreateTableAsSelect[T](`
  * `  logDebug(Found class for $serdeName)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-10-11 Thread viper-kun

Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/2471#discussion_r18739911
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: 
SparkConf) extends ApplicationHis
   }
 }
 
-val newIterator = logInfos.iterator.buffered
-val oldIterator = applications.values.iterator.buffered
-while (newIterator.hasNext  oldIterator.hasNext) {
-  if (newIterator.head.endTime  oldIterator.head.endTime) {
-addIfAbsent(newIterator.next)
-  } else {
-addIfAbsent(oldIterator.next)
+applications.synchronized {
--- End diff --

I think there is a need for the two tasks to never run concurrently. if the 
order is: 
1. check task get applications 
2. clean task get applications
3. clean task get result, and replace applications
4. check task get result, and replace applications
then clean task result is covered by check result.
use a ScheduledExecutorService with a single worker thread is a good way to 
solve it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3888] [PySpark] limit the memory used b...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2743#issuecomment-58740824
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/351/consoleFull)
 for   PR 2743 at commit 
[`623c8a7`](https://github.com/apache/spark/commit/623c8a76c2e91bd4f80193a0d7c4813d1cb3bc7a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58741122
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21637/consoleFull)
 for   PR 2762 at commit 
[`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58741124
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21637/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58741172
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/349/consoleFull)**
 for PR 2538 at commit 
[`6db00da`](https://github.com/apache/spark/commit/6db00da9595e38eccff7bfb5683b32cee3ac6263)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58741206
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21627/consoleFull)**
 for PR 2538 at commit 
[`331ecce`](https://github.com/apache/spark/commit/331ecced6f61ad5183da5830f94f584bcc74e479)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2377] Python API for Streaming

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2538#issuecomment-58741207
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21627/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58741232
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21638/consoleFull)
 for   PR 2762 at commit 
[`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  protected case class Keyword(str: String)`
  * `class SqlLexical(val keywords: Seq[String]) extends StdLexical `
  * `  case class FloatLit(chars: String) extends Token `
  * `class SqlParser extends AbstractSparkSQLParser `
  * `case class SetCommand(kv: Option[(String, Option[String])]) extends 
Command `
  * `case class ShellCommand(cmd: String) extends Command`
  * `case class SourceCommand(filePath: String) extends Command`
  * `case class SetCommand(kv: Option[(String, Option[String])], output: 
Seq[Attribute])(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2762#issuecomment-58741234
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21638/Test 
FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: The keys for sorting the columns of Executor p...

2014-10-11 Thread witgo

GitHub user witgo opened a pull request:

https://github.com/apache/spark/pull/2763

The keys for sorting the columns of Executor page ,Stage page Storage page 
are incorrect



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/witgo/spark SPARK-3905

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2763.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2763


commit 17d79904dc80e960145db216d2de9ab8884458dd
Author: GuoQiang Li wi...@qq.com
Date:   2014-10-11T07:11:58Z

The keys for sorting the columns of Executor page ,Stage page Storage page 
are incorrect




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2763#issuecomment-58741803
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2763#issuecomment-58741865
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21639/consoleFull)
 for   PR 2763 at commit 
[`17d7990`](https://github.com/apache/spark/commit/17d79904dc80e960145db216d2de9ab8884458dd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58741924
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21636/consoleFull)
 for   PR 2761 at commit 
[`d80d71a`](https://github.com/apache/spark/commit/d80d71abc4cf3d85a2585729719b35a5eca84551).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker `
  * `class SparkRunnerSettings(error: String = Unit) extends 
Settings(error) `
  * `trait ActorHelper extends Logging `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58741926
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21636/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/2764

[SQL] Refactors data type pattern matching

Refactors/adds extractors for `DataType` and `Binary*` types to ease and 
simplify data type related (nested) pattern matching.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark datatype-patmat

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2764.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2764


commit f391be51ee91da4c12146c90aad9f63d06f0ac34
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-10-11T07:37:25Z

Refactors data type pattern matching




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2764#issuecomment-58742036
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2764#issuecomment-58742085
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21640/consoleFull)
 for   PR 2764 at commit 
[`f391be5`](https://github.com/apache/spark/commit/f391be51ee91da4c12146c90aad9f63d06f0ac34).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-10-11 Thread viper-kun

Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/2471#discussion_r18740124
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: 
SparkConf) extends ApplicationHis
   }
 }
 
-val newIterator = logInfos.iterator.buffered
-val oldIterator = applications.values.iterator.buffered
-while (newIterator.hasNext  oldIterator.hasNext) {
-  if (newIterator.head.endTime  oldIterator.head.endTime) {
-addIfAbsent(newIterator.next)
-  } else {
-addIfAbsent(oldIterator.next)
+applications.synchronized {
+  val newIterator = logInfos.iterator.buffered
+  val oldIterator = applications.values.iterator.buffered
+  while (newIterator.hasNext  oldIterator.hasNext) {
+if (newIterator.head.endTime  oldIterator.head.endTime) {
+  addIfAbsent(newIterator.next)
+} else {
+  addIfAbsent(oldIterator.next)
+}
   }
+  newIterator.foreach(addIfAbsent)
+  oldIterator.foreach(addIfAbsent)
+
+  applications = newApps
 }
-newIterator.foreach(addIfAbsent)
-oldIterator.foreach(addIfAbsent)
+  }
+} catch {
+  case t: Throwable = logError(Exception in checking for event log 
updates, t)
+}
+  }
+
+  /**
+   *  Deleting apps if setting cleaner.
+   */
+  private def cleanLogs() = {
+lastLogCleanTimeMs = getMonotonicTimeMs()
+logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs))
+try {
+  val logStatus = fs.listStatus(new Path(resolvedLogDir))
+  val logDirs = if (logStatus != null) logStatus.filter(_.isDir).toSeq 
else Seq[FileStatus]()
+  val maxAge = conf.getLong(spark.history.fs.maxAge.seconds,
+DEFAULT_SPARK_HISTORY_FS_MAXAGE_S) * 1000
+
+  val now = System.currentTimeMillis()
+  fs.synchronized {
+// scan all logs from the log directory.
+// Only directories older than this many seconds will be deleted .
+logDirs.foreach { dir =
+  // history file older than this many seconds will be deleted 
+  // when the history cleaner runs.
+  if (now - getModificationTime(dir)  maxAge) {
+fs.delete(dir.getPath, true)
+  }
+}
+  }
+  
+  val newApps = new mutable.LinkedHashMap[String, 
FsApplicationHistoryInfo]()
+  def addIfNotExpire(info: FsApplicationHistoryInfo) = {
+if(now - info.lastUpdated = maxAge) {
+  newApps += (info.id - info)
--- End diff --

info.lastUpdated is the timestamps of the directory and the 
info.lastUpdated is always bigger than the files timestamps. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...

2014-10-11 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/2761#issuecomment-58742157
  
I quite like standardizing style, but doesn't this have the same problem 
mentioned before, that it's going to break a lot of potential merge commits? If 
it's bite-the-bullet time, there are other micro changes that may actually have 
a little positive impact on execution that might be good to get in too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3888] [PySpark] limit the memory used b...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2743#issuecomment-58742243
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/351/consoleFull)
 for   PR 2743 at commit 
[`623c8a7`](https://github.com/apache/spark/commit/623c8a76c2e91bd4f80193a0d7c4813d1cb3bc7a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-10-11 Thread viper-kun

Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/2471#discussion_r18740169
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: 
SparkConf) extends ApplicationHis
   }
 }
 
-val newIterator = logInfos.iterator.buffered
-val oldIterator = applications.values.iterator.buffered
-while (newIterator.hasNext  oldIterator.hasNext) {
-  if (newIterator.head.endTime  oldIterator.head.endTime) {
-addIfAbsent(newIterator.next)
-  } else {
-addIfAbsent(oldIterator.next)
+applications.synchronized {
+  val newIterator = logInfos.iterator.buffered
+  val oldIterator = applications.values.iterator.buffered
+  while (newIterator.hasNext  oldIterator.hasNext) {
+if (newIterator.head.endTime  oldIterator.head.endTime) {
+  addIfAbsent(newIterator.next)
+} else {
+  addIfAbsent(oldIterator.next)
+}
   }
+  newIterator.foreach(addIfAbsent)
+  oldIterator.foreach(addIfAbsent)
+
+  applications = newApps
 }
-newIterator.foreach(addIfAbsent)
-oldIterator.foreach(addIfAbsent)
+  }
+} catch {
+  case t: Throwable = logError(Exception in checking for event log 
updates, t)
+}
+  }
+
+  /**
+   *  Deleting apps if setting cleaner.
+   */
+  private def cleanLogs() = {
+lastLogCleanTimeMs = getMonotonicTimeMs()
+logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs))
+try {
+  val logStatus = fs.listStatus(new Path(resolvedLogDir))
+  val logDirs = if (logStatus != null) logStatus.filter(_.isDir).toSeq 
else Seq[FileStatus]()
+  val maxAge = conf.getLong(spark.history.fs.maxAge.seconds,
+DEFAULT_SPARK_HISTORY_FS_MAXAGE_S) * 1000
+
+  val now = System.currentTimeMillis()
+  fs.synchronized {
+// scan all logs from the log directory.
+// Only directories older than this many seconds will be deleted .
+logDirs.foreach { dir =
+  // history file older than this many seconds will be deleted 
+  // when the history cleaner runs.
+  if (now - getModificationTime(dir)  maxAge) {
+fs.delete(dir.getPath, true)
--- End diff --

Can you tell me the detail reason that add try..catch into fs.delete?   i 
think the exception may be caught by try..catch(line 271). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2764#issuecomment-58743015
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21640/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2764#issuecomment-58743014
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21640/consoleFull)
 for   PR 2764 at commit 
[`f391be5`](https://github.com/apache/spark/commit/f391be51ee91da4c12146c90aad9f63d06f0ac34).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class InSet(value: Expression, hset: HashSet[Any], child: 
Seq[Expression])`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread wangxiaojing

GitHub user wangxiaojing opened a pull request:

https://github.com/apache/spark/pull/2765

[spark-3586][streaming]Support nested directories in Spark Streaming

For text files, the method streamingContext.textFileStream(dataDirectory). 
The improvement of the streaming to Support subdirectories,spark streaming 
can  monitor the subdirectories dataDirectory and process any files created in 
that directory.
eg:
streamingContext.textFileStream(/test). 
Look at the direction contents:
/test/file1
/test/file2
/test/dr/file1
if the directory /test/dr/ have new file file2 ,spark streaming can 
process  the file



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangxiaojing/spark spark-3586

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2765.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2765


commit 98ead547f90520819b421b0f4436bfe7d8a3d4f4
Author: wangxiaojing u9j...@gmail.com
Date:   2014-10-11T08:22:31Z

Support nested directories in Spark Streaming




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-58743235
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-58743237
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...

2014-10-11 Thread james64

Github user james64 commented on the pull request:

https://github.com/apache/spark/pull/2712#issuecomment-58743254
  
Can it be that test Flume test failed due to upstream changes? It is 
passing for me locally now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2763#issuecomment-58743282
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21639/consoleFull)
 for   PR 2763 at commit 
[`17d7990`](https://github.com/apache/spark/commit/17d79904dc80e960145db216d2de9ab8884458dd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2763#issuecomment-58743286
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21639/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2712#issuecomment-58743306
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21641/consoleFull)
 for   PR 2712 at commit 
[`1b20d51`](https://github.com/apache/spark/commit/1b20d5193fa149347f9c8c05bb25298992324d4a).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740431
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 
 def accept(path: Path): Boolean = {
   try {
+if (fs.getFileStatus(path).isDirectory()){
--- End diff --

Nit: space before brace


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740429
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -230,6 +246,10 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 if (minNewFileModTime  0 || modTime  minNewFileModTime) {
   minNewFileModTime = modTime
 }
+if(path.getName().startsWith(_)){
+  System.out.println(startsWith: + path.getName())
--- End diff --

Remove this System.out


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740430
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 
 def accept(path: Path): Boolean = {
   try {
+if (fs.getFileStatus(path).isDirectory()){
+  return false
+}
 if (!filter(path)) {  // Reject file if it does not satisfy filter
   logDebug(Rejected by filter  + path)
   return false
--- End diff --

You don't need `return` anywhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740436
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
   true
 }
   }
+
+  private[streaming]
+  class SubPathFilter extends PathFilter {
+
+def accept(path: Path): Boolean = {
+  try {
+if(fs.getFileStatus(path).isDirectory()){
+  val modTime = getFileModTime(path)
+  logDebug(Mod time for  + path +  is  + modTime)
+  if (modTime  ignoreTime) {
+// Reject file if it was created before the ignore time (or, 
before last interval)
+logDebug(Mod time  + modTime +  less than ignore time  + 
ignoreTime)
+return false
+  }
+  return true
+}
+  } catch {
+case fnfe: java.io.FileNotFoundException =
--- End diff --

Why not import this, and what about more general `IOException`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740443
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
   true
 }
   }
+
+  private[streaming]
+  class SubPathFilter extends PathFilter {
+
+def accept(path: Path): Boolean = {
+  try {
+if(fs.getFileStatus(path).isDirectory()){
+  val modTime = getFileModTime(path)
+  logDebug(Mod time for  + path +  is  + modTime)
--- End diff --

Nit: you can use string interpolation to make it a little simpler


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740446
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
   true
 }
   }
+
+  private[streaming]
+  class SubPathFilter extends PathFilter {
+
+def accept(path: Path): Boolean = {
+  try {
+if(fs.getFileStatus(path).isDirectory()){
+  val modTime = getFileModTime(path)
+  logDebug(Mod time for  + path +  is  + modTime)
+  if (modTime  ignoreTime) {
+// Reject file if it was created before the ignore time (or, 
before last interval)
+logDebug(Mod time  + modTime +  less than ignore time  + 
ignoreTime)
--- End diff --

Log message is inconsistent with conditional


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740452
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -118,6 +119,18 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 (newFiles, filter.minNewFileModTime)
   }
 
+  def getPathList( path:Path, fs:FileSystem):List[Path]={
+val filter = new SubPathFilter()
+var pathList = List[Path]()
+fs.listStatus(path,filter).map(x={
+  if(x.isDirectory()){
--- End diff --

Doesn't this only list immediate subdirectories?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2471#discussion_r18740457
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: 
SparkConf) extends ApplicationHis
   }
 }
 
-val newIterator = logInfos.iterator.buffered
-val oldIterator = applications.values.iterator.buffered
-while (newIterator.hasNext  oldIterator.hasNext) {
-  if (newIterator.head.endTime  oldIterator.head.endTime) {
-addIfAbsent(newIterator.next)
-  } else {
-addIfAbsent(oldIterator.next)
+applications.synchronized {
+  val newIterator = logInfos.iterator.buffered
+  val oldIterator = applications.values.iterator.buffered
+  while (newIterator.hasNext  oldIterator.hasNext) {
+if (newIterator.head.endTime  oldIterator.head.endTime) {
+  addIfAbsent(newIterator.next)
+} else {
+  addIfAbsent(oldIterator.next)
+}
   }
+  newIterator.foreach(addIfAbsent)
+  oldIterator.foreach(addIfAbsent)
+
+  applications = newApps
 }
-newIterator.foreach(addIfAbsent)
-oldIterator.foreach(addIfAbsent)
+  }
+} catch {
+  case t: Throwable = logError(Exception in checking for event log 
updates, t)
+}
+  }
+
+  /**
+   *  Deleting apps if setting cleaner.
+   */
+  private def cleanLogs() = {
+lastLogCleanTimeMs = getMonotonicTimeMs()
+logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs))
--- End diff --

Nit: string interpolation is probably clearer: `s:Cleaning ... now 
$lastLogCleanTimeMs`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread wangxiaojing

Github user wangxiaojing commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740834
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -118,6 +119,18 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 (newFiles, filter.minNewFileModTime)
   }
 
+  def getPathList( path:Path, fs:FileSystem):List[Path]={
+val filter = new SubPathFilter()
+var pathList = List[Path]()
+fs.listStatus(path,filter).map(x={
+  if(x.isDirectory()){
--- End diff --

Yes,because this only support subdirectoriesï¼because nested all the 
directoriesï¼processing time is too long 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2712#issuecomment-58744782
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21641/consoleFull)
 for   PR 2712 at commit 
[`1b20d51`](https://github.com/apache/spark/commit/1b20d5193fa149347f9c8c05bb25298992324d4a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2712#issuecomment-58744783
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21641/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread wangxiaojing

Github user wangxiaojing commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740849
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 
 def accept(path: Path): Boolean = {
   try {
+if (fs.getFileStatus(path).isDirectory()){
+  return false
+}
 if (!filter(path)) {  // Reject file if it does not satisfy filter
   logDebug(Rejected by filter  + path)
   return false
--- End diff --

Why?if the file is directory ,the file should not consider.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2388#issuecomment-58744928
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21642/consoleFull)
 for   PR 2388 at commit 
[`b0734b8`](https://github.com/apache/spark/commit/b0734b86ab95774aec79af55d9de48b363fe243b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx

2014-10-11 Thread witgo

Github user witgo commented on a diff in the pull request:

https://github.com/apache/spark/pull/2388#discussion_r18740868
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/TopicModeling.scala ---
@@ -0,0 +1,674 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.feature
+
+import java.util.Random
+
+import breeze.linalg.{DenseVector = BDV, SparseVector = BSV, Vector = 
BV, sum = brzSum}
+
+import org.apache.spark.Logging
+import org.apache.spark.SparkContext._
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.broadcast.Broadcast
+import org.apache.spark.graphx._
+import org.apache.spark.mllib.linalg.distributed.{MatrixEntry, RowMatrix}
+import org.apache.spark.mllib.linalg.{DenseVector = SDV, SparseVector = 
SSV, Vector = SV}
+import org.apache.spark.rdd.RDD
+import org.apache.spark.serializer.KryoRegistrator
+import org.apache.spark.storage.StorageLevel
+
+import TopicModeling._
+
+class TopicModeling private[mllib](
+  @transient var corpus: Graph[VD, ED],
+  val numTopics: Int,
+  val numTerms: Int,
+  val alpha: Double,
+  val beta: Double,
+  @transient val storageLevel: StorageLevel)
+  extends Serializable with Logging {
+
+  def this(docs: RDD[(TopicModeling.DocId, SSV)],
+numTopics: Int,
+alpha: Double,
+beta: Double,
+storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK,
+computedModel: Broadcast[TopicModel] = null) {
+this(initializeCorpus(docs, numTopics, storageLevel, computedModel),
+  numTopics, docs.first()._2.size, alpha, beta, storageLevel)
+  }
+
+
+  /**
+   * The number of documents in the corpus
+   */
+  val numDocs = docVertices.count()
+
+  /**
+   * The number of terms in the corpus
+   */
+  private val sumTerms = corpus.edges.map(e = 
e.attr.size.toDouble).sum().toLong
+
+  /**
+   * The total counts for each topic
+   */
+  @transient private var globalTopicCounter: BV[Count] = 
collectGlobalCounter(corpus, numTopics)
+  assert(brzSum(globalTopicCounter) == sumTerms)
+  @transient private val sc = corpus.vertices.context
+  @transient private val seed = new Random().nextInt()
+  @transient private var innerIter = 1
+  @transient private var cachedEdges: EdgeRDD[ED, VD] = null
+  @transient private var cachedVertices: VertexRDD[VD] = null
+
+  private def termVertices = corpus.vertices.filter(t = t._1 = 0)
+
+  private def docVertices = corpus.vertices.filter(t = t._1  0)
+
+  private def gibbsSampling(cachedEdges: EdgeRDD[ED, VD],
+cachedVertices: VertexRDD[VD]): (EdgeRDD[ED, VD], VertexRDD[VD]) = {
+
+val corpusTopicDist = collectTermTopicDist(corpus, globalTopicCounter,
+  sumTerms, numTerms, numTopics, alpha, beta)
+
+val corpusSampleTopics = sampleTopics(corpusTopicDist, 
globalTopicCounter,
+  sumTerms, innerIter + seed, numTerms, numTopics, alpha, beta)
+corpusSampleTopics.edges.setName(sedges-$innerIter).cache().count()
+Option(cachedEdges).foreach(_.unpersist())
+val edges = corpusSampleTopics.edges
+
+corpus = updateCounter(corpusSampleTopics, numTopics)
+corpus.vertices.setName(svertices-$innerIter).cache()
+globalTopicCounter = collectGlobalCounter(corpus, numTopics)
+assert(brzSum(globalTopicCounter) == sumTerms)
+Option(cachedVertices).foreach(_.unpersist())
+val vertices = corpus.vertices
+
+if (innerIter % 10 == 0  sc.getCheckpointDir.isDefined) {
--- End diff --

This is only a temporary solution.
The related PR #2631


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with

[GitHub] spark pull request: [spark-3586][streaming]Support nested director...

2014-10-11 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2765#discussion_r18740883
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
 ---
@@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : 
NewInputFormat[K,V] : Clas
 
 def accept(path: Path): Boolean = {
   try {
+if (fs.getFileStatus(path).isDirectory()){
+  return false
+}
 if (!filter(path)) {  // Reject file if it does not satisfy filter
   logDebug(Rejected by filter  + path)
   return false
--- End diff --

I mean you can write `false`, not `return false`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...

2014-10-11 Thread cocoatomo

GitHub user cocoatomo opened a pull request:

https://github.com/apache/spark/pull/2766

[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and 
building warnings

Sphinx documents contains a corrupted ReST format and have some warnings.

The purpose of this issue is same as 
https://issues.apache.org/jira/browse/SPARK-3773.

commit: 0e8203f4fb721158fb27897680da476174d24c4b

output
```
$ cd ./python/docs
$ make clean html
rm -rf _build/*
sphinx-build -b html -d _build/doctrees   . _build/html
Making output directory...
Running Sphinx v1.2.3
loading pickled environment... not yet created
building [html]: targets for 4 source files that are out of date
updating environment: 4 added, 0 changed, 0 removed
reading sources... [100%] pyspark.sql   

  
/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring 
of pyspark.mllib.feature.Word2VecModel.findSynonyms:4: WARNING: Field list ends 
without a blank line; unexpected unindent.
/Users/user/MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring 
of pyspark.mllib.feature.Word2VecModel.transform:3: WARNING: Field list ends 
without a blank line; unexpected unindent.
/Users/user/MyRepos/Scala/spark/python/pyspark/sql.py:docstring of 
pyspark.sql:4: WARNING: Bullet list ends without a blank line; unexpected 
unindent.
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] pyspark.sql

  
writing additional files... (12 module code pages) _modules/index search
copying static files... WARNING: html_static_path entry 
u'/Users/user/MyRepos/Scala/spark/python/docs/_static' does not exist
done
copying extra files... done
dumping search index... done
dumping object inventory... done
build succeeded, 4 warnings.

Build finished. The HTML pages are in _build/html.
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cocoatomo/spark 
issues/3909-sphinx-build-warnings

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2766.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2766


commit 2c7faa8ca05820edd9936fdacc69e551059fc532
Author: cocoatomo cocoatom...@gmail.com
Date:   2014-10-11T10:20:24Z

[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and 
building warnings




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2766#issuecomment-58745151
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21643/consoleFull)
 for   PR 2766 at commit 
[`2c7faa8`](https://github.com/apache/spark/commit/2c7faa8ca05820edd9936fdacc69e551059fc532).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SQL] Refactors data type pattern matching

2014-10-11 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2764#discussion_r18740906
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -107,20 +107,20 @@ trait HiveTypeCoercion {
 case e if !e.childrenResolved = e
 
 /* Double Conversions */
-case b: BinaryExpression if b.left == stringNaN  
b.right.dataType == DoubleType =
-  b.makeCopy(Array(b.right, Literal(Double.NaN)))
-case b: BinaryExpression if b.left.dataType == DoubleType  
b.right == stringNaN =
-  b.makeCopy(Array(Literal(Double.NaN), b.left))
-case b: BinaryExpression if b.left == stringNaN  b.right == 
stringNaN =
-  b.makeCopy(Array(Literal(Double.NaN), b.left))
+case b @ BinaryExpression(StringNaN, DoubleType(r)) =
+  b.makeCopy(Array(r, Literal(Double.NaN)))
+case b @ BinaryExpression(DoubleType(l), StringNaN) =
+  b.makeCopy(Array(Literal(Double.NaN), l))
+case b @ BinaryExpression(l @ StringNaN, StringNaN) =
+  b.makeCopy(Array(Literal(Double.NaN), l))
 
 /* Float Conversions */
-case b: BinaryExpression if b.left == stringNaN  
b.right.dataType == FloatType =
-  b.makeCopy(Array(b.right, Literal(Float.NaN)))
-case b: BinaryExpression if b.left.dataType == FloatType  
b.right == stringNaN =
-  b.makeCopy(Array(Literal(Float.NaN), b.left))
-case b: BinaryExpression if b.left == stringNaN  b.right == 
stringNaN =
-  b.makeCopy(Array(Literal(Float.NaN), b.left))
+case b @ BinaryExpression(StringNaN, FloatType(r)) =
+  b.makeCopy(Array(r, Literal(Float.NaN)))
+case b @ BinaryExpression(FloatType(l), StringNaN) =
+  b.makeCopy(Array(Literal(Float.NaN), l))
+case b @ BinaryExpression(l @ StringNaN, StringNaN) =
+  b.makeCopy(Array(Literal(Float.NaN), l))
--- End diff --

This case branch can never be reached since line 114 supersedes it. As a 
result, NaN is always `Double`, bug?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...

2014-10-11 Thread liancheng

GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/2767

[SPARK-3906][SQL] Adds multiple join support for SQLContext



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark multi-join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2767.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2767


commit c78c944ccf20bd53c685e2be72cc6622c8b8e7ff
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-10-11T10:00:44Z

Adds multiple join support for SQLContext




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2767#issuecomment-58745472
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2767#issuecomment-58745535
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21644/consoleFull)
 for   PR 2767 at commit 
[`c78c944`](https://github.com/apache/spark/commit/c78c944ccf20bd53c685e2be72cc6622c8b8e7ff).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-10-11 Thread viper-kun

Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/2471#discussion_r18741084
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
@@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: 
SparkConf) extends ApplicationHis
   }
 }
 
-val newIterator = logInfos.iterator.buffered
-val oldIterator = applications.values.iterator.buffered
-while (newIterator.hasNext  oldIterator.hasNext) {
-  if (newIterator.head.endTime  oldIterator.head.endTime) {
-addIfAbsent(newIterator.next)
-  } else {
-addIfAbsent(oldIterator.next)
+applications.synchronized {
+  val newIterator = logInfos.iterator.buffered
+  val oldIterator = applications.values.iterator.buffered
+  while (newIterator.hasNext  oldIterator.hasNext) {
+if (newIterator.head.endTime  oldIterator.head.endTime) {
+  addIfAbsent(newIterator.next)
+} else {
+  addIfAbsent(oldIterator.next)
+}
   }
+  newIterator.foreach(addIfAbsent)
+  oldIterator.foreach(addIfAbsent)
+
+  applications = newApps
 }
-newIterator.foreach(addIfAbsent)
-oldIterator.foreach(addIfAbsent)
+  }
+} catch {
+  case t: Throwable = logError(Exception in checking for event log 
updates, t)
--- End diff --

you means: don't catch Throwable? what should we do?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2388#issuecomment-58746403
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21642/consoleFull)
 for   PR 2388 at commit 
[`b0734b8`](https://github.com/apache/spark/commit/b0734b86ab95774aec79af55d9de48b363fe243b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class TopicModelingKryoRegistrator extends KryoRegistrator `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2388#issuecomment-58746407
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21642/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2767#issuecomment-58746545
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21644/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2767#issuecomment-58746544
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21644/consoleFull)
 for   PR 2767 at commit 
[`c78c944`](https://github.com/apache/spark/commit/c78c944ccf20bd53c685e2be72cc6622c8b8e7ff).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...

2014-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2766#issuecomment-58746606
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21643/Test 
PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...

2014-10-11 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2766#issuecomment-58746604
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21643/consoleFull)
 for   PR 2766 at commit 
[`2c7faa8`](https://github.com/apache/spark/commit/2c7faa8ca05820edd9936fdacc69e551059fc532).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 193 matches

Mail list logo