[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739434 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739441 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21632/consoleFull) for PR 2761 at commit [`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2344#discussion_r18739656 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -220,20 +220,44 @@ trait HiveTypeCoercion { case a: BinaryArithmetic if a.right.dataType == StringType = a.makeCopy(Array(a.left, Cast(a.right, DoubleType))) + // we should cast all timestamp/date/string compare into string compare + case p: BinaryPredicate if p.left.dataType == StringType + p.right.dataType == DateType = +p.makeCopy(Array(p.left, Cast(p.right, StringType))) + case p: BinaryPredicate if p.left.dataType == DateType + p.right.dataType == StringType = +p.makeCopy(Array(Cast(p.left, StringType), p.right)) case p: BinaryPredicate if p.left.dataType == StringType p.right.dataType == TimestampType = -p.makeCopy(Array(Cast(p.left, TimestampType), p.right)) +p.makeCopy(Array(p.left, Cast(p.right, StringType))) case p: BinaryPredicate if p.left.dataType == TimestampType p.right.dataType == StringType = -p.makeCopy(Array(p.left, Cast(p.right, TimestampType))) +p.makeCopy(Array(Cast(p.left, StringType), p.right)) + case p: BinaryPredicate if p.left.dataType == TimestampType + p.right.dataType == DateType = +p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType))) + case p: BinaryPredicate if p.left.dataType == DateType + p.right.dataType == TimestampType = +p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType))) --- End diff -- OK... verified this behavior with Hive, I've no idea about this :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739472 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21632/consoleFull) for PR 2761 at commit [`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker ` * `class SparkRunnerSettings(error: String = Unit) extends Settings(error) ` * `trait ActorHelper extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739473 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21632/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Docs] logNormalGraph missing partition parame...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2523#issuecomment-58739482 @elmalto It looks like GitHub says that this PR was opened from unknown repository, which might explain why you're not able to update its code. If that's the case, could you close this PR and open a new one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739516 Oh, I didn't run scalastyle for yarn-alpha. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support
Github user adrian-wang commented on a diff in the pull request: https://github.com/apache/spark/pull/2344#discussion_r18739665 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -220,20 +220,44 @@ trait HiveTypeCoercion { case a: BinaryArithmetic if a.right.dataType == StringType = a.makeCopy(Array(a.left, Cast(a.right, DoubleType))) + // we should cast all timestamp/date/string compare into string compare + case p: BinaryPredicate if p.left.dataType == StringType + p.right.dataType == DateType = +p.makeCopy(Array(p.left, Cast(p.right, StringType))) + case p: BinaryPredicate if p.left.dataType == DateType + p.right.dataType == StringType = +p.makeCopy(Array(Cast(p.left, StringType), p.right)) case p: BinaryPredicate if p.left.dataType == StringType p.right.dataType == TimestampType = -p.makeCopy(Array(Cast(p.left, TimestampType), p.right)) +p.makeCopy(Array(p.left, Cast(p.right, StringType))) case p: BinaryPredicate if p.left.dataType == TimestampType p.right.dataType == StringType = -p.makeCopy(Array(p.left, Cast(p.right, TimestampType))) +p.makeCopy(Array(Cast(p.left, StringType), p.right)) + case p: BinaryPredicate if p.left.dataType == TimestampType + p.right.dataType == DateType = +p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType))) + case p: BinaryPredicate if p.left.dataType == DateType + p.right.dataType == TimestampType = +p.makeCopy(Array(Cast(p.left, StringType), Cast(p.right, StringType))) --- End diff -- So Michael agreed to leave the whole ordering and comparing stuff in a separated PR :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739525 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21633/consoleFull) for PR 2761 at commit [`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739546 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21633/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739545 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21633/consoleFull) for PR 2761 at commit [`86c63e0`](https://github.com/apache/spark/commit/86c63e04c392b97a0b629e719bb42424992cffd1). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker ` * `class SparkRunnerSettings(error: String = Unit) extends Settings(error) ` * `trait ActorHelper extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2574#issuecomment-58739664 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21624/consoleFull) for PR 2574 at commit [`4fee5a8`](https://github.com/apache/spark/commit/4fee5a8400e87f7bb33363194cc3039feb3dbed6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3719][CORE][UI]:complete/failed stages...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2574#issuecomment-58739666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21624/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...
Github user lirui-intel commented on the pull request: https://github.com/apache/spark/pull/2760#issuecomment-58739690 Looks great! I think it's very useful to have these async APIs in java :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2675#issuecomment-58739685 @marmbrus This should be ready to go once Jenkins nods. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739709 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21634/consoleFull) for PR 2761 at commit [`64b2c46`](https://github.com/apache/spark/commit/64b2c46474a48fc0906f140edf310c46eb63). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2675#issuecomment-58739744 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21626/consoleFull) for PR 2675 at commit [`1c384b7`](https://github.com/apache/spark/commit/1c384b7bc8b0b8d5b9b6bf294f399de5bb8a9976). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3809][SQL] Fixes test suites in hive-th...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2675#issuecomment-58739745 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21626/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739778 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21634/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58739777 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21634/consoleFull) for PR 2761 at commit [`64b2c46`](https://github.com/apache/spark/commit/64b2c46474a48fc0906f140edf310c46eb63). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker ` * `class SparkRunnerSettings(error: String = Unit) extends Settings(error) ` * `trait ActorHelper extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2570#issuecomment-58739817 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2570#issuecomment-58739815 Seems the failure is not related to this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3867] ./python/run-tests failed when it...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2759#issuecomment-58739867 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21625/consoleFull) for PR 2759 at commit [`f068eb5`](https://github.com/apache/spark/commit/f068eb508c7f0e6991d296f4473eb754c7d5090f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3867] ./python/run-tests failed when it...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2759#issuecomment-58739870 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21625/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2570#issuecomment-58739903 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21635/consoleFull) for PR 2570 at commit [`3774bd4`](https://github.com/apache/spark/commit/3774bd4617cb4dec3f78a08bdf42653b682102fd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58740083 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21636/consoleFull) for PR 2761 at commit [`d80d71a`](https://github.com/apache/spark/commit/d80d71abc4cf3d85a2585729719b35a5eca84551). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2760#issuecomment-58740225 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21628/consoleFull) for PR 2760 at commit [`ff28e49`](https://github.com/apache/spark/commit/ff28e49d990577635fa148bd57461a387bd3466d). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class JavaFutureActionWrapper[S, T](futureAction: FutureAction[S], converter: S = T)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2760#issuecomment-58740227 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21628/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/2762 [SPARK-3904] [SQL] add constant objectinspector support for udfs In HQL, we convert all of the data type into normal `ObjectInspector`s for UDFs, most of cases it work, however, some of the UDF actually requires the input `ObjectInspector` to be the `ConstantObjectInspector`, which will cause exception. e.g. select named_struct(x, str) from src limit 1; You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark udf_coi Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2762.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2762 commit 06581e31aaef055c89a0d89ddaac657a9609d571 Author: Cheng Hao hao.ch...@intel.com Date: 2014-10-11T06:34:24Z add constant objectinspector support for udfs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58740431 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58740447 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58740444 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58740484 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21637/consoleFull) for PR 2762 at commit [`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2760#issuecomment-58740594 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21630/consoleFull) for PR 2760 at commit [`6f8f6ac`](https://github.com/apache/spark/commit/6f8f6ac668d74a3164bcf037f09c8353134b53f6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class JavaFutureActionWrapper[S, T](futureAction: FutureAction[S], converter: S = T)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3902] Stabilize AsynRDDActions and add ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2760#issuecomment-58740597 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21630/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58740581 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21638/consoleFull) for PR 2762 at commit [`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58740676 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/350/consoleFull) for PR 2538 at commit [`6db00da`](https://github.com/apache/spark/commit/6db00da9595e38eccff7bfb5683b32cee3ac6263). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class StreamingContext(object):` * `class DStream(object):` * `class TransformedDStream(DStream):` * `class TransformFunction(object):` * `class TransformFunctionSerializer(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user tianyi commented on a diff in the pull request: https://github.com/apache/spark/pull/2762#discussion_r18739877 --- Diff: sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveCompatibilitySuite.scala --- @@ -578,6 +578,7 @@ class HiveCompatibilitySuite extends HiveQueryFileTest with BeforeAndAfter { multi_join_union, multiMapJoin1, multiMapJoin2, +udf_named_struct, --- End diff -- I think you should put it after udf_month --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58740718 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21631/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58740717 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21631/consoleFull) for PR 2538 at commit [`64561e4`](https://github.com/apache/spark/commit/64561e4e503eafb958f6769383ba3b37edbe5fa2). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class StreamingContext(object):` * `class DStream(object):` * `class TransformedDStream(DStream):` * `class TransformFunction(object):` * `class TransformFunctionSerializer(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2570#issuecomment-58740765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21635/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3343] [SQL] Add serde support for CTAS
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2570#issuecomment-58740758 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21635/consoleFull) for PR 2570 at commit [`3774bd4`](https://github.com/apache/spark/commit/3774bd4617cb4dec3f78a08bdf42653b682102fd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class CreateTableAsSelect[T](` * ` logDebug(Found class for $serdeName)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user viper-kun commented on a diff in the pull request: https://github.com/apache/spark/pull/2471#discussion_r18739911 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis } } -val newIterator = logInfos.iterator.buffered -val oldIterator = applications.values.iterator.buffered -while (newIterator.hasNext oldIterator.hasNext) { - if (newIterator.head.endTime oldIterator.head.endTime) { -addIfAbsent(newIterator.next) - } else { -addIfAbsent(oldIterator.next) +applications.synchronized { --- End diff -- I think there is a need for the two tasks to never run concurrently. if the order is: 1. check task get applications 2. clean task get applications 3. clean task get result, and replace applications 4. check task get result, and replace applications then clean task result is covered by check result. use a ScheduledExecutorService with a single worker thread is a good way to solve it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3888] [PySpark] limit the memory used b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2743#issuecomment-58740824 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/351/consoleFull) for PR 2743 at commit [`623c8a7`](https://github.com/apache/spark/commit/623c8a76c2e91bd4f80193a0d7c4813d1cb3bc7a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58741122 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21637/consoleFull) for PR 2762 at commit [`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58741124 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21637/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58741172 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/349/consoleFull)** for PR 2538 at commit [`6db00da`](https://github.com/apache/spark/commit/6db00da9595e38eccff7bfb5683b32cee3ac6263) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58741206 **[Tests timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21627/consoleFull)** for PR 2538 at commit [`331ecce`](https://github.com/apache/spark/commit/331ecced6f61ad5183da5830f94f584bcc74e479) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2377] Python API for Streaming
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2538#issuecomment-58741207 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21627/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58741232 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21638/consoleFull) for PR 2762 at commit [`06581e3`](https://github.com/apache/spark/commit/06581e31aaef055c89a0d89ddaac657a9609d571). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` protected case class Keyword(str: String)` * `class SqlLexical(val keywords: Seq[String]) extends StdLexical ` * ` case class FloatLit(chars: String) extends Token ` * `class SqlParser extends AbstractSparkSQLParser ` * `case class SetCommand(kv: Option[(String, Option[String])]) extends Command ` * `case class ShellCommand(cmd: String) extends Command` * `case class SourceCommand(filePath: String) extends Command` * `case class SetCommand(kv: Option[(String, Option[String])], output: Seq[Attribute])(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3904] [SQL] add constant objectinspecto...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2762#issuecomment-58741234 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21638/Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: The keys for sorting the columns of Executor p...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2763 The keys for sorting the columns of Executor page ,Stage page Storage page are incorrect You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-3905 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2763.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2763 commit 17d79904dc80e960145db216d2de9ab8884458dd Author: GuoQiang Li wi...@qq.com Date: 2014-10-11T07:11:58Z The keys for sorting the columns of Executor page ,Stage page Storage page are incorrect --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2763#issuecomment-58741803 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2763#issuecomment-58741865 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21639/consoleFull) for PR 2763 at commit [`17d7990`](https://github.com/apache/spark/commit/17d79904dc80e960145db216d2de9ab8884458dd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58741924 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21636/consoleFull) for PR 2761 at commit [`d80d71a`](https://github.com/apache/spark/commit/d80d71abc4cf3d85a2585729719b35a5eca84551). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SparkSpaceBeforeLeftBraceChecker extends ScalariformChecker ` * `class SparkRunnerSettings(error: String = Unit) extends Settings(error) ` * `trait ActorHelper extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58741926 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21636/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/2764 [SQL] Refactors data type pattern matching Refactors/adds extractors for `DataType` and `Binary*` types to ease and simplify data type related (nested) pattern matching. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark datatype-patmat Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2764.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2764 commit f391be51ee91da4c12146c90aad9f63d06f0ac34 Author: Cheng Lian lian.cs@gmail.com Date: 2014-10-11T07:37:25Z Refactors data type pattern matching --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2764#issuecomment-58742036 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2764#issuecomment-58742085 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21640/consoleFull) for PR 2764 at commit [`f391be5`](https://github.com/apache/spark/commit/f391be51ee91da4c12146c90aad9f63d06f0ac34). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user viper-kun commented on a diff in the pull request: https://github.com/apache/spark/pull/2471#discussion_r18740124 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis } } -val newIterator = logInfos.iterator.buffered -val oldIterator = applications.values.iterator.buffered -while (newIterator.hasNext oldIterator.hasNext) { - if (newIterator.head.endTime oldIterator.head.endTime) { -addIfAbsent(newIterator.next) - } else { -addIfAbsent(oldIterator.next) +applications.synchronized { + val newIterator = logInfos.iterator.buffered + val oldIterator = applications.values.iterator.buffered + while (newIterator.hasNext oldIterator.hasNext) { +if (newIterator.head.endTime oldIterator.head.endTime) { + addIfAbsent(newIterator.next) +} else { + addIfAbsent(oldIterator.next) +} } + newIterator.foreach(addIfAbsent) + oldIterator.foreach(addIfAbsent) + + applications = newApps } -newIterator.foreach(addIfAbsent) -oldIterator.foreach(addIfAbsent) + } +} catch { + case t: Throwable = logError(Exception in checking for event log updates, t) +} + } + + /** + * Deleting apps if setting cleaner. + */ + private def cleanLogs() = { +lastLogCleanTimeMs = getMonotonicTimeMs() +logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs)) +try { + val logStatus = fs.listStatus(new Path(resolvedLogDir)) + val logDirs = if (logStatus != null) logStatus.filter(_.isDir).toSeq else Seq[FileStatus]() + val maxAge = conf.getLong(spark.history.fs.maxAge.seconds, +DEFAULT_SPARK_HISTORY_FS_MAXAGE_S) * 1000 + + val now = System.currentTimeMillis() + fs.synchronized { +// scan all logs from the log directory. +// Only directories older than this many seconds will be deleted . +logDirs.foreach { dir = + // history file older than this many seconds will be deleted + // when the history cleaner runs. + if (now - getModificationTime(dir) maxAge) { +fs.delete(dir.getPath, true) + } +} + } + + val newApps = new mutable.LinkedHashMap[String, FsApplicationHistoryInfo]() + def addIfNotExpire(info: FsApplicationHistoryInfo) = { +if(now - info.lastUpdated = maxAge) { + newApps += (info.id - info) --- End diff -- info.lastUpdated is the timestamps of the directory and the info.lastUpdated is always bigger than the files timestamps. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3854] Scala style: require spaces befor...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2761#issuecomment-58742157 I quite like standardizing style, but doesn't this have the same problem mentioned before, that it's going to break a lot of potential merge commits? If it's bite-the-bullet time, there are other micro changes that may actually have a little positive impact on execution that might be good to get in too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3888] [PySpark] limit the memory used b...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2743#issuecomment-58742243 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/351/consoleFull) for PR 2743 at commit [`623c8a7`](https://github.com/apache/spark/commit/623c8a76c2e91bd4f80193a0d7c4813d1cb3bc7a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user viper-kun commented on a diff in the pull request: https://github.com/apache/spark/pull/2471#discussion_r18740169 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis } } -val newIterator = logInfos.iterator.buffered -val oldIterator = applications.values.iterator.buffered -while (newIterator.hasNext oldIterator.hasNext) { - if (newIterator.head.endTime oldIterator.head.endTime) { -addIfAbsent(newIterator.next) - } else { -addIfAbsent(oldIterator.next) +applications.synchronized { + val newIterator = logInfos.iterator.buffered + val oldIterator = applications.values.iterator.buffered + while (newIterator.hasNext oldIterator.hasNext) { +if (newIterator.head.endTime oldIterator.head.endTime) { + addIfAbsent(newIterator.next) +} else { + addIfAbsent(oldIterator.next) +} } + newIterator.foreach(addIfAbsent) + oldIterator.foreach(addIfAbsent) + + applications = newApps } -newIterator.foreach(addIfAbsent) -oldIterator.foreach(addIfAbsent) + } +} catch { + case t: Throwable = logError(Exception in checking for event log updates, t) +} + } + + /** + * Deleting apps if setting cleaner. + */ + private def cleanLogs() = { +lastLogCleanTimeMs = getMonotonicTimeMs() +logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs)) +try { + val logStatus = fs.listStatus(new Path(resolvedLogDir)) + val logDirs = if (logStatus != null) logStatus.filter(_.isDir).toSeq else Seq[FileStatus]() + val maxAge = conf.getLong(spark.history.fs.maxAge.seconds, +DEFAULT_SPARK_HISTORY_FS_MAXAGE_S) * 1000 + + val now = System.currentTimeMillis() + fs.synchronized { +// scan all logs from the log directory. +// Only directories older than this many seconds will be deleted . +logDirs.foreach { dir = + // history file older than this many seconds will be deleted + // when the history cleaner runs. + if (now - getModificationTime(dir) maxAge) { +fs.delete(dir.getPath, true) --- End diff -- Can you tell me the detail reason that add try..catch into fs.delete? i think the exception may be caught by try..catch(line 271). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2764#issuecomment-58743015 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21640/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2764#issuecomment-58743014 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21640/consoleFull) for PR 2764 at commit [`f391be5`](https://github.com/apache/spark/commit/f391be51ee91da4c12146c90aad9f63d06f0ac34). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class InSet(value: Expression, hset: HashSet[Any], child: Seq[Expression])` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
GitHub user wangxiaojing opened a pull request: https://github.com/apache/spark/pull/2765 [spark-3586][streaming]Support nested directories in Spark Streaming For text files, the method streamingContext.textFileStream(dataDirectory). The improvement of the streaming to Support subdirectories,spark streaming can monitor the subdirectories dataDirectory and process any files created in that directory. eg: streamingContext.textFileStream(/test). Look at the direction contents: /test/file1 /test/file2 /test/dr/file1 if the directory /test/dr/ have new file file2 ,spark streaming can process the file You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangxiaojing/spark spark-3586 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2765.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2765 commit 98ead547f90520819b421b0f4436bfe7d8a3d4f4 Author: wangxiaojing u9j...@gmail.com Date: 2014-10-11T08:22:31Z Support nested directories in Spark Streaming --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-58743235 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-58743237 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...
Github user james64 commented on the pull request: https://github.com/apache/spark/pull/2712#issuecomment-58743254 Can it be that test Flume test failed due to upstream changes? It is passing for me locally now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2763#issuecomment-58743282 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21639/consoleFull) for PR 2763 at commit [`17d7990`](https://github.com/apache/spark/commit/17d79904dc80e960145db216d2de9ab8884458dd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3905][Web UI]The keys for sorting the c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2763#issuecomment-58743286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21639/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2712#issuecomment-58743306 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21641/consoleFull) for PR 2712 at commit [`1b20d51`](https://github.com/apache/spark/commit/1b20d5193fa149347f9c8c05bb25298992324d4a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740431 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas def accept(path: Path): Boolean = { try { +if (fs.getFileStatus(path).isDirectory()){ --- End diff -- Nit: space before brace --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740429 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -230,6 +246,10 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas if (minNewFileModTime 0 || modTime minNewFileModTime) { minNewFileModTime = modTime } +if(path.getName().startsWith(_)){ + System.out.println(startsWith: + path.getName()) --- End diff -- Remove this System.out --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740430 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas def accept(path: Path): Boolean = { try { +if (fs.getFileStatus(path).isDirectory()){ + return false +} if (!filter(path)) { // Reject file if it does not satisfy filter logDebug(Rejected by filter + path) return false --- End diff -- You don't need `return` anywhere --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740436 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas true } } + + private[streaming] + class SubPathFilter extends PathFilter { + +def accept(path: Path): Boolean = { + try { +if(fs.getFileStatus(path).isDirectory()){ + val modTime = getFileModTime(path) + logDebug(Mod time for + path + is + modTime) + if (modTime ignoreTime) { +// Reject file if it was created before the ignore time (or, before last interval) +logDebug(Mod time + modTime + less than ignore time + ignoreTime) +return false + } + return true +} + } catch { +case fnfe: java.io.FileNotFoundException = --- End diff -- Why not import this, and what about more general `IOException`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740443 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas true } } + + private[streaming] + class SubPathFilter extends PathFilter { + +def accept(path: Path): Boolean = { + try { +if(fs.getFileStatus(path).isDirectory()){ + val modTime = getFileModTime(path) + logDebug(Mod time for + path + is + modTime) --- End diff -- Nit: you can use string interpolation to make it a little simpler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740446 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -240,6 +260,31 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas true } } + + private[streaming] + class SubPathFilter extends PathFilter { + +def accept(path: Path): Boolean = { + try { +if(fs.getFileStatus(path).isDirectory()){ + val modTime = getFileModTime(path) + logDebug(Mod time for + path + is + modTime) + if (modTime ignoreTime) { +// Reject file if it was created before the ignore time (or, before last interval) +logDebug(Mod time + modTime + less than ignore time + ignoreTime) --- End diff -- Log message is inconsistent with conditional --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740452 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -118,6 +119,18 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas (newFiles, filter.minNewFileModTime) } + def getPathList( path:Path, fs:FileSystem):List[Path]={ +val filter = new SubPathFilter() +var pathList = List[Path]() +fs.listStatus(path,filter).map(x={ + if(x.isDirectory()){ --- End diff -- Doesn't this only list immediate subdirectories? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2471#discussion_r18740457 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis } } -val newIterator = logInfos.iterator.buffered -val oldIterator = applications.values.iterator.buffered -while (newIterator.hasNext oldIterator.hasNext) { - if (newIterator.head.endTime oldIterator.head.endTime) { -addIfAbsent(newIterator.next) - } else { -addIfAbsent(oldIterator.next) +applications.synchronized { + val newIterator = logInfos.iterator.buffered + val oldIterator = applications.values.iterator.buffered + while (newIterator.hasNext oldIterator.hasNext) { +if (newIterator.head.endTime oldIterator.head.endTime) { + addIfAbsent(newIterator.next) +} else { + addIfAbsent(oldIterator.next) +} } + newIterator.foreach(addIfAbsent) + oldIterator.foreach(addIfAbsent) + + applications = newApps } -newIterator.foreach(addIfAbsent) -oldIterator.foreach(addIfAbsent) + } +} catch { + case t: Throwable = logError(Exception in checking for event log updates, t) +} + } + + /** + * Deleting apps if setting cleaner. + */ + private def cleanLogs() = { +lastLogCleanTimeMs = getMonotonicTimeMs() +logDebug(Cleaning logs. Time is now %d..format(lastLogCleanTimeMs)) --- End diff -- Nit: string interpolation is probably clearer: `s:Cleaning ... now $lastLogCleanTimeMs` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user wangxiaojing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740834 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -118,6 +119,18 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas (newFiles, filter.minNewFileModTime) } + def getPathList( path:Path, fs:FileSystem):List[Path]={ +val filter = new SubPathFilter() +var pathList = List[Path]() +fs.listStatus(path,filter).map(x={ + if(x.isDirectory()){ --- End diff -- Yes,because this only support subdirectoriesï¼because nested all the directoriesï¼processing time is too long --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2712#issuecomment-58744782 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21641/consoleFull) for PR 2712 at commit [`1b20d51`](https://github.com/apache/spark/commit/1b20d5193fa149347f9c8c05bb25298992324d4a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3121] Wrong implementation of implicit ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2712#issuecomment-58744783 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21641/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user wangxiaojing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740849 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas def accept(path: Path): Boolean = { try { +if (fs.getFileStatus(path).isDirectory()){ + return false +} if (!filter(path)) { // Reject file if it does not satisfy filter logDebug(Rejected by filter + path) return false --- End diff -- Why?if the file is directory ,the file should not consider. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-58744928 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21642/consoleFull) for PR 2388 at commit [`b0734b8`](https://github.com/apache/spark/commit/b0734b86ab95774aec79af55d9de48b363fe243b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/2388#discussion_r18740868 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/TopicModeling.scala --- @@ -0,0 +1,674 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.mllib.feature + +import java.util.Random + +import breeze.linalg.{DenseVector = BDV, SparseVector = BSV, Vector = BV, sum = brzSum} + +import org.apache.spark.Logging +import org.apache.spark.SparkContext._ +import org.apache.spark.annotation.Experimental +import org.apache.spark.broadcast.Broadcast +import org.apache.spark.graphx._ +import org.apache.spark.mllib.linalg.distributed.{MatrixEntry, RowMatrix} +import org.apache.spark.mllib.linalg.{DenseVector = SDV, SparseVector = SSV, Vector = SV} +import org.apache.spark.rdd.RDD +import org.apache.spark.serializer.KryoRegistrator +import org.apache.spark.storage.StorageLevel + +import TopicModeling._ + +class TopicModeling private[mllib]( + @transient var corpus: Graph[VD, ED], + val numTopics: Int, + val numTerms: Int, + val alpha: Double, + val beta: Double, + @transient val storageLevel: StorageLevel) + extends Serializable with Logging { + + def this(docs: RDD[(TopicModeling.DocId, SSV)], +numTopics: Int, +alpha: Double, +beta: Double, +storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK, +computedModel: Broadcast[TopicModel] = null) { +this(initializeCorpus(docs, numTopics, storageLevel, computedModel), + numTopics, docs.first()._2.size, alpha, beta, storageLevel) + } + + + /** + * The number of documents in the corpus + */ + val numDocs = docVertices.count() + + /** + * The number of terms in the corpus + */ + private val sumTerms = corpus.edges.map(e = e.attr.size.toDouble).sum().toLong + + /** + * The total counts for each topic + */ + @transient private var globalTopicCounter: BV[Count] = collectGlobalCounter(corpus, numTopics) + assert(brzSum(globalTopicCounter) == sumTerms) + @transient private val sc = corpus.vertices.context + @transient private val seed = new Random().nextInt() + @transient private var innerIter = 1 + @transient private var cachedEdges: EdgeRDD[ED, VD] = null + @transient private var cachedVertices: VertexRDD[VD] = null + + private def termVertices = corpus.vertices.filter(t = t._1 = 0) + + private def docVertices = corpus.vertices.filter(t = t._1 0) + + private def gibbsSampling(cachedEdges: EdgeRDD[ED, VD], +cachedVertices: VertexRDD[VD]): (EdgeRDD[ED, VD], VertexRDD[VD]) = { + +val corpusTopicDist = collectTermTopicDist(corpus, globalTopicCounter, + sumTerms, numTerms, numTopics, alpha, beta) + +val corpusSampleTopics = sampleTopics(corpusTopicDist, globalTopicCounter, + sumTerms, innerIter + seed, numTerms, numTopics, alpha, beta) +corpusSampleTopics.edges.setName(sedges-$innerIter).cache().count() +Option(cachedEdges).foreach(_.unpersist()) +val edges = corpusSampleTopics.edges + +corpus = updateCounter(corpusSampleTopics, numTopics) +corpus.vertices.setName(svertices-$innerIter).cache() +globalTopicCounter = collectGlobalCounter(corpus, numTopics) +assert(brzSum(globalTopicCounter) == sumTerms) +Option(cachedVertices).foreach(_.unpersist()) +val vertices = corpus.vertices + +if (innerIter % 10 == 0 sc.getCheckpointDir.isDefined) { --- End diff -- This is only a temporary solution. The related PR #2631 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with
[GitHub] spark pull request: [spark-3586][streaming]Support nested director...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740883 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -207,6 +220,9 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F : NewInputFormat[K,V] : Clas def accept(path: Path): Boolean = { try { +if (fs.getFileStatus(path).isDirectory()){ + return false +} if (!filter(path)) { // Reject file if it does not satisfy filter logDebug(Rejected by filter + path) return false --- End diff -- I mean you can write `false`, not `return false` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...
GitHub user cocoatomo opened a pull request: https://github.com/apache/spark/pull/2766 [SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and building warnings Sphinx documents contains a corrupted ReST format and have some warnings. The purpose of this issue is same as https://issues.apache.org/jira/browse/SPARK-3773. commit: 0e8203f4fb721158fb27897680da476174d24c4b output ``` $ cd ./python/docs $ make clean html rm -rf _build/* sphinx-build -b html -d _build/doctrees . _build/html Making output directory... Running Sphinx v1.2.3 loading pickled environment... not yet created building [html]: targets for 4 source files that are out of date updating environment: 4 added, 0 changed, 0 removed reading sources... [100%] pyspark.sql /Users/user/MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring of pyspark.mllib.feature.Word2VecModel.findSynonyms:4: WARNING: Field list ends without a blank line; unexpected unindent. /Users/user/MyRepos/Scala/spark/python/pyspark/mllib/feature.py:docstring of pyspark.mllib.feature.Word2VecModel.transform:3: WARNING: Field list ends without a blank line; unexpected unindent. /Users/user/MyRepos/Scala/spark/python/pyspark/sql.py:docstring of pyspark.sql:4: WARNING: Bullet list ends without a blank line; unexpected unindent. looking for now-outdated files... none found pickling environment... done checking consistency... done preparing documents... done writing output... [100%] pyspark.sql writing additional files... (12 module code pages) _modules/index search copying static files... WARNING: html_static_path entry u'/Users/user/MyRepos/Scala/spark/python/docs/_static' does not exist done copying extra files... done dumping search index... done dumping object inventory... done build succeeded, 4 warnings. Build finished. The HTML pages are in _build/html. ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/cocoatomo/spark issues/3909-sphinx-build-warnings Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2766.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2766 commit 2c7faa8ca05820edd9936fdacc69e551059fc532 Author: cocoatomo cocoatom...@gmail.com Date: 2014-10-11T10:20:24Z [SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and building warnings --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2766#issuecomment-58745151 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21643/consoleFull) for PR 2766 at commit [`2c7faa8`](https://github.com/apache/spark/commit/2c7faa8ca05820edd9936fdacc69e551059fc532). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] Refactors data type pattern matching
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2764#discussion_r18740906 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala --- @@ -107,20 +107,20 @@ trait HiveTypeCoercion { case e if !e.childrenResolved = e /* Double Conversions */ -case b: BinaryExpression if b.left == stringNaN b.right.dataType == DoubleType = - b.makeCopy(Array(b.right, Literal(Double.NaN))) -case b: BinaryExpression if b.left.dataType == DoubleType b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) -case b: BinaryExpression if b.left == stringNaN b.right == stringNaN = - b.makeCopy(Array(Literal(Double.NaN), b.left)) +case b @ BinaryExpression(StringNaN, DoubleType(r)) = + b.makeCopy(Array(r, Literal(Double.NaN))) +case b @ BinaryExpression(DoubleType(l), StringNaN) = + b.makeCopy(Array(Literal(Double.NaN), l)) +case b @ BinaryExpression(l @ StringNaN, StringNaN) = + b.makeCopy(Array(Literal(Double.NaN), l)) /* Float Conversions */ -case b: BinaryExpression if b.left == stringNaN b.right.dataType == FloatType = - b.makeCopy(Array(b.right, Literal(Float.NaN))) -case b: BinaryExpression if b.left.dataType == FloatType b.right == stringNaN = - b.makeCopy(Array(Literal(Float.NaN), b.left)) -case b: BinaryExpression if b.left == stringNaN b.right == stringNaN = - b.makeCopy(Array(Literal(Float.NaN), b.left)) +case b @ BinaryExpression(StringNaN, FloatType(r)) = + b.makeCopy(Array(r, Literal(Float.NaN))) +case b @ BinaryExpression(FloatType(l), StringNaN) = + b.makeCopy(Array(Literal(Float.NaN), l)) +case b @ BinaryExpression(l @ StringNaN, StringNaN) = + b.makeCopy(Array(Literal(Float.NaN), l)) --- End diff -- This case branch can never be reached since line 114 supersedes it. As a result, NaN is always `Double`, bug? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/2767 [SPARK-3906][SQL] Adds multiple join support for SQLContext You can merge this pull request into a Git repository by running: $ git pull https://github.com/liancheng/spark multi-join Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2767.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2767 commit c78c944ccf20bd53c685e2be72cc6622c8b8e7ff Author: Cheng Lian lian.cs@gmail.com Date: 2014-10-11T10:00:44Z Adds multiple join support for SQLContext --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2767#issuecomment-58745472 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2767#issuecomment-58745535 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21644/consoleFull) for PR 2767 at commit [`c78c944`](https://github.com/apache/spark/commit/c78c944ccf20bd53c685e2be72cc6622c8b8e7ff). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs
Github user viper-kun commented on a diff in the pull request: https://github.com/apache/spark/pull/2471#discussion_r18741084 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -195,22 +241,68 @@ private[history] class FsHistoryProvider(conf: SparkConf) extends ApplicationHis } } -val newIterator = logInfos.iterator.buffered -val oldIterator = applications.values.iterator.buffered -while (newIterator.hasNext oldIterator.hasNext) { - if (newIterator.head.endTime oldIterator.head.endTime) { -addIfAbsent(newIterator.next) - } else { -addIfAbsent(oldIterator.next) +applications.synchronized { + val newIterator = logInfos.iterator.buffered + val oldIterator = applications.values.iterator.buffered + while (newIterator.hasNext oldIterator.hasNext) { +if (newIterator.head.endTime oldIterator.head.endTime) { + addIfAbsent(newIterator.next) +} else { + addIfAbsent(oldIterator.next) +} } + newIterator.foreach(addIfAbsent) + oldIterator.foreach(addIfAbsent) + + applications = newApps } -newIterator.foreach(addIfAbsent) -oldIterator.foreach(addIfAbsent) + } +} catch { + case t: Throwable = logError(Exception in checking for event log updates, t) --- End diff -- you means: don't catch Throwable? what should we do? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-58746403 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21642/consoleFull) for PR 2388 at commit [`b0734b8`](https://github.com/apache/spark/commit/b0734b86ab95774aec79af55d9de48b363fe243b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class TopicModelingKryoRegistrator extends KryoRegistrator ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1405][MLLIB] topic modeling on Graphx
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-58746407 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21642/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2767#issuecomment-58746545 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21644/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3906][SQL] Adds multiple join support f...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2767#issuecomment-58746544 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21644/consoleFull) for PR 2767 at commit [`c78c944`](https://github.com/apache/spark/commit/c78c944ccf20bd53c685e2be72cc6622c8b8e7ff). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2766#issuecomment-58746606 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21643/Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3909][PySpark][Doc] A corrupted format ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2766#issuecomment-58746604 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21643/consoleFull) for PR 2766 at commit [`2c7faa8`](https://github.com/apache/spark/commit/2c7faa8ca05820edd9936fdacc69e551059fc532). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org