[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48276050 LGTM, +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2400 : fix spark.yarn.max.executor.failu...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1282#issuecomment-48276380 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14639677 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- This actually pretty tricky because you can add escape characters to % --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14639691 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- I think this is taken care of below with the if guard. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14639717 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- What about ``` LIKE abcd\\% ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Updated programming-guide.md
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1324#issuecomment-48280471 Thanks. Merging this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Updated programming-guide.md
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1324 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1241#issuecomment-48280584 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build � �
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1153#issuecomment-48280658 Thanks. Merging this in master and branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1241#issuecomment-48280591 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build � �
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1153 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1069#issuecomment-48280716 Can one of you give me a tl;dr on the verdict about this one? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14639873 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- We are conservative in that case and the optimization won't apply (but we'll still give a correct answer). I'm happy to update the regex if you have a better one... (this one is taken from Hive... though AFICT they don't check for `\` at the end at all). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix (some of the) warnings in the test suite
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48280759 I just merged #1153 and unfortunately this one is no longer mergeable. Will - mind updating it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-48280910 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-48280916 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [RFC] Disable local execution of Spark jobs by...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1321#issuecomment-48280965 Maybe we should also solve the problem that local execution should not transfer the whole in-memory block (as a matter of fact, perhaps local execution should just bypass the in-memory data)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14640036 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- Actually this sounds good (it's just not a case to optimize for). Can we add comment inline to explain that? i.e. This doesn't match case like abcd\\%, but it doesn't affect correctness. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1241#issuecomment-48281236 Sorry will take a look tomorrow! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2391][SQL] Custom take() for LIMIT quer...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1318#issuecomment-48281300 LGTM. Merging in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2391][SQL] Custom take() for LIMIT quer...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1318 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1069#issuecomment-48281479 @rxin It looks like Scala is requiring developers to be more explicit about intention to use these features; these warnings become errors in Scala 2.11 actually. So my opinion is that use of the feature should not be enabled globally with a compiler flag, but enabled per compilation unit with an import. The latter is the currently how the codebase works. Since this PR just swaps the latter for former, I personally would argue that this not be merged. Actually, an earlier PR *also* add the compiler flag though. I think it should be removed and any extra warnings fixed. If there is any support for that, and no objections, I can raise a PR for that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2384] Add tooltips to UI.
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1314#issuecomment-48281532 You can exclude the licensing check by editing .rat-excludes file. One thing I'm not sure about is whether it is worth it to introduce jquery and other library just for the sake of a tooltip. We can probably implement the hover tooltip thing in a few lines of JavaScript code ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1325#discussion_r14640188 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -112,6 +113,26 @@ object ColumnPruning extends Rule[LogicalPlan] { } /** + * Simplifies LIKE expressions that do not need full regular expressions to evaluate the condition. + * For example, when the expression is just checking to see if a string starts with a given + * pattern. + */ +object LikeSimplification extends Rule[LogicalPlan] { + val startsWith = ([^_%]+)%.r --- End diff -- added. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48281644 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48281631 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48281746 LGTM. Let's merge once Jenkins comes back happy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1069#issuecomment-48281718 Thanks for summarizing that. I agree that we should not have a global flag that just disables certain warnings, since that could hide potential problems in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2384] Add tooltips to UI.
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1314#issuecomment-48281747 @rxin what if we merged this as-is and then we create a starter jira to write a small library for doing tooltips and remove jquery? I think @kayousterhout probably doesn't have enough time/javascript foo to do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Delete the useless import
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1284#issuecomment-48281934 @XuTingjun Do you mind clicking the close button to close this pull request? We actually don't have the permission to do that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [EC2] Add default history server port to ec2 s...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1296 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48282385 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16398/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48282384 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48282477 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2152] fix bin offset in DecisionTree no...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1316#issuecomment-48282497 @johnnywalleye mind adding `[MLlib]` to the title. Then it's more likely to catch @mengxr's eye :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48282671 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48282776 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48282775 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2384] Add tooltips to UI.
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1314#issuecomment-48282767 Definitely we can merge this first. I can submit a PR later to remove jquery. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48282759 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48282761 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48283271 LGTM pending tests, this is something that has confused people before, so I think it's best to just leave it out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48283315 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48283504 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1241#issuecomment-48284014 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16396/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-48284602 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1291: Link the spark UI to RM ui in yarn...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1112#issuecomment-48284603 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16397/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1069#issuecomment-48285811 @srowen If so, I close the PR and I will submit a new PR meets you said. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1069 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1039#issuecomment-48286947 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1039#issuecomment-48286964 Hey @GregOwen this has fallen out of date, mind updating it? I'd like to get it merged soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642636 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head --- End diff -- maybe call this `lastDep` and just do `rdd.dependencies.last` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642705 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head --- End diff -- might be slightly cleaner to do: ``` val depRdd = rdd.dependencies.head.rdd debugString(depRdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Resolve sbt warnings during build
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1069#issuecomment-48287243 @witgo Go ahead. Right now I actually don't see any warnings appear when the compiler flag is removed, so it looks like all other warnings are suppressed locally already. So it can just be removed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1039#issuecomment-48287316 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642773 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head + (frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]])) --- End diff -- For this it would be better to avoid this complex expression: ``` val frontDeps = rdd.dependencies.take(len - 1) val frontDepStrings = frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]]) val lastDep = rdd.dependencies.last val lastDepString = debugString(lastDep.rdd, prefix, endDep.isInstanceOf[ShuffleDependency[_,_]], true) (frontDepStrings ++ lastDepString) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1039#issuecomment-48287336 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642790 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head + (frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]])) +++ debugString(endDep.rdd, prefix, endDep.isInstanceOf[ShuffleDependency[_,_]], true)) + } +} +// The first RDD in the dependency stack has no parents, so no need for a +- +def firstDebugString(rdd: RDD[_]): Seq[String] = { + val partitionStr = ( + rdd.partitions.size + ) + val leftOffset = (partitionStr.length - 1)/2 --- End diff -- The `/` operator should be surrounded by spaces --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1039#issuecomment-48287559 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16402/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642874 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head + (frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]])) +++ debugString(endDep.rdd, prefix, endDep.isInstanceOf[ShuffleDependency[_,_]], true)) + } +} +// The first RDD in the dependency stack has no parents, so no need for a +- +def firstDebugString(rdd: RDD[_]): Seq[String] = { + val partitionStr = ( + rdd.partitions.size + ) + val leftOffset = (partitionStr.length - 1)/2 + val nextPrefix = ( * leftOffset) + | + ( * (partitionStr.length - leftOffset)) + Seq(partitionStr + + rdd) ++ debugChildren(rdd, nextPrefix) +} +def shuffleDebugString(rdd: RDD[_], prefix: String = , isLastChild: Boolean): Seq[String] = { + val partitionStr = ( + rdd.partitions.size + ) + val thisPrefix = prefix.replaceAll(\\|\\s+$, ) + val leftOffset = (partitionStr.length - 1)/2 + val nextPrefix = ( +thisPrefix ++ (if (isLastChild)else | ) ++ ( * leftOffset) + | + ( * (partitionStr.length - leftOffset))) + Seq(thisPrefix + +- + partitionStr + + rdd) ++ debugChildren(rdd, nextPrefix) +} +def debugString(rdd: RDD[_], prefix: String = , isShuffle: Boolean = true, isLastChild: Boolean = false): Seq[String] = { + if (isShuffle) { shuffleDebugString(rdd, prefix, isLastChild) } --- End diff -- This if/else should be broken onto multiple --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642915 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head + (frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]])) +++ debugString(endDep.rdd, prefix, endDep.isInstanceOf[ShuffleDependency[_,_]], true)) + } +} +// The first RDD in the dependency stack has no parents, so no need for a +- +def firstDebugString(rdd: RDD[_]): Seq[String] = { --- End diff -- It might be good to make all these inner functions private to avoid having to deal with binary checker changes in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2086] Improve output of toDebugString t...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1039#discussion_r14642925 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1235,11 +1235,43 @@ abstract class RDD[T: ClassTag]( /** A description of this RDD and its recursive dependencies for debugging. */ def toDebugString: String = { -def debugString(rdd: RDD[_], prefix: String = ): Seq[String] = { - Seq(prefix + rdd + ( + rdd.partitions.size + partitions)) ++ -rdd.dependencies.flatMap(d = debugString(d.rdd, prefix + )) +// Apply a different rule to the last child +def debugChildren(rdd: RDD[_], prefix: String): Seq[String] = { + val len = rdd.dependencies.length + len match { +case 0 = Seq.empty +case 1 = + val d = rdd.dependencies.head + debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]], true) +case _ = + val frontDeps = rdd.dependencies.take(len - 1) + val endDep = rdd.dependencies.takeRight(1).head + (frontDeps.flatMap(d = debugString(d.rdd, prefix, d.isInstanceOf[ShuffleDependency[_,_]])) +++ debugString(endDep.rdd, prefix, endDep.isInstanceOf[ShuffleDependency[_,_]], true)) + } +} +// The first RDD in the dependency stack has no parents, so no need for a +- +def firstDebugString(rdd: RDD[_]): Seq[String] = { --- End diff -- btw - not sure scala allows the private identifier in inner functions, so you'll have to see. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/1327 [SPARK-2402] Update the initial position when reuse DiskBlockObjectWriter Minor fix, `initialPosition` can not be updated after `close()` and re`open()`, which will lead to error when reusing this object. Details can be seen in [SPARK-2402](https://issues.apache.org/jira/browse/SPARK-2402). You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-2402 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1327.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1327 commit ba8098f5c284b6803a98e5a3b3b8115488bb34cc Author: jerryshao saisai.s...@intel.com Date: 2014-07-08T08:33:16Z Update the initial position when reuse DiskBlockObjectWriter to open the file --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1327#issuecomment-48289644 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1327#issuecomment-48289657 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48290945 @mengxr Thanks! I've addressed all your comments. Btw., I'm working on one-vs-all decomposition for multi-label training and hope to share the code in near future... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48291038 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48291055 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48291669 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16399/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2395][SQL] Optimize common LIKE pattern...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1325#issuecomment-48291668 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48292618 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48292958 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48292980 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48298665 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16400/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48298659 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2290] Worker should directly use its ow...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1244#issuecomment-48298667 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16401/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48298656 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1327#issuecomment-48301689 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16403/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1327#issuecomment-48301683 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48318580 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16404/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48318579 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48320871 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16405/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48320869 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48321427 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48321419 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48324832 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1949. Servlet 2.5 vs 3.0 conflict in SBT...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/906#issuecomment-48324834 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16406/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix (some of the) warnings in the test suite
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48326263 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2387: remove stage barrier
GitHub user lirui-intel opened a pull request: https://github.com/apache/spark/pull/1328 SPARK-2387: remove stage barrier This PR is a PoC implementation of [SPARK-2387](https://issues.apache.org/jira/browse/SPARK-2387). When a ShuffleMapTask finishes, DAGScheduler will check resource usage. And if thereâre free slots, DAGScheduler chooses a stage from the waiting list whose parent stages have all started, and pre-starts this waiting stage. All the in-progress parent stages will then register the map outputs progressively with MapOutputTrackerMaster. A flag is added to MapOutputTracker to indicate whether the map statuses for a shuffle is partial or not, so that we can distinguish partial registration from failed shuffle map stage. When the downstream task tries to fetch shuffle blocks, it will get an array of map outputs that has âholesâ (unfinished map tasks) in it. We created PartialBlockFetcherIterator to handle this map output array. PartialBlockFetcherIterator keeps an array of conventional iterators (BasicBlockFetcherIterator or NettyBlockFetcherIterator). When some new map outputs become available, PartialBlockFetcherIterator will delegate these outputs to a new conventional iterator and relies on these conventional iterators for âhasNextâ and ânextâ methods. When all the delegated map statuses run out, PartialBlockFetcherIterator contacts local MapOutputTrackerWorker for updated map outputs. MapOutputTrackerWorker uses an updater thread to communicate with MapOutputTrackerMaster to update the map statuses and informs the downstream tasks to continue when the map statuses get updated. This PoC feature is mainly intended and tested against standalone cluster. I used a 7-node cluster for performance test. Each node runs an executor with 32 CPUs and 90GB memory. I used graphx.SynthBenchmark for the test and the testcase used is: graphx.SynthBenchmark -partStrategy=EdgePartition2D -numEPart=112 -nverts=1000 -niter=3 The feature improves the whole job by roughly 10% (reduces the creation time from 128s to 116s and run time from 126s to 115s). You can merge this pull request into a Git repository by running: $ git pull https://github.com/lirui-intel/spark removeStageBarrier Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1328.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1328 commit 163302d26af6ab4b780e7047c417d6199f6c3020 Author: lirui rui...@intel.com Date: 2014-05-05T05:21:05Z minor fix Signed-off-by: lirui rui...@intel.com commit f81476d0460afabdb5f6a83c6542f080be81a58e Author: lirui rui...@intel.com Date: 2014-05-07T03:24:00Z Merge branch 'master' of https://github.com/lirui-intel/spark commit 3124380ccfd19a38878e05f0af29f80c279a897b Author: lirui rui...@intel.com Date: 2014-05-08T07:06:57Z try to locate the point to remove the barrier commit 8e625c0c87a4f5f3d35433f2eb1aca5d1cf09549 Author: lirui rui...@intel.com Date: 2014-05-08T07:57:52Z apply upstream hot fix commit 1d5d0f0be263a98387ba763b2298a8eccf3e2c65 Author: lirui rui...@intel.com Date: 2014-05-09T13:33:36Z RemoveStageBarrier: support partial map outputs commit c4f405446739280a3e02e70854d88f08498d8447 Author: lirui rui...@intel.com Date: 2014-05-11T06:50:40Z RemoveStageBarrier: build fix commit 444d2d96de39edc5399276fa2efec57638f10462 Author: lirui rui...@intel.com Date: 2014-05-11T13:54:54Z RemoveStageBarrier: register map outputs progressively commit 2df1d4e5cab7ba36443425a6f7b54a6ed06f519f Author: lirui rui...@intel.com Date: 2014-05-12T02:30:39Z RemoveStageBarrier: increment epoch for progressive registration commit 9f18dc74a25d34dfda923e3f1064cc339110e9a9 Author: lirui rui...@intel.com Date: 2014-05-12T08:39:10Z RemoveStageBarrier: fix check free CPUs commit 7af23c0fa134ac329d2ee4fa8813e1deb13b1ddd Author: lirui rui...@intel.com Date: 2014-05-13T08:30:41Z RemoveStageBarrier: make reducers refresh map outputs less often commit 9a32a17d0f2620ef807aa3d1ed26df39038bf3af Author: lirui rui...@intel.com Date: 2014-05-13T13:00:20Z RemoveStageBarrier: start reducers earlier commit 9ffb208e0c32990f7919589c091124d41de51f7e Author: lirui rui...@intel.com Date: 2014-05-14T03:21:59Z RemoveStageBarrier: add log info commit ef3b04323bcdeadab58d02b9076a14b08e130436 Author: lirui rui...@intel.com Date: 2014-05-14T08:45:03Z RemoveStageBarrier: adjust sleep interval commit 4213d63be930a7f39fca5667855b2862e52d3487 Author: lirui rui...@intel.com Date: 2014-05-15T11:24:25Z RemoveStageBarrier: add a new iterator to manage partial map outputs commit 376230a20822e797558530e67ee9df3619476d78 Author: lirui rui...@intel.com Date: 2014-05-16T05:13:08Z RemoveStageBarrier:
[GitHub] spark pull request: Fix (some of the) warnings in the test suite
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48326270 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix (some of the) warnings in the test suite
Github user willb commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48326403 @rxin Done; I also updated the comment to reflect the narrower focus after eliminating overlap with #1153. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2387: remove stage barrier
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1328#issuecomment-48326594 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2152][MLlib] fix bin offset in Decision...
Github user johnnywalleye commented on the pull request: https://github.com/apache/spark/pull/1316#issuecomment-48326815 Sure, updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2402] Update the initial position when ...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/1327#issuecomment-48329176 It is alright if this class is not reopen supported, but seems there is not obvious fence to defend user to reuse this object, so at least this modification will not lead to error. If we really want to defend reopen of this object, `open()` should not be exposed to the user. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix postfixOps warnings in the test suite
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48329998 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16407/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix postfixOps warnings in the test suite
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1323#issuecomment-48329997 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2403] Catch all errors during serializa...
GitHub user darabos opened a pull request: https://github.com/apache/spark/pull/1329 [SPARK-2403] Catch all errors during serialization in DAGScheduler https://issues.apache.org/jira/browse/SPARK-2403 Spark hangs for us whenever we forget to register a class with Kryo. This should be a simple fix for that. But let me know if you have a better suggestion. I did not write a new test for this. It would be pretty complicated and I'm not sure it's worthwhile for such a simple change. Let me know if you disagree. You can merge this pull request into a Git repository by running: $ git pull https://github.com/darabos/spark spark-2403 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1329.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1329 commit 361e96225bad6497009d75c74778336105854a6c Author: Daniel Darabos darabos.dan...@gmail.com Date: 2014-07-08T13:07:30Z Catch all errors during serialization in DAGScheduler. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2403] Catch all errors during serializa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1329#issuecomment-48333968 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SQL]Update MultiInstanceRelation.scala
Github user baishuo commented on the pull request: https://github.com/apache/spark/pull/1312#issuecomment-48352678 Can one of the admins verify this patch?:) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1667 re-fetch fails occasionally
Github user sarutak closed the pull request at: https://github.com/apache/spark/pull/604 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---