[GitHub] spark pull request: [SPARK-12736][CORE][DEPLOY] Standalone Master ...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10674#discussion_r49266731 --- Diff: network/common/pom.xml --- @@ -55,6 +55,7 @@ com.google.guava guava + compile --- End diff -- Thought so, but the line has fixed standalone Master to start (after the line got removed in https://github.com/apache/spark/commit/659fd9d04b988d48960eac4f352ca37066f43f5c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12736][CORE][DEPLOY] Standalone Master ...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10674#issuecomment-170268746 I'm using the following commands to do the build: ``` $ ./dev/change-scala-version.sh 2.11 $ ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -Phive -Phive-thriftserver -DskipTests clean install ``` (I've been using sbt but somehow it got "broken" lately, i.e. it doesn't build all modules - I'm going to report it later). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Fix for BUILD FAILURE for Scala 2.11
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10636#discussion_r49051693 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONRelation.scala --- @@ -68,29 +68,12 @@ private[sql] class JSONRelation( val maybeDataSchema: Option[StructType], val maybePartitionSpec: Option[PartitionSpec], override val userDefinedPartitionColumns: Option[StructType], -override val bucketSpec: Option[BucketSpec], +override val bucketSpec: Option[BucketSpec] = None, override val paths: Array[String] = Array.empty[String], parameters: Map[String, String] = Map.empty[String, String]) (@transient val sqlContext: SQLContext) extends HadoopFsRelation(maybePartitionSpec, parameters) { - def this( - inputRDD: Option[RDD[String]], - maybeDataSchema: Option[StructType], - maybePartitionSpec: Option[PartitionSpec], - userDefinedPartitionColumns: Option[StructType], - paths: Array[String] = Array.empty[String], - parameters: Map[String, String] = Map.empty[String, String])(sqlContext: SQLContext) = { --- End diff -- The error message was: ``` [error] /Users/jacek/dev/oss/spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONRelation.scala:66: in class JSONRelation, multiple overloaded alternatives of constructor JSONRelation define default arguments. [error] private[sql] class JSONRelation( [error]^ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10595#issuecomment-169633803 On it. I'll rebase and push update. Thanks @srowen for code review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] Fix for BUILD FAILURE for Scala 2.11
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10636 [MINOR] Fix for BUILD FAILURE for Scala 2.11 It was introduced in 917d3fc069fb9ea1c1487119c9c12b373f4f9b77 /cc @cloud-fan @rxin You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark fix-for-build-failure-2.11 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10636.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10636 commit 142fea579dd97feb8a9914720e9c70ce99037571 Author: Jacek Laskowski <ja...@japila.pl> Date: 2016-01-07T07:35:36Z [MINOR] Fix for BUILD FAILURE for Scala 2.11 It was introduced in 917d3fc069fb9ea1c1487119c9c12b373f4f9b77 /cc @cloud-fan @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][DOCS][EXAMPLES] Minor fixes
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10603#discussion_r49048172 --- Diff: docs/streaming-custom-receivers.md --- @@ -257,9 +257,9 @@ The following table summarizes the characteristics of both types of receivers ## Implementing and Using a Custom Actor-based Receiver -Custom [Akka Actors](http://doc.akka.io/docs/akka/2.2.4/scala/actors.html) can also be used to +Custom [Akka Actors](http://doc.akka.io/docs/akka/current/scala/actors.html) can also be used to --- End diff -- It's about actors which are the fundamental concept of Akka so it's of less worry (like RDDs in Spark), but have changed it to the Akka version Spark uses, i.e. `2.3.11`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10595 [STREAMING][MINOR] More contextual information in logs + minor code i⦠â¦mprovements Please review and merge at your convenience. Thanks! You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark streaming-minor-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10595.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10595 commit 62129336a171479b37edc347255a7be226fd2d22 Author: Jacek Laskowski <ja...@japila.pl> Date: 2016-01-05T08:25:00Z [STREAMING][MINOR] More contextual information in logs + minor code improvements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10595#discussion_r48830059 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobSet.scala --- @@ -59,17 +59,15 @@ case class JobSet( // Time taken to process all the jobs from the time they were submitted // (i.e. including the time they wait in the streaming scheduler queue) - def totalDelay: Long = { -processingEndTime - time.milliseconds - } + def totalDelay: Long = processingEndTime - time.milliseconds --- End diff -- Noted & thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10595#discussion_r48830663 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobSet.scala --- @@ -59,17 +59,15 @@ case class JobSet( // Time taken to process all the jobs from the time they were submitted // (i.e. including the time they wait in the streaming scheduler queue) - def totalDelay: Long = { -processingEndTime - time.milliseconds - } + def totalDelay: Long = processingEndTime - time.milliseconds def toBatchInfo: BatchInfo = { BatchInfo( time, streamIdToInputInfo, submissionTime, - if (processingStartTime >= 0) Some(processingStartTime) else None, - if (processingEndTime >= 0) Some(processingEndTime) else None, + if (hasStarted) Some(processingStartTime) else None, --- End diff -- Tested it locally (and can't wait to see the results from Jenkins). The current code *overly* assumes that the times can be `0` (which cannot...ever). It is also more clearer that at `hasCompleted` `processingEndTime` is already set. It's over-complicated as it's now IMHO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10595#discussion_r48830750 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala --- @@ -286,7 +286,7 @@ abstract class DStream[T: ClassTag] ( dependencies.foreach(_.validateAtStart()) logInfo("Slide time = " + slideDuration) --- End diff -- Thanks! I was thinking about it, but was worried to propose such changes as you might not like it :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10595#issuecomment-168949749 Yes, sure! Been worried that @srowen might cross few lines out that would completely devastate my mood today :) On to... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][DOCS][EXAMPLES] Minor fixes
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10603#discussion_r48852732 --- Diff: docs/streaming-custom-receivers.md --- @@ -273,9 +273,9 @@ class CustomActor extends Actor with ActorHelper { And a new input stream can be created with this custom actor as {% highlight scala %} -// Assuming ssc is the StreamingContext -val lines = ssc.actorStream[String](Props(new CustomActor()), "CustomReceiver") +val ssc: StreamingContext = ... +val lines = ssc.actorStream[String](Props[CustomActor], "CustomReceiver") {% endhighlight %} See [ActorWordCount.scala](https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/ActorWordCount.scala) -for an end-to-end example. +for a complete example. --- End diff -- It was discussed, but the changes are a result of my daily code reviews and I don't really know where I end up ahead. I'm now in Streaming so I will...*batch*...more changes next time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] More contextual information...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10595#issuecomment-168975843 Merged the other branches and ran build locally. Please review and merge at your convenience @srowen @rxin. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE][MINOR] scaladoc fixes
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10591#issuecomment-168976471 Merged to #10595 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE][MINOR] scaladoc fixes
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/10591 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] Scaladoc fixes...mostly
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10592#issuecomment-168976132 Merged to #10595. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] Scaladoc fixes...mostly
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/10592 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][DOCS][EXAMPLES] Minor fixes
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10603 [STREAMING][DOCS][EXAMPLES] Minor fixes You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark streaming-actor-custom-receiver Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10603.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10603 commit 5629feb2f43df706f0664b67c098d11a3c0b7185 Author: Jacek Laskowski <ja...@japila.pl> Date: 2016-01-05T12:55:25Z [STREAMING][DOCS][EXAMPLES] Minor fixes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [STREAMING][MINOR] Scaladoc fixes...mostly
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10592 [STREAMING][MINOR] Scaladoc fixes...mostly Please review and merge at your convenience. Thanks. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark streaming-minor-fixes-scaladoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10592.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10592 commit fa65c0d69ca8ec97edb63353c34dfc5cdd04dacf Author: Jacek Laskowski <ja...@japila.pl> Date: 2016-01-05T07:41:24Z [STREAMING][MINOR] Scaladoc fixes...mostly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE][MINOR] scaladoc fixes
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10591 [CORE][MINOR] scaladoc fixes You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark core-scaladoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10591.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10591 commit a23cfcf8375c132c8d79c3c0ead3d0c317966f16 Author: Jacek Laskowski <ja...@japila.pl> Date: 2016-01-05T07:32:10Z [CORE][MINOR] scaladoc fixes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9026] Refactor SimpleFutureAction.onCom...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/7385#issuecomment-168384426 @zsxwing @JoshRosen Does the comment need attention since the pr is closed, https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L438? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE] Refactoring: Use pattern matching and d...
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/10433 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Minor corrections, i.e. typo fixes and follow ...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10432 Minor corrections, i.e. typo fixes and follow deprecated You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark minor-corrections Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10432 commit efa4f43058247f89fe2ec5297a9915b4cdd1a3c6 Author: Jacek Laskowski <ja...@japila.pl> Date: 2015-12-22T13:08:10Z Minor corrections, i.e. typo fixes and follow deprecated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE] Refactoring: Use pattern matching and d...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/10433 [CORE] Refactoring: Use pattern matching and dedicated method It should be slightly easier to see what really happens (without questions like what other state a task can be here). BTW, I was thinking about introducing `TaskState.isFailed` or similar method so it's even clearer that the order of cases in the pattern match matters as the only case left out in `TaskState.isFinished` is `FINISHED`. With such a method, `TaskState.isFinished` would be `TaskState.isFailed` + `FINISHED` state. Perhaps, `TaskState.isCompleted` would be more appropriate. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark taskschedulerimpl-pattern-matching Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10433.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10433 commit 734095269660da9b6db454ebe49466ffd5cd1b04 Author: Jacek Laskowski <ja...@japila.pl> Date: 2015-12-22T13:14:05Z [CORE] Refactoring: Use pattern matching and dedicated method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE] Refactoring: Use pattern matching and d...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10433#issuecomment-166618819 I disagree since Spark core is hard due to what it does and making it harder can hinder contributions. There are just such little things that can change how welcoming the code is. Please reconsider your vote (I spent the time on it already so if it's lost aka "I don't think this is worth it", let's turn it back when the change hits master). You've just agreed that the logic is convoluted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [CORE] Refactoring: Use pattern matching and d...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/10433#issuecomment-166628562 Doh, you're *again* inviting me to make more involved changes (which is great, but don't think helps others to contribute). What I'm gonna do is to improve the change with changes with `TaskState`. Would that make sense? Would you approve such a change? I know the code quite well now, and believe that it needs such small changes to ultimately be more open for newcomers. There's simply far too much to learn in Spark Core, and such naming conventions where `isFinished` should've been `isCompleted` instead (following the event `CompleteEvent` sent later) don't make it any easier. Thanks a lot for helping me find the right way! I'm greatly indebted for the effort. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12302][EXAMPLES] Example for basic auth...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10273#discussion_r47426522 --- Diff: examples/src/main/java/org/apache/spark/examples/BasicAuthFilter.java --- @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.examples; + +import com.sun.jersey.core.util.Base64; + +import java.io.IOException; +import java.io.UnsupportedEncodingException; +import java.util.StringTokenizer; +import javax.servlet.FilterConfig; +import javax.servlet.ServletException; +import javax.servlet.ServletRequest; +import javax.servlet.ServletResponse; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import javax.servlet.Filter; +import javax.servlet.FilterChain; + +/** + * Servlet filter example for Spark web UI. + * BasisAuthFilter provides a functionality of basic authentication. --- End diff -- A typo - s/Basis/Basic --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12302][EXAMPLES] Example for basic auth...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/10273#discussion_r47426529 --- Diff: examples/src/main/java/org/apache/spark/examples/BasicAuthFilter.java --- @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.examples; + +import com.sun.jersey.core.util.Base64; + +import java.io.IOException; +import java.io.UnsupportedEncodingException; +import java.util.StringTokenizer; +import javax.servlet.FilterConfig; +import javax.servlet.ServletException; +import javax.servlet.ServletRequest; +import javax.servlet.ServletResponse; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import javax.servlet.Filter; +import javax.servlet.FilterChain; + +/** + * Servlet filter example for Spark web UI. + * BasisAuthFilter provides a functionality of basic authentication. + * The credential of this filter is for "spark-user:spark-password". --- End diff -- The credentials of the filter are... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11631][Scheduler] Adding 'Starting DAGS...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9603#issuecomment-157309968 I don't agree with @andrewor14. It does add a value of being consistent with how Spark informs about its status - if it says "Stopping..." at INFO it should be corresponding "Starting..." at INFO. That was my initial goal. Consistency is the goal and the value here. Why does Spark report about services being stopped at all? What's the rationale? I would change it to DEBUG, but since all the other logs about starting services are at INFO, I'd leave it as is until someone reports it should be done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-11112 Fix Scala 2.11 compilation error i...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9538#issuecomment-154988152 Knock, knock. Can we do someting with the patch as the build procedure for Scala 2.11 expanded with another step - `git merge pr-9538` beside `./dev/change-scala-version.sh 2.11` and `./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -Phive -Phive-thriftserver -DskipTests clean install` :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix Scala 2.11 compilation error in RDDInfo.sc...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9538#issuecomment-154707778 It worked fine for me. ``` â spark git:(master) â ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -Phive -Phive-thriftserver -DskipTests clean install ... [INFO] BUILD SUCCESS ``` Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Typo fixes + code readability improvements
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9501#issuecomment-154572569 Thanks again @srowen! Next pull req should be from starter tag. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Typo fixes + code readability improvements
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/9501 Typo fixes + code readability improvements You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark typos-with-style Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9501.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9501 commit aa911d580421b0025ca32ecce814117287853799 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-11-05T05:28:58Z Typo fixes + code readability improvements --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Typo fixes + code readability improvements
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9501#issuecomment-154199308 Thanks @srowen for reviewing the changes. They indeed are small since they're the outcome of me going through the code and learning Spark this way. When I run across a typo or similar inconsistency, I fix it and move on. If you've got a task at hand I could work on - something small(ish) - I'd love to hear about it. Guide me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Typo fixes + code readability improvements
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/9501#discussion_r44071949 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -195,9 +196,6 @@ class HadoopRDD[K, V]( // add the credentials here as this can be called before SparkContext initialized SparkHadoopUtil.get.addCredentials(jobConf) val inputFormat = getInputFormat(jobConf) -if (inputFormat.isInstanceOf[Configurable]) { --- End diff -- It's a duplication since `getInputFormat` does exactly this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typo in WebUI
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/9444 Fix typo in WebUI You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark TImely-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9444.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9444 commit f926c61c97b84f24a5638c52498a130cb3d53d9c Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-11-03T20:26:05Z s/TIme/Time --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typo in WebUI
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9444#issuecomment-153579178 Thanks @rxin! I knew it might've been too small for a change, but since it's in the UI, I thought I'd *not* wait till I find other typos. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11218] [Core] show help messages for st...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9432#issuecomment-153362113 Could you change `Usage: Worker [options] ` and `Usage: Master [options]` to be the scripts' names themselves not the cryptic Master and Worker? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typos
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9250#issuecomment-150852862 All for now. Merge at will. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typos
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/9250 Fix typos Two typos squashed. BTW Let me know how to proceed with other typos if I ran across any. I don't feel well to leave them aside as much as sending pull requests with such tiny changes. Guide me. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark typos-hunting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9250.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9250 commit 7d1b20d2346b42dac9268bdba6b1ef8933489a3c Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-10-23T12:41:01Z Fix typos --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typos
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9250#issuecomment-150606925 Ok, deal. I can run a spell-checker and see what I can fix within a half-an-hour timeframe. Should I go and create a JIRA task for it? Any particular module/package to look at during the timeframe? Thanks @srowen for the help! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix typos
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/9250#issuecomment-150659267 Thanks @rxin @srowen for support! It took me 30 minutes to find and use `aspell` on a single Scala package and it drove me crazy with tons of false positives - every Scala class name was wrong :( I'd rather be spending my time on fixing the typos in the docs as I know where few typos lurk. Thanks again! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Fix a (very tiny) typo
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/9230 Fix a (very tiny) typo You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark utils-seconds-typo Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9230.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9230 commit 23ba7324bd2ec7db9fb806911ad0c720e00b6f17 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-10-22T19:31:41Z Fix a typo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-148512484 @srowen Mind looking at the PR once again? I'd appreciate. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-147602196 Hey @srowen @jerryshao How does the change look like now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-146474747 Fully agree. I'm however not skilled to work on the feature to enable locality preference and haven't heard about anyone working on it, and the code as it is now is *not* used and is misleading. That said, I'd remove it and bring back when it's needed (the comments do more harm than offer help). BTW, It's all versioned in the repo and one can easily revert to any version in the past when needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-146462053 At some point in the future, very much likely, but it's not happening now and I'd remove it now (to make the code clearer) and once it's needed it gets added in the feature-related changes in the future. I'm working on the fix re MiMa. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-146506978 Sure, but how do you know **now** how the public API is going to look like? It is **currently** causing troubles in understanding the code and we only **wish** / **hope** to have uses in the future. I'm all for this and very well understand the reasoning. Unless there are short-term plans to work on using `preferredNodeLocationData` (there's a task and plan for 1.6), I'll be advocating for removing it as it simply causes confusion and misunderstanding how Spark really works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-146574566 SPARK-2089 is marked as resolved so any discussion there is done (or am I mistaken?) Also, I'm saying that the comments say it's used in YARN while it's not. My change *merely* removes "the noise". I fully agree with your other points. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/8976#discussion_r41303229 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -147,23 +139,41 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli * @param jars Collection of JARs to send to the cluster. These can be paths on the local file * system or HDFS, HTTP, HTTPS, or FTP URLs. * @param environment Environment variables to set on worker nodes. - * @param preferredNodeLocationData used in YARN mode to select nodes to launch containers on. - * Can be generated using [[org.apache.spark.scheduler.InputFormatInfo.computePreferredLocations]] - * from a list of input files or InputFormats for the application. + * @param preferredNodeLocationData not used. Left for backward compatibility. */ + @deprecated("Passing in preferred locations has no effect at all, see SPARK-10921", "1.5.1") --- End diff -- Been thinking about it, but couldn't decide (and didn't ask) and pick the version. I'm going to fix it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-145958130 Thanks @vanzin @srowen! That helped a lot. I'm going to fix it. How can I run the compatibility check locally? Is it `./build/sbt core/mima-report-binary-issues` or something else? How can I ask @SparkQA to execute the check after I `rebase` and `push` new version of the pull request? Thanks a lot for helping me out with the pull request and the process. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8976#issuecomment-145910412 @srowen @SparkQA says "This patch fails MiMa tests.", but I can't seem to decipher what exactly the pull request has broken. I remember you mentioned the change should be fine with MiMa, and given the message I'm a bit concerned. Would you help me with it if there's anything I should do to help the pull request get merged cleanly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10921] [YARN] Completely remove the use...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8976 [SPARK-10921] [YARN] Completely remove the use of SparkContext.prefer⦠â¦redNodeLocationData You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark SPARK-10921 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8976.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8976 commit 1353686d6f963f6316a279f3425effd17973f61a Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-10-05T04:12:05Z [SPARK-10921] [YARN] Completely remove the use of SparkContext.preferredNodeLocationData --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Introduce config constants object
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/8753 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10662][DOCS] Code snippets are not prop...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141894712 > Still, probably best to just merge this, as it's unlikely to cause much if any trouble. Would you? I'd greatly appreciate (and propose new changes :-)) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10662][DOCS] Code snippets are not prop...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141779426 Do you want me to split the pull requests to two - one with code formatting in table and another for the *unfortunate* excessive spaces? And what JIRA would that be - Removing excessive spaces in docs? I can do that, but want to be clear on your intents (that I however disagree with). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10662][DOCS] Code snippets are not prop...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141827044 I disagree with not accepting this change in this version **with** the superfluous spaces at the end of lines removed -- they're simply a garbage (and should not have been merged in the first place). They're quite likely leftovers from the time when it was decided to make the lines shorter. Can you show me a line for which the space(s) is important? If there's one, I'll fix it right away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10662][DOCS] Code snippets are not prop...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141675546 It should be better now. While fixing the docs, Atom fixed the additional spaces at the end (that I remember you mentioned not to fix, but since it was done automatically and they are indeed an issue, please consider to merge them, too). Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10662][DOCS] Code snippets are not prop...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141569374 So you want me to review the other documents for no-backtick-code-formatted-in-table issue? I'm fine with it, but just need to confirm my thinking. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][DOCS] Fixes markup so code is properly...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8795 [MINOR][DOCS] Fixes markup so code is properly formatted * Backticks are processed properly in Spark Properties table * Removed unnecessary spaces * See http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/running-on-yarn.html You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-yarn-formatting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8795.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8795 commit 90c9a96720f97be012e975406c2f861af3c7c6a0 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-09-17T07:54:20Z [MINOR][DOCS] Fixes markup so code is properly formatted * Backticks are processed properly in Spark Properties table * Removed unnecessary spaces * See http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/running-on-yarn.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR][DOCS] Fixes markup so code is properly...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8795#issuecomment-141048359 Is [SPARK-10662](https://issues.apache.org/jira/browse/SPARK-10662) what you were thinking about? What should be the next steps? Guide me in the jira, please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [DOCS] Small fixes to Spark on Yarn doc
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8762 [DOCS] Small fixes to Spark on Yarn doc * a follow-up to 16b6d18613e150c7038c613992d80a7828413e66 as `--num-executors` flag is not suppported. * links + formatting You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-spark-on-yarn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8762.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8762 commit 00e36b2760f424db2e2b41933b066e7a2600ba30 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-09-15T07:59:40Z [DOCS] Small fixes to Spark on Yarn doc * a follow-up to 16b6d18613e150c7038c613992d80a7828413e66 as `--num-executors` flag is not suppported. * links + formatting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Small fixes to docs
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8759 Small fixes to docs Links work now properly + consistent use of *Spark standalone cluster* (Spark uppercase + lowercase the rest -- seems agreed in the other places in the docs). You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-submitting-apps Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8759.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8759 commit ca6357e6e7f2f7aa03ee83ced963e1f463697063 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-09-15T06:23:38Z Small fixes to docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [DOCS] Small fixes to Spark on Yarn doc
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/8762#discussion_r39517063 --- Diff: docs/running-on-yarn.md --- @@ -54,8 +54,8 @@ In `yarn-cluster` mode, the driver runs on a different machine than the client, # Preparations -Running Spark-on-YARN requires a binary distribution of Spark which is built with YARN support. -Binary distributions can be downloaded from the Spark project website. +Running Spark on YARN requires a binary distribution of Spark which is built with YARN support. --- End diff -- See http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/running-on-yarn.html and search for `./bin/spark-submit --class path.to.your.Class`. It renders incorrectly since there's the indent + backticks that make the docs go awry. That's why I fixed that, too. What do you propose as *a different approach going forward*? When I see a change without changes to the docs, what's the *proper* approach? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [DOCS] Small fixes to Spark on Yarn doc
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/8762#discussion_r39504260 --- Diff: docs/running-on-yarn.md --- @@ -54,8 +54,8 @@ In `yarn-cluster` mode, the driver runs on a different machine than the client, # Preparations -Running Spark-on-YARN requires a binary distribution of Spark which is built with YARN support. -Binary distributions can be downloaded from the Spark project website. +Running Spark on YARN requires a binary distribution of Spark which is built with YARN support. --- End diff -- What about lines 30 and 24. They were the most important with the others just amendments as I was reviewing the entire doc. The reason for the change was this change 16b6d18613e150c7038c613992d80a7828413e66 where I learnt the command line's option is no longer supported and hence the change in the doc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Introduce config constants object
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8753#issuecomment-140414166 Should we then move the discussion to the mailing list and/or creating a task in JIRA? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Introduce config constants object
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/8753#issuecomment-140193323 Is this better discussed on the user or dev mailing lists? I disagree on the use of another dependency hop using such an object, but can agree with you on doing it for just a single property. I'm surprised that such large code base doesn't have a central place for all property names. Their names are everywhere and kudos to the maintainers for no typos (I at least haven't spot one so far)! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Introduce config constants object
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8753 Introduce config constants object A small refactoring to introduce a Scala object to keep property/environment names in a single place for YARN cluster deployment first (as I hate seeing strings all over the code, and most importantly avoid any typos in the future). I couldn't resist after reviewing c0052d8d09eebadadb5ed35ac512caaf73919551. If it gets accepted I'm going to move all the configuration names to the file. I did the first step and want to be told I'm right or get corrected. (It's also to verify my understanding of how the pull request process works in the project.) You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark yarn-config-params Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8753.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8753 commit 0bb05999a736dac1b6350d075a4ea95d2db3ef32 Author: Jacek Laskowski <jacek.laskow...@deepsense.io> Date: 2015-09-14T18:46:00Z Introduce config constants object --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Space out
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/8630 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Docs small fixes
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/8629#discussion_r39024259 --- Diff: docs/building-spark.md --- @@ -163,11 +164,9 @@ the `spark-parent` module). Thus, the full flow for running continuous-compilation of the `core` submodule may look more like: -``` - $ mvn install - $ cd core - $ mvn scala:cc -``` +$ mvn install --- End diff -- It's merged already, but for the sake of completeness I'm answering now - yes, it's *properly* code-formatted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Docs small fixes
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/8629#discussion_r39024388 --- Diff: docs/cluster-overview.md --- @@ -33,9 +34,9 @@ There are several useful things to note about this architecture: 2. Spark is agnostic to the underlying cluster manager. As long as it can acquire executor processes, and these communicate with each other, it is relatively easy to run it even on a cluster manager that also supports other applications (e.g. Mesos/YARN). -3. The driver program must listen for and accept incoming connections from its executors throughout - its lifetime (e.g., see [spark.driver.port and spark.fileserver.port in the network config - section](configuration.html#networking)). As such, the driver program must be network +3. The driver program must listen for and accept incoming connections from its executors throughout --- End diff -- The additional spaces at the end. Thanks a lot for reviewing the change and accepting it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Space out
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8630 Space out You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark build-space-out Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8630.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8630 commit 1b353d868238eb9bab686f645f95e037ce1a6320 Author: Jacek Laskowski <ja...@japila.pl> Date: 2015-09-06T20:29:13Z Space out --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Docs small fixes
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8629 Docs small fixes You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8629.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8629 commit 66a137d5d28d756b319f5625a799225ef57fb5c6 Author: Jacek Laskowski <ja...@japila.pl> Date: 2015-09-06T20:25:46Z Docs small fixes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9613] [HOTFIX] Fix usage of JavaConvert...
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/8479 [SPARK-9613] [HOTFIX] Fix usage of JavaConverters removed in Scala 2.11 Fix for [JavaConverters.asJavaListConverter](http://www.scala-lang.org/api/2.10.5/index.html#scala.collection.JavaConverters$) being removed in 2.11.7 and hence the build fails with the 2.11 profile enabled. Tested with the default 2.10 and 2.11 profiles. BUILD SUCCESS in both cases. Build for 2.10: ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -DskipTests clean install and 2.11: ./dev/change-scala-version.sh 2.11 ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -DskipTests clean install You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark SPARK-9613-hotfix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8479.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8479 commit b6c17e7daad09096e0bed94e677226b61d349bc1 Author: Jacek Laskowski ja...@japila.pl Date: 2015-08-27T08:38:59Z [SPARK-9613] [HOTFIX] Fix usage of JavaConverters removed in Scala 2.11 Fix for JavaConverters.asJavaListConverter being removed in 2.11.7 and hence the build fails. Tested with the default 2.10 and 2.11 profiles. BUILD SUCCESS in both cases. Build for 2.10: ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -DskipTests clean install and 2.11: ./dev/change-scala-version.sh 2.11 ./build/mvn -Pyarn -Phadoop-2.6 -Dhadoop.version=2.7.1 -Dscala-2.11 -DskipTests clean install --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/830#issuecomment-54374855 No worries. Do what you think is going to be the best solution for the project. I don't mind closing the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski closed the pull request at: https://github.com/apache/spark/pull/830 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/830#issuecomment-43816303 Please review the changes that were introduced after @tdas's comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/830#discussion_r12869294 --- Diff: docs/streaming-programming-guide.md --- @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._ val ssc = new StreamingContext(local, NetworkWordCount, Seconds(1)) {% endhighlight %} -Using this context, we then create a new DStream -by specifying the IP address and port of the data server. +Using this context, we can create a DStream that represents streaming data from TCP +source hostname (`localhost`) and port (``). --- End diff -- That was the exact copy from scaladoc for [org.apache.spark.streaming.StreamingContext#socketTextStream](http://people.apache.org/~pwendell/spark-1.0.0-rc9-docs/api/scala/index.html#org.apache.spark.streaming.StreamingContext) as I thought it could've built...a consistent learning environment. I'll think that and the scaladoc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/830#discussion_r12870111 --- Diff: docs/streaming-programming-guide.md --- @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._ val ssc = new StreamingContext(local, NetworkWordCount, Seconds(1)) {% endhighlight %} -Using this context, we then create a new DStream -by specifying the IP address and port of the data server. +Using this context, we can create a DStream that represents streaming data from TCP +source hostname (`localhost`) and port (``). {% highlight scala %} // Create a DStream that will connect to serverIP:serverPort, like localhost: -val lines = ssc.socketTextStream(localhost, ) +import org.apache.spark.streaming.dstream._ +val lines: DStream[String] = ssc.socketTextStream(localhost, ) --- End diff -- I fully agree and I do follow the rule while developing Scala applications, but since Scala is a statically typed language knowing the type while reading the docs helps comprehending what types are in play. That was the only reason to include them to let users open the scaladoc and search for more information with the types explicitly described. I myself was wondering what types should I be reading about and although I had started with `ssc` and followed along, I found it a bit troublesome for newcomers to Spark and Scala. *The easier the better* was the idea behind the change. In Spark's [Quick Start](http://spark.apache.org/docs/latest/quick-start.html) it's quite different where the types are presented with the results. In either case, I needed types while reading along without access to Spark's shell/REPL. Would you agree with the reasoning? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/830#discussion_r12870184 --- Diff: docs/streaming-programming-guide.md --- @@ -105,23 +104,22 @@ generating multiple new records from each record in the source DStream. In this each line will be split into multiple words and the stream of words is represented as the `words` DStream. Next, we want to count these words. +The `words` DStream is further mapped (one-to-one transformation) to a DStream of `(word, +1)` pairs, which is then reduced to get the frequency of words in each batch of data. +Finally, `wordCounts.print()` will print the first ten counts generated every second. + {% highlight scala %} -import org.apache.spark.streaming.StreamingContext._ // Count each word in each batch -val pairs = words.map(word = (word, 1)) -val wordCounts = pairs.reduceByKey(_ + _) +val pairs: DStream[(String, Int)] = words.map((_, 1)) +val wordCounts: DStream[(String, Int)] = pairs.reduceByKey(_ + _) -// Print a few of the counts to the console +// Print the first ten elements of each RDD generated in this DStream to the console --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Small updates to Streaming Programming Guide
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/830#discussion_r12870325 --- Diff: docs/streaming-programming-guide.md --- @@ -306,12 +304,16 @@ need to know to write your streaming applications. ## Linking To write your own Spark Streaming program, you will have to add the following dependency to your - SBT or Maven project: + sbt or Maven project: --- End diff -- It should be all lowercase as is on [the website](http://www.scala-sbt.org/): sbt is a build tool for Scala, Java, and more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Small updates to Streaming Programming Guide
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/830 Small updates to Streaming Programming Guide Please merge. More update will come soon. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-streaming-guide Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/830.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #830 commit 3ccc9ce7e07310dddb937a3a2df685e6f77f27f0 Author: Jacek Laskowski ja...@japila.pl Date: 2014-05-19T21:56:09Z Small updates to Streaming Programming Guide --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Simplify the build with sbt 0.13.2 features
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/706#issuecomment-42659515 Hey @pwendell I'd love being engaged in the effort if possible. Where could we discuss how much I could do regarding sbt (I think I might be quite helpful here and there)? Is there a JIRA issue I could be following the progress of the task. Please guide me so the project could benefit from some of spare time I could devote. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Simplify the build with sbt 0.13.2 featu...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/706#discussion_r12490430 --- Diff: project/SparkBuild.scala --- @@ -16,17 +16,18 @@ */ import sbt._ -import sbt.Classpaths.publishTask -import sbt.Keys._ +import Keys._ +import Classpaths.publishTask import sbtassembly.Plugin._ import AssemblyKeys._ -import scala.util.Properties +import util.Properties --- End diff -- Thanks! Will keep that in mind when sending patches in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: String interpolation + some other small change...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/748#discussion_r12665202 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -296,18 +297,15 @@ object SparkEnv extends Logging { // System properties that are not java classpaths val systemProperties = System.getProperties.iterator.toSeq -val otherProperties = systemProperties.filter { case (k, v) = +val otherProperties = systemProperties.filter { case (k, _) = k != java.class.path !k.startsWith(spark.) }.sorted // Class paths including all added jars and files -val classPathProperty = systemProperties.find { case (k, v) = - k == java.class.path -}.getOrElse((, )) -val classPathEntries = classPathProperty._2 +val classPathEntries = javaClassPath .split(File.pathSeparator) --- End diff -- Yes, I did, but not within Spark, but outside, in REPL. As a matter of fact, [Properties.javaClassPath](https://github.com/scala/scala/blob/2.12.x/src/library/scala/util/Properties.scala#L126) ends up calling [System.getProperty(name, alt)](https://github.com/scala/scala/blob/2.12.x/src/library/scala/util/Properties.scala#L52) so the change is truly a refactoring (to ease reading the class later on). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Simplify the build with sbt 0.13.2 features
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/706 Simplify the build with sbt 0.13.2 features It's a WIP, but am pull request'ing the current changes hoping that someone from the dev team would have a look at the changes and guide me how to merge them to the project. Ultimately I'd like to **refactor** the build to make it easier to understand. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark wip/build-refactoring Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/706.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #706 commit 9b3d2bceda2f662948430a8d7bdd5436c20a12a0 Author: Jacek Laskowski ja...@japila.pl Date: 2014-05-09T02:18:20Z Simplify the build with sbt 0.13.2 features --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Simplify the build with sbt 0.13.2 featu...
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/706#discussion_r12508408 --- Diff: project/SparkBuild.scala --- @@ -297,7 +273,7 @@ object SparkBuild extends Build { val chillVersion = 0.3.6 val codahaleMetricsVersion = 3.0.0 val jblasVersion = 1.2.3 - val jets3tVersion = if (^2\\.[3-9]+.r.findFirstIn(hadoopVersion).isDefined) 0.9.0 else 0.7.1 + val jets3tVersion = ^2\\.[3-9]+.r.findFirstIn(hadoopVersion).fold(0.7.1)(_ = 0.9.0) --- End diff -- I myself learnt it not so long ago and noticed it spurred some discussions about it in the Scala community (with [Martin Ordersky himself](http://stackoverflow.com/a/5332657/1305344)). [Functional Programming in Scala](http://www.manning.com/bjarnason/) reads on page 69 (in the pdf version): *Itâs fine to use pattern matching, though you should be able to implement all the functions besides map and getOrElse without resorting to pattern matching.* So I followed the advice and applied Option.map(...).getOrElse(...) quite intensively, but...IntelliJ IDEA just before I committed the change suggested to replace it with fold. I thought I'd need to change my habits once more and sent the PR. With all that said, it's not clear what's the most idiomatic approach, but [pattern matching is in my opinion a step back from map/getOrElse](https://github.com/scala/scala/blob/2.12.x/src/library/scala/Option.scala#L156) and there's no need for it. I'd appreciate being corrected and could even replace `Option.fold` with `map`/`getOrElse` if that would make the PR better for more committers. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Simplify the build with sbt 0.13.2 featu...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/706#issuecomment-42785356 Thanks! I've said it before (when @pwendell asked to hold off) and now I'll say it again as my changes don't seem to find home soon before the *We are still experimenting*'s over. When is the experimentation happening? Is there a branch for it? Is there a discussion on the mailing list(s) how it's going to be done? I'd appreciate more openness in this regard (to avoid Kafka's case where they moved to gradle for no apparent reasons other than that they didn't seem to have cared to learn sbt enough). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] Simplify the build with sbt 0.13.2 featu...
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/706#issuecomment-42700893 As to the changes I proposed in the PR, I think that however the future steps with sbt-pom-reader they're easily applicable to the build. They (are supposed to) simplify the build definition format to leverage sbt 0.13.2 macros and are expected not to interfere with the upcoming changes. They're slated for 1.1.0 after all. I'm writing all this hoping to convince you guys, the committers, to accept the pr so I can propose others ;-) I'd like to play a bit with the optional project inclusions as I think there's too many env vars to work with the different versions of Hadoop. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: sbt assembly and environment variables
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/671 sbt assembly and environment variables You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark docs-index Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/671.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #671 commit c4099ee8fe54e76a53b36eedb65348d508bc748e Author: Jacek Laskowski ja...@japila.pl Date: 2014-05-06T21:58:15Z sbt assembly and environment variables --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---