[GitHub] spark pull request: SPARK-1756: Add missing description to spark-e...

2014-05-11 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/646#issuecomment-42763279 A better solution [PR 730](https://github.com/apache/spark/pull/730) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: The org.datanucleus:* should not be packaged i...

2014-05-11 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/688 The org.datanucleus:* should not be packaged into spark-assembly-*.jar You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1644

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

2014-05-11 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-42771828 @pwendell @marmbrus @rxin typesafehub/scala-logging#4 has been solved by typesafehub/scala-logging#15 The PR should be able to merge into master master

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

2014-05-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/332#discussion_r12512415 --- Diff: core/src/main/scala/org/apache/spark/Logging.scala --- @@ -116,7 +121,8 @@ trait Logging { val log4jInitialized

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

2014-05-11 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/332#discussion_r12512463 --- Diff: project/SparkBuild.scala --- @@ -317,6 +317,7 @@ object SparkBuild extends Build { val excludeFastutil = ExclusionRule(organization

[GitHub] spark pull request: Improve build configuration � �

2014-05-11 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/590#issuecomment-42792564 @pwendell Big changes have been removed. The PR can be merged into master and branch-1.0. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: Improve build configuration � �

2014-05-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/590#issuecomment-42809856 @srowen Has been removed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Improve build configuration � �

2014-05-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/590#issuecomment-42809563 @srowen In some cases,`commons-lang` has multiple version dependency. `fairscheduler.xml`,`hive-site.xml` should be ignored --- If your project is set up

[GitHub] spark pull request: Improve build configuration � �

2014-05-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/590#issuecomment-42811589 @srowen I will submit a new Pull Request to solve this problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: Improve build configuration � �

2014-05-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/590#issuecomment-42809975 ``` [INFO] | +- org.apache.hadoop:hadoop-client:jar:1.0.4:compile [INFO] | | \- org.apache.hadoop:hadoop-core:jar:1.0.4:compile [INFO

[GitHub] spark pull request: fix different versions of commons-lang depende...

2014-05-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/754#issuecomment-42911672 @srowen have time to review the code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [WIP]SPARK-1712: TaskDescription instance is t...

2014-05-13 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/677 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP][SPARK-1712]: TaskDescription instance is...

2014-05-13 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/694#issuecomment-43038842 @mateiz What do you think of [this demo](https://github.com/witgo/spark/compare/SPARK-1712_new3)? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [WIP][SPARK-1712]: TaskDescription instance is...

2014-05-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/694#issuecomment-43044976 @mateiz Unit testing has been added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-1712]: TaskDescription instance is too ...

2014-05-14 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/694 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP]SPARK-1712: TaskDescription instance is t...

2014-05-14 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/677#discussion_r12366698 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -414,6 +415,14 @@ private[spark] class TaskSetManager( // we

[GitHub] spark pull request: [SPARK-1712]: TaskDescription instance is too ...

2014-05-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/694#issuecomment-43046219 @mateiz The problem seems to be the current master branch. Local test no problem --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1712]: TaskDescription instance is too ...

2014-05-14 Thread witgo
GitHub user witgo reopened a pull request: https://github.com/apache/spark/pull/694 [SPARK-1712]: TaskDescription instance is too big causes Spark to hang You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK

[GitHub] spark pull request: [SPARK-1712]: TaskDescription instance is too ...

2014-05-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/694#issuecomment-43064571 I do not know what causes the error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2.0:0 failed 1 times, most recent failure: Exception

[GitHub] spark pull request: SPARK-1712: TaskDescription instance is too bi...

2014-05-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/694#issuecomment-42975919 There is [another solution](https://github.com/witgo/spark/compare/SPARK-1712_new3) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Fix: sbt test throw an java.lang.OutOfMemoryEr...

2014-05-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/773#issuecomment-43106294 log: ``` [info] ReplSuite: [info] - propagation of local properties (4 seconds, 979 milliseconds) [info] - simple foreach with accumulator (4 seconds, 150

[GitHub] spark pull request: SPARK-1712: TaskDescription instance is too bi...

2014-05-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/694 SPARK-1712: TaskDescription instance is too big causes Spark to hang You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1712_new

[GitHub] spark pull request: [SPARK-1837] NumericRange should be partitione...

2014-05-15 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/776#discussion_r12675655 --- Diff: core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala --- @@ -128,18 +137,17 @@ private object ParallelCollectionRDD

[GitHub] spark pull request: Fix: sbt test throw an java.lang.OutOfMemoryEr...

2014-05-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/773 Fix: sbt test throw an java.lang.OutOfMemoryError: PermGen space You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark sbt_javaOptions

[GitHub] spark pull request: Improve maven plugin configuration

2014-05-15 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/786 Improve maven plugin configuration You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark maven_plugin Alternatively you can review

[GitHub] spark pull request: [WIP] [SPARK-1841]: update scalatest to versio...

2014-05-16 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/713#issuecomment-43298969 @pwendell Code style issues has been fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-16 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43309863 In your code: `sc.parallelize(1L to 2L,4).zip(sc.parallelize(11 to 12,4)).collect` = `Array[(Long, Int)] = Array((1,11), (2,12))`   This is the right

[GitHub] spark pull request: Remove compile-scoped junit dependency.

2014-05-16 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/794#issuecomment-43288330 Related work #713 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Improve maven plugin configuration

2014-05-16 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/590 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-16 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43394756 @kanzhang ``` scala sc.parallelize((1D to 2D).by(0.2),4).collect res0: Array[Double] = Array(1.0, 1.2, 1.6, 1.8) ``` ``` scala sc.parallelize

[GitHub] spark pull request: [SPARK-1712]: TaskDescription instance is too ...

2014-05-16 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/694#discussion_r12765545 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -140,8 +141,29 @@ class

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-16 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43396698 @kanzhang All told, we should fix the following code `slices += r.take(sliceSize).asInstanceOf[Seq[T]]`. --- If your project is set up for it, you can reply

[GitHub] spark pull request: Convert spark.cleaner.ttl.* to lowercase

2014-05-17 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/811 Convert spark.cleaner.ttl.* to lowercase You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark MetadataCleanerType Alternatively you can

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-18 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43433310 A simple solution ```scala object ParallelCollectionRDD { /** * Slice a collection into numSlices sub-collections. One extra thing we do here

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-18 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43461175 @kanzhang ``` scala val d=(1D to 2D).by(0.2) d: scala.collection.immutable.NumericRange[Double] = NumericRange(1.0, 1.2, 1.4, 1.5999

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-18 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/820 [WIP][SPARK-1875]:NoClassDefFoundError: StringUtils when building agains... ...t Hadoop 1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-18 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/820#issuecomment-43465158 @pwendel Do you have time to review the code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-18 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/820#issuecomment-43465677 Oh, I'm sorry. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/820#issuecomment-43468949 @mateiz This problem only occurs in spark-assembly_2.10,will not affect user testing. ``` [INFO] --- maven-dependency-plugin:2.8:tree (default-cli

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/820#issuecomment-43470508 @mateiz `spark-examples_2.10` is consistent with the situation you say ``` [INFO] [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ spark

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/820#issuecomment-43471608 @srowen I agree with what @mateiz said. We should not exclude commons-lang. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-1875:NoClassDefFoundError: StringUtils w...

2014-05-19 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/824 SPARK-1875:NoClassDefFoundError: StringUtils when building against Hadoo... ...p 1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark

[GitHub] spark pull request: [SPARK-1875]NoClassDefFoundError: StringUtils ...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/824#issuecomment-43475257 @srowen The following code in line with your thoughts? https://github.com/witgo/spark/compare/SPARK-1875_new --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-1875]NoClassDefFoundError: StringUtils ...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/824#issuecomment-43476568 Has been modified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [WIP]Improve ALS resource usage

2014-05-19 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/828 [WIP]Improve ALS resource usage Now,In ALS algorithm, RDD can not be cleaned You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark

[GitHub] spark pull request: [SPARK-1817] RDD.zip() should verify partition...

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/760#issuecomment-43579042 I mean,`NumericRange[Double]` different methods get different results. So we just guarantee `slice` method return consistent results. --- If your project is set up

[GitHub] spark pull request: [WIP][SPARK-1875]:NoClassDefFoundError: String...

2014-05-19 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/820 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43581291 @tdas CheckpointRDD is not properly cleaned. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43581755 @mateiz Why the checkpoint data must be written to the file system?. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-19 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43583620 @mateiz It is not necessary to write it in the file system.After all, there is no other RDD in reading it.I think it should be put checkpoint data into blockManager, so

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-20 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43589674 [The code](https://github.com/witgo/spark/commit/6d7f2408a40bf4bb2889bf66fa61bced782cdefc#diff-2b593e0b4bd6eddab37f04968baa826c) will make the checkpoint directory larger

[GitHub] spark pull request: Convert spark.cleaner.ttl.* to lowercase

2014-05-20 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/811 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-20 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43608181 @mateiz @mengxr I added a new operation `cachePoint` of RDD --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-20 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43656940 Another [solution](https://github.com/witgo/spark/compare/cachePoint). --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-21 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43790944 @mateiz, @mengxr I am using [the code](https://github.com/witgo/spark/compare/cachePoint) to test ALS. A brief description of the test: | Item

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-21 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-43840745 @tdas You're right. the code breaks the fault-tolerance properties of RDDs. The perfect solution is the automatic cleanup and rebuilding shuffle data. --- If your

[GitHub] spark pull request: Automatically cleanup checkpoint date

2014-05-22 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/855 Automatically cleanup checkpoint date You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark cleanup_checkpoint_date Alternatively you can

[GitHub] spark pull request: Automatically cleanup checkpoint date

2014-05-22 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/855#issuecomment-43969051 @tdas Optional? Default is off? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Automatically cleanup checkpoint date

2014-05-22 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/855#issuecomment-43971489 @mridulm @tdas The code has been updated. Now, automatically clean up checkpoint data is optional --- If your project is set up for it, you can reply

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-25 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-44122991 I am using [the code](https://github.com/witgo/spark/compare/cleanup_checkpoint_date_als) to test ALS. A brief description of the test: | Item | Description

[GitHub] spark pull request: SPARK-1935: Explicitly add commons-codec 1.4 a...

2014-05-26 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/889#issuecomment-44235434 spark-hive = commons-codec 1.4 spark-sql = commons-codec 1.5 ``` [INFO] [INFO

[GitHub] spark pull request: [SPARK-1930] Container memory beyond limit, we...

2014-05-27 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/894 [SPARK-1930] Container memory beyond limit, were killed You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1930 Alternatively you

[GitHub] spark pull request: [WIP][SPARK-1930] The Container is running bey...

2014-05-28 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/894#discussion_r13131123 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala --- @@ -90,6 +90,12 @@ private[yarn] class YarnAllocationHandler

[GitHub] spark pull request: [WIP][SPARK-1930] The Container is running bey...

2014-05-28 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/894#issuecomment-44424182 I agree with @sryza .Spark automatically handle these better. Of course, we can allow users to manually specify the special value. --- If your project is set up

[GitHub] spark pull request: Pluggable Diskstore for BlockManager

2014-05-28 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/907#issuecomment-44487759 @colorant This is a big changes. Can you explain this change reason? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-1714. Take advantage of AMRMClient APIs ...

2014-05-30 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/655#discussion_r13227671 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala --- @@ -105,278 +96,222 @@ private[yarn] class

[GitHub] spark pull request: In some cases, yarn does not automatically res...

2014-05-30 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/921 In some cases, yarn does not automatically restart the container You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark allocateExecutors

[GitHub] spark pull request: In some cases, yarn does not automatically res...

2014-05-31 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/921#issuecomment-44719589 @sryza When `yarnAllocator.getNumExecutorsFailed` return value is greater than zero . `yarnAllocator.getNumExecutorsRunning args.numExecutors` is true forever

[GitHub] spark pull request: [WIP][SPARK-1930] The Container is running bey...

2014-05-31 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/894#discussion_r13259895 --- Diff: yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala --- @@ -92,21 +92,22 @@ class ExecutorLauncher(args

[GitHub] spark pull request: Improve ALS algorithm resource usage

2014-05-31 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/929 Improve ALS algorithm resource usage You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark improve_als Alternatively you can review

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-31 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/828 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP]Improve ALS algorithm resource usage

2014-05-31 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/828#issuecomment-44742037 This solution is not perfect. temporarily close this. The new #929 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [WIP][SPARK-1930] The Container is running bey...

2014-06-01 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/894#issuecomment-4410 @mridulm The following code in line with your thoughts? https://github.com/witgo/spark/compare/SPARK-1930_different --- If your project is set up for it, you can

[GitHub] spark pull request: update breeze to version 0.8.1

2014-06-02 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/940 update breeze to version 0.8.1 `breeze 0.8.1` dependent on `scala-logging-slf4j 2.1.1` The relevant code on #332 You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-1997] update breeze to version 0.8.1

2014-06-02 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/940#issuecomment-44911857 @markhamstra , `breeze 0.7 ` does not support `scala 2.11` . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [WIP][SPARK-1997] update breeze to version 0.8...

2014-06-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/940#issuecomment-44965251 I think ,`breeze` no big change from `0.7` to `0.8.1`. Of course, this conclusion has not been a lot of testing --- If your project is set up for it, you can reply

[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-44983923 @CrazyJvm I think we should also modify [SparkSubmitArguments.scala#L99](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy

[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-44985234 [SparkSubmitArguments.scala#L127](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L127) We can

[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-45043437 ```scala // Global defaults. These should be keep to minimum to avoid confusing behavior. master = Option(master).getOrElse

[GitHub] spark pull request: Improve ALS algorithm resource usage

2014-06-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/929#issuecomment-45069637 @mengxr By calling this method `RDD.checkpoint`, `ContextCleaner` can clean up the shuffle data, reduce disk usage. Just as described in the table below

[GitHub] spark pull request: Improve ALS algorithm resource usage

2014-06-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/929#issuecomment-45070672 As for the above data. One iteration write `160G` shuffle data . Three iterations will have occupied `480G` hard disk --- If your project is set up for it, you can reply

[GitHub] spark pull request: Improve ALS algorithm resource usage

2014-06-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/929#issuecomment-45071297 @mengxr Since I only have three test server, I need more time to test your ideas. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...

2014-06-04 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/969 [WIP] In yarn.ClientBase spark.yarn.dist.* do not work You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark yarn_ClientBase Alternatively

[GitHub] spark pull request: [SPARK-1978] In some cases, spark-yarn does no...

2014-06-04 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/921#discussion_r13425322 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ExecutorLauncher.scala --- @@ -204,9 +204,17 @@ class ExecutorLauncher(args

[GitHub] spark pull request: [WIP] In yarn.ClientBase spark.yarn.dist.* do ...

2014-06-05 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/969#issuecomment-45224988 Spark configuration `conf/spark-defaults.conf` = ``` spark.yarn.dist.archives /toona/conf spark.executor.extraClassPath ./conf

[GitHub] spark pull request: [WIP][SPARK-1930] The Container is running bey...

2014-06-05 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/894#discussion_r13473687 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -65,6 +65,18 @@ trait ClientBase extends Logging { val

[GitHub] spark pull request: [SPARK-1978] In some cases, spark-yarn does no...

2014-06-05 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/921#discussion_r13474143 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala --- @@ -252,16 +252,12 @@ class ApplicationMaster(args

[GitHub] spark pull request: [WIP][SPARK-1477]: Add the lifecycle interface...

2014-06-05 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/379 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP][SPARK-1477]: Add the lifecycle interface...

2014-06-05 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/991 [WIP][SPARK-1477]: Add the lifecycle interface You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1477 Alternatively you can

[GitHub] spark pull request: SPARK-1719: spark.executor.extraLibraryPath is...

2014-06-09 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1022 SPARK-1719: spark.executor.extraLibraryPath isn't applied on yarn You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-1719

[GitHub] spark pull request: [SPARK-2947] DAGScheduler resubmit the stage i...

2014-12-02 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/1877 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4161]Spark shell class path is not corr...

2014-12-04 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3050#issuecomment-65749267 @JoshRosen The code has been updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4763] All-pairs shortest paths algorith...

2014-12-05 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/3619#discussion_r21377222 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala --- @@ -139,6 +146,14 @@ object Pregel extends Logging { // get to send messages

[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-05 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2631#issuecomment-65880798 @ankurdave I have removed the Spark core related to modify --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-3623][GraphX] GraphX should support the...

2014-12-05 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/2631#issuecomment-65887724 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4161]Spark shell class path is not corr...

2014-12-09 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3050#issuecomment-66392860 OK, I'll try. But [CoarseMesosSchedulerBackend.scala#L156](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos

[GitHub] spark pull request: [SPARK-4161]Spark shell class path is not corr...

2014-12-09 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3050#issuecomment-66399472 In my local test, it works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4161]Spark shell class path is not corr...

2014-12-10 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3051#issuecomment-66557136 I'm sorry I forgot to update this PR. The Code has been updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-4161]Spark shell class path is not corr...

2014-12-10 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3051#issuecomment-66563972 That seems to be unrelated test fails. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4526][MLLIB]Gradient should be added ba...

2014-12-11 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/3677 [SPARK-4526][MLLIB]Gradient should be added batch computing interface. You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-4526

<    1   2   3   4   5   6   7   8   9   >