[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...

2016-06-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the issue: https://github.com/apache/spark/pull/13651 LGTM thank you --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13678: [SPARK-15824][SQL] Execute WITH .... INSERT ... statemen...

2016-06-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the issue: https://github.com/apache/spark/pull/13678 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13524: [SPARK-15776][SQL] Type coercion incorrect

2016-06-12 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the issue: https://github.com/apache/spark/pull/13524 @rxin Done. Pleas help review, thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13561: [SPARK-15824][SQL] Run 'with ... insert ... selec...

2016-06-08 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/13561 [SPARK-15824][SQL] Run 'with ... insert ... select' failed when use spark thriftserver ## What changes were proposed in this pull request? Dataset.collect will call

[GitHub] spark pull request #13524: [SPARK-15776] Type coercion incorrect

2016-06-06 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/13524 [SPARK-15776] Type coercion incorrect ## What changes were proposed in this pull request? Update type coercion order, details see https://issues.apache.org/jira/browse/SPARK-15776

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-10-21 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/7417 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-10-21 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-150096989 @cloud-fan OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-10-13 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7417#discussion_r41956822 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -274,12 +275,30 @@ private[sql] abstract class

[GitHub] spark pull request: [SPARK-9596][SQL]treat hadoop classes as share...

2015-10-13 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7931#discussion_r41853694 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala --- @@ -124,6 +124,7 @@ private[hive] class

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-10-13 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7417#discussion_r41855351 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/CartesianProduct.scala --- @@ -28,9 +28,17 @@ import

[GitHub] spark pull request: [SPARK-9522][SQL] SparkSubmit process can not ...

2015-09-17 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7853#issuecomment-141082236 @andrewor14 I have set stopped to private[spark], @liancheng @yhuai any thoughts? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-09-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-138554397 @scwf done. @zsxwing updated code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-09-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7417#discussion_r38504238 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/CartesianProduct.scala --- @@ -27,16 +27,27 @@ import

[GitHub] spark pull request: [SPARK-9519][Yarn] Confirm stop sc successfull...

2015-08-04 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7846#issuecomment-127815406 @vanzin @srowen Updated, thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-9519][Yarn] Confirm stop sc successfull...

2015-08-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7846#issuecomment-126879883 @srowen We need call interrupt in YarnClientSchedulerBackend.stop(), details see PR #5305 and PR #3143, so even if we call sc.stop() in the finally block

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-08-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-126880682 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-9522][SQL] SparkSubmit process can not ...

2015-08-01 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/7853 [SPARK-9522][SQL] SparkSubmit process can not exit if kill application when HiveThriftServer was starting When we start HiveThriftServer, we will start SparkContext first, then start

[GitHub] spark pull request: [SPARK-9519][Yarn] Confirm stop sc successfull...

2015-08-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7846#issuecomment-12629 Yes, this change doesn't stop this sequence from happening. As monitor thread is daemon thread, we don't need call interrupt as sc.stop(). Below I am not very

[GitHub] spark pull request: [SPARK-9519][Yarn] Confirm stop sc successfull...

2015-07-31 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/7846 [SPARK-9519][Yarn] Confirm stop sc successfully when application was killed Currently, when we kill application on Yarn, then will call sc.stop() at Yarn application state monitor thread

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-31 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-126856094 @hvanhovell Good suggestion, thank you, updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-22 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-123925858 @hvanhovell I use tpc-ds to test, for below SQL clause: ``` with single_value as ( select 1 tpcds_val from date_dim ) select sum(ss_quantity

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-21 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7417#discussion_r35180395 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastCartesianProduct.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/7417#discussion_r34754893 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/CartesianProduct.scala --- @@ -34,7 +34,15 @@ case class CartesianProduct(left

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-15 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/7417 [SPARK-9066][SQL] Improve cartesian performance see jira https://issues.apache.org/jira/browse/SPARK-9066 You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-9066][SQL] Improve cartesian performanc...

2015-07-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7417#issuecomment-121588200 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-8811][SQL] Read array struct data from ...

2015-07-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7209#issuecomment-119504817 @liancheng OK, no problem. Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-8811][SQL] Read array struct data from ...

2015-07-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/7209 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-8811][SQL] Read array struct data from ...

2015-07-06 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7209#issuecomment-119064179 @liancheng I have updated, please help to review, thank you! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-8811][SQL] Read array struct data from ...

2015-07-05 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/7209#issuecomment-118699916 @liancheng OK, good, thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-8811][SQL] Read array struct data from ...

2015-07-03 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/7209 [SPARK-8811][SQL] Read array struct data from parquet error JIRA:https://issues.apache.org/jira/browse/SPARK-8811 For example: we have a table: ``` t1(c1 string, c2

[GitHub] spark pull request: [SPARK-8162][BUILD] Run spark-shell cause Null...

2015-06-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6704#issuecomment-110231290 Close it first as PR #6711 can fix NPE, if we find the root cause of why the `@VisibleForTesting` annotation causes a NPE in the shell then reopen

[GitHub] spark pull request: [SPARK-8162][BUILD] Run spark-shell cause Null...

2015-06-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/6704 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Run spark-shell cause NullPointerException

2015-06-08 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/6704 Run spark-shell cause NullPointerException see jira https://issues.apache.org/jira/browse/SPARK-8162 JDK: 1.8.0_40 Hadoop: 2.7.0 You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-8162][BUILD] Run spark-shell cause Null...

2015-06-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6704#issuecomment-109965178 @srowen I build the Spark with comman **`mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.7.0 -Phive -Phive-thriftserver -Psparkr -DskipTests package`** and run spark

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-06-07 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-109820635 @srowen @vanzin This PR can cleanup correctly. I just mean without this PR even if we add KILLED status on ApplicationMaster to check, then it can not cleanup

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-06-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/6409#discussion_r31500453 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -91,51 +91,54 @@ private[spark] class Client( * available

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-06-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-108165847 @vanzin I have tested again, and below is the result of final status when we use yarn to kill the application: \ | YARN UI | Driver Log | AppMaster

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-06-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-107399469 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-06-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/6409#discussion_r31490416 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -825,6 +813,9 @@ private[spark] class Client( * throw

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-05-31 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/6409#discussion_r31397611 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -849,6 +852,27 @@ private[spark] class Client

[GitHub] spark pull request: [SPARK-7026] [SQL] fix left semi join with equ...

2015-05-29 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5643#discussion_r31304130 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastLeftSemiJoinHash.scala --- @@ -32,36 +32,59 @@ case class

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-05-28 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-106286738 @tgravescs yes, if yarn do it is better, but now it didn't, so as @vanzin said may be we can do it when launcher, thank you! --- If your project is set up

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-05-26 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6409#issuecomment-105715873 @tgravescs I have tested below: max retried is defaule, use yarn -kill to kill application when application start running, run SparkPi with parameter 2

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-26 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5887#issuecomment-105516789 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-7705][Yarn] Cleanup of .sparkStaging di...

2015-05-26 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/6409 [SPARK-7705][Yarn] Cleanup of .sparkStaging directory fails if application is killed As I have tested, if we cancel or kill the app then the final status may be undefined, killed

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-17 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5887#issuecomment-102756602 @andrewor14 what's your opinion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-14 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5887#issuecomment-101951195 @davies what's your opinion now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-7595][SQL] Window will cause resolve fa...

2015-05-13 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6114#issuecomment-10184 @scwf @yhuai Done, thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7595][SQL] Window will cause resolve fa...

2015-05-13 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/6114 [SPARK-7595][SQL] Window will cause resolve failed with self join for example: table: src(key string, value string) sql: with v1 as(select key, count(value) over (partition by key

[GitHub] spark pull request: [SPARK-7526][SparkR] Specify ip of RBackend, M...

2015-05-11 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/6053 [SPARK-7526][SparkR] Specify ip of RBackend, MonitorServer and RRDD Socket server These R process only used to communicate with JVM process on local, so binding to localhost is more

[GitHub] spark pull request: [SPARK-7526][SparkR] Specify ip of RBackend, M...

2015-05-11 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/6053#issuecomment-101105615 @shivaram Yes, I also think there should be no problems, as it is not system dependent. I will test this on Windows, thank you! --- If your project is set up

[GitHub] spark pull request: [Minor][PySpark] Set PYTHONPATH to python/lib/...

2015-05-10 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/6047 [Minor][PySpark] Set PYTHONPATH to python/lib/pyspark.zip rather than python/pyspark As PR#5580 we have create pyspark.zip on building and set PYTHONPATH to python/lib/pyspark.zip, so

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-05 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5887#discussion_r29729314 --- Diff: python/pyspark/shuffle.py --- @@ -362,7 +362,9 @@ def _spill(self): self.spills += 1 gc.collect() # release

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-04 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5887 [SPARK-7339][PySpark] PySpark shuffle spill memory sometimes are not correct You can merge this pull request into a Git repository by running: $ git pull https://github.com/Sephiroth

[GitHub] spark pull request: [SPARK-7339][PySpark] PySpark shuffle spill me...

2015-05-04 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5887#issuecomment-98901992 Jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-29 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5580#issuecomment-97346388 If user don't use make-distribution.sh and just compile Spark use maven or sbt, then don't have pyspark.zip. So we really don't need to do the zip in the code

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-27 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-96867560 @tgravescs yes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-25 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-96145643 @andrewor14 @sryza how about your opinions? thanks. @lianhuiwang please help me review this, thanks. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [PySpark][Minor] Update sql example, so that c...

2015-04-24 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5684 [PySpark][Minor] Update sql example, so that can read file correctly To run Spark, default will read file from HDFS if we don't set the schema. You can merge this pull request into a Git

[GitHub] spark pull request: [SPARK-5689][Doc] Document what can be run in ...

2015-04-23 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/5490 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-22 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-95102969 @andrewor14 Sorry, these days I am busy, now I have update the code. ^-^ --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-19 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-94331295 @lianhuiwang OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6604][PySpark]Specify ip of python serv...

2015-04-17 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5256#issuecomment-93915251 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-16 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-93705830 @andrewor14 @sryza Done, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-6869][PySpark] Add pyspark archives pat...

2015-04-16 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-93724717 @andrewor14 @sryza @WangTaoTheTonic As I have test again, if we install Spark on each node, then we can set spark.executorEnv.PYTHONPATH=${SPARK_HOME}/python

[GitHub] spark pull request: [SPARK-6604][PySpark]Specify ip of python serv...

2015-04-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5256#issuecomment-93650104 @srowen OK, thanks. Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6869][PySpark] Pass PYTHONPATH to execu...

2015-04-15 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5478#issuecomment-93270239 @andrewor14 @sryza Yes, to assume that the python files will already be present on the slave machines is not very reasonable. But if user want to use PySpark

[GitHub] spark pull request: [SPARK-6870][Yarn] Catch InterruptedException ...

2015-04-13 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5479#discussion_r28231958 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -128,10 +128,14 @@ private[spark] class

[GitHub] spark pull request: [SPARK-5689][Doc] Document what can be run in ...

2015-04-13 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5490 [SPARK-5689][Doc] Document what can be run in different YARN modes You can merge this pull request into a Git repository by running: $ git pull https://github.com/Sephiroth-Lin/spark

[GitHub] spark pull request: [SPARK-6870][Yarn] Catch InterruptedException ...

2015-04-12 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5479#discussion_r28210698 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -128,10 +128,14 @@ private[spark] class

[GitHub] spark pull request: [SPARK-6869][PySpark] Pass PYTHONPATH to execu...

2015-04-12 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5478 [SPARK-6869][PySpark] Pass PYTHONPATH to executor, so that executor can read pyspark file from local file system on executor node From SPARK-1920 and SPARK-1520 we know PySpark on Yarn can

[GitHub] spark pull request: [SPARK-6870][Yarn] Catch InterruptedException ...

2015-04-12 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5479 [SPARK-6870][Yarn] Catch InterruptedException when yarn application state monitor thread been interrupted On PR #5305 we interrupt the monitor thread but forget to catch

[GitHub] spark pull request: [SPARK-4346][SPARK-3596][YARN] Commonize the m...

2015-04-07 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5305#discussion_r27939765 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -127,23 +127,11 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4346][SPARK-3596][YARN] Commonize the m...

2015-04-05 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5305#discussion_r2372 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -559,50 +560,56 @@ private[spark] class Client( var lastState

[GitHub] spark pull request: [SPARK-4346][SPARK-3596][YARN] Commonize the m...

2015-04-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5305#issuecomment-88838919 @srowen unit tests failed at run Python app on yarn-cluster mode, I think this didn't cause by this PR, please ask jenkins to retest, thank you. --- If your

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-04-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5292#discussion_r27636093 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -125,6 +125,7 @@ private[spark] class

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-04-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5292#discussion_r27642711 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -125,6 +125,7 @@ private[spark] class

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-04-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5292#discussion_r27647902 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -125,6 +125,7 @@ private[spark] class

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-04-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/5292 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4346][SPARK-3596][YARN] Commonize the m...

2015-04-01 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5305 [SPARK-4346][SPARK-3596][YARN] Commonize the monitor logic 1. YarnClientSchedulerBack.asyncMonitorApplication use Client.monitorApplication so that commonize the monitor logic 2. Support

[GitHub] spark pull request: [SPARK-1502][YARN]Add config option to not inc...

2015-04-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5294#issuecomment-88708773 @tgravescs @srowen @sryza As i have retest again, if we don't populate hadoop classpath, then in all case it dosen't work. This PR cann't solve this issue, i

[GitHub] spark pull request: [SPARK-1502][YARN]Add config option to not inc...

2015-04-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/5294 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-4346][SPARK-3596][YARN] Commonize the m...

2015-04-01 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/5305#issuecomment-88752700 Jenkins, retest please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-1502][YARN]Add config option to not inc...

2015-03-31 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5294 [SPARK-1502][YARN]Add config option to not include yarn/mapred cluster classpath You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-03-31 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5292 [SPARK-3596][YARN]Support changing the yarn client monitor interval You can merge this pull request into a Git repository by running: $ git pull https://github.com/Sephiroth-Lin/spark

[GitHub] spark pull request: [SPARK-3596][YARN]Support changing the yarn cl...

2015-03-31 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/5292#discussion_r27540657 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala --- @@ -125,6 +125,7 @@ private[spark] class

[GitHub] spark pull request: Specify ip of python server scoket

2015-03-30 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/5256 Specify ip of python server scoket In driver now will start a server socket and use a wildcard ip, use 127.0.0.0 is more reasonable, as we only use it by local Python process. /cc @davies

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-03-05 Thread Sephiroth-Lin
Github user Sephiroth-Lin closed the pull request at: https://github.com/apache/spark/pull/4620 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-03-02 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-76895189 @srowen ok, pls help to close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-5801] [core] Avoid creating nested dire...

2015-02-24 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/4747#discussion_r25322767 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -728,6 +746,11 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-24 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-75919413 @srowen as PR #4747 will cache the local root directories, then we can close this PR first. For PR #4747 I think we also need to remove the local root directories

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-18 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74864163 @srowen ok, thank you. If this subdirectory is really needed, may be we can add code to delete this subdirectory after jvm exit or sc.stop(). --- If your project

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-18 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74860104 @srowen as in function getOrCreateLocalRootDirs will create a subdirectory for root local dir, then if we call getLocalDir will create a subdirectory for root

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-16 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4620#issuecomment-74604408 @srowen yes, this is same as SPARK-5801. In standalone, worker will create temp directories for executor, so if we create an unnecessary directory for local root

[GitHub] spark pull request: [SPARK-5830][Core]Don't create unnecessary dir...

2015-02-15 Thread Sephiroth-Lin
GitHub user Sephiroth-Lin opened a pull request: https://github.com/apache/spark/pull/4620 [SPARK-5830][Core]Don't create unnecessary directory for local root dir Now will create an unnecessary directory for local root directory, and this directory will not be deleted after

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-10 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4412#issuecomment-73682144 @srowen thank you, please help to check again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-10 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/4412#discussion_r24404265 --- Diff: core/src/main/scala/org/apache/spark/HttpFileServer.scala --- @@ -50,6 +50,15 @@ private[spark] class HttpFileServer( def stop

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-09 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on the pull request: https://github.com/apache/spark/pull/4412#issuecomment-73655978 @srowen thank you, now I add a member to store the reference of the tmp dir if it was created, please help to check again. --- If your project is set up

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/4412#discussion_r24285284 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -93,6 +93,14 @@ class SparkEnv ( // actorSystem.awaitTermination

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-08 Thread Sephiroth-Lin
Github user Sephiroth-Lin commented on a diff in the pull request: https://github.com/apache/spark/pull/4412#discussion_r24305832 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -93,6 +93,19 @@ class SparkEnv ( // actorSystem.awaitTermination

  1   2   >