[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-30 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191685596 --- Diff: python/pyspark/sql/dataframe.py --- @@ -351,8 +352,62 @@ def show(self, n=20, truncate=True, vertical=False): else

[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...

2018-05-28 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 @viirya @gatorsmile @ueshin @felixcheung @HyukjinKwon The refactor about generating html code out of `Dataset.scala` was done in 94f3414. Please help to check whether it is appropriate

[GitHub] spark issue #21445: [SPARK-24404][SS] Increase currentEpoch when meet a Epoc...

2018-05-28 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21445 ``` Looks like the patch is needed only with #21353 #21332 #21293 as of now, right? ``` @HeartSaVioR Yes, sorry for the late explanation. The background is we are running POC based

[GitHub] spark pull request #21385: [SPARK-24234][SS] Support multiple row writers in...

2018-05-28 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21385#discussion_r191149214 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/UnsafeRowReceiver.scala --- @@ -41,11 +50,15 @@ private

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080316 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080194 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,9 +238,13 @@ class Dataset[T] private[sql]( * @param truncate

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080082 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -358,6 +357,43 @@ class Dataset[T] private[sql]( sb.toString

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080049 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080066 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080057 --- Diff: python/pyspark/sql/tests.py --- @@ -3040,6 +3040,50 @@ def test_csv_sampling_ratio(self): .csv(rdd, samplingRatio=0.5

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080044 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080037 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,30 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r191080026 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 ``` Can we also do something a bit more generic that works for non-Jupyter notebooks as well? ``` Can we accept `spark.sql.repl.eagerEval.enabled` to control both \_\_repr

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190244648 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -358,6 +357,43 @@ class Dataset[T] private[sql]( sb.toString

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190154145 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,9 +238,13 @@ class Dataset[T] private[sql]( * @param truncate

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190154231 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -358,6 +357,43 @@ class Dataset[T] private[sql]( sb.toString

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190153907 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190153833 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r190153812 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189614136 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189614067 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189613358 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -292,31 +297,25 @@ class Dataset[T] private[sql

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189611792 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -358,6 +357,43 @@ class Dataset[T] private[sql]( sb.toString

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189603851 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +347,26 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 Thanks all reviewer's comments, I address all comments in this commit. Please have a look. --- - To unsubscribe, e-mail

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189574938 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,9 +238,13 @@ class Dataset[T] private[sql]( * @param truncate

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189570764 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -358,6 +357,43 @@ class Dataset[T] private[sql]( sb.toString

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189570479 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -292,31 +297,25 @@ class Dataset[T] private[sql

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189569952 --- Diff: python/pyspark/sql/dataframe.py --- @@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx): self.is_cached = False

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189569437 --- Diff: python/pyspark/sql/dataframe.py --- @@ -347,13 +353,18 @@ def show(self, n=20, truncate=True, vertical=False): name | Bob

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189567614 --- Diff: python/pyspark/sql/dataframe.py --- @@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx): self.is_cached = False

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189567350 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189567315 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189567259 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189483903 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189483894 --- Diff: docs/configuration.md --- @@ -456,6 +456,29 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 ``` this will need to escape the values to make sure it is legal html too right? ``` Yes you're right, thanks for your guidance, the new patch consider the escape and add new UT

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189463652 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -237,9 +236,13 @@ class Dataset[T] private[sql]( * @param truncate

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189463098 --- Diff: python/pyspark/sql/dataframe.py --- @@ -78,6 +78,12 @@ def __init__(self, jdf, sql_ctx): self.is_cached = False

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21370#discussion_r189463079 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3056,7 +3059,6 @@ class Dataset[T] private[sql]( * view, e.g

[GitHub] spark issue #21370: [SPARK-24215][PySpark] Implement _repr_html_ for datafra...

2018-05-19 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21370 Not sure who is the right reviewer, maybe @rdblue @gatorsmile ? Could you help me check whether it is the right implementation for the discussion in the dev list

[GitHub] spark pull request #21370: [SPARK-24215][PySpark] Implement _repr_html_ for ...

2018-05-19 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/21370 [SPARK-24215][PySpark] Implement _repr_html_ for dataframes in PySpark ## What changes were proposed in this pull request? Implement _repr_html_ for PySpark while in notebook and add

[GitHub] spark pull request #21353: [SPARK-24036][SS] Scheduler changes for continuou...

2018-05-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21353#discussion_r188975680 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala --- @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21353: [SPARK-24036][SS] Scheduler changes for continuou...

2018-05-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21353#discussion_r188974718 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -140,6 +140,7 @@ object SparkEnv extends Logging { private[spark] val

[GitHub] spark pull request #21353: [SPARK-24036][SS] Scheduler changes for continuou...

2018-05-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21353#discussion_r188974568 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -213,6 +213,12 @@ private[spark] sealed trait MapOutputTrackerMessage

[GitHub] spark pull request #21353: [SPARK-24036][SS] Scheduler changes for continuou...

2018-05-17 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21353#discussion_r188974319 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -88,14 +96,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag

[GitHub] spark pull request #21353: [SPARK-24036][SS] Scheduler changes for continuou...

2018-05-17 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/21353 [SPARK-24036][SS] Scheduler changes for continuous processing shuffle support ## What changes were proposed in this pull request? This is the last part of the preview PRs, the mainly

[GitHub] spark issue #21332: [SPARK-24236][SS] Continuous replacement for ShuffleExch...

2018-05-17 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21332 > As discussed in the other PR, I'm not sure about how we're integrating with the scheduler here, so I can't really give a more detailed review at this point. My bad, I'm prepar

[GitHub] spark issue #21114: [SPARK-22371][CORE] Return None instead of throwing an e...

2018-05-16 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21114 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21337: [SPARK-24234][SS] Reader for continuous processin...

2018-05-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21337#discussion_r188604001 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/ContinuousShuffleReadRDD.scala --- @@ -0,0 +1,64

[GitHub] spark pull request #21337: [SPARK-24234][SS] Reader for continuous processin...

2018-05-16 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21337#discussion_r188601016 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/shuffle/UnsafeRowReceiver.scala --- @@ -0,0 +1,56

[GitHub] spark issue #21332: [SPARK-24236][SS] Continuous replacement for ShuffleExch...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21332 cc @jose-torres As we discussion in #21293, the main difference between us is whether we can reuse current implementation of scheduler and shuffle, but in this part about

[GitHub] spark pull request #21332: [SPARK-24236][SS] Continuous replacement for Shuf...

2018-05-15 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/21332 [SPARK-24236][SS] Continuous replacement for ShuffleExchangeExec ## What changes were proposed in this pull request? 1. New RDD named ContinuousShuffleRowRDD 2. New case class

[GitHub] spark issue #21293: [SPARK-24237][SS] Continuous shuffle dependency and map ...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21293 @jose-torres Great thanks for you advise and guidance for us! I found the main difference between us is whether we can reuse current implementation of scheduler and shuffle. I marked in your

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r188277683 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala --- @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r188273722 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -769,6 +796,43 @@ private[spark] class MapOutputTrackerWorker(conf

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r188270290 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -88,14 +90,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-15 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r188269208 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -65,15 +65,17 @@ abstract class NarrowDependency[T](_rdd: RDD[T]) extends

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-05-14 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r188152980 --- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala --- @@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite

[GitHub] spark issue #21114: [SPARK-22371][CORE] Return None instead of throwing an e...

2018-05-13 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21114 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-05-13 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r187823469 --- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala --- @@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-05-12 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r187763308 --- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala --- @@ -237,6 +236,65 @@ class AccumulatorSuite extends SparkFunSuite

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-05-12 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r187763285 --- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala --- @@ -209,10 +209,8 @@ class AccumulatorSuite extends SparkFunSuite

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187599741 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -769,6 +796,43 @@ private[spark] class MapOutputTrackerWorker(conf

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187598787 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ContinuousShuffleMapTask.scala --- @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187598365 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -227,6 +228,7 @@ object SparkEnv extends Logging

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187598100 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -233,6 +239,28 @@ private[spark] class MapOutputTrackerMasterEndpoint

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187597922 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -88,14 +90,53 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-11 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21293#discussion_r187596748 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -65,15 +65,17 @@ abstract class NarrowDependency[T](_rdd: RDD[T]) extends

[GitHub] spark issue #21293: [SPARK-24237][SS] Continuous shuffle dependency and map ...

2018-05-10 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21293 cc @jose-torres @zsxwing --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #21293: [SPARK-24237][SS] Continuous shuffle dependency a...

2018-05-10 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/21293 [SPARK-24237][SS] Continuous shuffle dependency and map output tracker ## What changes were proposed in this pull request? As our disscussion in [jira comment](https

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-05-08 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r186764630 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,304

[GitHub] spark pull request #21199: [SPARK-24127][SS] Continuous text socket source

2018-05-08 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21199#discussion_r186765402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala --- @@ -0,0 +1,304

[GitHub] spark issue #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <= rampU...

2018-05-03 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21188 @maasg as comment in #21194, I just consider we should not change the behavior while `seconds > rampUpTimeSeconds`. Maybe it more important than smo

[GitHub] spark pull request #21188: [SPARK-24046][SS] Fix rate source rowsPerSecond <...

2018-05-03 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21188#discussion_r185852663 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -107,14 +107,25 @@ object

[GitHub] spark pull request #21194: [SPARK-24046][SS] Fix rate source when rowsPerSec...

2018-05-03 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21194#discussion_r185851172 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProviderSuite.scala --- @@ -173,55 +173,154 @@ class

[GitHub] spark pull request #21194: [SPARK-24046][SS] Fix rate source when rowsPerSec...

2018-05-01 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21194#discussion_r185252544 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -101,25 +101,10 @@ object

[GitHub] spark pull request #21194: [SPARK-24046][SS] Fix rate source when rowsPerSec...

2018-05-01 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21194#discussion_r185252360 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RateStreamProvider.scala --- @@ -101,25 +101,10 @@ object

[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...

2018-04-29 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184882338 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala --- @@ -20,12 +20,12 @@ package org.apache.spark.io import

[GitHub] spark pull request #21175: [SPARK-24107][CORE] ChunkedByteBuffer.writeFully ...

2018-04-29 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21175#discussion_r184882396 --- Diff: core/src/test/scala/org/apache/spark/io/ChunkedByteBufferSuite.scala --- @@ -20,12 +20,12 @@ package org.apache.spark.io import

[GitHub] spark pull request #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) qu...

2018-04-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21177#discussion_r184725980 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala --- @@ -78,7 +81,7 @@ object TPCDSQueryBenchmark

[GitHub] spark pull request #21177: [SPARK-24111][SQL] Add the TPCDS v2.7 (latest) qu...

2018-04-27 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21177#discussion_r184724132 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala --- @@ -87,10 +90,20 @@ object

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-26 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 > Have you applied this patch: #17955 ? No, this happened on Spark 2.1. Thanks xingbo & wenchen, I'll port back this patch to our internal Spark 2.1. > Tha

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-25 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r184276403 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-25 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r184274946 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-25 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r184260597 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-25 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r184260210 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -2399,6 +2399,84 @@ class DAGSchedulerSuite extends

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-25 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r184109204 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1266,6 +1266,9 @@ class DAGScheduler

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-04-24 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r183772627 --- Diff: core/src/test/scala/org/apache/spark/AccumulatorSuite.scala --- @@ -209,10 +209,8 @@ class AccumulatorSuite extends SparkFunSuite

[GitHub] spark pull request #21114: [SPARK-22371][CORE] Return None instead of throwi...

2018-04-24 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21114#discussion_r183770468 --- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala --- @@ -258,14 +258,8 @@ private[spark] object AccumulatorContext

[GitHub] spark issue #21136: [SPARK-24061][SS]Add TypedFilter support for continuous ...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/21136 +1 for this. We find this by CP app use filter with functions, this can be supported by current implement. cc @jose-torres @zsxwing @tdas

[GitHub] spark pull request #21136: [SPARK-24061][SS]Add TypedFilter support for cont...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21136#discussion_r183604217 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/UnsupportedOperationsSuite.scala --- @@ -771,7 +778,16 @@ class

[GitHub] spark pull request #20946: [SPARK-23565] [SQL] New error message for structu...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20946#discussion_r183447816 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala --- @@ -39,7 +39,9 @@ case class OffsetSeq(offsets: Seq

[GitHub] spark pull request #20946: [SPARK-23565] [SQL] New error message for structu...

2018-04-23 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20946#discussion_r183447988 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/OffsetSeqLogSuite.scala --- @@ -125,6 +125,19 @@ class OffsetSeqLogSuite

[GitHub] spark pull request #21116: [SPARK-24038][SS] Refactor continuous writing to ...

2018-04-21 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21116#discussion_r183224838 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/WriteToContinuousDataSourceExec.scala --- @@ -0,0 +1,126

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-21 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 ![image](https://user-images.githubusercontent.com/4833765/39091106-ff11d0a6-461f-11e8-968f-7fcbe6652bb3.png) Stage 0\1\2\3 same with 20\21\22\23 in this screenshot, stage2's shuffleId

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @Ngone51 Ah, maybe I know how the description misleading you, the in the description 5, 'this stage' refers to 'Stage 2' in screenshot, thanks for your check, I modified the description

[GitHub] spark issue #20930: [SPARK-23811][Core] FetchFailed comes before Success of ...

2018-04-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/20930 @Ngone51 You can check the screenshot in detail, stage 2's shuffleID is 1, but stage 3 failed by missing an output for shuffle '0'! So here the stage 2's skip cause stage 3 got an error

[GitHub] spark pull request #20930: [SPARK-23811][Core] FetchFailed comes before Succ...

2018-04-20 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/20930#discussion_r183198368 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1266,6 +1266,9 @@ class DAGScheduler

<    1   2   3   4   5   6   7   8   >