[GitHub] spark issue #18554: [SPARK-21306][ML] OneVsRest should cache weightCol if ne...

2017-07-06 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/18554 I guess we also need to update the python part: https://github.com/apache/spark/blob/v2.2.0-rc6/python/pyspark/ml/classification.py#L1563 --- If your project is set up for it, you can reply

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-06-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r122612344 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -502,6 +526,23

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-06-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r122611850 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -291,6 +300,19

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-06-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r122615540 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -502,6 +526,23

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-06-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r122612179 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala --- @@ -502,6 +526,23

[GitHub] spark pull request #18098: [SPARK-16944][Mesos] Improve data locality when l...

2017-06-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/18098#discussion_r122615907 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackendSuite.scala --- @@ -586,6

[GitHub] spark issue #17750: [SPARK-4899][MESOS] Support for Checkpointing on Coarse ...

2017-06-08 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17750 ping @srowen, i think this PR is ready to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #17750: [SPARK-4899][MESOS] Support for Checkpointing on ...

2017-05-29 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17750#discussion_r119009317 --- Diff: docs/running-on-mesos.md --- @@ -516,6 +516,16 @@ See the [configuration page](configuration.html) for information on Spark config

[GitHub] spark issue #17750: [SPARK-4899][MESOS] Support for checkpointing on Coarse ...

2017-05-12 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17750 > Do you then think it would be a viable option to enable it by default on Coarse grained and have it not used in Fine-grained. SGTM, especially considering fine-grained mode is alre

[GitHub] spark issue #17750: [SPARK-4899][MESOS] Support for checkpointing on Coarse ...

2017-05-05 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17750 > Yes, It is true that there is an associated overhead in both modes, that's why the defaults have not been changed. i.e. Default behavior is not to checkpoint. The overhead in fine-grai

[GitHub] spark issue #17519: [SPARK-15352][Doc] follow-up: add configuration docs for...

2017-05-05 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17519 Thanks @shubhamchopra. Could you please help review & merge this one @cloud-fan ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #17750: [SPARK-4899][MESOS] Support for checkpointing on Coarse ...

2017-05-03 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17750 IMO we should not enable checkpointing in fine-grained mode. Because with checkpointing enabled, mesos agents would persist all status updates to disk which means great I/O cost because fine-grained

[GitHub] spark issue #17519: [SPARK-15352][Doc] follow-up: add configuration docs for...

2017-05-02 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17519 ping @shubhamchopra @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-04-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r109680926 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -394,6 +394,68 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-04-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r109681031 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -394,6 +394,68 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-04-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17088#discussion_r109682527 --- Diff: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala --- @@ -394,6 +394,68 @@ class DAGSchedulerSuite extends SparkFunSuite

[GitHub] spark issue #17519: [SPARK-15352][Doc] follow-up: add configuration docs for...

2017-04-03 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17519 @shubhamchopra @cloud-fan PTAL --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17519: [SPARK-15352][Doc] follow-up: add configuration d...

2017-04-03 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/17519 [SPARK-15352][Doc] follow-up: add configuration docs for topology-awa block replication ## What changes were proposed in this pull request? Add configuration docs for topology-awa block

[GitHub] spark issue #17051: [SPARK-17075][SQL] Follow up: fix file line ending and i...

2017-02-24 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17051 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-23 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17051#discussion_r10239 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -398,6 +398,27 @@ class

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-23 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17051#discussion_r102888024 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -398,6 +398,27 @@ class

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-23 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17051#discussion_r102887655 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -1,511 +1,509

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-23 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/17051#discussion_r102887355 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -1,511 +1,509

[GitHub] spark pull request #17051: [SPARK-17075][SQL] Follow up: fix file line endin...

2017-02-23 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/17051 [SPARK-17075][SQL] Follow up: fix file line ending and improve the tests ## What changes were proposed in this pull request? Fixed the line ending of `FilterEstimation.scala`. Also improved

[GitHub] spark issue #17051: [SPARK-17075][SQL] Follow up: fix file line ending and i...

2017-02-23 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/17051 cc @ron8hu @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-23 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r102707494 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -0,0 +1,389

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-19 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r101912584 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,623

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-19 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r101911700 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,531

[GitHub] spark pull request #16395: [SPARK-17075][SQL] implemented filter estimation

2017-02-19 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16395#discussion_r101912258 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -0,0 +1,531

[GitHub] spark issue #16984: [SPARK-19550] Follow-up: fixed a typo that fails the dev...

2017-02-18 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/16984 That's how the shell's "default value" works. FYI http://www.tldp.org/LDP/abs/html/parameter-substitution.html --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16984: [SPARK-19550] Follow-up: fixed a typo that fails the dev...

2017-02-18 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/16984 cc @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16984: [SPARK-19550] Follow-up: fixed a typo that fails ...

2017-02-18 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/16984 [SPARK-19550] Follow-up: fixed a typo that fails the dev/make-distribution.sh script. ## What changes were proposed in this pull request? Fixed a typo in `dev/make-distribution.sh` script

[GitHub] spark pull request #16893: [SPARK-19555][SQL] Improve the performance of Str...

2017-02-10 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/16893 [SPARK-19555][SQL] Improve the performance of StringUtils.escapeLikeRegex method ## What changes were proposed in this pull request? Copied from [SPARK-19555](https://issues.apache.org

[GitHub] spark issue #16533: [SPARK-19160][PYTHON][SQL] Add udf decorator

2017-01-30 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/16533 What about also support using the type name without the parentheses, as a syntax sugar? e.g.: ```python @udf(returnType =IntegerType) # instead of IntegerType() def f

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

2017-01-19 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/16593 I just found "create table using hive " (without "select ... from", i.e. the non-CTAS form) is handled by `CreateTableCommand` ([source](https://github.com/apache/spark/blob/bcc

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-18 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96774234 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -87,8 +101,8 @@ case class

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96348515 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -1343,17 +1343,41 @@ class HiveDDLSuite sql

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96348791 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -45,6 +46,25 @@ case class

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96349044 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -183,9 +183,15 @@ case class CatalogTable

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96349144 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -183,9 +183,15 @@ case class CatalogTable

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96348696 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -88,7 +108,9 @@ case class

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16593#discussion_r96348933 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -1343,17 +1343,41 @@ class HiveDDLSuite sql

[GitHub] spark pull request #16189: [SPARK-18761][CORE] Introduce "task reaper" to ov...

2016-12-11 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16189#discussion_r91856289 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -432,6 +458,78 @@ private[spark] class Executor

[GitHub] spark pull request #16189: [SPARK-18761][CORE] Introduce "task reaper" to ov...

2016-12-11 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16189#discussion_r91855204 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -432,6 +458,78 @@ private[spark] class Executor

[GitHub] spark pull request #16189: [SPARK-18761][CORE] Introduce "task reaper" to ov...

2016-12-11 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16189#discussion_r91855053 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -432,6 +458,78 @@ private[spark] class Executor

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774898 --- Diff: python/setup.py --- @@ -69,10 +69,13 @@ EXAMPLES_PATH = os.path.join(SPARK_HOME, "examples/src/main/python") SC

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774685 --- Diff: python/MANIFEST.in --- @@ -17,6 +17,8 @@ global-exclude *.py[cod] __pycache__ .DS_Store recursive-include deps/jars *.jar graft deps

[GitHub] spark pull request #16082: [SPARK-18652][PYTHON] Include the example data an...

2016-12-04 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/16082#discussion_r90774620 --- Diff: python/setup.py --- @@ -69,10 +69,15 @@ EXAMPLES_PATH = os.path.join(SPARK_HOME, "examples/src/main/python") SC

[GitHub] spark issue #16082: [SPARK-18652] Include the example data and third-party l...

2016-12-02 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/16082 @holdenk @rxin Can we merge this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #16082: [SPARK-18652] Include the data in pyspark package...

2016-11-30 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/16082 [SPARK-18652] Include the data in pyspark package. ## What changes were proposed in this pull request? Since we already include the python examples in the pyspark package, we should

[GitHub] spark pull request #16049: [SPARK-16282][SQL] Follow-up: remove "percentile"...

2016-11-28 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/16049 [SPARK-16282][SQL] Follow-up: remove "percentile" from temp function detection after implementing it natively ## What changes were proposed in this pull request? In #1576

[GitHub] spark pull request #14136: [SPARK-16282][SQL] Implement percentile SQL funct...

2016-11-27 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/14136#discussion_r89686604 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala --- @@ -0,0 +1,262 @@ +/* + * Licensed

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-24 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r89476391 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -17,19 +17,22 @@ package org.apache.spark.storage

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-24 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r89487599 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -305,40 +312,82 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-24 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r89486838 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -108,6 +113,9 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-24 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r89475841 --- Diff: core/src/main/scala/org/apache/spark/shuffle/BlockStoreShuffleReader.scala --- @@ -42,24 +42,21 @@ private[spark] class BlockStoreShuffleReader[K

[GitHub] spark issue #15684: [SPARK-18171][MESOS] Show correct framework address in m...

2016-11-06 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15684 @zsxwing @mgummelt could you help review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #15541: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-11-01 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15541#discussion_r85924864 --- Diff: docs/configuration.md --- @@ -1350,6 +1350,20 @@ Apart from these, the following properties are also available, and may be useful Should

[GitHub] spark pull request #15541: [SPARK-17637][Scheduler]Packed scheduling for Spa...

2016-11-01 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15541#discussion_r85956857 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -305,12 +307,8 @@ private[spark] class TaskSchedulerImpl

[GitHub] spark issue #15684: [SPARK-18171][MESOS] Show correct framework address in m...

2016-10-30 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15684 /cc @zsxwing who worked on [SPARK-4563](https://issues.apache.org/jira/browse/SPARK-4563) and @mgummelt . --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #15684: [SPARK-18171][MESOS] Show correct framework addre...

2016-10-30 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/15684 [SPARK-18171][MESOS] Show correct framework address in mesos master web ui when the advertised address is used ## What changes were proposed in this pull request? In [SPARK-4563](https

[GitHub] spark pull request #15487: [SPARK-17940][SQL] Fixed a typo in LAST function ...

2016-10-27 Thread lins05
Github user lins05 closed the pull request at: https://github.com/apache/spark/pull/15487 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-27 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15487 @HyukjinKwon OK, please fix all these in your PR. I'll close this small one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-25 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @srowen Could we get this merged since the tests are now green? Not sure why it failed previously, it just turned green without me doing anything. --- If your project is set up for it, you can

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-19 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @weiqingy Emm, then we would also add the logic of checking "hadoop.caller.context.enabled" in the test code, which makes the test code simply duplicates the

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-19 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @weiqingy I agree that's a problem. But i don't see how to unit test the `callerContextSupported` method without repeating the same logic in the test code. Do you have any suggestion? --- If your

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-18 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15487 @HyukjinKwon I've updated the usage string. Now it looks like this: ``` spark-sql> describe function first; Function: first Cl

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-18 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 Thanks @jerryshao @srowen . I've updated the code like what you suggested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-17 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15487 @HyukjinKwon thanks, I'll update the PR accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @srowen done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-16 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 @weiqingy @srowen I see. So do you suggest to avoid using `Utils.classForName` to get this one merged, or rather wait for SPARK-17714? --- If your project is set up for it, you can reply

[GitHub] spark pull request #15487: [SPARK-17940][SQL] Fixed a typo in LAST function ...

2016-10-16 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15487#discussion_r83551685 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Last.scala --- @@ -29,15 +29,18 @@ import

[GitHub] spark pull request #15487: [SPARK-17940][SQL] Fixed a typo in LAST function ...

2016-10-14 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/15487 [SPARK-17940][SQL] Fixed a typo in LAST function and improved its usage string ## What changes were proposed in this pull request? * Fixed a a typo in the LAST function error message

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r83206341 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2479,20 +2483,35 @@ private[spark] class CallerContext

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 > Another thing, do you verify it locally? Since there's no unit test to cover it. @jerryshao Yeah, I did test it locally to ensure the error is only logged once. --- If your project is

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r83179622 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2478,42 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r83179598 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2478,42 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r83179592 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2478,42 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-13 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r83178194 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2432,6 +2432,10 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #15319: [SPARK-17733][SQL] InferFiltersFromConstraints ru...

2016-10-08 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15319#discussion_r82503911 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -74,14 +74,26 @@ abstract class QueryPlan[PlanType

[GitHub] spark pull request #15319: [SPARK-17733][SQL] InferFiltersFromConstraints ru...

2016-10-08 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15319#discussion_r82503891 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -74,14 +74,26 @@ abstract class QueryPlan[PlanType

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-07 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82492566 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2474,36 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-07 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82492524 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2432,6 +2432,10 @@ private[spark] object Utils extends Logging

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-06 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82224216 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2474,36 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-06 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82139291 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2474,36 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-06 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r82139283 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2474,25 +2474,36 @@ private[spark] class CallerContext( val context = "S

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-06 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15377 cc @weiqingy (who worked on [SPARK-16757](https://issues.apache.org/jira/browse/SPARK-16757).) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.

2016-10-06 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/15377 [SPARK-17802] Improved caller context logging. ## What changes were proposed in this pull request? [SPARK-16757](https://issues.apache.org/jira/browse/SPARK-16757) sets the hadoop

[GitHub] spark pull request #15089: [SPARK-15621] [SQL] Support spilling for Python U...

2016-09-29 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15089#discussion_r81208953 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/BatchEvalPythonExec.scala --- @@ -17,18 +17,21 @@ package

[GitHub] spark pull request #15089: [SPARK-15621] [SQL] Support spilling for Python U...

2016-09-29 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/15089#discussion_r81212877 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/RowQueue.scala --- @@ -0,0 +1,276 @@ +/* +* Licensed to the Apache Software

[GitHub] spark issue #15254: [SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConv...

2016-09-26 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15254 I guess we can also remove another workaround [here](https://github.com/apache/spark/blob/v2.0.0/python/pyspark/rdd.py#L2320-L2328) ? --- If your project is set up for it, you can reply

[GitHub] spark issue #15236: [SPARK-17017][ML][MLLIB][ML][DOC] Updated the ml/mllib f...

2016-09-26 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15236 @srowen I saw there was a proposal to change `setAlpha` to `setFpr` in #15214, but it was not changed when the PR is merged. So I think this PR is up to upate. --- If your project is set up

[GitHub] spark issue #15236: [SPARK-17017][ML][MLLIB][ML][DOC] Updated the ml/mllib f...

2016-09-25 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15236 Just found #15214 and #15212, I think this one need to wait until those are merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #15236: [SPARK-17017][ML][MLLIB][ML][DOC] Updated the ml ...

2016-09-25 Thread lins05
GitHub user lins05 opened a pull request: https://github.com/apache/spark/pull/15236 [SPARK-17017][ML][MLLIB][ML][DOC] Updated the ml feature selection doc for ChiSqSelector ## What changes were proposed in this pull request? A follow up for #14597 to update feature

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-10 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15043 Did a simple test and it does fix the bug. One interesting thing is while records.count() returns a smaller number than the actual count, the spark UI still shows the correct records number, in my

[GitHub] spark issue #14628: [SPARK-17033][Follow-up][ML][MLLib] Improve kmean aggreg...

2016-08-13 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/14628 A grep shows there is also call to `RDD.aggregate` in `LDAModel`, should we fix it here as well? ```sh mllib/src/main/scala/org/apache/spark/mllib/clustering]$ git grep -E '\<aggreg

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-02 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/11157#discussion_r73214595 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -357,4 +375,191 @@ private[mesos] trait

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-02 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/11157#discussion_r73214231 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -357,4 +375,191 @@ private[mesos] trait

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-02 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/11157#discussion_r73211731 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -357,4 +375,191 @@ private[mesos] trait

[GitHub] spark pull request #11157: [SPARK-11714][Mesos] Make Spark on Mesos honor po...

2016-08-02 Thread lins05
Github user lins05 commented on a diff in the pull request: https://github.com/apache/spark/pull/11157#discussion_r73208305 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala --- @@ -357,4 +375,191 @@ private[mesos] trait

  1   2   >