[GitHub] spark pull request #21985: [SPARK-24884][SQL] add regexp_extract_all support

2018-08-03 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/spark/pull/21985

[SPARK-24884][SQL] add regexp_extract_all support

## What changes were proposed in this pull request?
This PR add regexp_extract_all support in catalyst as RegExpExtractAll. 

It finds all occurrences of the regular expression pattern in string and 
returns the capturing group number

## How was this patch tested?

unit test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/spark RegExpExtractAll

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21985.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21985


commit 2a9623879d91a9b7f33e1f4d252b8633de2c9e8b
Author: xueyu <278006819@...>
Date:   2018-08-03T13:22:14Z

RegExpExtractAll




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21624: [SPARK-24639][DOC] Add three config in the doc

2018-06-29 Thread xueyumusic
Github user xueyumusic commented on a diff in the pull request:

https://github.com/apache/spark/pull/21624#discussion_r199145469
  
--- Diff: docs/configuration.md ---
@@ -456,6 +456,13 @@ Apart from these, the following properties are also 
available, and may be useful
 from JVM to Python worker for every task.
   
 
+
+  spark.python.task.killTimeout
+  2s
+  
+How long to wait before killing the python worker if a task cannot be 
interrupted.
--- End diff --

updated and fix confilct, please have a review, thanks, @zsxwing 
@jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-29 Thread xueyumusic
Github user xueyumusic commented on a diff in the pull request:

https://github.com/apache/spark/pull/21575#discussion_r199143954
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -74,17 +75,17 @@ private[spark] class HeartbeatReceiver(sc: 
SparkContext, clock: Clock)
 
   // "spark.network.timeout" uses "seconds", while 
`spark.storage.blockManagerSlaveTimeoutMs` uses
   // "milliseconds"
-  private val slaveTimeoutMs =
-sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs", "120s")
   private val executorTimeoutMs =
-sc.conf.getTimeAsSeconds("spark.network.timeout", 
s"${slaveTimeoutMs}ms") * 1000
+sc.conf.getTimeAsSeconds("spark.network.timeout",
--- End diff --

updated, please have a review, thank you @zsxwing @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs a...

2018-06-23 Thread xueyumusic
Github user xueyumusic closed the pull request at:

https://github.com/apache/spark/pull/21567


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21624: [SPARK-24639][DOC] Add three config in the doc

2018-06-23 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/spark/pull/21624

[SPARK-24639][DOC] Add three config in the doc

## What changes were proposed in this pull request?
add three config which are mentioned in the pr  #21567 , they are 
`spark.python.task.killTimeout`, `spark.worker.driverTerminateTimeout` and 
`spark.ui.consoleProgress.update.interval`

## How was this patch tested?
doc build

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/spark addconfig1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21624.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21624


commit 16c256d8206123df9487c57c6779ad2b0e0211d0
Author: xueyu <278006819@...>
Date:   2018-06-23T13:49:37Z

add some configs




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-21 Thread xueyumusic
Github user xueyumusic commented on a diff in the pull request:

https://github.com/apache/spark/pull/21575#discussion_r197028516
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc: 
SparkContext, clock: Clock)
   // "spark.network.timeout" uses "seconds", while 
`spark.storage.blockManagerSlaveTimeoutMs` uses
   // "milliseconds"
   private val slaveTimeoutMs =
-sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs", "120s")
+sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs",
--- End diff --

I have removed temp val `slaveTimeout`, also `timeoutIntervalMs` is the 
same case, so removed too, thanks @zsxwing @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21575: [SPARK-24566][CORE] spark.storage.blockManagerSla...

2018-06-19 Thread xueyumusic
Github user xueyumusic commented on a diff in the pull request:

https://github.com/apache/spark/pull/21575#discussion_r196635402
  
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -75,16 +76,18 @@ private[spark] class HeartbeatReceiver(sc: 
SparkContext, clock: Clock)
   // "spark.network.timeout" uses "seconds", while 
`spark.storage.blockManagerSlaveTimeoutMs` uses
   // "milliseconds"
   private val slaveTimeoutMs =
-sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs", "120s")
+sc.conf.getTimeAsMs("spark.storage.blockManagerSlaveTimeoutMs",
--- End diff --

I look at this carefully, I think your are right, thanks @jiangxb1987 . One 
case that is not relevant with this PR is like this: set 
spark.storage.blockManagerSlaveTimeoutMs=900ms and not configure 
spark.network.timeout, then `executorTimeoutMs ` will be 0 since 
getTimeAsSeconds loos precision for ms. This config maybe not reasonable. If 
need fix how about add ensuring > 0 or make executorTimeoutMs's min value as 1, 
@jiangxb1987 @zsxwing 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...

2018-06-19 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21567
  
I see, thanks for your review and guidance, @jiangxb1987 @maropu , I will 
try to add related config to doc and close this PR, thank you


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21575: [SPARK-24566][CORE] spark.storage.blockManagerSlaveTimeo...

2018-06-15 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21575
  
I added the tests, thanks @maropu 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21575: [SPARK-24566][CORE] spark.storage.blockManagerSlaveTimeo...

2018-06-15 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21575
  
It seems that "spark.core.connection.ack.wait.timeout" and 
"spark.shuffle.io.connectionTimeout" are used only in tests which might be 
legacy and do not have an impact on normal code, and "spark.rpc.lookupTimeout" 
don't have the same issue. 
The only one for "spark.rpc.askTimeout" which I am not sure whether it is 
an issue is 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/Client.scala#L229.
 I am not sure whether it is a special case that force this config 10s when not 
configured


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21575: [SPARK-24566][CORE] spark.storage.blockManagerSlaveTimeo...

2018-06-15 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21575
  
I have made the modification, @maropu please review the code, thank you


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...

2018-06-15 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21567
  
I have made some modification, @maropu please review the code, thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21575: spark.storage.blockManagerSlaveTimeoutMs default ...

2018-06-14 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/spark/pull/21575

spark.storage.blockManagerSlaveTimeoutMs default config

## What changes were proposed in this pull request?
This PR use spark.network.timeout in place of 
spark.storage.blockManagerSlaveTimeoutMs when it is not configured, as 
configuration doc said

## How was this patch tested?
manual test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/spark slaveTimeOutConfig

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21575.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21575


commit f5943410efd2f8f0cc82493eee5c5a4c30f7ebe3
Author: xueyu <278006819@...>
Date:   2018-06-15T05:32:33Z

blockManagerSlaveTimeoutMs default config




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21567: [SPARK-24560][SS][MESOS] Fix some getTimeAsMs as ...

2018-06-14 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/spark/pull/21567

[SPARK-24560][SS][MESOS] Fix some getTimeAsMs as getTimeAsSeconds

## What changes were proposed in this pull request?

This PR replaces some "getTimeAsMs" with "getTimeAsSeconds". This will 
return a wrong value when the user specifies a value without a time unit.

## How was this patch tested?
manual test

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/spark fixGetTimeAs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21567.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21567


commit 10bf41ec86c0af59a791fa02b5efaedc7a164a3c
Author: xueyu <278006819@...>
Date:   2018-06-14T11:01:29Z

fix getTimeAs method




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21485: [SPARK-24455][CORE] fix typo in TaskSchedulerImpl commen...

2018-06-03 Thread xueyumusic
Github user xueyumusic commented on the issue:

https://github.com/apache/spark/pull/21485
  
I took another look and find some typos, please review them, @HyukjinKwon , 
thank you for reminding


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21485: [SPARK-24455][CORE] fix typo in TaskSchedulerImpl...

2018-06-02 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/spark/pull/21485

[SPARK-24455][CORE] fix typo in TaskSchedulerImpl comment

change runTasks to submitTasks  in the TaskSchedulerImpl.scala 's comment




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/spark fixtypo1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21485.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21485


commit 97df135e7af26191cbbe3e5c54afe79d94aa43f8
Author: xueyu 
Date:   2018-06-02T07:09:18Z

fix typo in TaskSchedulerImpl




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org