[GitHub] spark issue #19819: [SPARK-22606][Streaming]Add threadId to the CachedKafkaC...

2018-07-17 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/19819
  
I've seen your PR:  https://github.com/apache/spark/pull/20997, a good 
solution @gaborgsomogyi 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2018-01-23 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
 Execute me ,has the concept of default value been introduce to schema in 
master branch? @gatorsmile thank you.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20356: [SPARK-23185][SQL] Make the configuration "spark.default...

2018-01-23 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/20356
  
Think you very much for your review. I see the discussion,  your pr and 
learn a lot.  But I just want to solve the problem when execute "insert into 
... values ...", which not involves in file source.  May be we can solve this 
first which trouble my team for a long time?  @maropu 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20356: [SPARK-23185][SQL] Make the configuration "spark....

2018-01-22 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/20356

[SPARK-23185][SQL] Make the configuration "spark.default.parallelism" can 
be changed on each SQL session to decrease empty files

## What changes were proposed in this pull request?
Make the configuration "spark.default.parallelism" can be changed on each 
SQL session to decrease empty files.
When execute "insert into ... values ...", many empty files will be 
generated.We can change the configuration "spark.default.parallelism" to 
decrease the number of empty files.But there are many occasions that we want to 
chang the configuration during each session so as not to influence other sql 
sentences, like we may use thrift server to excute many sql sentences on a SQL 
session.

## How was this patch tested?
unit tests,  manual tests


Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lvdongr/spark SPARK-23185

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20356


commit 01af8ce69afeade8bb034c6965de0f3738f12fd5
Author: lvdongr <lv.dongdong@...>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream to connect the kafka in a 
Spark Streaming application has been successfully created.

commit b6daeec664d757999e257e56fed3844db51515e2
Author: lvdongr <lv.dongdong@...>
Date:   2017-03-11T06:35:57Z

Merge remote-tracking branch 'apache/master'

commit e0e47b1da93b90210e44abc6e90655d3028555ec
Author: lvdongr <lv.dongdong@...>
Date:   2017-04-12T07:20:01Z

Merge remote-tracking branch 'apache/master'

commit f4ab88111c5b8e9700eacc1acfa3858aed45124e
Author: lvdongr <lv.dongdong@...>
Date:   2017-07-27T01:54:56Z

isklakldsng branch 'apache/master'

commit 463e570f9e05f785834e27bd535cfbb3b7cb7dfb
Author: lvdongr <lv.dongdong@...>
Date:   2017-07-27T12:09:47Z

Merge remote-tracking branch 'apache/master'

commit 0e1b7f6d8e436ca243f78e3cbf064f591557b6c0
Author: lvdongr <lv.dongdong@...>
Date:   2017-07-28T01:34:48Z

Merge remote-tracking branch 'apache/master'

commit 9a9972125ae8f7d90f5567f5b561f2c0ca16cfe7
Author: lvdongr <lv.dongdong@...>
Date:   2017-07-28T02:50:23Z

refresh the master branch for kafkaconsumer

commit 637900b576b8c4d9e04a808a078e481a99751d03
Author: lvdongr <lv.dongdong@...>
Date:   2017-07-31T03:08:29Z

Merge remote-tracking branch 'apache/master'

commit 04aafed076cb704a100eb7dc45b5cfda6438193b
Author: lvdongr <lv.dongdong@...>
Date:   2017-08-17T11:41:58Z

Merge remote-tracking branch 'apache/master'

commit 9f90ab5356b74dfc63dc9c80ff336ef2c2847e72
Author: root <root@...>
Date:   2017-11-10T03:32:54Z

Merge branch 'master' of https://github.com/apache/spark

commit 8b94711b7fb6cfa72aa06d9e009b73b73ccda36f
Author: root <root@...>
Date:   2017-11-13T00:56:22Z

Merge branch 'master' of https://github.com/apache/spark

commit 70699e3d80d853f7105d967544378c5c342d2ce6
Author: 10171592 <lv.dongdong@...>
Date:   2017-12-07T03:13:24Z

Merge remote-tracking branch 'apache/master'

commit 9e7c0c7d0f8bae30bc07abbedf4c110ec82f1cf3
Author: root <root@...>
Date:   2017-12-07T05:48:04Z

Merge remote-tracking branch 'apache/master'

commit 393730415bcebdef125364be3eb3a64320cac3c9
Author: root <root@...>
Date:   2018-01-09T03:16:36Z

Merge branch 'master' of https://github.com/lvdongr/spark

commit 5db407930d4802b6075036961688192a3039d95a
Author: root <root@...>
Date:   2018-01-09T03:30:43Z

Merge branch 'master' of https://github.com/apache/spark

commit 46672ddaf53b9ed1e97e404753fa14bd3406821a
Author: 10171592 <lv.dongdong@...>
Date:   2018-01-22T08:20:47Z

Merge remote-tracking branch 'apache/master'

commit 884eaee9f2d7782bceae73806da9b65f1119977e
Author: 10171592 <lv.dongdong@...>
Date:   2018-01-22T09:30:14Z

Merge branch 'master' of https://github.com/lvdongr/spark

commit 49641920727f426e88ac32a9c1381f7876eaf7c9
Author: 10171592 <lv.dongdong@...>
Date:   2018-01-23T02:57:54Z

Merge remote-tracking branch 'apache/master'

commit e1aeff8c0cb1358d0c77b0e729ecdfd1a07313dc
Author: lvdongr <lv.dongdong@...>
Date:   2018-01-23T03:18:28Z

[SPARK-23185][SQL] Make the configuration "spark.default.parallelism" can 
be changed on each SQL session to decrease empty files




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19819: [SPARK-22606][Streaming]Add threadId to the CachedKafkaC...

2017-11-26 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/19819
  
Will the cached consumer to the same partition increase , when different 
tasks  consume the same partition and no place to remove?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18987: [SPARK-21775][Core]Dynamic Log Level Settings for execut...

2017-08-23 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18987
  
ok. Thank you all the same for your review @srowen @jerryshao @ajbozarth .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18987: [SPARK-21775][Core]Dynamic Log Level Settings for...

2017-08-23 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/18987


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18987: [SPARK-21775][Core]Dynamic Log Level Settings for execut...

2017-08-21 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18987
  
The log level setting is a very useful function.Our team is doing a spark 
application and  when we want to see the debug log, we have to restart the 
application every time. So we develop this function. 
The complexity lays on the ui display and set the log level. But we can 
choose not to show the setting Button on UI, and just give a api or restful 
interface for user to access this function.Then the change on spark is no much.
And Storm (http://storm.apache.org/) has the same function,too. @srowen 
@jerryshao @ajbozarth 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18987: [SPARK-21775][Core]Dynamic Log Level Settings for...

2017-08-17 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/18987

[SPARK-21775][Core]Dynamic Log Level Settings for executors

## What changes were proposed in this pull request?

Someimes we want to change the log level of executor when our application 
has already deployed, to see detail infomation or decrease the log items. 
Changing the log4j configure file is not convenient,so We add the ability to 
set log level settings for a running executor. 

## How was this patch tested?
manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark SPARK-21775

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18987.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18987


commit 01af8ce69afeade8bb034c6965de0f3738f12fd5
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream to connect the kafka in a 
Spark Streaming application has been successfully created.

commit b6daeec664d757999e257e56fed3844db51515e2
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-11T06:35:57Z

Merge remote-tracking branch 'apache/master'

commit e0e47b1da93b90210e44abc6e90655d3028555ec
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-04-12T07:20:01Z

Merge remote-tracking branch 'apache/master'

commit f4ab88111c5b8e9700eacc1acfa3858aed45124e
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T01:54:56Z

isklakldsng branch 'apache/master'

commit 463e570f9e05f785834e27bd535cfbb3b7cb7dfb
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T12:09:47Z

Merge remote-tracking branch 'apache/master'

commit 0e1b7f6d8e436ca243f78e3cbf064f591557b6c0
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-28T01:34:48Z

Merge remote-tracking branch 'apache/master'

commit 9a9972125ae8f7d90f5567f5b561f2c0ca16cfe7
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-28T02:50:23Z

refresh the master branch for kafkaconsumer

commit 637900b576b8c4d9e04a808a078e481a99751d03
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-31T03:08:29Z

Merge remote-tracking branch 'apache/master'

commit 04aafed076cb704a100eb7dc45b5cfda6438193b
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-08-17T11:41:58Z

Merge remote-tracking branch 'apache/master'

commit f88debcfccf9d1cd5c436321ff8cf444539dfd6c
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-08-18T03:21:50Z

[SPARK-21775][Core]Dynamic Log Level Settings for executors




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-13 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
ok, I will solve the problems left first, and hold  this PR @gatorsmile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-09 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
You mean we can provide the different type of values with  different 
default values? like  int  with 0 ,and string with "" ?Or we set the default 
values when define the table?  @gatorsmile @maropu  I set the default to Null 
,because the "insert into ..." sentence in hive handle in this way, and I want 
to correspond with Hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-09 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
You can see this picture,my table has three columns,and I insert only two 
columns, then the last column is null.  @maropu @gatorsmile 

![insertinto](https://user-images.githubusercontent.com/25652150/29109253-f9b852a8-7d14-11e7-9a9c-b6aa76314a04.PNG)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-08 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
The target of this pr is support  to insert into specified  columns, all  
columns is  no need , like   insert into t(a, c) values (1, 0.8) .  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18756: [SPARK-21548][SQL] "Support insert into serial columns o...

2017-08-08 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/18756
  
Thank you for review, I will finish the tests as soon as possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18753: [SPARK-21548] [SQL] "Support insert into serial c...

2017-07-28 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/18753


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18756: [SPARK-21548][SQL] "Support insert into serial co...

2017-07-28 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/18756

[SPARK-21548][SQL] "Support insert into serial columns of table"

## What changes were proposed in this pull request?
When we use the 'insert into ...' statement we can only insert all the 
columns into table.But int some cases,our table has many columns and we are 
only interest in some of them.So we want to support the statement "insert into 
table tbl (column1, column2,...) values (value1, value2, value3,...)".
https://issues.apache.org/jira/browse/SPARK-21548

## How was this patch tested?
unit tests, integration tests, manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark SPARK-21548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18756.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18756


commit 01af8ce69afeade8bb034c6965de0f3738f12fd5
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream to connect the kafka in a 
Spark Streaming application has been successfully created.

commit b6daeec664d757999e257e56fed3844db51515e2
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-11T06:35:57Z

Merge remote-tracking branch 'apache/master'

commit e0e47b1da93b90210e44abc6e90655d3028555ec
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-04-12T07:20:01Z

Merge remote-tracking branch 'apache/master'

commit f4ab88111c5b8e9700eacc1acfa3858aed45124e
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T01:54:56Z

isklakldsng branch 'apache/master'

commit 463e570f9e05f785834e27bd535cfbb3b7cb7dfb
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T12:09:47Z

Merge remote-tracking branch 'apache/master'

commit 2a40d64bcad6613892a54bc3052a634f59c14c65
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-28T06:56:15Z

[SPARK-21548][SQL]Support insert into serial columns of table




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18751: [SPARK-21548][SQL]Support insert into serial colu...

2017-07-27 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/18751


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18753: [SPARK-21548] [SQL] Support insert into serial co...

2017-07-27 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/18753

[SPARK-21548] [SQL] Support insert into serial columns of table

## What changes were proposed in this pull request?

When we use the 'insert into ...' statement we can only insert all the 
columns into table.But int some cases,our table has many columns and we are 
only interest in some of them.So we want to support the statement "insert into 
table tbl (column1, column2,...) values (value1, value2, value3,...)".

## How was this patch tested?

manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark SPARK--21548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18753.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18753


commit 01af8ce69afeade8bb034c6965de0f3738f12fd5
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream to connect the kafka in a 
Spark Streaming application has been successfully created.

commit b6daeec664d757999e257e56fed3844db51515e2
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-11T06:35:57Z

Merge remote-tracking branch 'apache/master'

commit e0e47b1da93b90210e44abc6e90655d3028555ec
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-04-12T07:20:01Z

Merge remote-tracking branch 'apache/master'

commit f4ab88111c5b8e9700eacc1acfa3858aed45124e
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T01:54:56Z

isklakldsng branch 'apache/master'

commit 463e570f9e05f785834e27bd535cfbb3b7cb7dfb
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T12:09:47Z

Merge remote-tracking branch 'apache/master'

commit da882ea569d451b3f2af550b0976a6a059900f6a
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-28T02:56:23Z

[SPARK-21548][SQL]Support insert into serial columns of table

commit a65be1605865a1159532ba148434d3bb207da64c
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-28T03:03:23Z

refresh last commit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17203: [SPARK-19863][DStream] Whether or not use CachedK...

2017-07-27 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/17203


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18751: [SPARK-21548][SQL]Support insert into serial colu...

2017-07-27 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/18751

[SPARK-21548][SQL]Support insert into serial columns of table

## What changes were proposed in this pull request?

When we use the 'insert into ...' statement we can only insert all the 
columns into table.But int some cases,our table has many columns and we are 
only interest in some of them.So we want to support the statement "insert into 
table tbl (column1, column2,...) values (value1, value2, value3,...)".
https://issues.apache.org/jira/browse/SPARK-21548

## How was this patch tested?
manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark spark21548

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18751.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18751


commit 01af8ce69afeade8bb034c6965de0f3738f12fd5
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream to connect the kafka in a 
Spark Streaming application has been successfully created.

commit b6daeec664d757999e257e56fed3844db51515e2
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-11T06:35:57Z

Merge remote-tracking branch 'apache/master'

commit e0e47b1da93b90210e44abc6e90655d3028555ec
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-04-12T07:20:01Z

Merge remote-tracking branch 'apache/master'

commit f4ab88111c5b8e9700eacc1acfa3858aed45124e
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T01:54:56Z

isklakldsng branch 'apache/master'

commit 463e570f9e05f785834e27bd535cfbb3b7cb7dfb
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T12:09:47Z

Merge remote-tracking branch 'apache/master'

commit 0be180991d87a82d3075b6d63f28486799fc872d
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-07-27T13:25:24Z

[SPARK-21548][SQL]Support insert into serial columns of table




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17620: [SPARK-20305][Spark Core]Master may keep in the s...

2017-06-20 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/17620


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17620: [SPARK-20305][Spark Core]Master may keep in the state of...

2017-04-19 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17620
  
You can see the main method in Master.scala. 

  def main(argStrings: Array[String]) {
Utils.initDaemon(log)
val conf = new SparkConf
val args = new MasterArguments(argStrings, conf)
val (rpcEnv, _, _) = startRpcEnvAndEndpoint(args.host, args.port, 
args.webUiPort, conf)
rpcEnv.awaitTermination()
  }

When the rpcEnv is shut down,the main method will finish,and Master process 
will stop as I test already. I choose this way ,because the onstop method will 
be called before stopping master.So the service in master will also be 
closed,such as webui,metrics,persistenceEngine. I think it will be safer.  
Thank you for your last reply @jerryshao 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17620: [SPARK-20305][Spark Core]Master may keep in the state of...

2017-04-19 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17620
  
This happend at the time the previous master leader remove the died worker 
,clear the worker's node on persistEngine(we use zookeeper),But before the 
worker node was removed from the zookeeper ,the leader changed. The new master 
leader recovery from the zookeeper ,and read the died worker node. Then the new 
leader find the worker died and trying to remove it ,and try to clear the node 
on zookeeper,but the node has been removed by the previous leader ,so an 
exception was throw, and the recovery fail. Then the leader will always in 
COMPLETING_RECOVERY state,and all the application registered cannot get 
resources .


![failfetchresource](https://cloud.githubusercontent.com/assets/25652150/25209181/f7e31528-25ab-11e7-9eb2-e2f15db2dcac.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17620: [SPARK-20305][Spark Core]Master may keep in the s...

2017-04-17 Thread lvdongr
Github user lvdongr commented on a diff in the pull request:

https://github.com/apache/spark/pull/17620#discussion_r111732189
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -561,6 +561,11 @@ private[deploy] class Master(
 state = RecoveryState.ALIVE
 schedule()
 logInfo("Recovery complete - resuming operations!")
+   } catch {
--- End diff --

Thank you very much, I've changed the commit,  you can see  if there are 
any other problems.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17620: [SPARK-20305][Spark Core]Master may keep in the state of...

2017-04-16 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17620
  
Execute me, Can this issue be closed or threre are some other problem? 
@jerryshao 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17620: [SPARK-20305][Spark Core]Master may keep in the s...

2017-04-13 Thread lvdongr
Github user lvdongr commented on a diff in the pull request:

https://github.com/apache/spark/pull/17620#discussion_r111337583
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -539,7 +539,7 @@ private[deploy] class Master(
 
   private def completeRecovery() {
 // Ensure "only-once" recovery semantics using a short synchronization 
period.
-if (state != RecoveryState.RECOVERING) { return }
+if (state != RecoveryState.RECOVERING && state != 
RecoveryState.COMPLETING_RECOVERY) { return }
--- End diff --

It seems better to close the master as you say if an exception happened 
during recovery.So I change the last commit . 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17620: [SPARK-20305][Spark Core]Master may keep in the s...

2017-04-13 Thread lvdongr
Github user lvdongr commented on a diff in the pull request:

https://github.com/apache/spark/pull/17620#discussion_r111337249
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -539,7 +539,7 @@ private[deploy] class Master(
 
   private def completeRecovery() {
 // Ensure "only-once" recovery semantics using a short synchronization 
period.
-if (state != RecoveryState.RECOVERING) { return }
+if (state != RecoveryState.RECOVERING && state != 
RecoveryState.COMPLETING_RECOVERY) { return }
--- End diff --

Thank you for your review and suggest. The last change cannot work as I 
tested today.  I thought completeRecovery would be called again when some 
workers or drivers response to MasterChanged,  then the master(state is 
RecoveryState.COMPLETING_RECOVERY) will have the chance to complete the 
completeRecovery method and change state to ALIVE . I tested , but find no 
called again after the exception(maybe workers or drivers already response to 
MasterChanged before the exception).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17620: [SPARK-20305][Spark Core]Master may keep in the s...

2017-04-12 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/17620

[SPARK-20305][Spark Core]Master may keep in the state of "COMPELETING…

## What changes were proposed in this pull request?
Master may keep in the state of "COMPELETING_RECOVERY",then all the 
application registered cannot get resources, when the leader master change.
This happend when a exception was thrown during the Master trying to 
recovery(completeRecovery method in the master.scala  ). Then the leader will 
always in COMPLETING_RECOVERY state ,for the leader can only change to alive 
from state of RecoveryState.RECOVERING.

## How was this patch tested?
manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark SPARK20305

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17620.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17620


commit 44b9415dd1c6ac854a9debddd67c9dcb00e8df69
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-04-12T07:34:03Z

[SPARK-20305][Spark Core]Master may keep in the state of 
"COMPELETING_RECOVERY",then all the application registered cannot get 
resources, when the leader master change. has been successfully created.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17203: [SPARK-19863][DStream] Whether or not use CachedKafkaCon...

2017-03-10 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17203
  
You can see this issue ,and this is a problem of  cached KafkaConsumer,
https://issues.apache.org/jira/browse/SPARK-19185, and a commentator 
suggest the same method not to use cached kafka consumer. Besides ,only  if 
users can choose different method,they can choose the best way for their own 
situation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17203: [SPARK-19863][DStream] Whether or not use CachedKafkaCon...

2017-03-08 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17203
  
In our case,we deploy a streaming application whose data source are 20 
topics with 30 partitions in kafka cluster(3 brokers). Then the amount of 
connection with kafka is very large,up to a thousand, and the consumer will not 
got message from kafka sometimes,which may lead some jobs to fail. But when 
we replace the consumer with uncached ones, the number of connection decreased, 
then there were no jobs fail. We are still not sure if the large number of 
connection to kafka cause the job fail or not.But we test the result, and we 
want to use the uncached consumers for we can keep our streaming jobs running 
successfully first. So we think there are some occasions not to use the 
uncached consumer,and the developer can choose the way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17203: [SPARK-19863][DStream] Whether or not use CachedK...

2017-03-07 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/17203

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream t


## What changes were proposed in this pull request?
Whether or not use CachedKafkaConsumer need to be configured, when you use 
DirectKafkaInputDStream to connect the kafka in a Spark Streaming application. 
In Spark 2.x, the kafka consumer was replaced by CachedKafkaConsumer (some 
KafkaConsumer will keep establishing the kafka cluster), and cannot change the 
way. In fact ,The KafkaRDD(used by DirectKafkaInputDStream to connect kafka) 
provide the parameter useConsumerCache to choose Whether to use the 
CachedKafkaConsumer, but the DirectKafkaInputDStream always set the parameter 
true.

## How was this patch tested?
manual tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark SPARK-19863

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17203.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17203


commit 5d13e4e75845acabb9a11b0618669e9f51ba55fd
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-03-08T04:09:40Z

[SPARK-19863][DStream] Whether or not use CachedKafkaConsumer need to be 
configured, when you use DirectKafkaInputDStream t




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16879: [SPARK-19541][SQL] High Availability support for ...

2017-02-26 Thread lvdongr
Github user lvdongr closed the pull request at:

https://github.com/apache/spark/pull/16879


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17010: [SPARK-19673][SQL] "ThriftServer default app name is cha...

2017-02-22 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17010
  
Execuse me, may this issue be merged and closed ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17010: [SPARK-19673][SQL] "ThriftServer default app name is cha...

2017-02-20 Thread lvdongr
Github user lvdongr commented on the issue:

https://github.com/apache/spark/pull/17010
  
Before spark1.4.x, the ThriftServer name is "SparkSQL:localhostname",while 
https://issues.apache.org/jira/browse/SPARK-8650 change the rule as a side 
effect. Then the ThriftServer show the class name of HiveThriftServer2, which 
is not appropriate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #17010: [SPARK-19673][SQL] "ThriftServer default app name...

2017-02-20 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/17010

[SPARK-19673][SQL] "ThriftServer default app name is changed wrong"

## What changes were proposed in this pull request?
In spark 1.x ,the name of ThriftServer is SparkSQL:localHostName. While the 
ThriftServer default name is changed to the className of HiveThfift2 , which is 
not appropriate.

## How was this patch tested?
manual tests


Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark ThriftserverName

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/17010.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17010


commit c4a02bca4594ca10473050a85165b4bf96a4ba4e
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-02-21T04:37:12Z

[SPARK-19673][SQL] "ThriftServer default app name is changed wrong"




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16879: [SPARK-19541][SQL] High Availability support for ...

2017-02-09 Thread lvdongr
GitHub user lvdongr opened a pull request:

https://github.com/apache/spark/pull/16879

[SPARK-19541][SQL] High Availability support for ThriftServer

JIRA Issue:  https://issues.apache.org/jira/browse/SPARK-19541

## What changes were proposed in this pull request?

Currently, We use the spark ThriftServer frequently, and there are many 
connects between the client and only ThriftServer.When the ThriftServer is down 
,we cannot get the service again.So we need to consider the ThriftServer HA as 
well as master HA. 
For ThriftServer, we want to import the pattern of HiveServer HA to provide 
ThriftServer HA. Therefore, we need to start multiple thrift server which 
register on the zookeeper. Then the client  can find the thrift server by just 
connecting to the zookeeper.So the beeline can get the service from other 
thrift server when one is down.

## How was this patch tested?

manual tests





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lvdongr/spark spark-issue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16879.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16879


commit cf4b67f20922c3df494ad68db75c4ace18494116
Author: lvdongr <lv.dongd...@zte.com.cn>
Date:   2017-02-10T01:46:51Z

[SPARK-19541] - High Availability support for ThriftServer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org