[jira] [Commented] (HBASE-20526) multithreads bulkload performance

2018-05-06 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465343#comment-16465343
 ] 

Key Hutu commented on HBASE-20526:
--

Thanks , Ted Yu
I see the implementation of master branch and branch-1.2/1.3 is different
Can we accept submissions based on branch-1.3 ? 

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Zookeeper
>Affects Versions: 1.2.5, 1.3.2
> Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
> Fix For: 1.3.2
>
> Attachments: HBASE-20526-branch-1.3.V1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20526) multithreads bulkload performance

2018-05-06 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-20526:
-
Fix Version/s: (was: 1.2.5)
   1.3.2
Affects Version/s: 1.3.2
   Attachment: HBASE-20526-branch-1.3.V1.patch
   Status: Patch Available  (was: In Progress)

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Zookeeper
>Affects Versions: 1.3.2, 1.2.5
> Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
> Fix For: 1.3.2
>
> Attachments: HBASE-20526-branch-1.3.V1.patch
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20526) multithreads bulkload performance

2018-05-04 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-20526:
-
Affects Version/s: (was: 1.2.0)
   1.2.5
Fix Version/s: 1.2.5

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Zookeeper
>Affects Versions: 1.2.5
> Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
> Fix For: 1.2.5
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20526) multithreads bulkload performance

2018-05-04 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464628#comment-16464628
 ] 

Key Hutu commented on HBASE-20526:
--

In the application,  doBulkload(hpath, admin, table, regionLocator) method 
called.
To ensure real-time performance,  many small files at higher frequencies was 
loaded

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20526) multithreads bulkload performance

2018-05-04 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464627#comment-16464627
 ] 

Key Hutu commented on HBASE-20526:
--

Thank you for your attention, Ted Yu
the executor log like this

{panel:title=executor stderr}
2018-05-05 12:19:41,948- WARN -330831[Executor task launch worker for task 
187159]-(HBaseConfiguration.java:195)-Config option 
"hbase.regionserver.lease.period" is deprecated. Instead, use 
"hbase.client.scanner.timeout.period"
2018-05-05 12:19:41,948-DEBUG -330831[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 199,8  replyHeader:: 199,197642441638,0  request:: 
'/hbase,F  response:: 
v{'replication,'schema,'meta-region-server,'rs,'splitWAL,'backup-masters,'table-lock,'flush-table-proc,'region-in-transition,'online-snapshot,'master,'running,'balancer,'recovering-regions,'draining,'namespace,'hbaseid,'table}
 
2018-05-05 12:19:41,949-DEBUG -330832[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 200,4  replyHeader:: 200,197642441638,0  request:: 
'/hbase/meta-region-server,F  response:: 
#0001a726567696f6e7365727665723a3630303230ffb6ffac57ffadff80ff80ffa8b50425546a17aa686f73742d382d31323810fff4ffd4318ffd2ff8affe7ffd9ffaf2c100183,s{197568498964,197568498964,1524633515423,1524633515423,0,0,0,0,64,0,197568498964}
 
2018-05-05 12:19:41,950-DEBUG -330833[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 201,8  replyHeader:: 201,197642441638,0  request:: 
'/hbase,F  response:: 
v{'replication,'schema,'meta-region-server,'rs,'splitWAL,'backup-masters,'table-lock,'flush-table-proc,'region-in-transition,'online-snapshot,'master,'running,'balancer,'recovering-regions,'draining,'namespace,'hbaseid,'table}
 
2018-05-05 12:19:41,950-DEBUG -330833[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 202,4  replyHeader:: 202,197642441638,0  request:: 
'/hbase/meta-region-server,F  response:: 
#0001a726567696f6e7365727665723a3630303230ffb6ffac57ffadff80ff80ffa8b50425546a17aa686f73742d382d31323810fff4ffd4318ffd2ff8affe7ffd9ffaf2c100183,s{197568498964,197568498964,1524633515423,1524633515423,0,0,0,0,64,0,197568498964}
 
2018-05-05 12:19:41,950-DEBUG -330833[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 203,8  replyHeader:: 203,197642441638,0  request:: 
'/hbase,F  response:: 
v{'replication,'schema,'meta-region-server,'rs,'splitWAL,'backup-masters,'table-lock,'flush-table-proc,'region-in-transition,'online-snapshot,'master,'running,'balancer,'recovering-regions,'draining,'namespace,'hbaseid,'table}
 
2018-05-05 12:19:41,951-DEBUG -330834[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 204,4  replyHeader:: 204,197642441638,0  request:: 
'/hbase/meta-region-server,F  response:: 
#0001a726567696f6e7365727665723a3630303230ffb6ffac57ffadff80ff80ffa8b50425546a17aa686f73742d382d31323810fff4ffd4318ffd2ff8affe7ffd9ffaf2c100183,s{197568498964,197568498964,1524633515423,1524633515423,0,0,0,0,64,0,197568498964}
 
2018-05-05 12:19:42,002-DEBUG -330885[Executor task launch worker for task 
201898]-(TaskMemoryManager.java:221)-Task 201898 acquired 256.0 KB for 
org.apache.spark.shuffle.sort.ShuffleExternalSorter@18f196e
2018-05-05 12:19:42,003-DEBUG -330886[Executor task launch worker for task 
201898]-(TaskMemoryManager.java:230)-Task 201898 release 128.0 KB from 
org.apache.spark.shuffle.sort.ShuffleExternalSorter@18f196e
2018-05-05 12:19:42,053-DEBUG -330936[Executor task launch worker for task 
187159-SendThread(host-8-2:2181)]-(ClientCnxn.java:818)-Reading reply 
sessionid:0x162fb3760b1ea01, packet:: clientPath:null serverPath:null 
finished:false header:: 205,8  replyHeader:: 205,197642441638,0  request:: 
'/hbase,F  response:: 
v{'replication,'schema,'meta-region-server,'rs,'splitWAL,'backup-masters,'table-lock,'flush-table-proc,'region-in-transition,'online-snapshot,'master,'running,'balancer,'recovering-

[jira] [Updated] (HBASE-20526) multithreads bulkload performance

2018-05-03 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-20526:
-
Affects Version/s: (was: 2.0.0)
   1.2.0
  Environment: 
hbase-server-1.2.0-cdh5.12.1 

spark version 1.6
  Component/s: Zookeeper

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-server-1.2.0-cdh5.12.1 
> spark version 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HBASE-20526) multithreads bulkload performance

2018-05-03 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-20526 started by Key Hutu.

> multithreads bulkload performance
> -
>
> Key: HBASE-20526
> URL: https://issues.apache.org/jira/browse/HBASE-20526
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 2.0.0
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Minor
>  Labels: performance
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When doing bulkload , some interactive with zookeeper to getting region key 
> range may be cost more time.
> In multithreads enviorment, the duration maybe cost 5 minute or more.
> From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
> packet:: clientPath:null server ...' contents appear many times.
>  
> It likely to provide new method for bulkload, caching the key range outside
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20526) multithreads bulkload performance

2018-05-03 Thread Key Hutu (JIRA)
Key Hutu created HBASE-20526:


 Summary: multithreads bulkload performance
 Key: HBASE-20526
 URL: https://issues.apache.org/jira/browse/HBASE-20526
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Affects Versions: 2.0.0
Reporter: Key Hutu
Assignee: Key Hutu


When doing bulkload , some interactive with zookeeper to getting region key 
range may be cost more time.

In multithreads enviorment, the duration maybe cost 5 minute or more.

>From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , 
>packet:: clientPath:null server ...' contents appear many times.

 

It likely to provide new method for bulkload, caching the key range outside

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-02-02 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351007#comment-16351007
 ] 

Key Hutu commented on HBASE-19848:
--

thanks for your help, Ted Yu, huaxiang sun, thanks

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Attachments: HBASE-19848-V2.patch, HBASE-19848-V3.patch, 
> HBaseContext.patch, HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-02-02 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Attachment: HBASE-19848-V3.patch

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBASE-19848-V3.patch, 
> HBaseContext.patch, HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-31 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346718#comment-16346718
 ] 

Key Hutu commented on HBASE-19848:
--

Hi !Ted Yu, [~huaxiang]

I fix compile error locally.

In the patch, just make sure the connection is closing, testing the patch in 
addition no need.  What do you think ? Thank you

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBaseContext.patch, 
> HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-31 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Comment: was deleted

(was: just close the connection, no need to test the patch in addition)

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBaseContext.patch, 
> HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-31 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Status: Patch Available  (was: Open)

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBaseContext.patch, 
> HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-31 Thread Key Hutu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346690#comment-16346690
 ] 

Key Hutu commented on HBASE-19848:
--

just close the connection, no need to test the patch in addition

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBaseContext.patch, 
> HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-31 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Attachment: HBASE-19848-V2.patch

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBASE-19848-V2.patch, HBaseContext.patch, 
> HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-30 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Status: Patch Available  (was: Open)

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBaseContext.patch, HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-30 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Attachment: HBaseContext.patch

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Assignee: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBaseContext.patch, HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-25 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Status: Patch Available  (was: Open)

At bulkload and hbaseBulkLoadThinRows method , do close()

I submit a patch file in attachment

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-25 Thread Key Hutu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Key Hutu updated HBASE-19848:
-
Attachment: HBaseContext.scala

> Zookeeper thread leaks in hbase-spark bulkLoad method
> -
>
> Key: HBASE-19848
> URL: https://issues.apache.org/jira/browse/HBASE-19848
> Project: HBase
>  Issue Type: Bug
>  Components: spark, Zookeeper
>Affects Versions: 1.2.0
> Environment: hbase-spark-1.2.0-cdh5.12.1 version
> spark 1.6
>Reporter: Key Hutu
>Priority: Major
>  Labels: performance
> Fix For: 1.2.0
>
> Attachments: HBaseContext.scala
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> In hbase-spark project, HBaseContext provides bulkload methond for loading 
> spark rdd data to hbase easily.But when i using it frequently, the program 
> will throw "cannot create native thread" exception.
> using pstack command in spark driver process , the thread num is increasing 
> using jstack, named "main-SendThread" and "main-EventThread"  thread so many
> It seems like that , connection created before bulkload ,but close method 
> uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19848) Zookeeper thread leaks in hbase-spark bulkLoad method

2018-01-23 Thread Key Hutu (JIRA)
Key Hutu created HBASE-19848:


 Summary: Zookeeper thread leaks in hbase-spark bulkLoad method
 Key: HBASE-19848
 URL: https://issues.apache.org/jira/browse/HBASE-19848
 Project: HBase
  Issue Type: Bug
  Components: spark, Zookeeper
Affects Versions: 1.2.0
 Environment: hbase-spark-1.2.0-cdh5.12.1 version

spark 1.6
Reporter: Key Hutu
 Fix For: 1.2.0


In hbase-spark project, HBaseContext provides bulkload methond for loading 
spark rdd data to hbase easily.But when i using it frequently, the program will 
throw "cannot create native thread" exception.

using pstack command in spark driver process , the thread num is increasing 

using jstack, named "main-SendThread" and "main-EventThread"  thread so many

It seems like that , connection created before bulkload ,but close method 
uninvoked at last



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)