[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-21 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113151#comment-17113151
 ] 

Hudson commented on HBASE-23938:


Results for branch master
[build #1733 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1733/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1733/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1663//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1733/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1733/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 3. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/master/1733//artifact/output-integration/hadoop-3.log].
 (note that this means we didn't check the Hadoop 3 shaded client)


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-20 at 4.23.06 PM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112770#comment-17112770
 ] 

Hudson commented on HBASE-23938:


Results for branch branch-2.3
[build #99 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/99/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/99/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/99//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/99//console].


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/99/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-20 at 4.23.06 PM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112679#comment-17112679
 ] 

Hudson commented on HBASE-23938:


Results for branch branch-2
[build #2670 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2670/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2670/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2670//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2670//console].


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2670/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-20 at 4.23.06 PM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112602#comment-17112602
 ] 

Hudson commented on HBASE-23938:


Results for branch branch-2.3
[build #98 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/98/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/98/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/98//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/98//console].


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/98/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-20 at 4.23.06 PM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-15 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108444#comment-17108444
 ] 

Viraj Jasani commented on HBASE-23938:
--

*Heads up:*

I have received one +1 from [~zhangduo] on the PR. Please let me know if anyone 
else would like to review. Keeping the PR open for few days.

Thanks to everyone involved.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-15 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108004#comment-17108004
 ] 

Viraj Jasani commented on HBASE-23938:
--

Oh yes, rsgroup could be an added advantage on top of what we have considered 
so far.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-14 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107765#comment-17107765
 ] 

Sean Busbey commented on HBASE-23938:
-

fwiw I like the idea of the system table for taking care of the file management 
stuff. batching and best effort at doing the persistence writes sounds like a 
reasonable way to avoid pouring more load onto a loaded system, especially 
given existing options to segregate tables if needed, i.e. with rs groups.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-14 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107076#comment-17107076
 ] 

Viraj Jasani commented on HBASE-23938:
--

FYI [~busbey] [~dmanning]

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-14 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107073#comment-17107073
 ] 

Viraj Jasani commented on HBASE-23938:
--

HDFS maintenance might be an extra cost, rolling files, file size limit etc 
should be handled manually. Whereas with system table, not only these factors 
are taken care of but also we are not strictly dependent on HDFS FileSystem. 
For HBase running in cloud, if HFiles are stored directly in S3, Azure Blob 
storage etc, we would better want to store these logs there and we would prefer 
HBase compute layer taking care of all HFile maintenance. That's the reason why 
I also turned off WAL writes for this system table as per your suggestion to 
not have multiple HDFS writes, which is quite valid. Thought?

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-14 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106955#comment-17106955
 ] 

Anoop Sam John commented on HBASE-23938:


IMHO, for persisting, we should not consider another table within HBase at all. 
(I believe in ur latest patch u made WAL off for this system table write) HDFS 
write should be enough.  It will be bit more parsing work etc if want to go 
over the past details.  But the actual intent of this ask was to persist these 
complete log info. And HDFS writes can satisfy that.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-13 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106518#comment-17106518
 ] 

Viraj Jasani commented on HBASE-23938:
--

Hello Stack/Andrew,

As Andrew said, yes the main purpose behind having to store 
responseTooSlow/responseTooLarge RPC logs optionally in HDFS (System table) on 
top of ringbuffer is to persist them forever. While I understand not all users 
might want to store them permanently, having this option might be useful to 
provide historical system performance with actual RPC data. Moreover, these 
logs stored in ring buffer and system table are going to be complete request 
data as opposed to trimmed req data logged by RpcServer. (e.g 

"param":"region { type: REGION_NAME value: 
\"t1,\\000\\000\\215\\f)o\\024\\302\\220\\000\\000\\000\\000\\000\\001\\000\\000\\000\\000\\000\\006\\000\\000\\000\\000\\000\\005\\000\\000"

 )

 
{quote}They don’t include environmental or other details beyond the details of 
the request that is too slow. Yet we find value in them now. Adding such detail 
might be possible (some kind of derived load indicator? Like Unix load?) and 
could be pursued in addition to the present goals.
{quote}
I agree, this might be useful as we are planning to persist actual request data 
in system table.

 

There is one implementation difference I have considered in the PR:

While RingBuffer will get the data filled in asynchronously using LMax 
Disruptor as soon as RpcServer identifies a particular RPC call as slow/large 
in nature, writing the same request immediately in the system table might not 
be a preferred option because system is already most likely slow. Hence, in the 
latest patch, I have considered having a cron running every 10 min and persist 
slow/large logs preserved in memory so far: (list of 100 puts in one go). While 
it might increase the load on the system momentarily, cron will run every 10 
min (not continuously as and when we get slow log). Please let me know what you 
think as per your convenience: [https://github.com/apache/hbase/pull/1681]

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-13 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106426#comment-17106426
 ] 

Andrew Kyle Purtell commented on HBASE-23938:
-

Viraj has taken over this project so may have a different point of view but 
when I filed the first issue for this the answer to both your questions is: 
This is a best effort recording of responseTooSlow warnings by way of ring 
buffer suitable for online retrieval. An operator can list them with a 
convenient tool (eg shell command) and does not need to scrape logs. Later, 
Busby and others wanted an option to persist also to HDFS so the recording’s 
history is not subject to ring buffer limits. The value of the responseTooSlow 
warnings does not change from present. They don’t include environmental or 
other details beyond the details of the request that is too slow. Yet we find 
value in them now. Adding such detail might be possible (some kind of derived 
load indicator? Like Unix load?) and could be pursued in addition to the 
present goals. 

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-12 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105980#comment-17105980
 ] 

Michael Stack commented on HBASE-23938:
---

Thanks for the pointer [~apurtell]

Not answered are in the parent:

 * What will operator 'do' w/ the info. I can see querying ring buffer/table 
content but interested in how it is used to change loadings; i'm sure there a 
good use case. I ask because while a bunch of massive writes/reads might 
provoke too slow, usually I see HDFS turn soggy because of rate rather than 
particular reads/writes so interested in others findings. How do we know an 
entry in the ring buffer is due to the explicit query rather than because of 
surrounding read/writes. Is there a compelling case for recording in a table 
perhaps built on findings so far from use of ring buffers frustrated some 
because the history kept proved too short?
 * Also unanswered is recording of environment at time of slow; state of 
concurrent handler state, etc. (Perhaps external tooling for machine state and 
perhaps hadoop+hbase metrics are sufficient?)

Just wondering about the benefit/rigging ratio. Thanks.




> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-12 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105874#comment-17105874
 ] 

Andrew Kyle Purtell commented on HBASE-23938:
-

Most of those questions are answered by the parent?

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-12 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105829#comment-17105829
 ] 

Michael Stack commented on HBASE-23938:
---

What is a slow log? Why are we recording them and how will they be processed?  
What will an operator 'do' with the info?

We record 'large' RPCs? What will we do w/ them? If the log is slow, what is 
being written is a factor but the context at the time -- what other handlers 
are doing at the time, what is going on in HDFS, write rate, background 
replication, other process i/o -- are also worthy of factoring; is this 
happening and the missing piece is this recording of RPC?  We are already 
recording RPCs in a circular buffer? This is not enough? We want to keep the 
history for a longer time?  

Thanks.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-10 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104095#comment-17104095
 ] 

Viraj Jasani commented on HBASE-23938:
--

Yes, that is correct.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-10 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104094#comment-17104094
 ] 

ramkrishna.s.vasudevan commented on HBASE-23938:


[~vjasani]
Thanks for the info. So if the 'hbase.regionserver.slowlog.systable.enabled ' 
is disabled we won't write directly to HDFS also correct? 


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-10 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104088#comment-17104088
 ] 

Viraj Jasani commented on HBASE-23938:
--

Yes, with this Jira, 2 configs are of our interests:

1. *hbase.regionserver.slowlog.buffer.enabled* (already present in 
master/branch-2): Default false. If enabled, in-memory ring buffer will store 
all slow/large RPC calls and users can query this ring buffer with various 
filters.

2. *hbase.regionserver.slowlog.systable.enabled* (in progress as part of this 
Jira): Default false. If enabled, slow/large logs will flow in system table 
also in addition to ring buffer. The above config should also be enabled.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-10 Thread ramkrishna.s.vasudevan (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104081#comment-17104081
 ] 

ramkrishna.s.vasudevan commented on HBASE-23938:


[~vjasani]
Writng the slow/large logs - is it configurable like write to system table or 
to hdfs? If not just turn it off? 

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-08 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102427#comment-17102427
 ] 

Viraj Jasani commented on HBASE-23938:
--

Memstore + Flush + Compaction are enough. I can use Skip WAL option.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-08 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102411#comment-17102411
 ] 

Anoop Sam John commented on HBASE-23938:


Some times the mutation can be slow for a bit longer period also. (Specially 
when you are using cloud remote storage).  My point is that, then we are adding 
salt to the wound by adding this log also into another table. Are you planning 
to make WAL off for this table?)
When we write only to HDFS it is just one write. In case of HBase table write 
if WAL is there, already one write and then flush and then compactions.  I am 
concerned about this write amplification.


> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-08 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102395#comment-17102395
 ] 

Viraj Jasani commented on HBASE-23938:
--

While I agree HDFS level persistence should be good, the reason why I prefer 
system table is because user can easily retrieve all records on shell, use 
ColumnValue filters, and HBase will take care of StoreFile management.

In both cases (system table and direct HDFS), persisting complete data is going 
to require extra IO anyways: HBase and HDFS levels or just HDFS level. With 
system table, we are not concerned about FileSystem layer.

On the other hand, Mutate calls might be too slow but they can't be slow 
indefinitely right? Maybe instead of persisting all of them right after keeping 
them in ring buffer, we can have a cron running every 20 min which can look 
into in memory ring buffer entries and persist them all to System table. What 
do you think ?

[~anoop.hbase] [~apurtell] [~busbey]

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-07 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102179#comment-17102179
 ] 

Anoop Sam John commented on HBASE-23938:


When the Mutate ops suffering and gives responseTooSlow, writing those logs to 
a system table (memstore) making more load on the cluster right? Persisting to 
an HDFS file is not enough?

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-07 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102022#comment-17102022
 ] 

Viraj Jasani commented on HBASE-23938:
--

Please review: [https://github.com/apache/hbase/pull/1681]

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.0
>
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-06 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101081#comment-17101081
 ] 

Viraj Jasani commented on HBASE-23938:
--

Attaching sample scan of hbase:slowlog table:  !Screen Shot 2020-05-07 at 
12.01.26 AM.png!

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
> Attachments: Screen Shot 2020-05-07 at 12.01.26 AM.png
>
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-05-01 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097334#comment-17097334
 ] 

Viraj Jasani commented on HBASE-23938:
--

Oh yes, these query patterns are definitely useful. Since there would not be 
any Get query and user would want to use Scan with filters (op type, user, 
client etc), we don't have specifics about rowkeys. Maybe 
*TooSlowLog.hashcode() (protobuf generated)* as byte[] could be used as rawkey 
and single CF with all qualifiers (region, op type, client etc) so that client 
can use Scan with single/multiple ColumnValueFilter to get relevant data.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-04-30 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17097044#comment-17097044
 ] 

Andrew Kyle Purtell commented on HBASE-23938:
-

A suggestion: Keep in mind the likely most common query patterns, when thinking 
about row key and cf partitioning. I think the probable order of preference 
will be like
* Slow logs by table and/or namespace
* Slow logs by operation type
* Slow logs by user
* Slow logs by client address

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23938) Replicate slow/large RPC calls to HDFS

2020-04-30 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096409#comment-17096409
 ] 

Viraj Jasani commented on HBASE-23938:
--

[~apurtell] [~stack] [~ndimiduk]  

As part of this Jira, I am planning to create (configurable/optionally) new 
system table to save all slow/large RPC logs if user chose to have it turned on 
(in addition to ring buffer). The common details of the logs would be: client 
address, call details, param, responseSize, userName etc. Since all of these 
details are supposed to be present for any slow/large RPC log, I hope it would 
be better to have all these attributes as qualifiers to only one single column 
family.

> Replicate slow/large RPC calls to HDFS
> --
>
> Key: HBASE-23938
> URL: https://issues.apache.org/jira/browse/HBASE-23938
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.3.0, 1.7.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> We should provide capability to replicate complete slow and large RPC logs to 
> HDFS or create new system table in addition to Ring Buffer. This way we don't 
> lose any of slow logs and operator can retrieve all the slow/large logs. 
> Replicating logs to HDFS / creating new system table should be configurable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)