[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-10 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160331#comment-16160331
 ] 

Anastasia Braginsky commented on HBASE-18748:
-

Here is the summary of the design points for now:

1. A cache block load will earn a new type of WAL entry - cache-in WAL entry - 
to be interleaved among other write-caused WAL entries.
2. No coprocessor observer is defined on cache-in event, so no coprocessor code 
is going to be used. The suggestion is to modify the HBase internal code.
3. There is an idea to condition cache-in WAL entry creation to (1) 
accumulation of X cache-in events or to (2) passing Y units of time. The early 
among (1) and (2). Meaning the WAL update with cache-in WAL entry should be a 
rare case.
Thus WAL size change (on top of the write WAL updates) should be really 
minimal, hopefully not recognizable, subject to the benchmarking, of course.
4. When cache-in WAL entry appears on the other/secondary cluster side, it'd do 
a read on the remote cluster which hopefully loads block to cache. Possible 
feature extension is to make only cache-in event.


> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-10 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160329#comment-16160329
 ] 

Anastasia Braginsky commented on HBASE-18748:
-

bq. Our cache is pluggable. You could reference a cache that overloads our 
default that does a write to WAL (async) whenever we pull in a block? Something 
like that? 

[~stack], I think it is an excellent idea :)

> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-10 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160328#comment-16160328
 ] 

Anastasia Braginsky commented on HBASE-18748:
-

Hey, [~zyork]!

Sorry for late reply and thanks for your suggestions and references.

bq. Do you guys already enable this config?
No, it is yet to be implemented.

bq. Configuration key to prefetch all blocks of a given file into the block 
cache when the file is opened.
This is not exactly what we are talking about. We want to load only some blocks 
on secondary and only due to correlated cache load of the blocks on primary.

bq. Also you are mentioning multiple clusters here, have you taken a look at 
https://issues.apache.org/jira/browse/HBASE-18477?
Thanks for the reference. Again, this is not exactly what we are talking about 
but nice reference to look on.

> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-05 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154602#comment-16154602
 ] 

Zach York commented on HBASE-18748:
---

[~anastas] Do you guys already enable this config? This can be used to limit 
latency spikes, but it seems like you want it to already be prewarmed.

/**
   * Configuration key to prefetch all blocks of a given file into the block 
cache
   * when the file is opened.
   */
  public static final String PREFETCH_BLOCKS_ON_OPEN_KEY =
  "hbase.rs.prefetchblocksonopen";

Also you are mentioning multiple clusters here, have you taken a look at 
https://issues.apache.org/jira/browse/HBASE-18477? It might not be exactly what 
you are looking for, but it could help.

> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16152923#comment-16152923
 ] 

stack commented on HBASE-18748:
---

What you thinking [~anastas] ? The WAL hosts all writes to hbase. It does not 
do reads. Our cache is pluggable. You could reference a cache that overloads 
our default that does a write to WAL (async) whenever we pull in a block? 
Something like that? I don't think we have any hooks other than what 
implementation to load in cache implementation currently.

> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

2017-09-03 Thread Anastasia Braginsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151792#comment-16151792
 ] 

Anastasia Braginsky commented on HBASE-18748:
-

As explained in the description, we would like to add a feature to the HBase 
replication methodology. The failover from primary cluster to secondary should 
have zero effect on the read latency. Currently there is a spike in the read 
latency upon failover due to cache on the secondary being cold. Simple 
redirection (duplication by user application) of reads to secondary prior to 
failover, resolves this issue. However, to make secondary to proceed all the 
reads is some waist of resources. Therefore, the suggestion is to redirect only 
"relevant" reads. In other words, the suggested solution is to selectively 
replay read requests at the backup - namely, those reads that caused cache-ins 
at the primary. 

We intend to use WAL replication as transport protocol (hopefully, as black 
box), and of course add custom replay callbacks. Meaning, to add a new "read 
type" of WAL entries, that are going to be rare, only upon cache-in. Those, 
read WAL entries, are going to be replicated on the secondary cluster. Of 
course, the cache blocks on primary and secondary may diverse, but this is a 
good heuristic.

What do you think about this suggestion? [~stack] and everybody, we would like 
to hear from you! May be this is anyhow already implemented and we are not 
aware?

> Cache pre-warming upon replication
> --
>
> Key: HBASE-18748
> URL: https://issues.apache.org/jira/browse/HBASE-18748
> Project: HBase
>  Issue Type: New Feature
>Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's 
> assume primary cluster is replicated to secondary (backup) cluster using the 
> WAL of the primary cluster to propagate the changes. Let's also assume the 
> secondary cluster is a target for failover when needed and should become 
> primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, 
> upon failover, the backup RS's cache is cold. Warming it up to the right 
> working set takes many minutes. The suggested solution is to selectively 
> replay read requests at the backup - namely, those reads that caused 
> cache-ins at the primary. We intend to use WAL replication as transport 
> protocol (hopefully, as black box), and of course add custom replay 
> callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)