[jira] Commented: (HBASE-2223) Handle 10min+ network partitions between clusters

HBase Review Board (JIRA) Fri, 25 Jun 2010 17:37:16 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882787#action_12882787
 ]

HBase Review Board commented on HBASE-2223:
-------------------------------------------

Message from: "Jean-Daniel Cryans" <[email protected]>

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 508
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1580#file1580line508>
bq.  >
bq.  >     Is this a leak from another patch?

Like I said in the description of this patch: "It also has the patch from 
HBASE-2707 applied because without it it's too easy to fail TestReplication. "

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java, 
line 114
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1582#file1582line114>
bq.  >
bq.  >     This is a leak from another patch?

Like I said in the description of this patch: "It also has the patch from 
HBASE-2707 applied because without it it's too easy to fail TestReplication. "

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/ServerManager.java, line 153
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1583#file1583line153>
bq.  >
bq.  >     Whats going on here?  We're setting into config the logcleaner to 
use but then in the lines after this we go ahead and create OldLogsCleaner 
anyways?

OldLogsCleaner uses the the class that was specified in the configuration.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, 
line 243
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1584#file1584line243>
bq.  >
bq.  >     Just call this 'replication' or 'replicationEnabled'

Side note, I think every reviewer had a different preference for the name of 
this attribute.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line 
219
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1585#file1585line219>
bq.  >
bq.  >     If the above mentioned Replication class were a singleton, it could 
be shared here with HRS

I created a visitor list to inspect log entries before they get appended. 
Replication is now out of HLog.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line 
1043
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1585#file1585line1043>
bq.  >
bq.  >     It'd be good if we didn't have to create this byte [] per edit.  I 
should reinstitute the comparator I had that took KVs but only compared family 
portion of the KV... no need to create family byte [] then.

Should we open an issue?

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java, line 
1208
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1585#file1585line1208>
bq.  >
bq.  >     YOu might say why its being moved.  Should be INFO level?  WALs are 
kinda important to track?

ok

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeperWrapper.java,
 line 207
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1587#file1587line207>
bq.  >
bq.  >     Why do it this way?  Why an if/else?  Why not do if (...length != 3) 
throw .... no need of an else.

ooops yes

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeperWrapper.java,
 line 238
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1587#file1587line238>
bq.  >
bq.  >     This is an error? Is it critical fail (We dropped recording a WAL)  
Are we just logging and moving on?

https://issues.apache.org/jira/browse/HBASE-2791

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java,
 line 48
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1588#file1588line48>
bq.  >
bq.  >     Why this implementation have to know about other implementations?  
Can't we do a chain of decision classes? Any class can say no?  As soon as any 
decision class says no, we exit the chain.... So in this case, first on the 
chain would be the ttl decision... then would be this one... and third would be 
the snapshotting decision. You don't have to do the chain as part of this patch 
but please open an issue to implement.

https://issues.apache.org/jira/browse/HBASE-2792

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/replication/package.html, line 25
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1589#file1589line25>
bq.  >
bq.  >     Its not just MDC, right?

That'll be Multi Cluster Replication then!

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/replication/package.html, line 47
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1589#file1589line47>
bq.  >
bq.  >     Is it user tables or CF in user tables?

true, CF

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 52
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line52>
bq.  >
bq.  >     What is the 'main thread'?

stale

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 55
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line55>
bq.  >
bq.  >     Can you make this clearer?

that was stale too

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 57
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line57>
bq.  >
bq.  >     Whats this mean?  Lost edits?

stale stale

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 59
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line59>
bq.  >
bq.  >     Maybe say something about how it works... pool of HTables to write 
remote peer, etc.

sure thing

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 93
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line93>
bq.  >
bq.  >     Why again does replication keep its own set of logs rather than just 
keep positions on HRS logs (whether local HRS or the HRS of others?)  So it 
prevails across HRS failures?  Aren't logs archived now rather than thrown away 
so this should be fine?  Otherwise, we are writing edits to FS twice, right?

It's totally different in this context, but I currently removed the logs in 
ReplicationSink until some refactoring happens.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 119
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line119>
bq.  >
bq.  >     This should be inside a finally block?

good catch

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 149
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line149>
bq.  >
bq.  >     Duplicated code -- see up about ten lines.

Not it's different, one inserts and the other deletes

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java,
 line 162
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1590#file1590line162>
bq.  >
bq.  >     You are stopping the regionserver because replication failed?

is it too bold?

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 66
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line66>
bq.  >
bq.  >     One second seems short?

It's actually more like 55 seconds

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 79
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line79>
bq.  >
bq.  >     We carry edits in memory?  And its outside of our normal accounting 
-- e.g. cache and max for memstores?  How we get it into the mix?  Else we 
could OOME if we don't account for this memory payload.

You think ReplicationSource should implement HeapSize?

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 84
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line84>
bq.  >
bq.  >     Ratio?

eh

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 177
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line177>
bq.  >
bq.  >     Whats this about?  IN particular, the 'handling story'...

I'll talk about this more in the doc

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 367
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line367>
bq.  >
bq.  >     When would it make sense to ever replicate catalog table entries?
bq.  >     
bq.  >     Why we even in here reading a file if replicating == false

- When would it make sense to ever replicate catalog table entries?
Never I'd say, why do you ask?

- Why we even in here reading a file if replicating == false
A user can switch that at anytime

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 372
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line372>
bq.  >
bq.  >     Are you checking size of these edits?  You'll read 25k entries of 
any size?

That's exactly what this does:

      if ((this.reader.getPosition() - this.position)
          >= this.replicationQueueSizeCapacity ||

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java,
 line 623
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1591#file1591line623>
bq.  >
bq.  >     Where do you explain replication hlog filenaming convention?

Replication doesn't have a special naming convention.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceInterface.java,
 line 79
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1592#file1592line79>
bq.  >
bq.  >     Can only do one sink, right?  Might want to say so.  How does the 
source know the sink to replicate to?  Should that be part of the Interface?
bq.  >     
bq.  >     The comment for the Interface could say how the Interface is used... 
init, then set sink, then per file archived, call ... etc.

will add that

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 43
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line43>
bq.  >
bq.  >     Make this "This class manages all the replication sources"
bq.  >     
bq.  >     "There are two classes of them"... what you mean to say here?  Two 
classes of source?

Yes, recovered or not, and it's even explained the 2 lines after.

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 46
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line46>
bq.  >
bq.  >     I don't follow this commentary.

Changing it to:
Old sources, they were recovered from a failed region server and our only goal 
is to finish replicating the HLog queue it had up in ZK

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 79
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line79>
bq.  >
bq.  >     You should say more here about whats going on.. either here or up in 
the class comment.  It has watchers on all other RSs in current cluster and 
will try to take on the workloads of others on crash (?).

ok

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 116
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line116>
bq.  >
bq.  >     You reporting status or updating updated position into zk?  There is 
a running status being kept up in zk?  Shouldn't this log message have the log 
path in it too?
bq.  >     
bq.  >     This method does more than update position or report status... looks 
like it changes state in this class updating set of logs?

will change name and add doc

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 194
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line194>
bq.  >
bq.  >     Shouldn't this be nonmodifiable List that you give out here since it 
seems you have it synchronized internally here.

yep

bq.  On 2010-06-25 15:07:35, stack wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java,
 line 321
bq.  > <http://review.hbase.org/r/76/diff/7/?file=1593#file1593line321>
bq.  >
bq.  >     What other regionserver?  There is alot that goes unexplained in 
here?  Could point to the doc. on how replication works.  Same for the transer 
method above (I have an idea because you white boarded it for me).

k

- Jean-Daniel

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/76/#review281
-----------------------------------------------------------

> Handle 10min+ network partitions between clusters
> -------------------------------------------------
>
>                 Key: HBASE-2223
>                 URL: https://issues.apache.org/jira/browse/HBASE-2223
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2223.patch
>
>
> We need a nice way of handling long network partitions without impacting a 
> master cluster (which pushes the data). Currently it will just retry over and 
> over again.
> I think we could:
>  - Stop replication to a slave cluster if it didn't respond for more than 10 
> minutes
>  - Keep track of the duration of the partition
>  - When the slave cluster comes back, initiate a MR job like HBASE-2221 
> Maybe we want less than 10 minutes, maybe we want this to be all automatic or 
> just the first 2 parts. Discuss.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2223) Handle 10min+ network partitions between clusters

Reply via email to