Re: Replication

Sebastian Bauer Tue, 13 Jul 2010 15:03:31 -0700

 W dniu 13.07.2010 23:50, Jean-Daniel Cryans pisze:

Yeah using an experimental feature can be "odd" to use :D

I love bleeding edge technologies :D

So one of the following is happening:


  1) You aren't using a version of hadoop patched enough to get
replication working fully. Trunk uses a special jar that I patched
myself. CDH3b2 also has everything needed. What this means is that
it's trying to open the log file but the first block isn't available
(it's actually a very small patch for the Namenode).

I'm using hadoop from 0.20-append branch released with hbase-0.89.xxxx

  2) The file is empty, because nothing was written to the log file.
What this means is that it's trying to open the log file but there's
not even a single block in it, so it fails on EOF.

this problem can be because of this, cause when all its running i seeless of this traces

J-D

Thanks for your help :)

On Tue, Jul 13, 2010 at 2:37 PM, Sebastian Bauer<[email protected]>  wrote:

  after trying to setup replication i have got many od this errors:

2010-07-13 23:35:26,498 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Waited
too long for this file, considering dumping
2010-07-13 23:35:26,498 DEBUG
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unable
to open a reader, sleeping 100 times 10
2010-07-13 23:35:27,111 INFO org.apache.hadoop.hbase.regionserver.Store:
Completed compaction of 3 file(s) in c of
CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd.;
new
  storefile is
hdfs://db2a:50001/hbase/CampaignToUsers/6504d518fb224efe1530e79c198994cd/c/226233377281334567;
store size is 19.6m
2010-07-13 23:35:27,111 INFO org.apache.hadoop.hbase.regionserver.HRegion:
compaction completed on region
CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd.
in 1sec
2010-07-13 23:35:27,111 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Starting compaction on region
UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d.
2010-07-13 23:35:27,112 DEBUG org.apache.hadoop.hbase.regionserver.Store:
Compaction size of c: 31.4m; Skipped 0 file(s), size: 0
2010-07-13 23:35:27,112 INFO org.apache.hadoop.hbase.regionserver.Store:
Started compaction of 3 file(s) in c of
UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d.  into
hdfs://db2a:50001/hbase/UsersToCampaign/ecb7
605434967e247ce14d525849495d/.tmp, seqid=65302505
2010-07-13 23:35:27,498 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening
log for replication 85.232.237.234%3A60020.1279056880911 at 0
2010-07-13 23:35:27,499 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: test
Got:
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103)
        at
org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511)
        at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422)
        at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262)
2010-07-13 23:35:27,499 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Waited
too long for this file, considering dumping
2010-07-13 23:35:27,499 DEBUG
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unable
to open a reader, sleeping 100 times 10
2010-07-13 23:35:28,499 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening
log for replication 85.232.237.234%3A60020.1279056880911 at 0
2010-07-13 23:35:28,500 WARN
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: test
Got:
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103)
        at
org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511)
        at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422)
        at
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262)

W dniu 13.07.2010 20:18, Jean-Daniel Cryans pisze:

No, but you can use the new mapreduce utility
org.apache.hadoop.hbase.mapreduce.CopyTable to copy whole tables
between clusters. It's like distcp for HBase.

Oh and looking at the documentation I just figured that I changed the
name of the configuration that enables replication just before
committing and forgot to update the package.html file, it's now simply
hbase.replication (and it should stay like that). I'll fix that in the
scope of HBASE-2808.

J-D

On Tue, Jul 13, 2010 at 11:12 AM, Sebastian Bauer<[email protected]>
  wrote:

  I have one more question can i first create master and after loading
data
connect slave or turn on replication on existing tables with data?

W dniu 13.07.2010 19:56, Jean-Daniel Cryans pisze:

Thanks for info where i can find some documentation. There is info
about
zookeeper that it need running in standalone mode it is true?

Well you can run add_peer.rb when the clusters are running, but they
won't pickup the change live (that part isn't done yet). So if you run
the script while the cluster is running, restart it. Also take a look
at the region server log, it should output something like this when
starting up:

     LOG.info("This cluster (" + thisCluster + ") is a "
           + (this.replicationMaster ? "master" : "slave") + " for
replication" +
           ", compared with (" + address + ")");

This will tell you if you used the right address for zookeeper. If
your region server on the master cluster thinks its a slave, then the
addresses are wrong. Also currently there's no reporting for
replication, since it's not done yet!

For a more in-depth documentation, check out
https://issues.apache.org/jira/browse/HBASE-2808

Thanks for trying this out, as the author of most of that part of the
code I'm thrilled!

J-D

Re: Replication

Reply via email to