Re: Replication

Jean-Daniel Cryans Tue, 13 Jul 2010 15:10:01 -0700

Just looked at the head of 0.20-append and I see it contains the
missing patch (was committed as part of HDFS-1057).


So that would mean that the file is just empty :) If you insert a few
rows in the shell on the master cluster, do you see them some seconds
later on the slave?

J-D

On Tue, Jul 13, 2010 at 3:01 PM, Sebastian Bauer <[email protected]> wrote:
>  W dniu 13.07.2010 23:50, Jean-Daniel Cryans pisze:
>>
>> Yeah using an experimental feature can be "odd" to use :D
>
> I love bleeding edge technologies :D
>>
>> So one of the following is happening:
>>
>>  1) You aren't using a version of hadoop patched enough to get
>> replication working fully. Trunk uses a special jar that I patched
>> myself. CDH3b2 also has everything needed. What this means is that
>> it's trying to open the log file but the first block isn't available
>> (it's actually a very small patch for the Namenode).
>
> I'm using hadoop from 0.20-append branch released with hbase-0.89.xxxx
>
>>  2) The file is empty, because nothing was written to the log file.
>> What this means is that it's trying to open the log file but there's
>> not even a single block in it, so it fails on EOF.
>
> this problem can be because of this, cause when all its running i see less
> of this traces
>
>> J-D
>>
> Thanks for your help :)
>
>> On Tue, Jul 13, 2010 at 2:37 PM, Sebastian Bauer<[email protected]>
>>  wrote:
>>>
>>>  after trying to setup replication i have got many od this errors:
>>>
>>> 2010-07-13 23:35:26,498 WARN
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Waited
>>> too long for this file, considering dumping
>>> 2010-07-13 23:35:26,498 DEBUG
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Unable
>>> to open a reader, sleeping 100 times 10
>>> 2010-07-13 23:35:27,111 INFO org.apache.hadoop.hbase.regionserver.Store:
>>> Completed compaction of 3 file(s) in c of
>>>
>>> CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd.;
>>> new
>>>  storefile is
>>>
>>> hdfs://db2a:50001/hbase/CampaignToUsers/6504d518fb224efe1530e79c198994cd/c/226233377281334567;
>>> store size is 19.6m
>>> 2010-07-13 23:35:27,111 INFO
>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>> compaction completed on region
>>>
>>> CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd.
>>> in 1sec
>>> 2010-07-13 23:35:27,111 INFO
>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>> Starting compaction on region
>>> UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d.
>>> 2010-07-13 23:35:27,112 DEBUG org.apache.hadoop.hbase.regionserver.Store:
>>> Compaction size of c: 31.4m; Skipped 0 file(s), size: 0
>>> 2010-07-13 23:35:27,112 INFO org.apache.hadoop.hbase.regionserver.Store:
>>> Started compaction of 3 file(s) in c of
>>> UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d.  into
>>> hdfs://db2a:50001/hbase/UsersToCampaign/ecb7
>>> 605434967e247ce14d525849495d/.tmp, seqid=65302505
>>> 2010-07-13 23:35:27,498 INFO
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Opening
>>> log for replication 85.232.237.234%3A60020.1279056880911 at 0
>>> 2010-07-13 23:35:27,499 WARN
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: test
>>> Got:
>>> java.io.EOFException
>>>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>>>        at java.io.DataInputStream.readFully(DataInputStream.java:152)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>>>        at
>>>
>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51)
>>>        at
>>>
>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103)
>>>        at
>>> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511)
>>>        at
>>>
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422)
>>>        at
>>>
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262)
>>> 2010-07-13 23:35:27,499 WARN
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Waited
>>> too long for this file, considering dumping
>>> 2010-07-13 23:35:27,499 DEBUG
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Unable
>>> to open a reader, sleeping 100 times 10
>>> 2010-07-13 23:35:28,499 INFO
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
>>> Opening
>>> log for replication 85.232.237.234%3A60020.1279056880911 at 0
>>> 2010-07-13 23:35:28,500 WARN
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: test
>>> Got:
>>> java.io.EOFException
>>>        at java.io.DataInputStream.readFully(DataInputStream.java:180)
>>>        at java.io.DataInputStream.readFully(DataInputStream.java:152)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>>>        at
>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>>>        at
>>>
>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51)
>>>        at
>>>
>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103)
>>>        at
>>> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511)
>>>        at
>>>
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422)
>>>        at
>>>
>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262)
>>>
>>> W dniu 13.07.2010 20:18, Jean-Daniel Cryans pisze:
>>>>
>>>> No, but you can use the new mapreduce utility
>>>> org.apache.hadoop.hbase.mapreduce.CopyTable to copy whole tables
>>>> between clusters. It's like distcp for HBase.
>>>>
>>>> Oh and looking at the documentation I just figured that I changed the
>>>> name of the configuration that enables replication just before
>>>> committing and forgot to update the package.html file, it's now simply
>>>> hbase.replication (and it should stay like that). I'll fix that in the
>>>> scope of HBASE-2808.
>>>>
>>>> J-D
>>>>
>>>> On Tue, Jul 13, 2010 at 11:12 AM, Sebastian Bauer<[email protected]>
>>>>  wrote:
>>>>>
>>>>>  I have one more question can i first create master and after loading
>>>>> data
>>>>> connect slave or turn on replication on existing tables with data?
>>>>>
>>>>> W dniu 13.07.2010 19:56, Jean-Daniel Cryans pisze:
>>>>>>>
>>>>>>> Thanks for info where i can find some documentation. There is info
>>>>>>> about
>>>>>>> zookeeper that it need running in standalone mode it is true?
>>>>>>>
>>>>>> Well you can run add_peer.rb when the clusters are running, but they
>>>>>> won't pickup the change live (that part isn't done yet). So if you run
>>>>>> the script while the cluster is running, restart it. Also take a look
>>>>>> at the region server log, it should output something like this when
>>>>>> starting up:
>>>>>>
>>>>>>     LOG.info("This cluster (" + thisCluster + ") is a "
>>>>>>           + (this.replicationMaster ? "master" : "slave") + " for
>>>>>> replication" +
>>>>>>           ", compared with (" + address + ")");
>>>>>>
>>>>>> This will tell you if you used the right address for zookeeper. If
>>>>>> your region server on the master cluster thinks its a slave, then the
>>>>>> addresses are wrong. Also currently there's no reporting for
>>>>>> replication, since it's not done yet!
>>>>>>
>>>>>> For a more in-depth documentation, check out
>>>>>> https://issues.apache.org/jira/browse/HBASE-2808
>>>>>>
>>>>>> Thanks for trying this out, as the author of most of that part of the
>>>>>> code I'm thrilled!
>>>>>>
>>>>>> J-D
>>>>>>
>>>
>
>

Re: Replication

Reply via email to