This is not related to replication, it's about a new feature added by https://issues.apache.org/jira/browse/HBASE-2306
https://issues.apache.org/jira/browse/HBASE-2382 added the required documentation, basically if you are in pseudo-distributed you need to tell HBase not to expect the default number of replicas: http://hbase.apache.org/docs/r0.89.20100621/apidocs/overview-summary.html#pseudo-distrib See the discussion in 2382, do you think it could be more user friendly? J-D On Wed, Jul 14, 2010 at 3:33 AM, Sebastian Bauer <[email protected]> wrote: > So replication is working, but after hadoop update i see many of this on > slave: > > 2010-07-14 12:30:51,941 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > 2010-07-14 12:30:51,955 INFO > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs > -- HDFS-200 > 2010-07-14 12:30:51,955 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: > Roll > /hbase/.logs/db2b.goldenline.pl,60020,1279102137010/85.232.237.235%3A60020.1279103451914, > entrie > s=1, filesize=555. New hlog > /hbase/.logs/db2b.goldenline.pl,60020,1279102137010/85.232.237.235%3A60020.1279103451944 > 2010-07-14 12:30:51,957 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > 2010-07-14 12:30:51,966 INFO > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs > -- HDFS-200 > 2010-07-14 12:30:51,967 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: > Roll > /hbase/.logs/db2b.goldenline.pl,60020,1279102137010/85.232.237.235%3A60020.1279103451944, > entrie > s=1, filesize=1195. New hlog > /hbase/.logs/db2b.goldenline.pl,60020,1279102137010/85.232.237.235%3A60020.1279103451959 > > > and something like this on master: > > 2010-07-14 12:25:10,939 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > 2010-07-14 12:25:10,940 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > 2010-07-14 12:25:10,940 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > 2010-07-14 12:25:11,399 INFO > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs > -- HDFS-200 > 2010-07-14 12:25:11,400 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: > Roll > /hbase/.logs/db2a.goldenline.pl,60020,1279102568601/85.232.237.234%3A60020.1279103110860, > entries= > 81, filesize=22075. New hlog > /hbase/.logs/db2a.goldenline.pl,60020,1279102568601/85.232.237.234%3A60020.1279103111379 > 2010-07-14 12:25:11,451 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: > <db2a:/hbase,db2a.goldenline.pl,60020,1279102568601>Created > /hbase/replication/rs/db2a.goldenline > .pl,60020,1279102568601/test/85.232.237.234%3A60020.1279103111379 with data > 2010-07-14 12:25:11,454 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: > HDFS pipeline error detected. Found 1 replicas but expecting 3 replicas. > Requesting close of hlog. > > > W dniu 14.07.2010 00:08, Jean-Daniel Cryans pisze: >> >> Just looked at the head of 0.20-append and I see it contains the >> missing patch (was committed as part of HDFS-1057). >> >> So that would mean that the file is just empty :) If you insert a few >> rows in the shell on the master cluster, do you see them some seconds >> later on the slave? >> >> J-D >> >> On Tue, Jul 13, 2010 at 3:01 PM, Sebastian Bauer<[email protected]> >> wrote: >>> >>> W dniu 13.07.2010 23:50, Jean-Daniel Cryans pisze: >>>> >>>> Yeah using an experimental feature can be "odd" to use :D >>> >>> I love bleeding edge technologies :D >>>> >>>> So one of the following is happening: >>>> >>>> 1) You aren't using a version of hadoop patched enough to get >>>> replication working fully. Trunk uses a special jar that I patched >>>> myself. CDH3b2 also has everything needed. What this means is that >>>> it's trying to open the log file but the first block isn't available >>>> (it's actually a very small patch for the Namenode). >>> >>> I'm using hadoop from 0.20-append branch released with hbase-0.89.xxxx >>> >>>> 2) The file is empty, because nothing was written to the log file. >>>> What this means is that it's trying to open the log file but there's >>>> not even a single block in it, so it fails on EOF. >>> >>> this problem can be because of this, cause when all its running i see >>> less >>> of this traces >>> >>>> J-D >>>> >>> Thanks for your help :) >>> >>>> On Tue, Jul 13, 2010 at 2:37 PM, Sebastian Bauer<[email protected]> >>>> wrote: >>>>> >>>>> after trying to setup replication i have got many od this errors: >>>>> >>>>> 2010-07-13 23:35:26,498 WARN >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Waited >>>>> too long for this file, considering dumping >>>>> 2010-07-13 23:35:26,498 DEBUG >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Unable >>>>> to open a reader, sleeping 100 times 10 >>>>> 2010-07-13 23:35:27,111 INFO >>>>> org.apache.hadoop.hbase.regionserver.Store: >>>>> Completed compaction of 3 file(s) in c of >>>>> >>>>> >>>>> CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd.; >>>>> new >>>>> storefile is >>>>> >>>>> >>>>> hdfs://db2a:50001/hbase/CampaignToUsers/6504d518fb224efe1530e79c198994cd/c/226233377281334567; >>>>> store size is 19.6m >>>>> 2010-07-13 23:35:27,111 INFO >>>>> org.apache.hadoop.hbase.regionserver.HRegion: >>>>> compaction completed on region >>>>> >>>>> >>>>> CampaignToUsers,43-m_2010_5_750D70A83162FF54389D2CA67ADA0B86,1278610126054.6504d518fb224efe1530e79c198994cd. >>>>> in 1sec >>>>> 2010-07-13 23:35:27,111 INFO >>>>> org.apache.hadoop.hbase.regionserver.HRegion: >>>>> Starting compaction on region >>>>> UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d. >>>>> 2010-07-13 23:35:27,112 DEBUG >>>>> org.apache.hadoop.hbase.regionserver.Store: >>>>> Compaction size of c: 31.4m; Skipped 0 file(s), size: 0 >>>>> 2010-07-13 23:35:27,112 INFO >>>>> org.apache.hadoop.hbase.regionserver.Store: >>>>> Started compaction of 3 file(s) in c of >>>>> UsersToCampaign,,1278609821058.ecb7605434967e247ce14d525849495d. into >>>>> hdfs://db2a:50001/hbase/UsersToCampaign/ecb7 >>>>> 605434967e247ce14d525849495d/.tmp, seqid=65302505 >>>>> 2010-07-13 23:35:27,498 INFO >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Opening >>>>> log for replication 85.232.237.234%3A60020.1279056880911 at 0 >>>>> 2010-07-13 23:35:27,499 WARN >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> test >>>>> Got: >>>>> java.io.EOFException >>>>> at java.io.DataInputStream.readFully(DataInputStream.java:180) >>>>> at java.io.DataInputStream.readFully(DataInputStream.java:152) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262) >>>>> 2010-07-13 23:35:27,499 WARN >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Waited >>>>> too long for this file, considering dumping >>>>> 2010-07-13 23:35:27,499 DEBUG >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Unable >>>>> to open a reader, sleeping 100 times 10 >>>>> 2010-07-13 23:35:28,499 INFO >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> Opening >>>>> log for replication 85.232.237.234%3A60020.1279056880911 at 0 >>>>> 2010-07-13 23:35:28,500 WARN >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: >>>>> test >>>>> Got: >>>>> java.io.EOFException >>>>> at java.io.DataInputStream.readFully(DataInputStream.java:180) >>>>> at java.io.DataInputStream.readFully(DataInputStream.java:152) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1457) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1435) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424) >>>>> at >>>>> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.<init>(SequenceFileLogReader.java:51) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:103) >>>>> at >>>>> org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:511) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:422) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262) >>>>> >>>>> W dniu 13.07.2010 20:18, Jean-Daniel Cryans pisze: >>>>>> >>>>>> No, but you can use the new mapreduce utility >>>>>> org.apache.hadoop.hbase.mapreduce.CopyTable to copy whole tables >>>>>> between clusters. It's like distcp for HBase. >>>>>> >>>>>> Oh and looking at the documentation I just figured that I changed the >>>>>> name of the configuration that enables replication just before >>>>>> committing and forgot to update the package.html file, it's now simply >>>>>> hbase.replication (and it should stay like that). I'll fix that in the >>>>>> scope of HBASE-2808. >>>>>> >>>>>> J-D >>>>>> >>>>>> On Tue, Jul 13, 2010 at 11:12 AM, Sebastian Bauer<[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> I have one more question can i first create master and after loading >>>>>>> data >>>>>>> connect slave or turn on replication on existing tables with data? >>>>>>> >>>>>>> W dniu 13.07.2010 19:56, Jean-Daniel Cryans pisze: >>>>>>>>> >>>>>>>>> Thanks for info where i can find some documentation. There is info >>>>>>>>> about >>>>>>>>> zookeeper that it need running in standalone mode it is true? >>>>>>>>> >>>>>>>> Well you can run add_peer.rb when the clusters are running, but they >>>>>>>> won't pickup the change live (that part isn't done yet). So if you >>>>>>>> run >>>>>>>> the script while the cluster is running, restart it. Also take a >>>>>>>> look >>>>>>>> at the region server log, it should output something like this when >>>>>>>> starting up: >>>>>>>> >>>>>>>> LOG.info("This cluster (" + thisCluster + ") is a " >>>>>>>> + (this.replicationMaster ? "master" : "slave") + " for >>>>>>>> replication" + >>>>>>>> ", compared with (" + address + ")"); >>>>>>>> >>>>>>>> This will tell you if you used the right address for zookeeper. If >>>>>>>> your region server on the master cluster thinks its a slave, then >>>>>>>> the >>>>>>>> addresses are wrong. Also currently there's no reporting for >>>>>>>> replication, since it's not done yet! >>>>>>>> >>>>>>>> For a more in-depth documentation, check out >>>>>>>> https://issues.apache.org/jira/browse/HBASE-2808 >>>>>>>> >>>>>>>> Thanks for trying this out, as the author of most of that part of >>>>>>>> the >>>>>>>> code I'm thrilled! >>>>>>>> >>>>>>>> J-D >>>>>>>> >>> > >
