-Metaonly returns a summary which I presume is what the other flags are
meant to return

Summary:
  -ROOT- is okay.
    Number of regions: 1
    Deployed on:  datanode006.si.lan,60020,1341357895747
  .META. is okay.
    Number of regions: 1
    Deployed on:  datanode013.si.lan,60020,1341357896016
0 inconsistencies detected.
Status: OK

In terms of logs the namenode shows no log of hbck nor do the nodes.

The only signification we have of an error is that there is no summary.

We do however see this further up the hbck output.

12/07/05 16:24:45 DEBUG util.HBaseFsck: HRegionInfo read: {NAME =>
'thefinalfrontier,com|1|1245884400|agencedupontdugard.com,1339788822565.838
b9c14a918a97e584dc35537c50b22.', STARTKEY =>
'com|1|1245884400|agencedupontdugard.com', ENDKEY =>
'com|1|1263427200|jbheart.com', ENCODED =>
838b9c14a918a97e584dc35537c50b22,}
Exception in thread "main" java.util.concurrent.RejectedExecutionException
 at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Threa
dPoolExecutor.java:1768)
 at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
 at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658
)
 at 
org.apache.hadoop.hbase.util.HBaseFsck.loadHdfsRegionInfos(HBaseFsck.java:6
33)
 at 
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.ja
va:354)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:382)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3120)





And we were getting the xciever errors below at similar times to running
the hbck. So we increased the xcievers to 4096.



2012-07-05 15:07:34,392 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(188.94.23.23:50010,
storageID=DS-1551164377-188.94.23.23-50010-1335800300163, infoPort=50075,
ipcPort=50020):Got exception while serving blk_-3876969825338062337_132429
to /188.94.23.26:
java.io.IOException: Block blk_-3876969825338062337_132429 is not valid.
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.jav
a:1072)
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:1
035)
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getVisibleLength(FSDataset
.java:1045)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:
94)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.ja
va:189)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
        at java.lang.Thread.run(Thread.java:662)

2012-07-05 15:07:34,392 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(188.94.23.23:50010,
storageID=DS-1551164377-188.94.23.23-50010-1335800300163, infoPort=50075,
ipcPort=50020):DataXceiver
java.io.IOException: Block blk_-3876969825338062337_132429 is not valid.
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.jav
a:1072)
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:1
035)
        at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.getVisibleLength(FSDataset
.java:1045)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:
94)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.ja
va:189)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
        at java.lang.Thread.run(Thread.java:662)





They now seem to have gone and been replaced with the below at random
intervals.

2012-07-05 17:04:02,501 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(188.94.23.23:50010,
storageID=DS-1551164377-188.94.23.23-50010-1335800300163, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/188.94.23.23:50010
remote=/188.94.23.20:56619]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.jav
a:246)
        at 
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream
.java:159)
        at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream
.java:198)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.j
ava:350)
        at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.ja
va:436)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.ja
va:197)
        at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
        at java.lang.Thread.run(Thread.java:662)




On 05/07/2012 08:39, "Jonathan Hsieh" <[email protected]> wrote:

>Jay,
>
>Have you tried the -metaOnly hbck option (possibly in conjunction with
>-fixAssignments/-fix)?  It could be that meta is out of whack which
>prevents everything else from making progress.
>
>If that doesn't work please share more logs -- it will help us figure out
>where it got stuck.
>
>Thanks,
>Jon.
>
>On Wed, Jul 4, 2012 at 8:38 AM, Jay Whittaker
><[email protected]
>> wrote:
>
>> Hey,
>>
>> I have been getting the following in thrift logs.
>>
>> 2012-07-04 15:41:05,903 WARN
>> 
>>org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementati
>>on:
>> Encountered problems when prefetch META table:
>> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in
>>.META.
>> for table: finalfrontier, row=finalfrontier,,99999999999999
>>
>> Which made us think it's a META table error. So we ran 'bin/hbase hbck'
>> and 'bin/hbase hbck ­details' and both seem to hang after a 'HregionInfo
>> read' line before dropping to the CLI with no error or debug info.
>>
>> We presume it is a Hregion read hanging but can not find it logged
>> anywhere. Is there a way to see where it is hanging?
>>
>> It may also be worth pointing out we have tried the ­fix ­fixMeta and
>> ­repair tags with no change.
>>
>> Thanks,
>>
>> Jay
>>
>
>
>
>-- 
>// Jonathan Hsieh (shay)
>// Software Engineer, Cloudera
>// [email protected]

Reply via email to