Re: distributed log splitting aborted

Cyril Scetbon Fri, 06 Jul 2012 05:32:31 -0700

dfs.datanode.max.xcievers is set to 4096 and the soft limit of nofile is set to 
32768 (it is the default in the package)


However when I log in as hdfs it's set to 1024 and I can't find if it's set 
somewhere to more...

Cyril SCETBON

On Jul 6, 2012, at 12:19 PM, N Keywal wrote:

> Hi Cyril,
> 
> BTW, have you checked dfs.datanode.max.xcievers and ulimit -n? When
> underconfigured they can cause this type of errors, even if it seems
> it's not the case here...
> 
> Cheers,
> 
> N.
> 
> On Fri, Jul 6, 2012 at 11:31 AM, Cyril Scetbon <[email protected]> wrote:
>> The file is now missing but I have tried with another one and you can see 
>> the error :
>> 
>> shell> hdfs dfs -ls 
>> "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446"
>> Found 1 items
>> -rw-r--r--   4 hbase supergroup          0 2012-07-04 17:06 
>> /hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446
>> shell> hdfs dfs -cat 
>> "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446"
>> 12/07/06 09:27:51 WARN hdfs.DFSClient: Last block locations not available. 
>> Datanodes might not have reported blocks completely. Will retry for 3 times
>> 12/07/06 09:27:55 WARN hdfs.DFSClient: Last block locations not available. 
>> Datanodes might not have reported blocks completely. Will retry for 2 times
>> 12/07/06 09:27:59 WARN hdfs.DFSClient: Last block locations not available. 
>> Datanodes might not have reported blocks completely. Will retry for 1 times
>> cat: Could not obtain the last block locations.
>> 
>> I'm using hadoop 2.0 from Cloudera package (CDH4) with hbase 0.92.1
>> 
>> Regards
>> Cyril SCETBON
>> 
>> On Jul 5, 2012, at 11:44 PM, Jean-Daniel Cryans wrote:
>> 
>>> Interesting... Can you read the file? Try a "hadoop dfs -cat" on it
>>> and see if it goes to the end of it.
>>> 
>>> It could also be useful to see a bigger portion of the master log, for
>>> all I know maybe it handles it somehow and there's a problem
>>> elsewhere.
>>> 
>>> Finally, which Hadoop version are you using?
>>> 
>>> Thx,
>>> 
>>> J-D
>>> 
>>> On Thu, Jul 5, 2012 at 1:58 PM, Cyril Scetbon <[email protected]> wrote:
>>>> yes :
>>>> 
>>>> /hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.134143064971
>>>> 
>>>> I did a fsck and here is the report :
>>>> 
>>>> Status: HEALTHY
>>>> Total size:    618827621255 B (Total open files size: 868 B)
>>>> Total dirs:    4801
>>>> Total files:   2825 (Files currently being written: 42)
>>>> Total blocks (validated):      11479 (avg. block size 53909541 B) (Total 
>>>> open file blocks (not validated): 41)
>>>> Minimally replicated blocks:   11479 (100.0 %)
>>>> Over-replicated blocks:        1 (0.008711561 %)
>>>> Under-replicated blocks:       0 (0.0 %)
>>>> Mis-replicated blocks:         0 (0.0 %)
>>>> Default replication factor:    4
>>>> Average block replication:     4.0000873
>>>> Corrupt blocks:                0
>>>> Missing replicas:              0 (0.0 %)
>>>> Number of data-nodes:          12
>>>> Number of racks:               1
>>>> FSCK ended at Thu Jul 05 20:56:35 UTC 2012 in 795 milliseconds
>>>> 
>>>> 
>>>> The filesystem under path '/hbase' is HEALTHY
>>>> 
>>>> Cyril SCETBON
>>>> 
>>>> Cyril SCETBON
>>>> 
>>>> On Jul 5, 2012, at 7:59 PM, Jean-Daniel Cryans wrote:
>>>> 
>>>>> Does this file really exist in HDFS?
>>>>> 
>>>>> hdfs://hb-zk1:54310/hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.1341430649711
>>>>> 
>>>>> If so, did you run fsck in HDFS?
>>>>> 
>>>>> It would be weird if HDFS doesn't report anything bad but somehow the
>>>>> clients (like HBase) can't read it.
>>>>> 
>>>>> J-D
>>>>> 
>>>>> On Thu, Jul 5, 2012 at 12:45 AM, Cyril Scetbon <[email protected]> 
>>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I can nolonger start my cluster correctly and get messages like 
>>>>>> http://pastebin.com/T56wrJxE (taken on one region server)
>>>>>> 
>>>>>> I suppose Hbase is not done for being stopped but only for having some 
>>>>>> nodes going down ??? HDFS is not complaining, it's only HBase that can't 
>>>>>> start correctly :(
>>>>>> 
>>>>>> I suppose some data has not been flushed and it's not really important 
>>>>>> for me. Is there a way to fix theses errors even if I will lose data ?
>>>>>> 
>>>>>> thanks
>>>>>> 
>>>>>> Cyril SCETBON
>>>>>> 
>>>> 
>>

Re: distributed log splitting aborted

Reply via email to