Re: Data lost during intensive writes

schubert zhang Thu, 26 Mar 2009 21:10:02 -0700

Thank you very much Andy. Yes, it is really a difficult issue.
Schubert

On Fri, Mar 27, 2009 at 1:13 AM, Andrew Purtell <[email protected]> wrote:


>
> Hi Schubert,
>
> I set dfs.datanode.max.xcievers=4096 in my config. This was the
> only way I was able to bring > 7000 regions online on 25 nodes
> during cluster restart without DFS errors. Definitely the
> default is too low for HBase. HFile in 0.20 will have material
> impact here, which should help the situation. Also perhaps more
> can/will be done with regards to HBASE-24 to relieve the load on
> the DataNodes:
>
>
> https://issues.apache.org/jira/browse/HBASE-24?focusedCommentId=12613104&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12613104
>
> The root cause of this is HADOOP-3846:
> https://issues.apache.org/jira/browse/HADOOP-3856
>
> I looked at helping out on this issue. There is so much
> reimplementation of such a fundamental component (to Hadoop)
> involved that it's difficult for a part-time volunteer to make
> progress on it. Even if the code can be changed, there is
> follow up shepherding through Core review and release processes
> to consider... I hold out hope that a commercial user of Hadoop
> will have pain in this area and commit sponsored resources to
> address the issue of I/O scalability in DFS. I think when DFS
> was written the expectation was that 10,000 nodes would have
> only a few open files each -- very large mapreduce inputs,
> intermediates, and outputs -- not that 100s of nodes might
> have 1,000s of files open each. In any case, the issue is well
> known.
>
> I have found "dfs.datanode.socket.write.timeout=0" is not
> necessary for HBase 0.19.1 on Hadoop 0.19.1 in my testing.
>
> Best regards,
>
>   -Andy
>
>
> > From: schubert zhang <[email protected]>
> > Subject: Re: Data lost during intensive writes
> > To: [email protected], [email protected]
> > Date: Thursday, March 26, 2009, 4:58 AM
> >
> > I will set "dfs.datanode.max.xcievers=1024" (default is 256)
> >
> > I am using branch-0.19.
> > Do you think "dfs.datanode.socket.write.timeout=0" is
> > necessary in release-0.19?
> >
> > Schubert
>
>
>
>
>

Re: Data lost during intensive writes

Reply via email to