I'm using Linux, the Amazon beta version that they recently released. I'm
not very familiar with Linux, so I think the kernel version
is 2.6.34.7-56.40.amzn1.x86_64. Hadoop version is 0.20.2 and HBase version
is 0.20.6. Hadoop and HBase have 2 GB each and they are not sawpping.

Besides all other questions I posed, I have one more. How can I calculate
the maximum number of xcievers? Is there a formula?

Lucas



On Wed, Nov 17, 2010 at 2:12 PM, Lars George <lars.geo...@gmail.com> wrote:

> Hi Lucas,
>
> What OS are you on? What kernel version? What is your Hadoop and HBase
> version? How much heap do you assign to each Java process?
>
> Lars
>
> On Wed, Nov 17, 2010 at 3:05 PM, Lucas Nazário dos Santos
> <nazario.lu...@gmail.com> wrote:
> > Hi,
> >
> > This problem is widely know, but I'm not able to come up with a decent
> > solution for it.
> >
> > I'm scanning 1.000.000+ rows from one table in order to index their
> content.
> > Each row has around 100 KB. The problem is that I keep getting the
> > exception:
> >
> > Exception in thread
> "org.apache.hadoop.dfs.datanode$dataxceiveser...@82d37"
> > java.lang.OutOfMemoryError: unable to create new native thread
> >
> > This is a Hadoop exception and it causes the DataNote to go down, so I
> > decreased the dfs.datanode.max.xcievers from 4048 to 512. Well, that led
> me
> > to another problem:
> >
> > java.io.IOException: xceiverCount 513 exceeds the limit of concurrent
> > xcievers 512
> >
> > This time the DataNode doesn't die, nor HBase, but my scan, and the whole
> > indexing process, suffers a lot.
> >
> > After reading different posts about this issue, I have the impression
> that
> > HBase can't handle this limits transparently for the user. The scanner is
> a
> > sequential process, so I thought it would free Hadoop resources already
> used
> > in order to make room for new requests for data under HDFS. What I am
> > missing? Should I slow down the scanning process? Should I scan portions
> of
> > the table sequentially instead of doing a full scan in all 1.000.000+
> rows?
> > Is there a timeout so unused Hadoop resources can be released?
> >
> > Thanks in advance,
> > Lucas
> >
>

Reply via email to