Hi Lucas,

What OS are you on? What kernel version? What is your Hadoop and HBase
version? How much heap do you assign to each Java process?

Lars

On Wed, Nov 17, 2010 at 3:05 PM, Lucas Nazário dos Santos
<nazario.lu...@gmail.com> wrote:
> Hi,
>
> This problem is widely know, but I'm not able to come up with a decent
> solution for it.
>
> I'm scanning 1.000.000+ rows from one table in order to index their content.
> Each row has around 100 KB. The problem is that I keep getting the
> exception:
>
> Exception in thread "org.apache.hadoop.dfs.datanode$dataxceiveser...@82d37"
> java.lang.OutOfMemoryError: unable to create new native thread
>
> This is a Hadoop exception and it causes the DataNote to go down, so I
> decreased the dfs.datanode.max.xcievers from 4048 to 512. Well, that led me
> to another problem:
>
> java.io.IOException: xceiverCount 513 exceeds the limit of concurrent
> xcievers 512
>
> This time the DataNode doesn't die, nor HBase, but my scan, and the whole
> indexing process, suffers a lot.
>
> After reading different posts about this issue, I have the impression that
> HBase can't handle this limits transparently for the user. The scanner is a
> sequential process, so I thought it would free Hadoop resources already used
> in order to make room for new requests for data under HDFS. What I am
> missing? Should I slow down the scanning process? Should I scan portions of
> the table sequentially instead of doing a full scan in all 1.000.000+ rows?
> Is there a timeout so unused Hadoop resources can be released?
>
> Thanks in advance,
> Lucas
>

Reply via email to