Re: Xceiver problem

Lars George Wed, 17 Nov 2010 23:34:00 -0800

You haven't answered all questions yet :) Are you running this on EC2?
What instance types?


On Thu, Nov 18, 2010 at 12:12 AM, Lucas Nazário dos Santos
<nazario.lu...@gmail.com> wrote:
> It seems that newer Linux versions don't have the
> file /proc/sys/fs/epoll/max_user_instances, but instead
> /proc/sys/fs/epoll/max_user_watches. I'm not quite sure about what to do.
>
> Can I favor max_user_watches over max_user_instances? With what value?
>
> I also tried to play with the Xss argument and decreased it to 128K with no
> luck (xcievers at 4096).
>
> Lucas
>
>
>
> On Wed, Nov 17, 2010 at 8:02 PM, Lars George <lars.geo...@gmail.com> wrote:
>
>> That is what I was also thinking about, thanks for jumping in Todd.
>>
>> I was simply not sure if that is just on .27 or all after that one and
>> the defaults have never been increased.
>>
>> On Wed, Nov 17, 2010 at 8:24 PM, Todd Lipcon <t...@cloudera.com> wrote:
>> > On that new of a kernel you'll also need to increase your epoll limit.
>> Some
>> > tips about that here:
>> >
>> >
>> http://www.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/
>> >
>> > Thanks
>> > -Todd
>> >
>> > On Wed, Nov 17, 2010 at 9:10 AM, Lars George <lars.geo...@gmail.com>
>> wrote:
>> >
>> >> Are you running on EC2? Couldn't you simply up the heap size for the
>> >> java processes?
>> >>
>> >> I do not think there is a hard and fast rule to how many xcievers you
>> >> need, trial and error is common. Or ifmyou have enough heap simply set
>> >> it too high, like 4096 and that usually works fine. It all depends on
>> >> how many regions and column families you have on each server.
>> >>
>> >> Lars
>> >>
>> >> On Wed, Nov 17, 2010 at 5:31 PM, Lucas Nazário dos Santos
>> >> <nazario.lu...@gmail.com> wrote:
>> >> > I'm using Linux, the Amazon beta version that they recently released.
>> I'm
>> >> > not very familiar with Linux, so I think the kernel version
>> >> > is 2.6.34.7-56.40.amzn1.x86_64. Hadoop version is 0.20.2 and HBase
>> >> version
>> >> > is 0.20.6. Hadoop and HBase have 2 GB each and they are not sawpping.
>> >> >
>> >> > Besides all other questions I posed, I have one more. How can I
>> calculate
>> >> > the maximum number of xcievers? Is there a formula?
>> >> >
>> >> > Lucas
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Nov 17, 2010 at 2:12 PM, Lars George <lars.geo...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> Hi Lucas,
>> >> >>
>> >> >> What OS are you on? What kernel version? What is your Hadoop and
>> HBase
>> >> >> version? How much heap do you assign to each Java process?
>> >> >>
>> >> >> Lars
>> >> >>
>> >> >> On Wed, Nov 17, 2010 at 3:05 PM, Lucas Nazário dos Santos
>> >> >> <nazario.lu...@gmail.com> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > This problem is widely know, but I'm not able to come up with a
>> decent
>> >> >> > solution for it.
>> >> >> >
>> >> >> > I'm scanning 1.000.000+ rows from one table in order to index their
>> >> >> content.
>> >> >> > Each row has around 100 KB. The problem is that I keep getting the
>> >> >> > exception:
>> >> >> >
>> >> >> > Exception in thread
>> >> >> "org.apache.hadoop.dfs.datanode$dataxceiveser...@82d37"
>> >> >> > java.lang.OutOfMemoryError: unable to create new native thread
>> >> >> >
>> >> >> > This is a Hadoop exception and it causes the DataNote to go down,
>> so I
>> >> >> > decreased the dfs.datanode.max.xcievers from 4048 to 512. Well,
>> that
>> >> led
>> >> >> me
>> >> >> > to another problem:
>> >> >> >
>> >> >> > java.io.IOException: xceiverCount 513 exceeds the limit of
>> concurrent
>> >> >> > xcievers 512
>> >> >> >
>> >> >> > This time the DataNode doesn't die, nor HBase, but my scan, and the
>> >> whole
>> >> >> > indexing process, suffers a lot.
>> >> >> >
>> >> >> > After reading different posts about this issue, I have the
>> impression
>> >> >> that
>> >> >> > HBase can't handle this limits transparently for the user. The
>> scanner
>> >> is
>> >> >> a
>> >> >> > sequential process, so I thought it would free Hadoop resources
>> >> already
>> >> >> used
>> >> >> > in order to make room for new requests for data under HDFS. What I
>> am
>> >> >> > missing? Should I slow down the scanning process? Should I scan
>> >> portions
>> >> >> of
>> >> >> > the table sequentially instead of doing a full scan in all
>> 1.000.000+
>> >> >> rows?
>> >> >> > Is there a timeout so unused Hadoop resources can be released?
>> >> >> >
>> >> >> > Thanks in advance,
>> >> >> > Lucas
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > Todd Lipcon
>> > Software Engineer, Cloudera
>> >
>>
>

Re: Xceiver problem

Reply via email to