Re: Too many open files

2018-01-22 Thread Jeff Jirsa
Typically, long lived connections are better, so global.



-- 
Jeff Jirsa


> On Jan 22, 2018, at 3:28 AM, Andreou, Arys (Nokia - GR/Athens) 
> <arys.andr...@nokia.com> wrote:
> 
> It turns out it was a mistake in the client’s implementation.
> The session was created for each request but it was shut down, so all the 
> connections were left open.
> I only needed to execute a cluste.shutdown() once the request was over.
>  
> I do have a follow up question though.
> Is it better to have a global session object or to create it and shut it down 
> for every request?
>  
>  
> From: n...@photonhost.com [mailto:n...@photonhost.com] On Behalf Of Nikolay 
> Mihaylov
> Sent: Monday, January 22, 2018 11:47 AM
> To: user@cassandra.apache.org
> Subject: Re: Too many open files
>  
> You can increase system open files,
> also if you compact, open files will go down.
>  
> On Mon, Jan 22, 2018 at 10:19 AM, Dor Laor <d...@scylladb.com> wrote:
> It's a high number, your compaction may run behind and thus
> many small sstables exist. However, you're also taking the
> number of network connection in the calculation (everything
> in *nix is a file). If it makes you feel better my laptop
> has 40k open files for Chrome..
>  
> On Sun, Jan 21, 2018 at 11:59 PM, Andreou, Arys (Nokia - GR/Athens) 
> <arys.andr...@nokia.com> wrote:
> Hi,
>  
> I keep getting a “Last error: Too many open files” followed by a list of node 
> IPs.
> The output of “lsof -n|grep java|wc -l” is about 674970 on each node.
>  
> What is a normal number of open files?
>  
> Thank you.
>  
>  
>  


RE: Too many open files

2018-01-22 Thread Andreou, Arys (Nokia - GR/Athens)
It turns out it was a mistake in the client’s implementation.
The session was created for each request but it was shut down, so all the 
connections were left open.
I only needed to execute a cluste.shutdown() once the request was over.

I do have a follow up question though.
Is it better to have a global session object or to create it and shut it down 
for every request?


From: n...@photonhost.com [mailto:n...@photonhost.com] On Behalf Of Nikolay 
Mihaylov
Sent: Monday, January 22, 2018 11:47 AM
To: user@cassandra.apache.org
Subject: Re: Too many open files

You can increase system open files,
also if you compact, open files will go down.

On Mon, Jan 22, 2018 at 10:19 AM, Dor Laor 
<d...@scylladb.com<mailto:d...@scylladb.com>> wrote:
It's a high number, your compaction may run behind and thus
many small sstables exist. However, you're also taking the
number of network connection in the calculation (everything
in *nix is a file). If it makes you feel better my laptop
has 40k open files for Chrome..

On Sun, Jan 21, 2018 at 11:59 PM, Andreou, Arys (Nokia - GR/Athens) 
<arys.andr...@nokia.com<mailto:arys.andr...@nokia.com>> wrote:
Hi,

I keep getting a “Last error: Too many open files” followed by a list of node 
IPs.
The output of “lsof -n|grep java|wc -l” is about 674970 on each node.

What is a normal number of open files?

Thank you.





Re: Too many open files

2018-01-22 Thread Nikolay Mihaylov
You can increase system open files,
also if you compact, open files will go down.

On Mon, Jan 22, 2018 at 10:19 AM, Dor Laor <d...@scylladb.com> wrote:

> It's a high number, your compaction may run behind and thus
> many small sstables exist. However, you're also taking the
> number of network connection in the calculation (everything
> in *nix is a file). If it makes you feel better my laptop
> has 40k open files for Chrome..
>
> On Sun, Jan 21, 2018 at 11:59 PM, Andreou, Arys (Nokia - GR/Athens) <
> arys.andr...@nokia.com> wrote:
>
>> Hi,
>>
>>
>>
>> I keep getting a “Last error: Too many open files” followed by a list of
>> node IPs.
>>
>> The output of “lsof -n|grep java|wc -l” is about 674970 on each node.
>>
>>
>>
>> What is a normal number of open files?
>>
>>
>>
>> Thank you.
>>
>>
>>
>
>


Re: Too many open files

2018-01-22 Thread Dor Laor
It's a high number, your compaction may run behind and thus
many small sstables exist. However, you're also taking the
number of network connection in the calculation (everything
in *nix is a file). If it makes you feel better my laptop
has 40k open files for Chrome..

On Sun, Jan 21, 2018 at 11:59 PM, Andreou, Arys (Nokia - GR/Athens) <
arys.andr...@nokia.com> wrote:

> Hi,
>
>
>
> I keep getting a “Last error: Too many open files” followed by a list of
> node IPs.
>
> The output of “lsof -n|grep java|wc -l” is about 674970 on each node.
>
>
>
> What is a normal number of open files?
>
>
>
> Thank you.
>
>
>


Too many open files

2018-01-22 Thread Andreou, Arys (Nokia - GR/Athens)
Hi,

I keep getting a "Last error: Too many open files" followed by a list of node 
IPs.
The output of "lsof -n|grep java|wc -l" is about 674970 on each node.

What is a normal number of open files?

Thank you.



Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Jason Lewis
cat /proc/5980/limits
Limit Soft Limit   Hard Limit   Units
Max cpu time  unlimitedunlimitedseconds
Max file size unlimitedunlimitedbytes
Max data size unlimitedunlimitedbytes
Max stack size8388608  unlimitedbytes
Max core file size0unlimitedbytes
Max resident set  unlimitedunlimitedbytes
Max processes 2063522  2063522
 processes
Max open files10   10   files
Max locked memory unlimitedunlimitedbytes
Max address space unlimitedunlimitedbytes
Max file locksunlimitedunlimitedlocks
Max pending signals   2063522  2063522  signals
Max msgqueue size 819200   819200   bytes
Max nice priority 00
Max realtime priority 00
Max realtime timeout  unlimitedunlimitedus


On Fri, Nov 6, 2015 at 4:01 PM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> You probably need to configure ulimits correctly
> <http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html>
> .
>
> What does this give you?
>
> /proc//limits
>
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
> <http://goog_410786983>
>
>
> <http://www.datastax.com/gartner-magic-quadrant-odbms>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Fri, Nov 6, 2015 at 1:56 PM, Branton Davis <branton.da...@spanning.com>
> wrote:
>
>> We recently went down the rabbit hole of trying to understand the output
>> of lsof.  lsof -n has a lot of duplicates (files opened by multiple
>> threads).  Use 'lsof -p $PID' or 'lsof -u cassandra' instead.
>>
>> On Fri, Nov 6, 2015 at 12:49 PM, Bryan Cheng <br...@blockcypher.com>
>> wrote:
>>
>>> Is your compaction progressing as expected? If not, this may cause an
>>> excessive number of tiny db files. Had a node refuse to start recently
>>> because of this, had to temporarily remove limits on that process.
>>>
>>> On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis <jle...@packetnexus.com>
>>> wrote:
>>>
>>>> I'm getting too many open files errors and I'm wondering what the
>>>> cause may be.
>>>>
>>>> lsof -n | grep java  show 1.4M files
>>>>
>>>> ~90k are inodes
>>>> ~70k are pipes
>>>> ~500k are cassandra services in /usr
>>>> ~700K are the data files.
>>>>
>>>> What might be causing so many files to be open?
>>>>
>>>> jas
>>>>
>>>
>>>
>>
>


Re: Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread 郝加来
many connection ?





郝加来

From: Jason Lewis
Date: 2015-11-07 10:38
To: user@cassandra.apache.org
Subject: Re: Too many open files Cassandra 2.1.11.872
cat /proc/5980/limits
Limit Soft Limit   Hard Limit   Units
Max cpu time  unlimitedunlimitedseconds
Max file size unlimitedunlimitedbytes
Max data size unlimitedunlimitedbytes
Max stack size8388608  unlimitedbytes
Max core file size0unlimitedbytes
Max resident set  unlimitedunlimitedbytes
Max processes 2063522  2063522  processes
Max open files10   10   files
Max locked memory unlimitedunlimitedbytes
Max address space unlimitedunlimitedbytes
Max file locksunlimitedunlimitedlocks
Max pending signals   2063522  2063522  signals
Max msgqueue size 819200   819200   bytes
Max nice priority 00
Max realtime priority 00
Max realtime timeout  unlimitedunlimitedus




On Fri, Nov 6, 2015 at 4:01 PM, Sebastian Estevez 
<sebastian.este...@datastax.com> wrote:

You probably need to configure ulimits correctly.


What does this give you?


/proc//limits


All the best,



Sebastián Estévez
Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com







DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay. 


On Fri, Nov 6, 2015 at 1:56 PM, Branton Davis <branton.da...@spanning.com> 
wrote:

We recently went down the rabbit hole of trying to understand the output of 
lsof.  lsof -n has a lot of duplicates (files opened by multiple threads).  Use 
'lsof -p $PID' or 'lsof -u cassandra' instead.


On Fri, Nov 6, 2015 at 12:49 PM, Bryan Cheng <br...@blockcypher.com> wrote:

Is your compaction progressing as expected? If not, this may cause an excessive 
number of tiny db files. Had a node refuse to start recently because of this, 
had to temporarily remove limits on that process.


On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis <jle...@packetnexus.com> wrote:

I'm getting too many open files errors and I'm wondering what the
cause may be.

lsof -n | grep java  show 1.4M files

~90k are inodes
~70k are pipes
~500k are cassandra services in /usr
~700K are the data files.

What might be causing so many files to be open?

jas


---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---


Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Branton Davis
We recently went down the rabbit hole of trying to understand the output of
lsof.  lsof -n has a lot of duplicates (files opened by multiple threads).
Use 'lsof -p $PID' or 'lsof -u cassandra' instead.

On Fri, Nov 6, 2015 at 12:49 PM, Bryan Cheng <br...@blockcypher.com> wrote:

> Is your compaction progressing as expected? If not, this may cause an
> excessive number of tiny db files. Had a node refuse to start recently
> because of this, had to temporarily remove limits on that process.
>
> On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis <jle...@packetnexus.com>
> wrote:
>
>> I'm getting too many open files errors and I'm wondering what the
>> cause may be.
>>
>> lsof -n | grep java  show 1.4M files
>>
>> ~90k are inodes
>> ~70k are pipes
>> ~500k are cassandra services in /usr
>> ~700K are the data files.
>>
>> What might be causing so many files to be open?
>>
>> jas
>>
>
>


Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Bryan Cheng
Is your compaction progressing as expected? If not, this may cause an
excessive number of tiny db files. Had a node refuse to start recently
because of this, had to temporarily remove limits on that process.

On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis <jle...@packetnexus.com> wrote:

> I'm getting too many open files errors and I'm wondering what the
> cause may be.
>
> lsof -n | grep java  show 1.4M files
>
> ~90k are inodes
> ~70k are pipes
> ~500k are cassandra services in /usr
> ~700K are the data files.
>
> What might be causing so many files to be open?
>
> jas
>


Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Jason Lewis
I'm getting too many open files errors and I'm wondering what the
cause may be.

lsof -n | grep java  show 1.4M files

~90k are inodes
~70k are pipes
~500k are cassandra services in /usr
~700K are the data files.

What might be causing so many files to be open?

jas


Re: Too many open files Cassandra 2.1.11.872

2015-11-06 Thread Sebastian Estevez
You probably need to configure ulimits correctly
<http://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html>
.

What does this give you?

/proc//limits


All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>
<http://goog_410786983>


<http://www.datastax.com/gartner-magic-quadrant-odbms>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Nov 6, 2015 at 1:56 PM, Branton Davis <branton.da...@spanning.com>
wrote:

> We recently went down the rabbit hole of trying to understand the output
> of lsof.  lsof -n has a lot of duplicates (files opened by multiple
> threads).  Use 'lsof -p $PID' or 'lsof -u cassandra' instead.
>
> On Fri, Nov 6, 2015 at 12:49 PM, Bryan Cheng <br...@blockcypher.com>
> wrote:
>
>> Is your compaction progressing as expected? If not, this may cause an
>> excessive number of tiny db files. Had a node refuse to start recently
>> because of this, had to temporarily remove limits on that process.
>>
>> On Fri, Nov 6, 2015 at 10:09 AM, Jason Lewis <jle...@packetnexus.com>
>> wrote:
>>
>>> I'm getting too many open files errors and I'm wondering what the
>>> cause may be.
>>>
>>> lsof -n | grep java  show 1.4M files
>>>
>>> ~90k are inodes
>>> ~70k are pipes
>>> ~500k are cassandra services in /usr
>>> ~700K are the data files.
>>>
>>> What might be causing so many files to be open?
>>>
>>> jas
>>>
>>
>>
>


Re: too many open files

2014-08-09 Thread Jack Krupansky
Maybe the drivers should have two modes: few sessions, and lots of sessions. 
The former would give you a developer-friendly driver error if you leave more 
than say a dozen or two dozen sessions open (or whatever is considered a best 
practice for parallel threads in a client), on the theory that you probably 
used the anti-pattern of failing to reuse sessions. The latter would be more 
for expert apps that have some good reason for having hundreds or thousands of 
simultaneous sessions open. Whether the latter also has some (configurable) 
limit that is simply a lot higher than the former or is unlimited, is probably 
not so important. Or, maybe, simply have a single limit, without the modes and 
default it to 10 or 25 or some other relatively low number for “normal” apps.

This would be more developer-friendly, for both new and “normal” developers... 
I think.

-- Jack Krupansky

From: Marcelo Elias Del Valle 
Sent: Saturday, August 9, 2014 12:41 AM
To: user@cassandra.apache.org 
Subject: Re: too many open files

Indeed, that was my mistake, that was exactly what we were doing in the code. 
[]s



2014-08-09 0:56 GMT-03:00 Brian Zhang yikebo...@gmail.com:

  For cassandra driver,session is just like database connection pool,it maybe 
contains many tcp connections,if you create a new session every time,more and 
more tcp connections will be created,till surpass the max file description 
limit  of os. 

  You should create one session,use it repeatedly ,session can manage 
connections automatically,create new connection or close old connection for 
your requests.


  在 2014年8月9日,6:52,Redmumba redmu...@gmail.com 写道:


Just to chime in, I also ran into this issue when I was migrating to the 
Datastax client. Instead of reusing the session, I was opening a new session 
each time. For some reason, even though I was still closing the session on the 
client side, I was getting the same error.

Plus, the only way I could recover was by restarting Cassandra. I did not 
really see the connections timeout over a period of a few minutes.

Andrew

On Aug 8, 2014 3:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:

  You may have this problem if your client doesn't reuse the connection but 
opens new every type. So, run netstat and check the number of established 
connections. This number should not be big. 

  Thank you,
Andrey 



  On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

Hi, 

I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having 
too many open files exceptions when I try to perform a large number of 
operations in my 10 node cluster.

I saw the documentation 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting the 
errors.

In the documentation it says: Another, much less likely possibility, 
is a file descriptor leak in Cassandra. Run lsof -n | grep java to check that 
the number of file descriptors opened by Java is reasonable and reports the 
error if the number is greater than a few thousand.

I guess it's not the case, or else a lot of people would be complaining 
about it, but I am not sure what I could do to solve the problem.

Any hint about how to solve it?

My client is written in python and uses Cassandra Python Driver. Here 
are the exceptions I am having in the client:
[s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.151, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.142, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.143, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.142, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.145, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.144, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.148, scheduling retry in 600.0 seconds: 
[Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error 
attempting to reconnect to 200.200.200.146, scheduling retry in 600.0 seconds: 
[Errno 24] Too

Re: too many open files

2014-08-09 Thread Andrew
Tyler,

I’ll see if I can reproduce this on a local instance, but just in case, the 
error was basically—instead of storing the session in my connection factory, I 
stored a cluster and called “connect” each time I requested a Session.  I had 
defined a max/min number of connections for the connection itself, maxing out 
at 128 local/remote.  I’m not sure if a Session results in a new file handle on 
the server side, but I saw the same issue (hundreds of thousands of sockets 
opened on the server).

The cluster was also using hsha; most of the other settings were default in 
2.0.7.

Andrew

On August 8, 2014 at 4:08:50 PM, Tyler Hobbs (ty...@datastax.com) wrote:


On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:
Just to chime in, I also ran into this issue when I was migrating to the 
Datastax client. Instead of reusing the session, I was opening a new session 
each time. For some reason, even though I was still closing the session on the 
client side, I was getting the same error.

Which driver?  If you can still reproduce this, would you mind opening a 
ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all)


--
Tyler Hobbs
DataStax


Re: too many open files

2014-08-09 Thread Andrew
I just had a generator that (in the incorrect way) had a cluster as a member 
variable, and would call .connect() repeatedly.  I _thought_, incorrectly, that 
the Session was thread unsafe, and so I should request a separate Session each 
time—obviously wrong in hind sight.

There was no special logic; I had a restriction of about 128 connections per 
host, but the connections were in the 100s of thousands, like the OP mentioned. 
 Again, I’ll see about reproducing it on Monday, but just wanted the repro 
steps (overall) to live somewhere in case I can’t. :)

Andrew

On August 8, 2014 at 4:08:50 PM, Tyler Hobbs (ty...@datastax.com) wrote:


On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:
Just to chime in, I also ran into this issue when I was migrating to the 
Datastax client. Instead of reusing the session, I was opening a new session 
each time. For some reason, even though I was still closing the session on the 
client side, I was getting the same error.

Which driver?  If you can still reproduce this, would you mind opening a 
ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all)


--
Tyler Hobbs
DataStax


Re: too many open files

2014-08-09 Thread Marcelo Elias Del Valle
IMHO, I think the drivers are fine. It was a dumb mistake of mine to use
sessions as connections and not as connection pools.
What was harder to figure, in my opinion, was that too many connections
from the client would increase the amount of file descriptors used by the
server. I didn't know Linux open a FD for each connection received and
honestly I still don't know much about the details of this. When I got a
too many open files error it took a good while to think about checking
the connections.
I think the documentation could point this fact, it would help other people
with the same problem.
There could be something talking about it here:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html

[]s



2014-08-09 12:55 GMT-03:00 Jack Krupansky j...@basetechnology.com:

   Maybe the drivers should have two modes: few sessions, and lots of
 sessions. The former would give you a developer-friendly driver error if
 you leave more than say a dozen or two dozen sessions open (or whatever is
 considered a best practice for parallel threads in a client), on the theory
 that you probably used the anti-pattern of failing to reuse sessions. The
 latter would be more for expert apps that have some good reason for having
 hundreds or thousands of simultaneous sessions open. Whether the latter
 also has some (configurable) limit that is simply a lot higher than the
 former or is unlimited, is probably not so important. Or, maybe, simply
 have a single limit, without the modes and default it to 10 or 25 or some
 other relatively low number for “normal” apps.

 This would be more developer-friendly, for both new and “normal”
 developers... I think.

 -- Jack Krupansky

  *From:* Marcelo Elias Del Valle marc...@s1mbi0se.com.br
 *Sent:* Saturday, August 9, 2014 12:41 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: too many open files

  Indeed, that was my mistake, that was exactly what we were doing in the
 code.
 []s


 2014-08-09 0:56 GMT-03:00 Brian Zhang yikebo...@gmail.com:

 For cassandra driver,session is just like database connection pool,it
 maybe contains many tcp connections,if you create a new session every
 time,more and more tcp connections will be created,till surpass the max
 file description limit  of os.

 You should create one session,use it repeatedly ,session can manage
 connections automatically,create new connection or close old connection for
 your requests.

 在 2014年8月9日,6:52,Redmumba redmu...@gmail.com 写道:

  Just to chime in, I also ran into this issue when I was migrating to
 the Datastax client. Instead of reusing the session, I was opening a new
 session each time. For some reason, even though I was still closing the
 session on the client side, I was getting the same error.

 Plus, the only way I could recover was by restarting Cassandra. I did not
 really see the connections timeout over a period of a few minutes.

 Andrew
 On Aug 8, 2014 3:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:

 You may have this problem if your client doesn't reuse the connection
 but opens new every type. So, run netstat and check the number of
 established connections. This number should not be big.

 Thank you,
   Andrey


 On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

  Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having
 too many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility,
 is a file descriptor leak in Cassandra. Run lsof -n | grep java to
 check that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here
 are the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds

Re: too many open files

2014-08-09 Thread Kevin Burton
Another idea to detect this is when the number of open sessions exceeds the
number of threads.
On Aug 9, 2014 10:59 AM, Andrew redmu...@gmail.com wrote:

 I just had a generator that (in the incorrect way) had a cluster as a
 member variable, and would call .connect() repeatedly.  I _thought_,
 incorrectly, that the Session was thread unsafe, and so I should request a
 separate Session each time—obviously wrong in hind sight.

 There was no special logic; I had a restriction of about 128 connections
 per host, but the connections were in the 100s of thousands, like the OP
 mentioned.  Again, I’ll see about reproducing it on Monday, but just wanted
 the repro steps (overall) to live somewhere in case I can’t. :)

 Andrew

 On August 8, 2014 at 4:08:50 PM, Tyler Hobbs (ty...@datastax.com) wrote:


 On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:

 Just to chime in, I also ran into this issue when I was migrating to the
 Datastax client. Instead of reusing the session, I was opening a new
 session each time. For some reason, even though I was still closing the
 session on the client side, I was getting the same error.


 Which driver?  If you can still reproduce this, would you mind opening a
 ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all
 )


 --
 Tyler Hobbs
 DataStax http://datastax.com/




Re: too many open files

2014-08-09 Thread Jonathan Haddad
It really doesn't need to be this complicated.  You only need 1
session per application.  It's thread safe and manages the connection
pool for you.

http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/Session.html



On Sat, Aug 9, 2014 at 1:29 PM, Kevin Burton bur...@spinn3r.com wrote:
 Another idea to detect this is when the number of open sessions exceeds the
 number of threads.

 On Aug 9, 2014 10:59 AM, Andrew redmu...@gmail.com wrote:

 I just had a generator that (in the incorrect way) had a cluster as a
 member variable, and would call .connect() repeatedly.  I _thought_,
 incorrectly, that the Session was thread unsafe, and so I should request a
 separate Session each time—obviously wrong in hind sight.

 There was no special logic; I had a restriction of about 128 connections
 per host, but the connections were in the 100s of thousands, like the OP
 mentioned.  Again, I’ll see about reproducing it on Monday, but just wanted
 the repro steps (overall) to live somewhere in case I can’t. :)

 Andrew

 On August 8, 2014 at 4:08:50 PM, Tyler Hobbs (ty...@datastax.com) wrote:


 On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:

 Just to chime in, I also ran into this issue when I was migrating to the
 Datastax client. Instead of reusing the session, I was opening a new session
 each time. For some reason, even though I was still closing the session on
 the client side, I was getting the same error.


 Which driver?  If you can still reproduce this, would you mind opening a
 ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all)


 --
 Tyler Hobbs
 DataStax



-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Re: too many open files

2014-08-09 Thread Andrew
Yes, that was the problem—I actually knew better, but had briefly overlooked 
this that when I was doing some refactoring.  I am not the OP (although he 
himself realized his mistake).

if you follow the thread, I was explaining that the Datastax Java driver 
allowed me to basically open a significantly large number of connections until 
the Cassandra server ran out of connections.  Tyler was asking for a repro case 
and requesting that I file a possible bug, if this was something that SHOULD 
have been caught on the client side (via the max connections client 
configuration).

Andrew

On August 9, 2014 at 2:17:57 PM, Jonathan Haddad (j...@jonhaddad.com) wrote:

It really doesn't need to be this complicated. You only need 1  
session per application. It's thread safe and manages the connection  
pool for you.  

http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/Session.html  



On Sat, Aug 9, 2014 at 1:29 PM, Kevin Burton bur...@spinn3r.com wrote:  
 Another idea to detect this is when the number of open sessions exceeds the  
 number of threads.  
  
 On Aug 9, 2014 10:59 AM, Andrew redmu...@gmail.com wrote:  
  
 I just had a generator that (in the incorrect way) had a cluster as a  
 member variable, and would call .connect() repeatedly. I _thought_,  
 incorrectly, that the Session was thread unsafe, and so I should request a  
 separate Session each time—obviously wrong in hind sight.  
  
 There was no special logic; I had a restriction of about 128 connections  
 per host, but the connections were in the 100s of thousands, like the OP  
 mentioned. Again, I’ll see about reproducing it on Monday, but just wanted  
 the repro steps (overall) to live somewhere in case I can’t. :)  
  
 Andrew  
  
 On August 8, 2014 at 4:08:50 PM, Tyler Hobbs (ty...@datastax.com) wrote:  
  
  
 On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:  
  
 Just to chime in, I also ran into this issue when I was migrating to the  
 Datastax client. Instead of reusing the session, I was opening a new 
 session  
 each time. For some reason, even though I was still closing the session on  
 the client side, I was getting the same error.  
  
  
 Which driver? If you can still reproduce this, would you mind opening a  
 ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all)  
  
  
 --  
 Tyler Hobbs  
 DataStax  



--  
Jon Haddad  
http://www.rustyrazorblade.com  
skype: rustyrazorblade  


too many open files

2014-08-08 Thread Marcelo Elias Del Valle
Hi,

I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
many open files exceptions when I try to perform a large number of
operations in my 10 node cluster.

I saw the documentation
http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
and I have set everything to the recommended settings, but I keep getting
the errors.

In the documentation it says: Another, much less likely possibility, is a
file descriptor leak in Cassandra. Run lsof -n | grep java to check that
the number of file descriptors opened by Java is reasonable and reports the
error if the number is greater than a few thousand.

I guess it's not the case, or else a lot of people would be complaining
about it, but I am not sure what I could do to solve the problem.

Any hint about how to solve it?

My client is written in python and uses Cassandra Python Driver. Here are
the exceptions I am having in the client:
[s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
seconds: [Errno 24] Too many open files
[s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
seconds: Timed out connecting to 200.200.200.144
[s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
seconds: Timed out connecting to 200.200.200.77


And here is the exception I am having in the server:

 WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
BatchStatement.java (line 223) Batch of prepared statements for
[identification.entity_lookup, identification.entity] is of size 25216,
exceeding specified threshold of 5120 by 20096.
ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
ErrorMessage.java (line 222) Unexpected exception during request
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
at
org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
at
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142

Re: too many open files

2014-08-08 Thread Shane Hansen
Are you using apache or Datastax cassandra?

The datastax distribution ups the file handle limit to 10. That
number's hard to exceed.



On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
 many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility, is
 a file descriptor leak in Cassandra. Run lsof -n | grep java to check
 that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here are
 the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during request
 java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:192)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
 at
 org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
 at
 org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
 at
 org.jboss.netty.channel.socket.nio.AbstractNioSelector.run

Re: too many open files

2014-08-08 Thread Marcelo Elias Del Valle
I am using datastax community, the packaged version for Debian. I am also
using last version of opscenter and datastax-agent

However, I just listed open files here:

sudo lsof -n | grep java | wc -l
1096599

It seems it has exceed. Should I just increase? Or is it possible to be a
memory leak?

Best regards,
Marcelo.



2014-08-08 17:06 GMT-03:00 Shane Hansen shanemhan...@gmail.com:

 Are you using apache or Datastax cassandra?

 The datastax distribution ups the file handle limit to 10. That
 number's hard to exceed.



 On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
 many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility, is
 a file descriptor leak in Cassandra. Run lsof -n | grep java to check
 that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here are
 the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during request
 java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223

Re: too many open files

2014-08-08 Thread Kevin Burton
You may want to look at the the actual filenames.  You might have an app
leaving them open.  Also, remember, sockets use FDs so they are in the list
too.


On Fri, Aug 8, 2014 at 1:13 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 I am using datastax community, the packaged version for Debian. I am also
 using last version of opscenter and datastax-agent

 However, I just listed open files here:

 sudo lsof -n | grep java | wc -l
 1096599

 It seems it has exceed. Should I just increase? Or is it possible to be a
 memory leak?

 Best regards,
 Marcelo.



 2014-08-08 17:06 GMT-03:00 Shane Hansen shanemhan...@gmail.com:

 Are you using apache or Datastax cassandra?

 The datastax distribution ups the file handle limit to 10. That
 number's hard to exceed.



 On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having
 too many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility,
 is a file descriptor leak in Cassandra. Run lsof -n | grep java to
 check that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here
 are the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during

Re: too many open files

2014-08-08 Thread Marcelo Elias Del Valle
I just solved the issue, it was Cassandra process which was opening too
many fds, indeed, but the problem was the amount of client connections
being opened. It was opening more connection than needed in the client'
side.
Thanks for the help.
[]s


2014-08-08 17:17 GMT-03:00 Kevin Burton bur...@spinn3r.com:

 You may want to look at the the actual filenames.  You might have an app
 leaving them open.  Also, remember, sockets use FDs so they are in the list
 too.


 On Fri, Aug 8, 2014 at 1:13 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 I am using datastax community, the packaged version for Debian. I am also
 using last version of opscenter and datastax-agent

 However, I just listed open files here:

 sudo lsof -n | grep java | wc -l
 1096599

 It seems it has exceed. Should I just increase? Or is it possible to be a
 memory leak?

 Best regards,
 Marcelo.



 2014-08-08 17:06 GMT-03:00 Shane Hansen shanemhan...@gmail.com:

 Are you using apache or Datastax cassandra?

 The datastax distribution ups the file handle limit to 10. That
 number's hard to exceed.



 On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having
 too many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility,
 is a file descriptor leak in Cassandra. Run lsof -n | grep java to
 check that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here
 are the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27

Re: too many open files

2014-08-08 Thread Andrey Ilinykh
You may have this problem if your client doesn't reuse the connection but
opens new every type. So, run netstat and check the number of established
connections. This number should not be big.

Thank you,
  Andrey


On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
 many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility, is
 a file descriptor leak in Cassandra. Run lsof -n | grep java to check
 that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here are
 the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during request
 java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 at sun.nio.ch.IOUtil.read(IOUtil.java:192)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:375)
 at
 org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
 at
 org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109

Re: too many open files

2014-08-08 Thread Redmumba
Just to chime in, I also ran into this issue when I was migrating to the
Datastax client. Instead of reusing the session, I was opening a new
session each time. For some reason, even though I was still closing the
session on the client side, I was getting the same error.

Plus, the only way I could recover was by restarting Cassandra. I did not
really see the connections timeout over a period of a few minutes.

Andrew
On Aug 8, 2014 3:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:

 You may have this problem if your client doesn't reuse the connection but
 opens new every type. So, run netstat and check the number of established
 connections. This number should not be big.

 Thank you,
   Andrey


 On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
 many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility, is
 a file descriptor leak in Cassandra. Run lsof -n | grep java to check
 that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here are
 the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during request
 java.io.IOException: Connection reset by peer

Re: too many open files

2014-08-08 Thread Tyler Hobbs
On Fri, Aug 8, 2014 at 5:52 PM, Redmumba redmu...@gmail.com wrote:

 Just to chime in, I also ran into this issue when I was migrating to the
 Datastax client. Instead of reusing the session, I was opening a new
 session each time. For some reason, even though I was still closing the
 session on the client side, I was getting the same error.


Which driver?  If you can still reproduce this, would you mind opening a
ticket? (https://datastax-oss.atlassian.net/secure/BrowseProjects.jspa#all)


-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: too many open files

2014-08-08 Thread J. Ryan Earl
Yes, definitely look how many open files are actual file handles versus
networks sockets.  We found a file handle leak in 2.0 but it was patched in
2.0.3 or .5 I think.  A million open files is way too high.


On Fri, Aug 8, 2014 at 5:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:

 You may have this problem if your client doesn't reuse the connection but
 opens new every type. So, run netstat and check the number of established
 connections. This number should not be big.

 Thank you,
   Andrey


 On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
 many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility, is
 a file descriptor leak in Cassandra. Run lsof -n | grep java to check
 that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here are
 the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.77


 And here is the exception I am having in the server:

  WARN [Native-Transport-Requests:163] 2014-08-08 14:27:30,499
 BatchStatement.java (line 223) Batch of prepared statements for
 [identification.entity_lookup, identification.entity] is of size 25216,
 exceeding specified threshold of 5120 by 20096.
 ERROR [Native-Transport-Requests:150] 2014-08-08 14:27:31,611
 ErrorMessage.java (line 222) Unexpected exception during request
 java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223

Re: too many open files

2014-08-08 Thread Brian Zhang
For cassandra driver,session is just like database connection pool,it maybe 
contains many tcp connections,if you create a new session every time,more and 
more tcp connections will be created,till surpass the max file description 
limit  of os.

You should create one session,use it repeatedly ,session can manage connections 
automatically,create new connection or close old connection for your requests.

在 2014年8月9日,6:52,Redmumba redmu...@gmail.com 写道:

 Just to chime in, I also ran into this issue when I was migrating to the 
 Datastax client. Instead of reusing the session, I was opening a new session 
 each time. For some reason, even though I was still closing the session on 
 the client side, I was getting the same error.
 
 Plus, the only way I could recover was by restarting Cassandra. I did not 
 really see the connections timeout over a period of a few minutes.
 
 Andrew
 
 On Aug 8, 2014 3:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:
 You may have this problem if your client doesn't reuse the connection but 
 opens new every type. So, run netstat and check the number of established 
 connections. This number should not be big.
 
 Thank you,
   Andrey 
 
 
 On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:
 Hi, 
 
 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too 
 many open files exceptions when I try to perform a large number of 
 operations in my 10 node cluster.
 
 I saw the documentation 
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
  and I have set everything to the recommended settings, but I keep getting 
 the errors.
 
 In the documentation it says: Another, much less likely possibility, is a 
 file descriptor leak in Cassandra. Run lsof -n | grep java to check that the 
 number of file descriptors opened by Java is reasonable and reports the error 
 if the number is greater than a few thousand.
 
 I guess it's not the case, or else a lot of people would be complaining about 
 it, but I am not sure what I could do to solve the problem.
 
 Any hint about how to solve it?
 
 My client is written in python and uses Cassandra Python Driver. Here are the 
 exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.151, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.142, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.143, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.142, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.145, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.144, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.148, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.146, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.77, scheduling retry in 600.0 seconds: [Errno 24] 
 Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.76, scheduling retry in 600.0 seconds: [Errno 24] 
 Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.75, scheduling retry in 600.0 seconds: [Errno 24] 
 Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.142, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.185, scheduling retry in 600.0 seconds: [Errno 
 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.144, scheduling retry in 512.0 seconds: Timed out 
 connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error attempting 
 to reconnect to 200.200.200.77, scheduling retry in 512.0 seconds: Timed out 
 connecting

Re: too many open files

2014-08-08 Thread Marcelo Elias Del Valle
Indeed, that was my mistake, that was exactly what we were doing in the
code.
[]s


2014-08-09 0:56 GMT-03:00 Brian Zhang yikebo...@gmail.com:

 For cassandra driver,session is just like database connection pool,it
 maybe contains many tcp connections,if you create a new session every
 time,more and more tcp connections will be created,till surpass the max
 file description limit  of os.

 You should create one session,use it repeatedly ,session can manage
 connections automatically,create new connection or close old connection for
 your requests.

 在 2014年8月9日,6:52,Redmumba redmu...@gmail.com 写道:

 Just to chime in, I also ran into this issue when I was migrating to the
 Datastax client. Instead of reusing the session, I was opening a new
 session each time. For some reason, even though I was still closing the
 session on the client side, I was getting the same error.

 Plus, the only way I could recover was by restarting Cassandra. I did not
 really see the connections timeout over a period of a few minutes.

 Andrew
 On Aug 8, 2014 3:19 PM, Andrey Ilinykh ailin...@gmail.com wrote:

 You may have this problem if your client doesn't reuse the connection but
 opens new every type. So, run netstat and check the number of established
 connections. This number should not be big.

 Thank you,
   Andrey


 On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle 
 marc...@s1mbi0se.com.br wrote:

 Hi,

 I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having
 too many open files exceptions when I try to perform a large number of
 operations in my 10 node cluster.

 I saw the documentation
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/troubleshooting/trblshootTooManyFiles_r.html
 and I have set everything to the recommended settings, but I keep getting
 the errors.

 In the documentation it says: Another, much less likely possibility,
 is a file descriptor leak in Cassandra. Run lsof -n | grep java to
 check that the number of file descriptors opened by Java is reasonable and
 reports the error if the number is greater than a few thousand.

 I guess it's not the case, or else a lot of people would be complaining
 about it, but I am not sure what I could do to solve the problem.

 Any hint about how to solve it?

 My client is written in python and uses Cassandra Python Driver. Here
 are the exceptions I am having in the client:
 [s1log] 2014-08-08 12:16:09,631 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.151, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,632 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,633 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.143, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,634 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.145, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,635 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.148, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,732 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.146, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,733 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.77, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.76, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,734 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.75, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,735 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.142, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,736 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.185, scheduling retry in 600.0
 seconds: [Errno 24] Too many open files
 [s1log] 2014-08-08 12:16:09,942 - cassandra.pool - WARNING - Error
 attempting to reconnect to 200.200.200.144, scheduling retry in 512.0
 seconds: Timed out connecting to 200.200.200.144
 [s1log] 2014-08-08 12:16:09,998 - cassandra.pool - WARNING - Error
 attempting

Re: Too Many Open Files (sockets) - VNodes - Map/Reduce Job

2014-06-04 Thread Michael Shuler

(this is probably a better question for the user list - cc/reply-to set)

Allow more files to be open  :)

http://www.datastax.com/documentation/cassandra/1.2/cassandra/install/installRecommendSettings.html

--
Kind regards,
Michael


On 06/04/2014 12:15 PM, Florian Dambrine wrote:

Hi every body,

We are running ElasticMapReduce Jobs from Amazon on a 25 nodes Cassandra
cluster (with VNodes). Since we have increased the size of the cluster we
are facing a too many open files (due to sockets) exception when creating
the splits. Does anyone has an idea?

Thanks,

Here is the stacktrace:


14/06/04 03:23:24 INFO mapred.JobClient: Default number of map tasks: null
14/06/04 03:23:24 INFO mapred.JobClient: Setting default number of map
tasks based on cluster size to : 80
14/06/04 03:23:24 INFO mapred.JobClient: Default number of reduce tasks: 26
14/06/04 03:23:25 INFO security.ShellBasedUnixGroupsMapping: add
hadoop to shell userGroupsCache
14/06/04 03:23:25 INFO mapred.JobClient: Setting group to hadoop
14/06/04 03:23:41 ERROR transport.TSocket: Could not configure socket.
java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:447)
at java.net.Socket.getImpl(Socket.java:510)
at java.net.Socket.setSoLinger(Socket.java:984)
at org.apache.thrift.transport.TSocket.initSocket(TSocket.java:118)
at org.apache.thrift.transport.TSocket.init(TSocket.java:109)
at org.apache.thrift.transport.TSocket.init(TSocket.java:94)
at 
org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:39)
at 
org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:558)
at 
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSubSplits(AbstractColumnFamilyInputFormat.java:286)
at 
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.access$200(AbstractColumnFamilyInputFormat.java:61)
at 
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:236)
at 
org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat$SplitCallable.call(AbstractColumnFamilyInputFormat.java:221)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)







Re: Getting into Too many open files issues

2013-11-20 Thread J. Ryan Earl
There was a bug introduced in 2.0.0-beta1 related to TTL, a patch just came
available in: https://issues.apache.org/jira/browse/CASSANDRA-6275


On Thu, Nov 7, 2013 at 5:15 AM, Murthy Chelankuri kmurt...@gmail.comwrote:

 I have experimenting cassandra latest version for storing the huge the in
 our application.

 Write are doing good. but when comes to reads i have obsereved that
 cassandra is getting into too many open files issues. When i check the logs
 its not able to open the cassandra data files any more before of the file
 descriptors limits.


 Can some one suggest me what i am going wrong what could be issues which
 causing the read operating leads to Too many open files issue.



Re: Getting into Too many open files issues

2013-11-11 Thread Aaron Morton
 Some reason with in less than an hour cassandra node is opening 32768 files 
 and cassandra is not responding after that. 
Are you using Levelled Compaction ? 
Is so what value did you set for min_sstable_size ? The default has changed 
from 5 to 160. 

Increasing the file handles is the right thing to do but 32K files is a lot. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 8/11/2013, at 8:09 am, Arindam Barua aba...@247-inc.com wrote:

  
 I see 100 000 recommended in the Datastax documentation for thenofile limit 
 since Cassandra 1.2 :
  
 http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/install/installRecommendSettings.html
  
 -Arindam
  
 From: Pieter Callewaert [mailto:pieter.callewa...@be-mobile.be] 
 Sent: Thursday, November 07, 2013 4:22 AM
 To: user@cassandra.apache.org
 Subject: RE: Getting into Too many open files issues
  
 Hi Murthy,
  
 32768 is a bit low (I know datastax docs recommend this). But our production 
 env is now running on 1kk, or you can even put it on unlimited.
  
 Pieter
  
 From: Murthy Chelankuri [mailto:kmurt...@gmail.com] 
 Sent: donderdag 7 november 2013 12:46
 To: user@cassandra.apache.org
 Subject: Re: Getting into Too many open files issues
  
 Thanks Pieter for giving quick reply.
 
 I have downloaded  the tar ball. And have changed the limits.conf as per the 
 documentation like below.
 
 * soft nofile 32768
 * hard nofile 32768
 root soft nofile 32768
 root hard nofile 32768
 * soft memlock unlimited
 * hard memlock unlimited
 root soft memlock unlimited
 root hard memlock unlimited
 * soft as unlimited
 * hard as unlimited
 root soft as unlimited
 root hard as unlimited
 
 root soft/hard nproc 32000
 
 
 Some reason with in less than an hour cassandra node is opening 32768 files 
 and cassandra is not responding after that.
 
 It is still not clear why cassadra is opening that many files and not closing 
 properly ( does the laest cassandra 2.0.1 version have some bugs ).
 
 what i have been experimenting is 300 writes per sec and 500 reads per sec.
 
 And i have using 2 node cluster with 8 core cpu and 32GB RAM ( Virtuval 
 Machines)
  
 
 Do we need to increase the nofile limts to more than 32768 ?
 
 
 
 
 
 
 
 
 
 
  
 
  
 
 On Thu, Nov 7, 2013 at 4:55 PM, Pieter Callewaert 
 pieter.callewa...@be-mobile.be wrote:
 Hi Murthy,
  
 Did you do a package install (.deb?) or you downloaded the tar?
 If the latest, you have to adjust the limits.conf file 
 (/etc/security/limits.conf) to raise the nofile (number of files open) for 
 the cassandra user.
  
 If you are using the .deb package, the limit is already raised to 100 000 
 files. (can be found in /etc/init.d/cassandra, FD_LIMIT).
 However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was 
 too low.
  
 Kind regards,
 Pieter Callewaert
  
 From: Murthy Chelankuri [mailto:kmurt...@gmail.com] 
 Sent: donderdag 7 november 2013 12:15
 To: user@cassandra.apache.org
 Subject: Getting into Too many open files issues
  
 I have experimenting cassandra latest version for storing the huge the in our 
 application.
 
 Write are doing good. but when comes to reads i have obsereved that cassandra 
 is getting into too many open files issues. When i check the logs its not 
 able to open the cassandra data files any more before of the file descriptors 
 limits.
 
 Can some one suggest me what i am going wrong what could be issues which 
 causing the read operating leads to Too many open files issue.



Getting into Too many open files issues

2013-11-07 Thread Murthy Chelankuri
I have experimenting cassandra latest version for storing the huge the in
our application.

Write are doing good. but when comes to reads i have obsereved that
cassandra is getting into too many open files issues. When i check the logs
its not able to open the cassandra data files any more before of the file
descriptors limits.


Can some one suggest me what i am going wrong what could be issues which
causing the read operating leads to Too many open files issue.


RE: Getting into Too many open files issues

2013-11-07 Thread Pieter Callewaert
Hi Murthy,

Did you do a package install (.deb?) or you downloaded the tar?
If the latest, you have to adjust the limits.conf file 
(/etc/security/limits.conf) to raise the nofile (number of files open) for the 
cassandra user.

If you are using the .deb package, the limit is already raised to 100 000 
files. (can be found in /etc/init.d/cassandra, FD_LIMIT).
However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was too 
low.

Kind regards,
Pieter Callewaert

From: Murthy Chelankuri [mailto:kmurt...@gmail.com]
Sent: donderdag 7 november 2013 12:15
To: user@cassandra.apache.org
Subject: Getting into Too many open files issues

I have experimenting cassandra latest version for storing the huge the in our 
application.
Write are doing good. but when comes to reads i have obsereved that cassandra 
is getting into too many open files issues. When i check the logs its not able 
to open the cassandra data files any more before of the file descriptors limits.

Can some one suggest me what i am going wrong what could be issues which 
causing the read operating leads to Too many open files issue.


Re: Getting into Too many open files issues

2013-11-07 Thread Murthy Chelankuri
Thanks Pieter for giving quick reply.

I have downloaded  the tar ball. And have changed the limits.conf as per
the documentation like below.

* soft nofile 32768
* hard nofile 32768
root soft nofile 32768
root hard nofile 32768
* soft memlock unlimited
* hard memlock unlimited
root soft memlock unlimited
root hard memlock unlimited
* soft as unlimited
* hard as unlimited
root soft as unlimited
root hard as unlimited

root soft/hard nproc 32000



Some reason with in less than an hour cassandra node is opening 32768 files
and cassandra is not responding after that.

It is still not clear why cassadra is opening that many files and not
closing properly ( does the laest cassandra 2.0.1 version have some bugs ).

what i have been experimenting is 300 writes per sec and 500 reads per sec.

And i have using 2 node cluster with 8 core cpu and 32GB RAM ( Virtuval
Machines)


Do we need to increase the nofile limts to more than 32768 ?
















On Thu, Nov 7, 2013 at 4:55 PM, Pieter Callewaert 
pieter.callewa...@be-mobile.be wrote:

  Hi Murthy,



 Did you do a package install (.deb?) or you downloaded the tar?

 If the latest, you have to adjust the limits.conf file
 (/etc/security/limits.conf) to raise the nofile (number of files open) for
 the cassandra user.



 If you are using the .deb package, the limit is already raised to 100 000
 files. (can be found in /etc/init.d/cassandra, FD_LIMIT).

 However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was
 too low.



 Kind regards,

 Pieter Callewaert



 *From:* Murthy Chelankuri [mailto:kmurt...@gmail.com]
 *Sent:* donderdag 7 november 2013 12:15
 *To:* user@cassandra.apache.org
 *Subject:* Getting into Too many open files issues



 I have experimenting cassandra latest version for storing the huge the in
 our application.

 Write are doing good. but when comes to reads i have obsereved that
 cassandra is getting into too many open files issues. When i check the logs
 its not able to open the cassandra data files any more before of the file
 descriptors limits.

   Can some one suggest me what i am going wrong what could be issues
 which causing the read operating leads to Too many open files issue.



RE: Getting into Too many open files issues

2013-11-07 Thread Pieter Callewaert
Hi Murthy,

32768 is a bit low (I know datastax docs recommend this). But our production 
env is now running on 1kk, or you can even put it on unlimited.

Pieter

From: Murthy Chelankuri [mailto:kmurt...@gmail.com]
Sent: donderdag 7 november 2013 12:46
To: user@cassandra.apache.org
Subject: Re: Getting into Too many open files issues

Thanks Pieter for giving quick reply.
I have downloaded  the tar ball. And have changed the limits.conf as per the 
documentation like below.

* soft nofile 32768
* hard nofile 32768
root soft nofile 32768
root hard nofile 32768
* soft memlock unlimited
* hard memlock unlimited
root soft memlock unlimited
root hard memlock unlimited
* soft as unlimited
* hard as unlimited
root soft as unlimited
root hard as unlimited

root soft/hard nproc 32000


Some reason with in less than an hour cassandra node is opening 32768 files and 
cassandra is not responding after that.
It is still not clear why cassadra is opening that many files and not closing 
properly ( does the laest cassandra 2.0.1 version have some bugs ).
what i have been experimenting is 300 writes per sec and 500 reads per sec.
And i have using 2 node cluster with 8 core cpu and 32GB RAM ( Virtuval 
Machines)

Do we need to increase the nofile limts to more than 32768 ?













On Thu, Nov 7, 2013 at 4:55 PM, Pieter Callewaert 
pieter.callewa...@be-mobile.bemailto:pieter.callewa...@be-mobile.be wrote:
Hi Murthy,

Did you do a package install (.deb?) or you downloaded the tar?
If the latest, you have to adjust the limits.conf file 
(/etc/security/limits.conf) to raise the nofile (number of files open) for the 
cassandra user.

If you are using the .deb package, the limit is already raised to 100 000 
files. (can be found in /etc/init.d/cassandra, FD_LIMIT).
However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was too 
low.

Kind regards,
Pieter Callewaert

From: Murthy Chelankuri [mailto:kmurt...@gmail.commailto:kmurt...@gmail.com]
Sent: donderdag 7 november 2013 12:15
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Getting into Too many open files issues

I have experimenting cassandra latest version for storing the huge the in our 
application.
Write are doing good. but when comes to reads i have obsereved that cassandra 
is getting into too many open files issues. When i check the logs its not able 
to open the cassandra data files any more before of the file descriptors limits.
Can some one suggest me what i am going wrong what could be issues which 
causing the read operating leads to Too many open files issue.



RE: Getting into Too many open files issues

2013-11-07 Thread Arindam Barua

I see 100 000 recommended in the Datastax documentation for thenofile limit 
since Cassandra 1.2 :

http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/install/installRecommendSettings.html

-Arindam

From: Pieter Callewaert [mailto:pieter.callewa...@be-mobile.be]
Sent: Thursday, November 07, 2013 4:22 AM
To: user@cassandra.apache.org
Subject: RE: Getting into Too many open files issues

Hi Murthy,

32768 is a bit low (I know datastax docs recommend this). But our production 
env is now running on 1kk, or you can even put it on unlimited.

Pieter

From: Murthy Chelankuri [mailto:kmurt...@gmail.com]
Sent: donderdag 7 november 2013 12:46
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Getting into Too many open files issues

Thanks Pieter for giving quick reply.
I have downloaded  the tar ball. And have changed the limits.conf as per the 
documentation like below.

* soft nofile 32768
* hard nofile 32768
root soft nofile 32768
root hard nofile 32768
* soft memlock unlimited
* hard memlock unlimited
root soft memlock unlimited
root hard memlock unlimited
* soft as unlimited
* hard as unlimited
root soft as unlimited
root hard as unlimited

root soft/hard nproc 32000

Some reason with in less than an hour cassandra node is opening 32768 files and 
cassandra is not responding after that.
It is still not clear why cassadra is opening that many files and not closing 
properly ( does the laest cassandra 2.0.1 version have some bugs ).
what i have been experimenting is 300 writes per sec and 500 reads per sec.
And i have using 2 node cluster with 8 core cpu and 32GB RAM ( Virtuval 
Machines)

Do we need to increase the nofile limts to more than 32768 ?











On Thu, Nov 7, 2013 at 4:55 PM, Pieter Callewaert 
pieter.callewa...@be-mobile.bemailto:pieter.callewa...@be-mobile.be wrote:
Hi Murthy,

Did you do a package install (.deb?) or you downloaded the tar?
If the latest, you have to adjust the limits.conf file 
(/etc/security/limits.conf) to raise the nofile (number of files open) for the 
cassandra user.

If you are using the .deb package, the limit is already raised to 100 000 
files. (can be found in /etc/init.d/cassandra, FD_LIMIT).
However, with the 2.0.x I had to raise it to 1 000 000 because 100 000 was too 
low.

Kind regards,
Pieter Callewaert

From: Murthy Chelankuri [mailto:kmurt...@gmail.commailto:kmurt...@gmail.com]
Sent: donderdag 7 november 2013 12:15
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Getting into Too many open files issues

I have experimenting cassandra latest version for storing the huge the in our 
application.
Write are doing good. but when comes to reads i have obsereved that cassandra 
is getting into too many open files issues. When i check the logs its not able 
to open the cassandra data files any more before of the file descriptors limits.
Can some one suggest me what i am going wrong what could be issues which 
causing the read operating leads to Too many open files issue.



Re: Too many open files with Cassandra 1.2.11

2013-10-31 Thread Aaron Morton
What’s in /etc/security/limits.conf ? 

and just for fun what does lsof -n | grep java | wc -l  say ? 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 12:21 am, Oleg Dulin oleg.du...@gmail.com wrote:

 Got this error:
 
 WARN [Thread-8] 2013-10-29 02:58:24,565 CustomTThreadPoolServer.java (line 
 122) Transport error occurred during acceptance of message.
2 org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Too many open files
3 at 
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:109)
  
4 at 
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36)
  
5 at 
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) 
6 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110)
  
7 at 
 org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:111)
  
 
 I haven't seen this since 1.0 days. 1.1.11 had it all fixed I thought.
 
 ulimit outputs unlimited
 
 What could cause this ?
 
 Any help is greatly apprecaited.
 
 -- 
 Regards,
 Oleg Dulin
 http://www.olegdulin.com
 
 



Too many open files with Cassandra 1.2.11

2013-10-29 Thread Oleg Dulin

Got this error:

WARN [Thread-8] 2013-10-29 02:58:24,565 CustomTThreadPoolServer.java 
(line 122) Transport error occurred during acceptance of message.
2 org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Too many open files
3 at 
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:109) 

4 at 
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36) 

5 at 
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) 

6 at 
org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110) 

7 at 
org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:111) 



I haven't seen this since 1.0 days. 1.1.11 had it all fixed I thought.

ulimit outputs unlimited

What could cause this ?

Any help is greatly apprecaited.

--
Regards,
Oleg Dulin
http://www.olegdulin.com




Too many open files (Cassandra 2.0.1)

2013-10-29 Thread Pieter Callewaert
Hi,

I've noticed some nodes in our cluster are dying after some period of time.

WARN [New I/O server boss #17] 2013-10-29 12:22:20,725 Slf4JLogger.java (line 
76) Failed to accept a connection.
java.io.IOException: Too many open files
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
at 
org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at 
org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

And other exceptions related to the same cause.
Now, as we use the Cassandra package, the nofile limit is raised to 10.
To double check if this correct:

root@de-cass09 ~ # cat /proc/18332/limits
Limit Soft Limit   Hard Limit   Units
...
Max open files10   10   files
...

Now I check how many files are open:
root@de-cass09 ~ # lsof -n -p 18332 | wc -l
100038

This seems an awful a lot for size tiered compaction... ?
Now I noticed when I checked the list, a (deleted) file passed a lot

...
java18332 cassandra 4704r   REG8,1  10911921661 2147483839 
/data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted)
java18332 cassandra 4705r   REG8,1  10911921661 2147483839 
/data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db (deleted)
...

Actually, if I count specific for this file:
root@de-cass09 ~ # lsof -n -p 18332 | grep mapdata040-hos-jb-7648-Data.db | wc 
-l
52707

Other nodes are around a total of 350 files open... Any idea why this nofiles 
is so high ?

The first exceptions I see is this:
WARN [New I/O worker #8] 2013-10-29 12:09:34,440 Slf4JLogger.java (line 76) 
Unexpected exception in the selector loop.
java.lang.NullPointerException
at 
sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:178)
at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:227)
at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:164)
at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:133)
at 
java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:209)
at 
org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:151)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)

Several minutes later I get Too many open files.

Specs:
12-node cluster with Ubuntu 12.04 LTS, Cassandra 2.0.1 (datastax packages), 
using JBOD of 2 disks.
JNA enabled.

Any suggestions?

Kind regards,
Pieter Callewaert

[Description: cid:image003.png@01CD9CE5.CE5A2330]

   Pieter Callewaert
   Web  IT engineer

   Web:   www.be-mobile.behttp://www.be-mobile.be/
   Email: pieter.callewa...@be-mobile.bemailto:pieter.callewa...@be-mobile.be
   Tel:  + 32 9 330 51 80



inline: image001.png

Re: too many open files

2013-07-15 Thread Paul Ingalls
Also, looking through the log, it appears a lot of the files end with ic- 
which I assume is associated with a secondary index I have on the table.  Are 
secondary indexes really expensive from a file descriptor standpoint?  That 
particular table uses the default compaction scheme...

On Jul 15, 2013, at 12:00 AM, Paul Ingalls paulinga...@gmail.com wrote:

 I have one table that is using leveled.  It was set to 10MB, I will try 
 changing it to 256MB.  Is there a good way to merge the existing sstables?
 
 On Jul 14, 2013, at 5:32 PM, Jonathan Haddad j...@jonhaddad.com wrote:
 
 Are you using leveled compaction?  If so, what do you have the file size set 
 at?  If you're using the defaults, you'll have a ton of really small files.  
 I believe Albert Tobey recommended using 256MB for the table 
 sstable_size_in_mb to avoid this problem.
 
 
 On Sun, Jul 14, 2013 at 5:10 PM, Paul Ingalls paulinga...@gmail.com wrote:
 I'm running into a problem where instances of my cluster are hitting over 
 450K open files.  Is this normal for a 4 node 1.2.6 cluster with replication 
 factor of 3 and about 50GB of data on each node?  I can push the file 
 descriptor limit up, but I plan on having a much larger load so I'm 
 wondering if I should be looking at something else….
 
 Let me know if you need more info…
 
 Paul
 
 
 
 
 
 -- 
 Jon Haddad
 http://www.rustyrazorblade.com
 skype: rustyrazorblade
 



Re: too many open files

2013-07-15 Thread Michał Michalski
It doesn't tell you anything if file ends it with ic-###, except 
pointing out the SSTable version it uses (ic in this case).


Files related to secondary index contain something like this in the 
filename: KS-CF.IDX-NAME, while in regular CFs do not contain 
any dots except the one just before file extension.


M.

W dniu 15.07.2013 09:38, Paul Ingalls pisze:

Also, looking through the log, it appears a lot of the files end with ic- 
which I assume is associated with a secondary index I have on the table.  Are 
secondary indexes really expensive from a file descriptor standpoint?  That 
particular table uses the default compaction scheme...

On Jul 15, 2013, at 12:00 AM, Paul Ingalls paulinga...@gmail.com wrote:


I have one table that is using leveled.  It was set to 10MB, I will try 
changing it to 256MB.  Is there a good way to merge the existing sstables?

On Jul 14, 2013, at 5:32 PM, Jonathan Haddad j...@jonhaddad.com wrote:


Are you using leveled compaction?  If so, what do you have the file size set 
at?  If you're using the defaults, you'll have a ton of really small files.  I 
believe Albert Tobey recommended using 256MB for the table sstable_size_in_mb 
to avoid this problem.


On Sun, Jul 14, 2013 at 5:10 PM, Paul Ingalls paulinga...@gmail.com wrote:
I'm running into a problem where instances of my cluster are hitting over 450K 
open files.  Is this normal for a 4 node 1.2.6 cluster with replication factor 
of 3 and about 50GB of data on each node?  I can push the file descriptor limit 
up, but I plan on having a much larger load so I'm wondering if I should be 
looking at something else….

Let me know if you need more info…

Paul





--
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade









Re: too many open files

2013-07-15 Thread Brian Tarbox
Odd that this discussion happens now as I'm also getting this error.  I get
a burst of error messages and then the system continues...with no apparent
ill effect.
I can't tell what the system was doing at the timehere is the stack.
 BTW Opscenter says I only have 4 or 5 SSTables in each of my 6 CFs.

ERROR [ReadStage:62384] 2013-07-14 18:04:26,062
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[ReadStage:62384,5,main]
java.io.IOError: java.io.FileNotFoundException:
/tmp_vol/cassandra/data/dev_a/portfoliodao/dev_a-portfoliodao-hf-166-Data.db
(Too many open files)
at
org.apache.cassandra.io.util.CompressedSegmentedFile.getSegment(CompressedSegmentedFile.java:69)
at
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:898)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:63)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:61)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
at
org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:124)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
at org.apache.cassandra.db.Table.getRow(Table.java:378)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:51)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException:
/tmp_vol/cassandra/data/dev_a/portfoliodao/dev_a-portfoliodao-hf-166-Data.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:216)
at
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:67)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:64)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:46)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:41)
at
org.apache.cassandra.io.util.CompressedSegmentedFile.getSegment(CompressedSegmentedFile.java:63)
... 16 more



On Mon, Jul 15, 2013 at 7:23 AM, Michał Michalski mich...@opera.com wrote:

 It doesn't tell you anything if file ends it with ic-###, except
 pointing out the SSTable version it uses (ic in this case).

 Files related to secondary index contain something like this in the
 filename: KS-CF.IDX-NAME, while in regular CFs do not contain any
 dots except the one just before file extension.

 M.

 W dniu 15.07.2013 09:38, Paul Ingalls pisze:

  Also, looking through the log, it appears a lot of the files end with
 ic- which I assume is associated with a secondary index I have on the
 table.  Are secondary indexes really expensive from a file descriptor
 standpoint?  That particular table uses the default compaction scheme...

 On Jul 15, 2013, at 12:00 AM, Paul Ingalls paulinga...@gmail.com wrote:

  I have one table that is using leveled.  It was set to 10MB, I will try
 changing it to 256MB.  Is there a good way to merge the existing sstables?

 On Jul 14, 2013, at 5:32 PM, Jonathan Haddad j...@jonhaddad.com wrote:

  Are you using leveled compaction?  If so, what do you have the file
 size set at?  If you're using the defaults, you'll have a ton of really
 small files.  I believe Albert Tobey recommended using 256MB for the table
 sstable_size_in_mb to avoid this problem.


 On Sun, Jul 14, 2013 at 5:10 PM, Paul Ingalls paulinga...@gmail.com
 wrote:
 I'm running into a problem where instances of my cluster are hitting
 over 450K open files.  Is this normal for a 4 node 1.2.6 cluster with
 replication factor of 3 and about 50GB of data on each node?  I can push
 the file descriptor limit up, but I plan on having a much larger load so
 I'm wondering if I should be looking at something else….

 Let me know if you need more info…

 Paul





 --
 Jon Haddad
 http://www.rustyrazorblade.com
 skype: rustyrazorblade








too many open files

2013-07-14 Thread Paul Ingalls
I'm running into a problem where instances of my cluster are hitting over 450K 
open files.  Is this normal for a 4 node 1.2.6 cluster with replication factor 
of 3 and about 50GB of data on each node?  I can push the file descriptor limit 
up, but I plan on having a much larger load so I'm wondering if I should be 
looking at something else….

Let me know if you need more info…

Paul




Re: too many open files

2013-07-14 Thread Jonathan Haddad
Are you using leveled compaction?  If so, what do you have the file size
set at?  If you're using the defaults, you'll have a ton of really small
files.  I believe Albert Tobey recommended using 256MB for the
table sstable_size_in_mb to avoid this problem.


On Sun, Jul 14, 2013 at 5:10 PM, Paul Ingalls paulinga...@gmail.com wrote:

 I'm running into a problem where instances of my cluster are hitting over
 450K open files.  Is this normal for a 4 node 1.2.6 cluster with
 replication factor of 3 and about 50GB of data on each node?  I can push
 the file descriptor limit up, but I plan on having a much larger load so
 I'm wondering if I should be looking at something else….

 Let me know if you need more info…

 Paul





-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Too many open files and stopped compaction with many pending compaction tasks

2013-06-27 Thread Desimpel, Ignace
On a test with 3 cassandra servers version 1.2.5 with replication factor 1 and 
leveled compaction, I did a store last night and I did not see any problem with 
Cassandra. On all 3 machine the compaction is stopped already several hours. 
However , one machine reports 650 pending compaction tasks (via jmx).
compaction_throughput_mb_per_sec is 0.
Concurrent_compactors is 3.
multithreaded_compaction = false.
No other load on these machines.

And when I start querying (using thrift), I get a 'too many open files' error 
on the machine with pending compaction tasks.

Limits.conf setting for nofile is 65536
Using 'lsof'  and  'wc -l' I get a count of  59577 files for Cassandra.
Total count of keyspace files on disk : 20464.

The 3 machines have an equal (+/-) data load of about 60 GB. I see that 2 
machines have no unleveled or just 1 sstables on any keyspace, but on the 
machine with troubles there is one keyspace having 670 unleveled sstables. 
Level sstable histo [670,28,106,14] thus 818 sstables. An 'ls' on that 
directory counts for 5729 files, which corresponds to the 818 sstable (7 files 
per sstables).

After restart of that machine I get 4037 open files for Cassandra. And also 
compaction has restarted. Once finisched I get SSTableCountPerLEvel = [0,10, 
109, 644].
Also, compaction reports speeds of 2.5 MB per sec. Seems slow too me. CPU less 
than 10%, Disk 15% with peeks to 45% (15000 rpm scsi). 14 GB free memory.

So I am puzzled about the number of open files and number of unleveled 
sstables, and a not so fast compaction.

Anything than can be done? Or to be done so that the next time I can get more 
useful information?

Regards,
Ignace

Example output of lsof is :
java10968 root  483r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  484u   REG8,1  33554432 29229231 
/home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568123.log
java10968 root  485r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  486r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  487r   REG   8,17  39967253 14158943 
/media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Data.db
java10968 root  488r   REG   8,17  58641524 14158942 
/media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Index.db
java10968 root  489r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  490r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  491r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  492u   REG8,1  33554432 29230501 
/home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568134.log
java10968 root  493r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  494r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  495r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  497u   REG8,1  33554432 29242455 
/home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568126.log
java10968 root  498r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  499r   REG   8,17  39725539 14160146 
/media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-1019-Data.db
java10968 root  500r   REG   8,17  56369841 14160005 
/media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-1019-Index.db
java10968 root  502r   REG   8,17  10507031 14156174 
/media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
java10968 root  504r   REG   8,17   1989198 14163384 
/media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-922-Data.db
java10968 root  505r   REG   8,17  40679209 14161763

Re: Too many open files and stopped compaction with many pending compaction tasks

2013-06-27 Thread Jeremy Hanna
Are you on SSDs?

On 27 Jun 2013, at 14:24, Desimpel, Ignace ignace.desim...@nuance.com wrote:

 On a test with 3 cassandra servers version 1.2.5 with replication factor 1 
 and leveled compaction, I did a store last night and I did not see any 
 problem with Cassandra. On all 3 machine the compaction is stopped already 
 several hours. However , one machine reports 650 pending compaction tasks 
 (via jmx).
 compaction_throughput_mb_per_sec is 0.
 Concurrent_compactors is 3.
 multithreaded_compaction = false.
 No other load on these machines.
  
 And when I start querying (using thrift), I get a ’too many open files’ error 
 on the machine with pending compaction tasks.
  
 Limits.conf setting for nofile is 65536
 Using ‘lsof’  and  ‘wc –l’ I get a count of  59577 files for Cassandra.
 Total count of keyspace files on disk : 20464.
  
 The 3 machines have an equal (+/-) data load of about 60 GB. I see that 2 
 machines have no unleveled or just 1 sstables on any keyspace, but on the 
 machine with troubles there is one keyspace having 670 unleveled sstables. 
 Level sstable histo [670,28,106,14] thus 818 sstables. An ‘ls’ on that 
 directory counts for 5729 files, which corresponds to the 818 sstable (7 
 files per sstables).
  
 After restart of that machine I get 4037 open files for Cassandra. And also 
 compaction has restarted. Once finisched I get SSTableCountPerLEvel = [0,10, 
 109, 644].
 Also, compaction reports speeds of 2.5 MB per sec. Seems slow too me. CPU 
 less than 10%, Disk 15% with peeks to 45% (15000 rpm scsi). 14 GB free memory.
  
 So I am puzzled about the number of open files and number of unleveled 
 sstables, and a not so fast compaction.
  
 Anything than can be done? Or to be done so that the next time I can get more 
 useful information?
  
 Regards,
 Ignace
  
 Example output of lsof is :
 java10968 root  483r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  484u   REG8,1  33554432 29229231 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568123.log
 java10968 root  485r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  486r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  487r   REG   8,17  39967253 14158943 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Data.db
 java10968 root  488r   REG   8,17  58641524 14158942 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Index.db
 java10968 root  489r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  490r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  491r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  492u   REG8,1  33554432 29230501 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568134.log
 java10968 root  493r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  494r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  495r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  497u   REG8,1  33554432 29242455 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568126.log
 java10968 root  498r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  499r   REG   8,17  39725539 14160146 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-1019-Data.db
 java10968 root  500r   REG   8,17  56369841 14160005 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-1019-Index.db
 java10968 root  502r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  504r   REG   8,17   1989198

RE: Too many open files and stopped compaction with many pending compaction tasks

2013-06-27 Thread Desimpel, Ignace
No : just two 15000 rpm scsi disks per machine. Each disk can handle more than 
100MB/sec streaming data (tested). Iostat reports service times of 2 or 3 milli 
sec.
Ubuntu 12.04 LTS 48 GB memory, 24 CPU Xeon X 5670
Cassandra is started with 8GB.

-Original Message-
From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com] 
Sent: donderdag 27 juni 2013 15:36
To: user@cassandra.apache.org
Subject: Re: Too many open files and stopped compaction with many pending 
compaction tasks

Are you on SSDs?

On 27 Jun 2013, at 14:24, Desimpel, Ignace ignace.desim...@nuance.com wrote:

 On a test with 3 cassandra servers version 1.2.5 with replication factor 1 
 and leveled compaction, I did a store last night and I did not see any 
 problem with Cassandra. On all 3 machine the compaction is stopped already 
 several hours. However , one machine reports 650 pending compaction tasks 
 (via jmx).
 compaction_throughput_mb_per_sec is 0.
 Concurrent_compactors is 3.
 multithreaded_compaction = false.
 No other load on these machines.
  
 And when I start querying (using thrift), I get a 'too many open files' error 
 on the machine with pending compaction tasks.
  
 Limits.conf setting for nofile is 65536 Using 'lsof'  and  'wc -l' I 
 get a count of  59577 files for Cassandra.
 Total count of keyspace files on disk : 20464.
  
 The 3 machines have an equal (+/-) data load of about 60 GB. I see that 2 
 machines have no unleveled or just 1 sstables on any keyspace, but on the 
 machine with troubles there is one keyspace having 670 unleveled sstables. 
 Level sstable histo [670,28,106,14] thus 818 sstables. An 'ls' on that 
 directory counts for 5729 files, which corresponds to the 818 sstable (7 
 files per sstables).
  
 After restart of that machine I get 4037 open files for Cassandra. And also 
 compaction has restarted. Once finisched I get SSTableCountPerLEvel = [0,10, 
 109, 644].
 Also, compaction reports speeds of 2.5 MB per sec. Seems slow too me. CPU 
 less than 10%, Disk 15% with peeks to 45% (15000 rpm scsi). 14 GB free memory.
  
 So I am puzzled about the number of open files and number of unleveled 
 sstables, and a not so fast compaction.
  
 Anything than can be done? Or to be done so that the next time I can get more 
 useful information?
  
 Regards,
 Ignace
  
 Example output of lsof is :
 java10968 root  483r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  484u   REG8,1  33554432 29229231 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568123.log
 java10968 root  485r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  486r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  487r   REG   8,17  39967253 14158943 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Data.db
 java10968 root  488r   REG   8,17  58641524 14158942 
 /media/datadrive1/dbdatafile/Ks100K/ReverseStringFunction/Ks100K-ReverseStringFunction-ic-481-Index.db
 java10968 root  489r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  490r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  491r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  492u   REG8,1  33554432 29230501 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568134.log
 java10968 root  493r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  494r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  495r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  497u   REG8,1  33554432 29242455 
 /home/cassandra/deployed/data/cdi.cassandra.cdi/dbcommitlog/CommitLog-2-1372260568126.log
 java10968 root  498r   REG   8,17  10507031 14156174 
 /media/datadrive1/dbdatafile/Ks100K/ReverseLabelFunction/Ks100K-ReverseLabelFunction-ic-1512-Data.db
 java10968 root  499r   REG   8,17  39725539 14160146 
 /media/datadrive1

Too Many Open files error

2012-12-20 Thread santi kumar
While running the nodetool repair , we are running into
FileNotFoundException with too many open files error. We increased the
ulimit value to 32768, and still we have seen this issue.

THe number of files in the data directory is around 29500+.

If we further increase the limit of ulimt, would it help?

While tracking the log file for specific file for which it threw the
FileNotFoundException, observed that it was part of Compaction. Does it
have any thing to do with it?

We are using 1.1.4.


Re: Too Many Open files error

2012-12-20 Thread Andrey Ilinykh
This bug is fixed in 1.1.5

Andrey


On Thu, Dec 20, 2012 at 12:01 AM, santi kumar santi.ku...@gmail.com wrote:

 While running the nodetool repair , we are running into
 FileNotFoundException with too many open files error. We increased the
 ulimit value to 32768, and still we have seen this issue.

 THe number of files in the data directory is around 29500+.

 If we further increase the limit of ulimt, would it help?

 While tracking the log file for specific file for which it threw the
 FileNotFoundException, observed that it was part of Compaction. Does it
 have any thing to do with it?

 We are using 1.1.4.




Re: Too Many Open files error

2012-12-20 Thread santi kumar
Can you please give more details about this bug? bug id or something

Now if I want to upgrade, is there any specific process or best practices.

Thanks
Santi



On Thu, Dec 20, 2012 at 1:44 PM, Andrey Ilinykh ailin...@gmail.com wrote:

 This bug is fixed in 1.1.5

 Andrey


 On Thu, Dec 20, 2012 at 12:01 AM, santi kumar santi.ku...@gmail.comwrote:

 While running the nodetool repair , we are running into
 FileNotFoundException with too many open files error. We increased the
 ulimit value to 32768, and still we have seen this issue.

 THe number of files in the data directory is around 29500+.

 If we further increase the limit of ulimt, would it help?

 While tracking the log file for specific file for which it threw the
 FileNotFoundException, observed that it was part of Compaction. Does it
 have any thing to do with it?

 We are using 1.1.4.





Re: Too Many Open files error

2012-12-20 Thread Andrey Ilinykh
On Thu, Dec 20, 2012 at 1:17 AM, santi kumar santi.ku...@gmail.com wrote:

 Can you please give more details about this bug? bug id or something

https://issues.apache.org/jira/browse/CASSANDRA-4571


 Now if I want to upgrade, is there any specific process or best practices.

migration from 1.1.4 to 1.1.5 is straightforward- install 1.1.5, stop 1.1.4
(nodetool drain), start 1.1.5
http://www.datastax.com/docs/1.0/install/upgrading#completing-upgrade

Andrey



 Thanks
 Santi




 On Thu, Dec 20, 2012 at 1:44 PM, Andrey Ilinykh ailin...@gmail.comwrote:

 This bug is fixed in 1.1.5

 Andrey


 On Thu, Dec 20, 2012 at 12:01 AM, santi kumar santi.ku...@gmail.comwrote:

 While running the nodetool repair , we are running into
 FileNotFoundException with too many open files error. We increased the
 ulimit value to 32768, and still we have seen this issue.

 THe number of files in the data directory is around 29500+.

 If we further increase the limit of ulimt, would it help?

 While tracking the log file for specific file for which it threw the
 FileNotFoundException, observed that it was part of Compaction. Does it
 have any thing to do with it?

 We are using 1.1.4.






Re: Too Many Open files error

2012-12-20 Thread aaron morton
 THe number of files in the data directory is around 29500+. 
If you are using Levelled Compaction it is probably easier to set the ulimit to 
unlimited. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/12/2012, at 6:34 AM, Andrey Ilinykh ailin...@gmail.com wrote:

 
 
 
 On Thu, Dec 20, 2012 at 1:17 AM, santi kumar santi.ku...@gmail.com wrote:
 Can you please give more details about this bug? bug id or something
 https://issues.apache.org/jira/browse/CASSANDRA-4571 
 
 Now if I want to upgrade, is there any specific process or best practices.
 migration from 1.1.4 to 1.1.5 is straightforward- install 1.1.5, stop 1.1.4 
 (nodetool drain), start 1.1.5
 http://www.datastax.com/docs/1.0/install/upgrading#completing-upgrade
 
 Andrey
  
 
 Thanks
 Santi
 
 
 
 
 On Thu, Dec 20, 2012 at 1:44 PM, Andrey Ilinykh ailin...@gmail.com wrote:
 This bug is fixed in 1.1.5
 
 Andrey
 
 
 On Thu, Dec 20, 2012 at 12:01 AM, santi kumar santi.ku...@gmail.com wrote:
 While running the nodetool repair , we are running into FileNotFoundException 
 with too many open files error. We increased the ulimit value to 32768, and 
 still we have seen this issue.
 
 THe number of files in the data directory is around 29500+. 
 
 If we further increase the limit of ulimt, would it help? 
 
 While tracking the log file for specific file for which it threw the 
 FileNotFoundException, observed that it was part of Compaction. Does it have 
 any thing to do with it?
 
 We are using 1.1.4.
 
 
 
 



Re: cassandra hit a wall: Too many open files (98567!)

2012-01-19 Thread Thorsten von Eicken
Ah, that explains part of the problem indeed. The whole situation still
doesn't make a lot of sense to me, unless the answer is that the default
sstable size with level compaction is just no good for large datasets. I
restarted cassandra a few hours ago and it had to open about 32k files
at start-up. Took about 15 minutes. That just can't be good...

I also noticed that when using compression the sstable size specified is
uncompressed, so the actual files tend to be smaller. I now upped the
sstable size to 100MB, which should result in about 40MB files in my
case. Is there a way I can compact some of the existing sstables that
are small? For example, I have a level-4 sstable that is 56KB in size
and many more that are rather small. Does nodetool compact do anything
with level compaction?

On 1/18/2012 2:39 AM, Janne Jalkanen wrote:

 1.0.6 has a file leak problem, fixed in 1.0.7. Perhaps this is the reason?

 https://issues.apache.org/jira/browse/CASSANDRA-3616

 /Janne

 On Jan 18, 2012, at 03:52 , dir dir wrote:

 Very Interesting Why you open so many file? Actually what kind of
 system that is built by you until open so many files? would you tell us?
 Thanks...


 On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken
 t...@rightscale.com mailto:t...@rightscale.com wrote:

 I'm running a single node cassandra 1.0.6 server which hit a wall
 yesterday:

 ERROR [CompactionExecutor:2918] 2012-01-12 20
 tel:2012-01-12%2020:37:06,327
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[CompactionExecutor:2918,1,main] java.io.IOError:
 java.io.FileNotFoundException:
 /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db
 (Too many
 open files in system)

 After that it stopped working and just say there with this error
 (undestandable). I did an lsof and saw that it had 98567 open files,
 yikes! An ls in the data directory shows 234011 files. After
 restarting
 it spent about 5 hours compacting, then quieted down. About 173k
 files
 left in the data directory. I'm using leveldb (with compression). I
 looked into the json of the two large CFs and gen 0 is empty, most
 sstables are gen 3  4. I have a total of about 150GB of data
 (compressed). Almost all the SStables are around 3MB in size. Aren't
 they supposed to get 10x bigger at higher gen's?

 This situation can't be healthy, can it? Suggestions?





Re: cassandra hit a wall: Too many open files (98567!)

2012-01-18 Thread Sylvain Lebresne
On Fri, Jan 13, 2012 at 8:01 PM, Thorsten von Eicken t...@rightscale.com 
wrote:
 I'm running a single node cassandra 1.0.6 server which hit a wall yesterday:

 ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[CompactionExecutor:2918,1,main] java.io.IOError:
 java.io.FileNotFoundException:
 /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many
 open files in system)

 After that it stopped working and just say there with this error
 (undestandable). I did an lsof and saw that it had 98567 open files,
 yikes! An ls in the data directory shows 234011 files. After restarting
 it spent about 5 hours compacting, then quieted down. About 173k files
 left in the data directory. I'm using leveldb (with compression). I
 looked into the json of the two large CFs and gen 0 is empty, most
 sstables are gen 3  4. I have a total of about 150GB of data
 (compressed). Almost all the SStables are around 3MB in size. Aren't
 they supposed to get 10x bigger at higher gen's?

No, with leveled compaction, the (max) size of sstables is fixed
whatever the generation is (the default is 5MB, but it's 5MB of
uncompressed data (we may change that though) so 3MB sound about
right).
What changes between generations is the number of sstables it can
contain. Gen 1 can have 10 sstables (it can have more but only
temporarily), Gen 2 can have 100, Gen 3 can have 1000 etc.. So again,
that most sstables are in gen 3 and 4 is expected too.

 This situation can't be healthy, can it? Suggestions?

Leveled compaction uses lots of files (the number is proportional to
the amount of data). It is not necessarily a big problem as modern OS
deal wit big amount of open files fairly well (as far as I know at
least). I would just up the file descriptor ulimit and not worry too
much about it, unless you have reasons to believe that it's an actual
descriptor leak (but given the number of files you have, the number of
open ones doesn't seem off so I don't think there is one here) or that
this has performance impacts.

--
Sylvain


Re: cassandra hit a wall: Too many open files (98567!)

2012-01-18 Thread Janne Jalkanen

1.0.6 has a file leak problem, fixed in 1.0.7. Perhaps this is the reason?

https://issues.apache.org/jira/browse/CASSANDRA-3616

/Janne

On Jan 18, 2012, at 03:52 , dir dir wrote:

 Very Interesting Why you open so many file? Actually what kind of
 system that is built by you until open so many files? would you tell us?
 Thanks...
 
 
 On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken t...@rightscale.com 
 wrote:
 I'm running a single node cassandra 1.0.6 server which hit a wall yesterday:
 
 ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[CompactionExecutor:2918,1,main] java.io.IOError:
 java.io.FileNotFoundException:
 /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many
 open files in system)
 
 After that it stopped working and just say there with this error
 (undestandable). I did an lsof and saw that it had 98567 open files,
 yikes! An ls in the data directory shows 234011 files. After restarting
 it spent about 5 hours compacting, then quieted down. About 173k files
 left in the data directory. I'm using leveldb (with compression). I
 looked into the json of the two large CFs and gen 0 is empty, most
 sstables are gen 3  4. I have a total of about 150GB of data
 (compressed). Almost all the SStables are around 3MB in size. Aren't
 they supposed to get 10x bigger at higher gen's?
 
 This situation can't be healthy, can it? Suggestions?
 



Re: cassandra hit a wall: Too many open files (98567!)

2012-01-17 Thread dir dir
Very Interesting Why you open so many file? Actually what kind of
system that is built by you until open so many files? would you tell us?
Thanks...


On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken t...@rightscale.comwrote:

 I'm running a single node cassandra 1.0.6 server which hit a wall
 yesterday:

 ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[CompactionExecutor:2918,1,main] java.io.IOError:
 java.io.FileNotFoundException:
 /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many
 open files in system)

 After that it stopped working and just say there with this error
 (undestandable). I did an lsof and saw that it had 98567 open files,
 yikes! An ls in the data directory shows 234011 files. After restarting
 it spent about 5 hours compacting, then quieted down. About 173k files
 left in the data directory. I'm using leveldb (with compression). I
 looked into the json of the two large CFs and gen 0 is empty, most
 sstables are gen 3  4. I have a total of about 150GB of data
 (compressed). Almost all the SStables are around 3MB in size. Aren't
 they supposed to get 10x bigger at higher gen's?

 This situation can't be healthy, can it? Suggestions?



Re: cassandra hit a wall: Too many open files (98567!)

2012-01-15 Thread aaron morton
That sounds like to many sstables. 

Out of interest were you using multi threaded compaction ? Just wondering about 
this 
https://issues.apache.org/jira/browse/CASSANDRA-3711

Can you set the file handles to unlimited ? 

Can you provide some more info what your see in the data dir incase it is a bug 
in leveled compaction. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/01/2012, at 8:01 AM, Thorsten von Eicken wrote:

 I'm running a single node cassandra 1.0.6 server which hit a wall yesterday:
 
 ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327
 AbstractCassandraDaemon.java (line 133) Fatal exception in thread
 Thread[CompactionExecutor:2918,1,main] java.io.IOError:
 java.io.FileNotFoundException:
 /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many
 open files in system)
 
 After that it stopped working and just say there with this error
 (undestandable). I did an lsof and saw that it had 98567 open files,
 yikes! An ls in the data directory shows 234011 files. After restarting
 it spent about 5 hours compacting, then quieted down. About 173k files
 left in the data directory. I'm using leveldb (with compression). I
 looked into the json of the two large CFs and gen 0 is empty, most
 sstables are gen 3  4. I have a total of about 150GB of data
 (compressed). Almost all the SStables are around 3MB in size. Aren't
 they supposed to get 10x bigger at higher gen's?
 
 This situation can't be healthy, can it? Suggestions?



cassandra hit a wall: Too many open files (98567!)

2012-01-13 Thread Thorsten von Eicken
I'm running a single node cassandra 1.0.6 server which hit a wall yesterday:

ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327
AbstractCassandraDaemon.java (line 133) Fatal exception in thread
Thread[CompactionExecutor:2918,1,main] java.io.IOError:
java.io.FileNotFoundException:
/mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many
open files in system)

After that it stopped working and just say there with this error
(undestandable). I did an lsof and saw that it had 98567 open files,
yikes! An ls in the data directory shows 234011 files. After restarting
it spent about 5 hours compacting, then quieted down. About 173k files
left in the data directory. I'm using leveldb (with compression). I
looked into the json of the two large CFs and gen 0 is empty, most
sstables are gen 3  4. I have a total of about 150GB of data
(compressed). Almost all the SStables are around 3MB in size. Aren't
they supposed to get 10x bigger at higher gen's?

This situation can't be healthy, can it? Suggestions?


Too many open files

2011-07-27 Thread Donna Li
All:

What does the following error mean? One of my cassandra servers print
this error, and nodetool shows the state of the server is down. Netstat
result shows the socket number is very few.

 

WARN [main] 2011-07-27 16:14:04,872 CustomTThreadPoolServer.java (line
104) Transport error occurred during acceptance of message.

org.apache.thrift.transport.TTransportException:
java.net.SocketException: Too many open files

 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:
124)

 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:
35)

 at
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.jav
a:31)

 at
org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadP
oolServer.java:98)

 at
org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:1
83)

 at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:22
4)

Caused by: java.net.SocketException: Too many open files

 at java.net.PlainSocketImpl.socketAccept(Native Method)

 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)

 at java.net.ServerSocket.implAccept(ServerSocket.java:453)

 at java.net.ServerSocket.accept(ServerSocket.java:421)

 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:
119)

 ... 5 more

 

Best Regards

Donna li



Re: Too many open files

2011-07-27 Thread Peter Schuller
 What does the following error mean? One of my cassandra servers print this
 error, and nodetool shows the state of the server is down. Netstat result
 shows the socket number is very few.

The operating system enforced limits have been hit, so Cassandra is
unable to create additional file descriptors (so it can't open files,
TCP connections, etc).

The correct fix is to ensure that Cassandra is running with higher
operating system enforced limits (see ulimit,
/etc/security/limits.conf, etc).

Cassandra is not expected to deal with this type of error gracefully
and you will want to restart nodes that run into this.

-- 
/ Peter Schuller (@scode on twitter)


Re: Too many open files during Repair operation

2011-07-19 Thread Sameer Farooqui
I'm guessing you've seen this already?
http://www.datastax.com/docs/0.8/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files

Check out the # of File Descriptors opened with the lsof- -n | grep java
command.



On Tue, Jul 19, 2011 at 8:30 AM, cbert...@libero.it cbert...@libero.itwrote:

 Hi all.
 In production we want to run nodetool repair but each time we do it we get
 the
 too many open files error.
 We've increased the number of available FD for Cassandra till 8192 but
 still
 we get the same error after few seconds.
 Should I increase it more?

 WARN [Thread-7] 2011-07-19 12:34:00,348 CustomTThreadPoolServer.java (line
 131) Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException: java.net.SocketException:
 Too
 many open files
at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.
 java:124)
at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl
 (TCustomServerSocket.java:68)
at org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl
 (TCustomServerSocket.java:39)
at org.apache.thrift.transport.TServerTransport.accept
 (TServerTransport.java:31)
at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve
 (CustomTThreadPoolServer.java:121)
at org.apache.cassandra.thrift.CassandraDaemon$ThriftServer.run
 (CassandraDaemon.java:155)
 Caused by: java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.
 java:119)
... 5 more


 nodetool repair keyspacename -h host

 Cassandra 0.7.5, 1 cluster, 5 nodes. Each node give the same output.
 One more question: when repair start throwing this kind of exceptions (very
 fast) we stop the process of repair ... is it dangerous for data?

 Best Regards

 Carlo



Re: Too many open files during Repair operation

2011-07-19 Thread Attila Babo
If you are using Linux, especially Ubuntu, check the linked document
below. This is my favorite: Using sudo has side effects in terms of
open file limits. On Ubuntu they’ll be reset to 1024, no matter what’s
set in /etc/security/limits.conf

http://wiki.basho.com/Open-Files-Limit.html

/Attila


Re: too many open files - maybe a fd leak in indexslicequeries

2011-04-05 Thread Jonathan Ellis
sounds like they haven't been munmapped yet.  try forcing a GC.

On Sat, Apr 2, 2011 at 5:38 AM, Roland Gude roland.g...@yoochoose.com wrote:

 Hi,

 The open file limit is 1024
 Sstable count is somewhere around 20 or so thread count is in the same order 
 of magnitude I guess
 But lsof shows that deleted sstables still have open file handles. This seems 
 to be the issue as this number keeps growing.
 Any ideas?

 Roland.

 -Ursprüngliche Nachricht-
 Von: Jonathan Ellis [mailto:jbel...@gmail.com]
 Gesendet: Freitag, 1. April 2011 06:07
 An: user@cassandra.apache.org
 Cc: Roland Gude; Juergen Link; Johannes Hoerle
 Betreff: Re: too many open files - maybe a fd leak in indexslicequeries

 Index queries (ColumnFamilyStore.scan) don't do any low-level i/o
 themselves, they go through CFS.getColumnFamily, which is what normal
 row fetches also go through.  So if there is a leak there it's
 unlikely to be specific to indexes.

 What is your open-file limit (remember that sockets count towards
 this), thread count, sstable count?

 On Thu, Mar 31, 2011 at 4:15 PM, Roland Gude roland.g...@yoochoose.com 
 wrote:
 I experience something that looks exactly like
 https://issues.apache.org/jira/browse/CASSANDRA-1178

 On cassandra 0.7.3 when using index slice queries (lots of them)

 Crashing multiple nodes and rendering the cluster useless. But I have no
 clue where to look if index queries still leak fd



 Does anybody know about it?

 Where could I look?



 Greetings,

 roland



 --

 YOOCHOOSE GmbH



 Roland Gude

 Software Engineer



 Im Mediapark 8, 50670 Köln



 +49 221 4544151 (Tel)

 +49 221 4544159 (Fax)

 +49 171 7894057 (Mobil)





 Email: roland.g...@yoochoose.com

 WWW: www.yoochoose.com



 YOOCHOOSE GmbH

 Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann

 Handelsregister: Amtsgericht Köln HRB 65275

 Ust-Ident-Nr: DE 264 773 520

 Sitz der Gesellschaft: Köln





 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com






-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


AW: too many open files - maybe a fd leak in indexslicequeries

2011-04-02 Thread Roland Gude

Hi,

The open file limit is 1024
Sstable count is somewhere around 20 or so thread count is in the same order of 
magnitude I guess
But lsof shows that deleted sstables still have open file handles. This seems 
to be the issue as this number keeps growing.
Any ideas?

Roland.

-Ursprüngliche Nachricht-
Von: Jonathan Ellis [mailto:jbel...@gmail.com] 
Gesendet: Freitag, 1. April 2011 06:07
An: user@cassandra.apache.org
Cc: Roland Gude; Juergen Link; Johannes Hoerle
Betreff: Re: too many open files - maybe a fd leak in indexslicequeries

Index queries (ColumnFamilyStore.scan) don't do any low-level i/o
themselves, they go through CFS.getColumnFamily, which is what normal
row fetches also go through.  So if there is a leak there it's
unlikely to be specific to indexes.

What is your open-file limit (remember that sockets count towards
this), thread count, sstable count?

On Thu, Mar 31, 2011 at 4:15 PM, Roland Gude roland.g...@yoochoose.com wrote:
 I experience something that looks exactly like
 https://issues.apache.org/jira/browse/CASSANDRA-1178

 On cassandra 0.7.3 when using index slice queries (lots of them)

 Crashing multiple nodes and rendering the cluster useless. But I have no
 clue where to look if index queries still leak fd



 Does anybody know about it?

 Where could I look?



 Greetings,

 roland



 --

 YOOCHOOSE GmbH



 Roland Gude

 Software Engineer



 Im Mediapark 8, 50670 Köln



 +49 221 4544151 (Tel)

 +49 221 4544159 (Fax)

 +49 171 7894057 (Mobil)





 Email: roland.g...@yoochoose.com

 WWW: www.yoochoose.com



 YOOCHOOSE GmbH

 Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann

 Handelsregister: Amtsgericht Köln HRB 65275

 Ust-Ident-Nr: DE 264 773 520

 Sitz der Gesellschaft: Köln





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com




too many open files - maybe a fd leak in indexslicequeries

2011-03-31 Thread Roland Gude
I experience something that looks exactly like 
https://issues.apache.org/jira/browse/CASSANDRA-1178
On cassandra 0.7.3 when using index slice queries (lots of them)
Crashing multiple nodes and rendering the cluster useless. But I have no clue 
where to look if index queries still leak fd

Does anybody know about it?
Where could I look?

Greetings,
roland

--
YOOCHOOSE GmbH

Roland Gude
Software Engineer

Im Mediapark 8, 50670 Köln

+49 221 4544151 (Tel)
+49 221 4544159 (Fax)
+49 171 7894057 (Mobil)


Email: roland.g...@yoochoose.com
WWW: www.yoochoose.comhttp://www.yoochoose.com/

YOOCHOOSE GmbH
Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann
Handelsregister: Amtsgericht Köln HRB 65275
Ust-Ident-Nr: DE 264 773 520
Sitz der Gesellschaft: Köln



Re: too many open files - maybe a fd leak in indexslicequeries

2011-03-31 Thread Jonathan Ellis
Index queries (ColumnFamilyStore.scan) don't do any low-level i/o
themselves, they go through CFS.getColumnFamily, which is what normal
row fetches also go through.  So if there is a leak there it's
unlikely to be specific to indexes.

What is your open-file limit (remember that sockets count towards
this), thread count, sstable count?

On Thu, Mar 31, 2011 at 4:15 PM, Roland Gude roland.g...@yoochoose.com wrote:
 I experience something that looks exactly like
 https://issues.apache.org/jira/browse/CASSANDRA-1178

 On cassandra 0.7.3 when using index slice queries (lots of them)

 Crashing multiple nodes and rendering the cluster useless. But I have no
 clue where to look if index queries still leak fd



 Does anybody know about it?

 Where could I look?



 Greetings,

 roland



 --

 YOOCHOOSE GmbH



 Roland Gude

 Software Engineer



 Im Mediapark 8, 50670 Köln



 +49 221 4544151 (Tel)

 +49 221 4544159 (Fax)

 +49 171 7894057 (Mobil)





 Email: roland.g...@yoochoose.com

 WWW: www.yoochoose.com



 YOOCHOOSE GmbH

 Geschäftsführer: Dr. Uwe Alkemper, Michael Friedmann

 Handelsregister: Amtsgericht Köln HRB 65275

 Ust-Ident-Nr: DE 264 773 520

 Sitz der Gesellschaft: Köln





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Amin Sakka, Novapost
I increased the amount of the allowed file descriptors to unlimted.
Now, I get exactly the same exception after 3.50 rows :

*CustomTThreadPoolServer.java (line 104) Transport error occurred during
acceptance of message.*
*org.apache.thrift.transport.TTransportException: java.net.SocketException:
Too many open files*
*
*
What worries me is this / by zero exception when I try to restart cassandra
! At least, I want to backup the 3.50 rows to continue then my
insertion, is there a way to do this?

*
 Exception encountered during startup.
java.lang.ArithmeticException: / by zero
 at
org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)

*


Thanks.
*
*





2010/12/15 Jake Luciani jak...@gmail.com


 http://www.riptano.com/docs/0.6/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files



 On Wed, Dec 15, 2010 at 11:13 AM, Amin Sakka, Novapost 
 amin.sa...@novapost.fr wrote:

 *Hello,*
 *I'm using cassandra 0.7.0 rc1, a single node configuration, replication
 factor 1, random partitioner, 2 GO heap size.*
 *I ran my hector client to insert 5.000.000 rows but after a couple of
 hours, the following Exception occurs : *


  WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
 104) Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException: java.net.SocketException:
 Too many open files
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
 at
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
 at
 org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 Caused by: java.net.SocketException: Too many open files
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
 at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


 *When I try to restart Cassandra, I have the following exception :*


 ERROR 16:42:26,573 Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
 at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
 at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


 I am looking for advice on how to debug this.

 Thanks,

 --

 Amin








-- 

Amin SAKKA
Research and Development Engineer
32 rue de Paradis, 75010 Paris
*Tel:* +33 (0)6 34 14 19 25
*Mail:* amin.sa...@novapost.fr
*Web:* www.novapost.fr / www.novapost-rh.fr


Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Germán Kondolf
Be careful with the unlimited value on ulimit, you could end up with a
unresponsive server... I mean, you could not even connect via ssh if you
don't have enough handles.

On Thu, Dec 16, 2010 at 9:59 AM, Amin Sakka, Novapost 
amin.sa...@novapost.fr wrote:


 I increased the amount of the allowed file descriptors to unlimted.
 Now, I get exactly the same exception after 3.50 rows :

 *CustomTThreadPoolServer.java (line 104) Transport error occurred during
 acceptance of message.*
 *org.apache.thrift.transport.TTransportException:
 java.net.SocketException: Too many open files*
 *
 *
 What worries me is this / by zero exception when I try to restart cassandra
 ! At least, I want to backup the 3.50 rows to continue then my
 insertion, is there a way to do this?

 *
  Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
  at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)

 *


 Thanks.
 *
 *





 2010/12/15 Jake Luciani jak...@gmail.com


 http://www.riptano.com/docs/0.6/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files



 On Wed, Dec 15, 2010 at 11:13 AM, Amin Sakka, Novapost 
 amin.sa...@novapost.fr wrote:

 *Hello,*
 *I'm using cassandra 0.7.0 rc1, a single node configuration, replication
 factor 1, random partitioner, 2 GO heap size.*
 *I ran my hector client to insert 5.000.000 rows but after a couple of
 hours, the following Exception occurs : *


  WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
 104) Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException:
 java.net.SocketException: Too many open files
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
 at
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
 at
 org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 Caused by: java.net.SocketException: Too many open files
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
 at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


 *When I try to restart Cassandra, I have the following exception :*


 ERROR 16:42:26,573 Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
 at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
 at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


 I am looking for advice on how to debug this.

 Thanks,

 --

 Amin








 --

 Amin SAKKA
 Research and Development Engineer
 32 rue de Paradis, 75010 Paris
 *Tel:* +33 (0)6 34 14 19 25
 *Mail:* amin.sa...@novapost.fr
 *Web:* www.novapost.fr / www.novapost-rh.fr







-- 
//GK
german.kond...@gmail.com
// sites
http://twitter.com/germanklf
http://www.facebook.com/germanklf
http://ar.linkedin.com/in/germankondolf


Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Jake Luciani
how many sstable Data.db files do you see in your system and how big are
they?

Also, how big are the rows you are inserting?


On Thu, Dec 16, 2010 at 7:59 AM, Amin Sakka, Novapost 
amin.sa...@novapost.fr wrote:


 I increased the amount of the allowed file descriptors to unlimted.
 Now, I get exactly the same exception after 3.50 rows :

 *CustomTThreadPoolServer.java (line 104) Transport error occurred during
 acceptance of message.*
 *org.apache.thrift.transport.TTransportException:
 java.net.SocketException: Too many open files*
 *
 *
 What worries me is this / by zero exception when I try to restart cassandra
 ! At least, I want to backup the 3.50 rows to continue then my
 insertion, is there a way to do this?

 *
  Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
  at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)

 *


 Thanks.
 *
 *





 2010/12/15 Jake Luciani jak...@gmail.com


 http://www.riptano.com/docs/0.6/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files



 On Wed, Dec 15, 2010 at 11:13 AM, Amin Sakka, Novapost 
 amin.sa...@novapost.fr wrote:

 *Hello,*
 *I'm using cassandra 0.7.0 rc1, a single node configuration, replication
 factor 1, random partitioner, 2 GO heap size.*
 *I ran my hector client to insert 5.000.000 rows but after a couple of
 hours, the following Exception occurs : *


  WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
 104) Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException:
 java.net.SocketException: Too many open files
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
 at
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
 at
 org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 Caused by: java.net.SocketException: Too many open files
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
 at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


 *When I try to restart Cassandra, I have the following exception :*


 ERROR 16:42:26,573 Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
 at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
 at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


 I am looking for advice on how to debug this.

 Thanks,

 --

 Amin








 --

 Amin SAKKA
 Research and Development Engineer
 32 rue de Paradis, 75010 Paris
 *Tel:* +33 (0)6 34 14 19 25
 *Mail:* amin.sa...@novapost.fr
 *Web:* www.novapost.fr / www.novapost-rh.fr







Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Ryan King
Are you creating a new connection for each row you insert (and if so
are you closing it)?

-ryan

On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost
amin.sa...@novapost.fr wrote:
 Hello,
 I'm using cassandra 0.7.0 rc1, a single node configuration, replication
 factor 1, random partitioner, 2 GO heap size.
 I ran my hector client to insert 5.000.000 rows but after a couple of hours,
 the following Exception occurs :

  WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line 104)
 Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException: java.net.SocketException:
 Too many open files
 at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
 at
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
 at
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
 at
 org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 Caused by: java.net.SocketException: Too many open files
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
 at java.net.ServerSocket.implAccept(ServerSocket.java:453)
 at java.net.ServerSocket.accept(ServerSocket.java:421)
 at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)

 When I try to restart Cassandra, I have the following exception :

 ERROR 16:42:26,573 Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
 at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
 at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
 at org.apache.cassandra.db.Table.initCf(Table.java:341)
 at org.apache.cassandra.db.Table.init(Table.java:283)
 at org.apache.cassandra.db.Table.open(Table.java:114)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
 at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)

 I am looking for advice on how to debug this.

 Thanks,
 --

 Amin







Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Nate McCall
You probably want to switch to using mutator#addInsertion for some
number of iterations (start with 1000 and adjust as needed), then
calling execute(). This will be much more efficient.

On Thu, Dec 16, 2010 at 11:39 AM, Amin Sakka, Novapost
amin.sa...@novapost.fr wrote:

 I'm using a unique client instance (using Hector) and a unique connection to
 cassandra.
 For each insertion I'm using a new mutator and then I release it.
 I have 473  sstable Data.db, the average size of each is 30Mo.



 2010/12/16 Ryan King r...@twitter.com

 Are you creating a new connection for each row you insert (and if so
 are you closing it)?

 -ryan

 On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost
 amin.sa...@novapost.fr wrote:
  Hello,
  I'm using cassandra 0.7.0 rc1, a single node configuration, replication
  factor 1, random partitioner, 2 GO heap size.
  I ran my hector client to insert 5.000.000 rows but after a couple of
  hours,
  the following Exception occurs :
 
   WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
  104)
  Transport error occurred during acceptance of message.
  org.apache.thrift.transport.TTransportException:
  java.net.SocketException:
  Too many open files
  at
 
  org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
  at
 
  org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 
  org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
  at
 
  org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 
  org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
  Caused by: java.net.SocketException: Too many open files
  at java.net.PlainSocketImpl.socketAccept(Native Method)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
  at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 
  org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
 
  When I try to restart Cassandra, I have the following exception :
 
  ERROR 16:42:26,573 Exception encountered during startup.
  java.lang.ArithmeticException: / by zero
  at
 
  org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 
  org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
  at
 
  org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
  at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 
  I am looking for advice on how to debug this.
 
  Thanks,
  --
 
  Amin
 
 
 
 
 



 --
 Amin






Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Germán Kondolf
Indeed Hector has a connection pool behind it, I think it uses 50
connectios per node.
But also uses a node to discover the others, I assume that, as I saw
connections from my app to nodes that I didn't configure in Hector.

So, you may check the fds in OS level to see if there is a bottleneck there.

On Thu, Dec 16, 2010 at 2:39 PM, Amin Sakka, Novapost
amin.sa...@novapost.fr wrote:

 I'm using a unique client instance (using Hector) and a unique connection to
 cassandra.
 For each insertion I'm using a new mutator and then I release it.
 I have 473  sstable Data.db, the average size of each is 30Mo.



 2010/12/16 Ryan King r...@twitter.com

 Are you creating a new connection for each row you insert (and if so
 are you closing it)?

 -ryan

 On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost
 amin.sa...@novapost.fr wrote:
  Hello,
  I'm using cassandra 0.7.0 rc1, a single node configuration, replication
  factor 1, random partitioner, 2 GO heap size.
  I ran my hector client to insert 5.000.000 rows but after a couple of
  hours,
  the following Exception occurs :
 
   WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
  104)
  Transport error occurred during acceptance of message.
  org.apache.thrift.transport.TTransportException:
  java.net.SocketException:
  Too many open files
  at
 
  org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
  at
 
  org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 
  org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
  at
 
  org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 
  org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
  Caused by: java.net.SocketException: Too many open files
  at java.net.PlainSocketImpl.socketAccept(Native Method)
  at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
  at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 
  org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
 
  When I try to restart Cassandra, I have the following exception :
 
  ERROR 16:42:26,573 Exception encountered during startup.
  java.lang.ArithmeticException: / by zero
  at
 
  org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 
  org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
  at
 
  org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 
  org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
  at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
  at
 
  org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 
  I am looking for advice on how to debug this.
 
  Thanks,
  --
 
  Amin
 
 
 
 
 



 --
 Amin







-- 
//GK
german.kond...@gmail.com
// sites
http://twitter.com/germanklf
http://www.facebook.com/germanklf
http://ar.linkedin.com/in/germankondolf


Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-15 Thread Amin Sakka, Novapost
*Hello,*
*I'm using cassandra 0.7.0 rc1, a single node configuration, replication
factor 1, random partitioner, 2 GO heap size.*
*I ran my hector client to insert 5.000.000 rows but after a couple of
hours, the following Exception occurs : *


 WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line 104)
Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Too many open files
 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
 at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
at
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
 at
org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
at
org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
 at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
Caused by: java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
 at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


*When I try to restart Cassandra, I have the following exception :*


ERROR 16:42:26,573 Exception encountered during startup.
java.lang.ArithmeticException: / by zero
at
org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
 at
org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
 at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
 at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
 at org.apache.cassandra.db.Table.initCf(Table.java:341)
at org.apache.cassandra.db.Table.init(Table.java:283)
 at org.apache.cassandra.db.Table.open(Table.java:114)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
 at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
 at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


*I am looking for advice on how to debug this, any ideas please?

Thanks, *
-- 

Amin


Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-15 Thread Jake Luciani
http://www.riptano.com/docs/0.6/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files



On Wed, Dec 15, 2010 at 11:13 AM, Amin Sakka, Novapost 
amin.sa...@novapost.fr wrote:

 *Hello,*
 *I'm using cassandra 0.7.0 rc1, a single node configuration, replication
 factor 1, random partitioner, 2 GO heap size.*
 *I ran my hector client to insert 5.000.000 rows but after a couple of
 hours, the following Exception occurs : *


  WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line
 104) Transport error occurred during acceptance of message.
 org.apache.thrift.transport.TTransportException: java.net.SocketException:
 Too many open files
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
 at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
  at
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
 at
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
  at
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
 at
 org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
  at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
 at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
 Caused by: java.net.SocketException: Too many open files
 at java.net.PlainSocketImpl.socketAccept(Native Method)
 at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
  at java.net.ServerSocket.implAccept(ServerSocket.java:453)
 at java.net.ServerSocket.accept(ServerSocket.java:421)
  at
 org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


 *When I try to restart Cassandra, I have the following exception :*


 ERROR 16:42:26,573 Exception encountered during startup.
 java.lang.ArithmeticException: / by zero
 at
 org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
  at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
  at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
  at
 org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
 at
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
  at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
  at org.apache.cassandra.db.Table.initCf(Table.java:341)
 at org.apache.cassandra.db.Table.init(Table.java:283)
  at org.apache.cassandra.db.Table.open(Table.java:114)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
  at
 org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
 at
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
  at
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


 I am looking for advice on how to debug this.

 Thanks,

 --

 Amin







Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-15 Thread Amin Sakka, Novapost
*Hello,*
*I'm using cassandra 0.7.0 rc1, a single node configuration, replication
factor 1, random partitioner, 2 GO heap size.*
*I ran my hector client to insert 5.000.000 rows but after a couple of
hours, the following Exception occurs : *


 WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java (line 104)
Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException:
Too many open files
 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:67)
 at
org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:38)
at
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
 at
org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
at
org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:120)
 at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:229)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)
Caused by: java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
 at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
 at
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)


*When I try to restart Cassandra, I have the following exception :*


ERROR 16:42:26,573 Exception encountered during startup.
java.lang.ArithmeticException: / by zero
at
org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:233)
 at
org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284)
at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200)
 at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:225)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
 at
org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java:306)
at
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:246)
 at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:449)
at
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:437)
 at org.apache.cassandra.db.Table.initCf(Table.java:341)
at org.apache.cassandra.db.Table.init(Table.java:283)
 at org.apache.cassandra.db.Table.open(Table.java:114)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:138)
 at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:216)
 at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)


I am looking for advice on how to debug this.

Thanks,

-- 

Amin


Re: too many open files 0.7.0 beta1

2010-08-26 Thread Aaron Morton
That looks like it. I've pushed the limits up to 65k and turned down the testing for now. Otherwise machines were dropping like flies.Thanks.AaronOn 26 Aug, 2010,at 04:16 PM, Dan Washusen d...@reactive.org wrote:Maybe you're seeing this: https://issues.apache.org/jira/browse/CASSANDRA-1416On Thu, Aug 26, 2010 at 2:05 PM, Aaron Morton aa...@thelastpickle.com wrote:
Under 0.7.0 beta1 am seeing cassandra run out of files handles...Caused by: java.io.FileNotFoundException: /local1/junkbox/cassandra/data/junkbox.wetafx.co.nz/ObjectIndex-e-31-Index.db (Too many open files)
   at java.ioRandomAccessFile.open(Native Method)   at java.io.RandomAccessFile.init(RandomAccessFile.java:212)   at java.io.RandomAccessFile.init(RandomAccessFile.java:98)
   at org.apache.cassandra.io.util.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:142)If I look at the filedescriptorsfor the process I can see it already has 1,958 for to the file
sudo ls -l /proc/20862/fd | grep "ObjectIndex-e-31-Data.db" | wc -l1958Out of a total of 2044.Other nodes in the cluster have a similar number of fd's - around 2k with the majority to one SSTable.
I did not experience this under 0.6 so just checking if this sounds OK and I should just increase the number of handles or if it's a bug?ThanksAaron



too many open files 0.7.0 beta1

2010-08-25 Thread Aaron Morton
Under 0.7.0 beta1 am seeing cassandra run out of files handles...Caused by: java.io.FileNotFoundException: /local1/junkbox/cassandra/data/junkbox.wetafx.co.nz/ObjectIndex-e-31-Index.db (Too many open files)   at java.ioRandomAccessFile.open(Native Method)   at java.io.RandomAccessFile.init(RandomAccessFile.java:212)   at java.io.RandomAccessFile.init(RandomAccessFile.java:98)   at org.apache.cassandra.io.util.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:142)If I look at the filedescriptorsfor the process I can see it already has 1,958 for to the filesudo ls -l /proc/20862/fd | grep "ObjectIndex-e-31-Data.db" | wc -l1958Out of a total of 2044.Other nodes in the cluster have a similar number of fd's - around 2k with the majority to one SSTable.I did not experience this under 0.6 so just checking if this sounds OK and I should just increase the number of handles or if it's a bug?ThanksAaron

Re: too many open files 0.7.0 beta1

2010-08-25 Thread Dan Washusen
Maybe you're seeing this:
https://issues.apache.org/jira/browse/CASSANDRA-1416

On Thu, Aug 26, 2010 at 2:05 PM, Aaron Morton aa...@thelastpickle.comwrote:

 Under 0.7.0 beta1 am seeing cassandra run out of files handles...

 Caused by: java.io.FileNotFoundException: /local1/junkbox/cassandra/data/
 junkbox.wetafx.co.nz/ObjectIndex-e-31-Index.db (Too many open files)
 at java.ioRandomAccessFile.open(Native Method)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
 at java.io.RandomAccessFile.init(RandomAccessFile.java:98)
 at
 org.apache.cassandra.io.util.BufferedRandomAccessFile.init(BufferedRandomAccessFile.java:142)

 If I look at the file descriptors for the process I can see it already has
 1,958 for to the file

 sudo ls -l /proc/20862/fd | grep ObjectIndex-e-31-Data.db |  wc -l
 1958

 Out of a total of 2044.

 Other nodes in the cluster have a similar number of fd's - around 2k with
 the majority to one SSTable.

 I did not experience this under 0.6 so just checking if this sounds OK and
 I should just increase the number of handles or if it's a bug?

 Thanks
 Aaron






Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jonathan Ellis
socketexception means this is coming from the network, not the sstables

knowing the full error message would be nice, but just about any
problem on that end should be fixed by adding connection pooling to
your client.

(moving to user@)

On Wed, Jul 14, 2010 at 5:09 AM, Thomas Downing
tdown...@proteus-technologies.com wrote:
 On 7/13/2010 9:20 AM, Jonathan Ellis wrote:

 On Tue, Jul 13, 2010 at 4:19 AM, Thomas Downing
 tdown...@proteus-technologies.com  wrote:


 On a related note:  I am running some feasibility tests looking for
 high ingest rate capabilities.  While testing Cassandra the problem
 I've encountered is that it runs out of file handles during compaction.


 This usually just means increase the allowed fh via ulimit/etc.

 Increasing the memtable thresholds so that you create less sstables,
 but larger ones, is also a good idea.  The defaults are small so
 Cassandra can work on a 1GB heap which is much smaller than most
 production ones.  Reasonable rule of thumb: if you have a heap of N
 GB, increase both the throughput and count thresholds by N times.



 Thanks for the suggestion.  I gave it a whirl, but no go.  The file handles
 in
 in use stayed at around 500 for the first 30M or so mutates, then within
 4 seconds they jumped to about 800, stayed there for about 30 seconds,
 then within 5 seconds went over 2022, at which point the server entered
 the cycle of SocketException: Too many open files.  Interesting thing is
 that the file limit for this process is 32768.  Note the numbers below as
 well.

 If there is anything specific you would like me to try, let me know.

 Seems like there's some sort of non-linear behavior here.  This behavior is
 the same as before I multiplied the Cassandra params by 4 (number of G);
 which leads me to think that increasing limits, whether files or Cassandra
 parameters is likely to be a tail-chasing excercise.

 This causes time-out exceptions at the client.  On this exception, my client
 closes the connection, waits a bit, then retries.  After a few hours of this
 the server still had not recovered.

 I killed the clients, and watched the server after that.  The file handles
 open
 dropped by 8, and have stayed there.  The server is, of course, not throwing
 SocketException any more.  On the other hand, the server is not doing any
 thing at all.

 When there is no client activity, and the server is idle, there are 155
 threads
 running in the JVM.  The all are in one of three states, almost all blocked
 at
 futex( ),  a few blocked at accept( ) , a few cycling over timeout on
 futex(),
 gettimeofday(), futex() ... None are blocked at IO.  I can't attach a
 debugger,
 I get IO exceptions trying either socket or local connections, no surprise,
 so I don't know of a way to get the Java code where the threads are
 blocking.

 More than one fd can be open on a given file, and many of open fd's are
 on files that have been deleted.  The stale fd's are all on Data.db files in
 the
 data directory, which I have separate from the commit log directory.

 I haven't had a chance to look at the code handling files, and I am not any
 sort of Java expert, but could this be due to Java's lazy resource clean up?
 I wonder if when considering writing your own file handling classes for
 O_DIRECT or posix_fadvise or whatever, an explicit close(2) might help.

 A restart of the client causes immediate SocketExceptions at the server and
 timeouts at the client.

 I noted on the restart that the open fd's jumped by 32, despite only making
 4 connections.  At this point, there were 2028 open files - more than there
 where when the exceptions began at 2002 open files.  So it seems like the
 exception is not caused by the OS returning EMFILE - unless it was returning
 EMFILE for some strange reason, and the bump in open files is due to an
 increase in duplicate open files.  (BTW, it's not ENFILE!).

 I also noted that although the TimeoutExceptions did not  occur immediately
 on the client, the SocketExceptions began immediately on the server.  This
 does not seem to match up.  I am using the org.apache.cassandra.thrift API
 directly, not any higher level wrapper.

 Finally, this jump to 2028 on the restart caused a new symptom.  I only had
 the client running a few seconds, but after 15 minutes, the server is still
 throwing
 exceptions, even though the open file handles immediately dropped from
 2028 down to 1967.

 Thanks for your attention, and all your work,

 Thomas Downing




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Peter Schuller
 [snip]
 I'm not sure that is the case.

 When the server gets into the unrecoverable state, the repeating exceptions
 are indeed SocketException: Too many open files.
[snip]
 Although this is unquestionably a network error,  I don't think it is
 actually a
 network problem per se, as the maximum number of sockets open by the
 Cassandra server is at this point is about 8.  When I kill the client,
 sockets
 held are just the listening sockets - no sockets in ESTABLISHED or
 TIMED_WAIT.

Is this based on netstat or lsof or similar? When the node is in the
state of giving these errors, try inspecting /proc/pid/fd or use
lsof. Presumably you'll see thousands of fds of some category; either
sockets or files.

(If you already did this, sorry!)

-- 
/ Peter Schuller


Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jorge Barrios
Thomas, I had a similar problem a few weeks back. I changed my code to make
sure that each thread only creates and uses one Hector connection. It seems
that client sockets are not being released properly, but I didn't have the
time to dig into it.

Jorge

On Wed, Jul 14, 2010 at 8:28 AM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  [snip]
  I'm not sure that is the case.
 
  When the server gets into the unrecoverable state, the repeating
 exceptions
  are indeed SocketException: Too many open files.
 [snip]
  Although this is unquestionably a network error,  I don't think it is
  actually a
  network problem per se, as the maximum number of sockets open by the
  Cassandra server is at this point is about 8.  When I kill the client,
  sockets
  held are just the listening sockets - no sockets in ESTABLISHED or
  TIMED_WAIT.

 Is this based on netstat or lsof or similar? When the node is in the
 state of giving these errors, try inspecting /proc/pid/fd or use
 lsof. Presumably you'll see thousands of fds of some category; either
 sockets or files.

 (If you already did this, sorry!)

 --
 / Peter Schuller



Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]

2010-07-14 Thread Jorge Barrios
Each of my top-level functions was allocating a Hector client connection at
the top, and releasing it when returning. The problem arose when a top-level
function had to call another top-level function, which led to the same
thread allocating two connections. Hector was not releasing one of them even
though I was explicitly requesting them to be released. This might have been
fixed since then, and like I said, I didn't dig into why it was happening. I
just made sure to pass along the connection instances as necessary and the
problem went away.

On Wed, Jul 14, 2010 at 11:40 AM, shimi shim...@gmail.com wrote:

 do you mean that you don't release the connection back to fhe pool?

 On 2010 7 14 20:51, Jorge Barrios jo...@tapulous.com wrote:

 Thomas, I had a similar problem a few weeks back. I changed my code to make
 sure that each thread only creates and uses one Hector connection. It seems
 that client sockets are not being released properly, but I didn't have the
 time to dig into it.

 Jorge



 On Wed, Jul 14, 2010 at 8:28 AM, Peter Schuller 
 peter.schul...@infidyne.com wrote:
 
   [snip]
 ...