AW: The changing clustering key

2017-04-06 Thread j.kesten
Hi,

your primary goal is to fetch a user by dept_id and user_id and additionally 
keep versions of the user data?

{
   dept_id text,
   user_id text,
   mod_date timestamp,
   user_name text,
   PRIMARY KEY ((dept_id,user_id), mod_date)
   WITH CLUSTERING ORDER BY (mod_date DESC);
}

There is a difference between partition key and cluster keys. My suggestion 
will end up with all versions of a particular (dept_id,user_id) on a partition 
(say node) and all versions of your data on that portion in descending order by 
mod_date. 

For a normal loopkup you do not need to know mod_date, a simple SELECT * FROM 
users WHERE dept_id=foo and user_id=bar LIMIT 1 will do.

http://datascale.io/cassandra-partitioning-and-clustering-keys-explained/



Gesendet von meinem Windows 10 Phone

Von: Monmohan Singh
Gesendet: Donnerstag, 6. April 2017 13:54
An: user@cassandra.apache.org
Betreff: The changing clustering key

Dear Cassandra experts,
I have a data modeling question for cases where data needs to be sorted by keys 
which can be modified.
So , say we have a user table
{
   dept_id text,
   user_id text,
   user_name text,
   mod_date timestamp
   PRIMARY KEY (dept_id,user_id)
}
Now I can query cassandra to get all users by a dept_id
What if I wanted to query to get all users in a dept, sorted by mod_date.
So, one way would be to
{
   dept_id text,
   user_id text,
   mod_date timestamp,
   user_name text,
   PRIMARY KEY (dept_id,user_id, mod_date)
}
But, mod_date changes every time user name is updated. So it can't be part of 
clustering key.

Attempt 1:  Don't update the row but instead create new record for every 
update. So, say the record for user foo is like below
{'dept_id1','user_id1',TimeStamp1','foo'} and then the name was changed to 
'bar' and then to 'baz' . In that case we add another row to table, so the 
table data would look like

{'dept_id1','user_id1',TimeStamp3','baz'}
{'dept_id1','user_id1',TimeStamp2','bar'}
{'dept_id1','user_id1',TimeStamp1','foo'}

Now we can get all users in a dept, sorted by mod_date but it presents a 
different problem. The data returned is duplicated. 

Attempt 2 : Add another column to identify the head record much like a linked 
list
{
   dept_id text,
   user_id text,
   mod_date timestamp,
   user_name text,
   next_record text
   PRIMARY KEY (user_id,user_id, mod_date)
}
Every time an update happens it adds a row and also adds the PK of new record 
except in the latest record.

{'dept_id1','user_id1',TimeStamp3','baz','HEAD'}
{'dept_id1','user_id1',TimeStamp2','bar','dept_id1#user_id1#TimeStamp3'}
{'dept_id1','user_id1',TimeStamp1','foo','dept_id1#user_id1#TimeStamp2'}
and also add a secondary index to 'next_record' column.

Now I can support get all users in a dept, sorted by mod_date by
SELECT * from USERS where dept_id=':dept' AND next_record='HEAD' order by 
mod_date.

But it looks fairly involved solution and perhaps I am missing something , a 
simpler solution ..

The other option is delete and insert but for high frequency changes I think 
Cassandra has issues with tombstones.

Thanks for helping on this.
Regards
Monmohan




Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-06 Thread Thakrar, Jayesh
I had asked a similar/related question - on how to carry out repair, etc and 
got some useful pointers.
I would highly recommend the youtube video or the slideshare link below (both 
are for the same presentation).

https://www.youtube.com/watch?v=1Sz_K8UID6E

http://www.slideshare.net/DataStax/real-world-repairs-vinay-chella-netflix-cassandra-summit-2016

https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/

https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsRepair.html

https://www.datastax.com/dev/blog/repair-in-cassandra




From: eugene miretsky 
Date: Thursday, April 6, 2017 at 3:35 PM
To: 
Subject: Why are automatic anti-entropy repairs required when hinted hand-off 
is enabled?

Hi,

As I see it, if hinted handoff is enabled, the only time data can be 
inconsistent is when:

  1.  A node is down for longer than the max_hint_window
  2.  The coordinator node crushes before all the hints have been replayed
Why is it still recommended to perform frequent automatic repairs, as well as 
enable read repair? Can't I just run a repair after one of the nodes is down? 
The only problem I see with this approach is a long repair job (instead of 
small incremental repairs). But other than that, are there any other 
issues/corner-cases?

Cheers,
Eugene


Re: The changing clustering key

2017-04-06 Thread Monmohan Singh
thanks for the pointer. Let me readup more on Materialized views and see if
that helps solve our problem. I do know that its not supported in our
current version 2.2.x but I can explore moving to Cassandra 3.

On Fri, 7 Apr 2017 at 04:22 Eric Stevens  wrote:

> Just curious if you've looked at materialized views.  Something like:
>
> CREATE MATERIALIZED VIEW users_by_mod_date AS
>SELECT dept_id,mod_date,user_id,user_name FROM users
>WHERE mod_date IS NOT NULL
>PRIMARY KEY (dept_id,mod_date,user_id)
>WITH CLUSTERING ORDER BY (mod_date desc)
>
> On Thu, Apr 6, 2017 at 5:54 AM Monmohan Singh  wrote:
>
> Dear Cassandra experts,
> I have a data modeling question for cases where data needs to be sorted by
> keys which can be modified.
> So , say we have a user table
> {
>dept_id text,
>user_id text,
>user_name text,
>mod_date timestamp
>PRIMARY KEY (dept_id,user_id)
> }
> Now I can query cassandra to get all users by a dept_id
> What if I wanted to query to get all users in a dept, sorted by mod_date.
> So, one way would be to
> {
>dept_id text,
>user_id text,
>mod_date timestamp,
>user_name text,
>PRIMARY KEY (dept_id,user_id, mod_date)
> }
> But, mod_date changes every time user name is updated. So it can't be part
> of clustering key.
>
> Attempt 1:  Don't update the row but instead create new record for every
> update. So, say the record for user foo is like below
> {'dept_id1','user_id1',TimeStamp1','foo'} and then the name was changed to
> 'bar' and then to 'baz' . In that case we add another row to table, so the
> table data would look like
>
> {'dept_id1','user_id1',TimeStamp3','baz'}
> {'dept_id1','user_id1',TimeStamp2','bar'}
> {'dept_id1','user_id1',TimeStamp1','foo'}
>
> Now we can get all users in a dept, sorted by mod_date but it presents a
> different problem. The data returned is duplicated.
>
> Attempt 2 : Add another column to identify the head record much like a
> linked list
> {
>dept_id text,
>user_id text,
>mod_date timestamp,
>user_name text,
>next_record text
>PRIMARY KEY (user_id,user_id, mod_date)
> }
> Every time an update happens it adds a row and also adds the PK of new
> record except in the latest record.
>
> {'dept_id1','user_id1',TimeStamp3','baz','HEAD'}
> {'dept_id1','user_id1',TimeStamp2','bar','dept_id1#user_id1#TimeStamp3'}
> {'dept_id1','user_id1',TimeStamp1','foo','dept_id1#user_id1#TimeStamp2'}
> and also add a secondary index to 'next_record' column.
>
> Now I can support get all users in a dept, sorted by mod_date by
> SELECT * from USERS where dept_id=':dept' AND next_record='HEAD' order by
> mod_date.
>
> But it looks fairly involved solution and perhaps I am missing something ,
> a simpler solution ..
>
> The other option is delete and insert but for high frequency changes I
> think Cassandra has issues with tombstones.
>
> Thanks for helping on this.
> Regards
> Monmohan
>
>


Re: Unsubscribe

2017-04-06 Thread Nate McCall
Hi John,
Please send an email to user-unsubscr...@cassandra.apache.org to
unsubscribe from this list.

On Fri, Apr 7, 2017 at 8:58 AM, John Buczkowski  wrote:

> *From:* eugene miretsky [mailto:eugene.miret...@gmail.com]
> *Sent:* Thursday, April 06, 2017 4:36 PM
> *To:* user@cassandra.apache.org
> *Subject:* Why are automatic anti-entropy repairs required when hinted
> hand-off is enabled?
>
>
>
> Hi,
>
>
>
> As I see it, if hinted handoff is enabled, the only time data can be
> inconsistent is when:
>
>1. A node is down for longer than the max_hint_window
>2. The coordinator node crushes before all the hints have been replayed
>
> Why is it still recommended to perform frequent automatic repairs, as well
> as enable read repair? Can't I just run a repair after one of the nodes is
> down? The only problem I see with this approach is a long repair job
> (instead of small incremental repairs). But other than that, are there any
> other issues/corner-cases?
>
>
>
> Cheers,
>
> Eugene
>


Re: Copy from CSV on OS X problem with varint values <= -2^63

2017-04-06 Thread Boris Babic
Stefania

Downloading and simply running from folder without homebrew interference it now 
looks like the driver matches what you say in the last email.
I will try writing variants again to confirm it works.

cqlsh --debug
Using CQL driver: 
Using connect timeout: 5 seconds
Using 'utf-8' encoding
Using ssl: False
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.


> On Apr 6, 2017, at 11:58 AM, Stefania Alborghetti 
>  wrote:
> 
> It doesn't look like the embedded driver, it should come from a zip file 
> labeled with version 3.7.0.post0-2481531 for cassandra 3.10:
> 
> Using CQL driver:  '/home/stefi/git/cstar/cassandra/bin/../lib/cassandra-driver-internal-only-3.7.0.post0-2481531.zip/cassandra-driver-3.7.0.post0-2481531/cassandra/__init__.py'>
> 
> Sorry, I should have posted this example in my previous email, rather than an 
> example based on the non-embedded driver.
> 
> I don't know who to contact regarding homebrew installation, but you could 
> download the Cassandra package, unzip it, and run cqlsh and Cassandra from 
> that directory?
> 
> 
> On Thu, Apr 6, 2017 at 4:59 AM, Boris Babic  wrote:
> Stefania
> 
> This is the output of my --debug, I never touched CQLSH_NO_BUNDLED and did 
> not know about it.
> As you can see I have used homebrew to install Cassandra and looks like its 
> the embedded version as it sits under the Cassandra folder ? 
> 
> cqlsh --debug
> Using CQL driver:  '/usr/local/Cellar/cassandra/3.10_1/libexec/vendor/lib/python2.7/site-packages/cassandra/__init__.pyc'>
> Using connect timeout: 5 seconds
> Using 'utf-8' encoding
> Using ssl: False
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> 
> 
>> On Apr 5, 2017, at 12:07 PM, Stefania Alborghetti 
>>  wrote:
>> 
>> You are welcome.
>> 
>> I traced the problem to a commit of the Python driver that shipped in 
>> version 3.8 of the driver. It is fixed in 3.8.1. More details on 
>> CASSANDRA-13408. I don't think it's related to the OS.
>> 
>> Since Cassandra 3.10 ships with an older version of the driver embedded in a 
>> zip file in the lib folder, and this version is not affected, I'm guessing 
>> that either the embedded version does not work on OS X, or you are manually 
>> using a different version of the driver by setting CQLSH_NO_BUNDLED (which 
>> is why I could reproduce it on my laptop). 
>> 
>> You can run cqlsh with --debug to see the version of the driver that cqlsh 
>> is using, for example:
>> 
>> cqlsh --debug
>> Using CQL driver: > '/usr/local/lib/python2.7/dist-packages/cassandra_driver-3.8.1-py2.7-linux-x86_64.egg/cassandra/__init__.pyc'>
>> 
>> Can you confirm if you were overriding the Python driver by setting 
>> CQLSH_NO_BUNDLED and the version of the driver?
>> 
>> 
>> 
>> On Tue, Apr 4, 2017 at 6:12 PM, Boris Babic  wrote:
>> Thanks Stefania, going from memory don't think I noticed this on windows but 
>> haven't got a machine handy to test it on at the moment. 
>> 
>> On Apr 4, 2017, at 19:44, Stefania Alborghetti 
>>  wrote:
>> 
>>> I've reproduced the same problem on Linux, and I've opened CASSANDRA-13408. 
>>> As a workaround, disable prepared statements and it will work (WITH HEADER 
>>> = TRUE AND PREPAREDSTATEMENTS = False).
>>> 
>>> On Tue, Apr 4, 2017 at 5:02 PM, Boris Babic  wrote:
>>> 
>>> On Apr 4, 2017, at 7:00 PM, Boris Babic  wrote:
>>> 
>>> Hi
>>> 
>>> I’m testing the write of various datatypes on OS X for fun running 
>>> cassandra 3.10 on a single laptop instance, and from what i can see varint 
>>> should map to java.math.BigInteger and have no problems with Long.MIN_VALE 
>>> , -9223372036854775808, but i can’t see what I’m doing wrong.
>>> 
>>> cqlsh: 5.0.1
>>> cassandra 3.10
>>> osx el capitan.
>>> 
>>> data.csv:
>>> 
>>> id,varint
>>> -2147483648,-9223372036854775808
>>> 2147483647,9223372036854775807
>>> 
>>> COPY mykeyspace.data (id,varint) FROM 'data.csv' WITH HEADER=true;
>>> 
>>>   Failed to make batch statement: Received an argument of invalid type 
>>> for column "varint". Expected: , 
>>> Got: ; (descriptor 'bit_length' requires a 'int' object but 
>>> received a 'long’)
>>> 
>>> If I directly type a similar insert in cqlsh no such problem occurs, in 
>>> fact I can make the value many orders of magnitude less and all is fine.
>>> 
>>> cqlsh> insert into mykeyspace.data (id,varint) 
>>> values(1,-9223372036854775808898989898) ;
>>> 
>>> Had not observed this before on other OS, is this something todo with the 
>>> way the copy from parser is interpreting varint for values <= -2^63 ?
>>> 
>>> Thanks for any input
>>> Boris
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> 
>>> STEFANIA ALBORGHETTI
>>> 

How does clustering key works with TimeWindowCompactionStrategy (TWCS)

2017-04-06 Thread Jerry Lam
Hi guys,

I'm a new and happy user of Cassandra. We are using Cassandra for time
series data so we choose TWCS because of its predictability and its ease of
configuration.

My question is we have a table with the following schema:

CREATE TABLE IF NOT EXISTS customer_view (
customer_id bigint,
date_day Timestamp,
view_id bigint,
PRIMARY KEY (customer_id, date_day)
) WITH CLUSTERING ORDER BY (date_day DESC)

What I understand is that the data will be order by date_day within the
partition using the clustering key. However, the same customer_id can be
inserted to this partition several times during the day and the TWCS says
it will only compact the sstables within the window interval set in the
configuration (in our case is 1 hour).

How does Cassandra guarantee the clustering key order when the same
customer_id appears in several sstables? Does it need to do a merge and
then sort to find out the latest view_id for the customer_id? Or there are
some magics happen behind the book can tell?

Best Regards,

Jerry


Unsubscribe

2017-04-06 Thread John Buczkowski
From: eugene miretsky [mailto:eugene.miret...@gmail.com] 
Sent: Thursday, April 06, 2017 4:36 PM
To: user@cassandra.apache.org
Subject: Why are automatic anti-entropy repairs required when hinted hand-off 
is enabled?

 

Hi, 

 

As I see it, if hinted handoff is enabled, the only time data can be 
inconsistent is when:

1.  A node is down for longer than the max_hint_window
2.  The coordinator node crushes before all the hints have been replayed

Why is it still recommended to perform frequent automatic repairs, as well as 
enable read repair? Can't I just run a repair after one of the nodes is down? 
The only problem I see with this approach is a long repair job (instead of 
small incremental repairs). But other than that, are there any other 
issues/corner-cases? 

 

Cheers,

Eugene 



Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-06 Thread eugene miretsky
Hi,

As I see it, if hinted handoff is enabled, the only time data can be
inconsistent is when:

   1. A node is down for longer than the max_hint_window
   2. The coordinator node crushes before all the hints have been replayed

Why is it still recommended to perform frequent automatic repairs, as well
as enable read repair? Can't I just run a repair after one of the nodes is
down? The only problem I see with this approach is a long repair job
(instead of small incremental repairs). But other than that, are there any
other issues/corner-cases?

Cheers,
Eugene


Re: The changing clustering key

2017-04-06 Thread Eric Stevens
Just curious if you've looked at materialized views.  Something like:

CREATE MATERIALIZED VIEW users_by_mod_date AS
   SELECT dept_id,mod_date,user_id,user_name FROM users
   WHERE mod_date IS NOT NULL
   PRIMARY KEY (dept_id,mod_date,user_id)
   WITH CLUSTERING ORDER BY (mod_date desc)

On Thu, Apr 6, 2017 at 5:54 AM Monmohan Singh  wrote:

> Dear Cassandra experts,
> I have a data modeling question for cases where data needs to be sorted by
> keys which can be modified.
> So , say we have a user table
> {
>dept_id text,
>user_id text,
>user_name text,
>mod_date timestamp
>PRIMARY KEY (dept_id,user_id)
> }
> Now I can query cassandra to get all users by a dept_id
> What if I wanted to query to get all users in a dept, sorted by mod_date.
> So, one way would be to
> {
>dept_id text,
>user_id text,
>mod_date timestamp,
>user_name text,
>PRIMARY KEY (dept_id,user_id, mod_date)
> }
> But, mod_date changes every time user name is updated. So it can't be part
> of clustering key.
>
> Attempt 1:  Don't update the row but instead create new record for every
> update. So, say the record for user foo is like below
> {'dept_id1','user_id1',TimeStamp1','foo'} and then the name was changed to
> 'bar' and then to 'baz' . In that case we add another row to table, so the
> table data would look like
>
> {'dept_id1','user_id1',TimeStamp3','baz'}
> {'dept_id1','user_id1',TimeStamp2','bar'}
> {'dept_id1','user_id1',TimeStamp1','foo'}
>
> Now we can get all users in a dept, sorted by mod_date but it presents a
> different problem. The data returned is duplicated.
>
> Attempt 2 : Add another column to identify the head record much like a
> linked list
> {
>dept_id text,
>user_id text,
>mod_date timestamp,
>user_name text,
>next_record text
>PRIMARY KEY (user_id,user_id, mod_date)
> }
> Every time an update happens it adds a row and also adds the PK of new
> record except in the latest record.
>
> {'dept_id1','user_id1',TimeStamp3','baz','HEAD'}
> {'dept_id1','user_id1',TimeStamp2','bar','dept_id1#user_id1#TimeStamp3'}
> {'dept_id1','user_id1',TimeStamp1','foo','dept_id1#user_id1#TimeStamp2'}
> and also add a secondary index to 'next_record' column.
>
> Now I can support get all users in a dept, sorted by mod_date by
> SELECT * from USERS where dept_id=':dept' AND next_record='HEAD' order by
> mod_date.
>
> But it looks fairly involved solution and perhaps I am missing something ,
> a simpler solution ..
>
> The other option is delete and insert but for high frequency changes I
> think Cassandra has issues with tombstones.
>
> Thanks for helping on this.
> Regards
> Monmohan
>
>


Re: Node always dieing

2017-04-06 Thread Carlos Rolo
i3 are having those issues more than the other instances it seems. Not the
first report I heard about.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Thu, Apr 6, 2017 at 5:36 PM, Cogumelos Maravilha <
cogumelosmaravi...@sapo.pt> wrote:

> Yes but this time I going to give lots of time between killing and pickup.
> Thanks a lot.
>
>
> On 04/06/2017 05:31 PM, Avi Kivity wrote:
>
> Your disk is bad.  Kill that instance and hope someone else gets it.
>
> On 04/06/2017 07:27 PM, Cogumelos Maravilha wrote:
>
> Interesting
>
> [  720.693768] blk_update_request: I/O error, dev nvme0n1, sector
> 1397303056
> [  750.698840] blk_update_request: I/O error, dev nvme0n1, sector
> 1397303080
> [ 1416.202103] blk_update_request: I/O error, dev nvme0n1, sector
> 1397303080
>
> On 04/06/2017 05:26 PM, Avi Kivity wrote:
>
> Is there anything in dmesg?
>
> On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote:
>
> Now dies and restart (systemd) without logging why
>
> system.log
>
> INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362
> AuthCache.java:172 - (Re)initializing RolesCache (validity period
> /update interval/max entries) (2000/2000/1000)
> INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 -
> Configuration location: file:/etc/cassandra/cassandra.
> yaml
>
> debug.log
> DEBUG [GossipStage:1] 2017-04-06 16:16:56,272 FailureDetector.java:457 -
> Ignoring interval time of 2496703934 for /10.100.120.52
> DEBUG [GossipStage:1] 2017-04-06 16:16:59,090 FailureDetector.java:457 -
> Ignoring interval time of 2818071981 for /10.100.120.161
> INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 -
> Configuration location: file:/etc/cassandra/cassandra.yaml
> DEBUG [main] 2017-04-06 16:17:42,540 YamlConfigurationLoader.java:108 -
> Loading settings from file:/etc/cassandra/cassandra.yaml
>
>
> On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:
>
> find */mnt/cassandra/* \! -user cassandra
> nothing
>
> I've found some "strange" solutions on Internet
> chmod -R 2777 /tmp
> chmod -R 2775 cassandra folder
>
> Lets give some time to see the result
>
>
> On 04/06/2017 03:14 PM, Michael Shuler wrote:
>
> All it takes is one frustrated `sudo cassandra` run. Checking only the
> top level directory ownership is insufficient, since root could own
> files/dirs created below the top level. Find all files not owned by user
> cassandra:  `find */mnt/cassandra/* \! -user cassandra`
>
> Just another thought.
>
> --
> Michael
>
>
> On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
>
> From cassandra.yaml:
>
> hints_directory: /mnt/cassandra/hints
> data_file_directories:
> - /mnt/cassandra/data
> commitlog_directory: /mnt/cassandra/commitlog
> saved_caches_directory: /mnt/cassandra/saved_caches
>
> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
>
> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
>
> cassand+  2267 1 99 10:18 ?00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...
>
> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
>
> On /etc/security/limits.conf
>
> *   -   memlock  unlimited
> *   -  nofile  10
> *   -  nproc  32768
> *   -  as   unlimited
>
> On /etc/security/limits.d/cassandra.conf
>
> cassandra  -  memlock  unlimited
> cassandra  -  nofile   10
> cassandra  -  as   unlimited
> cassandra  -  nproc32768
>
> On /etc/sysctl.conf
>
> vm.max_map_count = 1048575
>
> On /etc/systcl.d/cassanda.conf
>
> vm.max_map_count = 1048575
> net.ipv4.tcp_keepalive_time=600
>
> On /etc/pam.d/su
> ...
> sessionrequired   pam_limits.so
> ...
>
> Distro is the currently Ubuntu LTS.
> Thanks
>
>
> On 04/06/2017 10:39 AM, benjamin roth wrote:
>
> Cassandra cannot write an SSTable to disk. Are you sure the
> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
> is writeable for the CS user and has enough free space?
> The CDC warning also implies that.
> The other warnings indicate you are probably not running CS as root
> and you did not set an appropriate limit for max open files. Running
> out of open files can also be a reason for the IO error.
>
> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>  
> >:
>
> Hi list,
>
> I'm using C* 3.10 in a 6 nodes cluster RF=2. All 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Yes but this time I going to give lots of time between killing and pickup.

Thanks a lot.

On 04/06/2017 05:31 PM, Avi Kivity wrote:
>
> Your disk is bad.  Kill that instance and hope someone else gets it.
>
>
> On 04/06/2017 07:27 PM, Cogumelos Maravilha wrote:
>>
>> Interesting
>>
>> [  720.693768] blk_update_request: I/O error, dev nvme0n1, sector
>> 1397303056
>> [  750.698840] blk_update_request: I/O error, dev nvme0n1, sector
>> 1397303080
>> [ 1416.202103] blk_update_request: I/O error, dev nvme0n1, sector
>> 1397303080
>>
>>
>> On 04/06/2017 05:26 PM, Avi Kivity wrote:
>>>
>>> Is there anything in dmesg?
>>>
>>>
>>> On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote:

 Now dies and restart (systemd) without logging why

 system.log

 INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362
 AuthCache.java:172 - (Re)initializing RolesCache (validity period
 /update interval/max entries) (2000/2000/1000)
 INFO  [main] 2017-04-06 16:17:42,535
 YamlConfigurationLoader.java:89 - Configuration location:
 file:/etc/cassandra/cassandra.
 yaml


 debug.log
 DEBUG [GossipStage:1] 2017-04-06 16:16:56,272
 FailureDetector.java:457 - Ignoring interval time of 2496703934 for
 /10.100.120.52
 DEBUG [GossipStage:1] 2017-04-06 16:16:59,090
 FailureDetector.java:457 - Ignoring interval time of 2818071981 for
 /10.100.120.161
 INFO  [main] 2017-04-06 16:17:42,535
 YamlConfigurationLoader.java:89 - Configuration location:
 file:/etc/cassandra/cassandra.yaml
 DEBUG [main] 2017-04-06 16:17:42,540
 YamlConfigurationLoader.java:108 - Loading settings from
 file:/etc/cassandra/cassandra.yaml


 On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:
> find /mnt/cassandra/ \! -user cassandra
> nothing
>
> I've found some "strange" solutions on Internet
> chmod -R 2777 /tmp
> chmod -R 2775 cassandra folder
>
> Lets give some time to see the result
>
> On 04/06/2017 03:14 PM, Michael Shuler wrote:
>> All it takes is one frustrated `sudo cassandra` run. Checking only the
>> top level directory ownership is insufficient, since root could own
>> files/dirs created below the top level. Find all files not owned by user
>> cassandra:  `find /mnt/cassandra/ \! -user cassandra`
>>
>> Just another thought.
>>
>> -- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
>>> From cassandra.yaml:
>>>
>>> hints_directory: /mnt/cassandra/hints
>>> data_file_directories:
>>> - /mnt/cassandra/data
>>> commitlog_directory: /mnt/cassandra/commitlog
>>> saved_caches_directory: /mnt/cassandra/saved_caches
>>>
>>> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
>>>
>>> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
>>> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
>>> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
>>> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
>>> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
>>> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
>>>
>>> cassand+  2267 1 99 10:18 ?00:02:56 java
>>> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities 
>>> -XX:Threa...
>>>
>>> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
>>>
>>> On /etc/security/limits.conf
>>>
>>> *   -   memlock  unlimited
>>> *   -  nofile  10
>>> *   -  nproc  32768
>>> *   -  as   unlimited
>>>
>>> On /etc/security/limits.d/cassandra.conf
>>>
>>> cassandra  -  memlock  unlimited
>>> cassandra  -  nofile   10
>>> cassandra  -  as   unlimited
>>> cassandra  -  nproc32768
>>>
>>> On /etc/sysctl.conf
>>>
>>> vm.max_map_count = 1048575
>>>
>>> On /etc/systcl.d/cassanda.conf
>>>
>>> vm.max_map_count = 1048575
>>> net.ipv4.tcp_keepalive_time=600
>>>
>>> On /etc/pam.d/su
>>> ...
>>> sessionrequired   pam_limits.so
>>> ...
>>>
>>> Distro is the currently Ubuntu LTS.
>>> Thanks
>>>
>>>
>>> On 04/06/2017 10:39 AM, benjamin roth wrote:
 Cassandra cannot write an SSTable to disk. Are you sure the
 disk/volume where SSTables reside (normally /var/lib/cassandra/data)
 is writeable for the CS user and has enough free space?
 The CDC warning also implies that.
 The other warnings indicate you are probably not running CS as root
 and you did not set an appropriate limit for max open files. Running
 out of open files can also be a reason for the IO error.

 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
 >:

 

Re: Node always dieing

2017-04-06 Thread Avi Kivity

Your disk is bad.  Kill that instance and hope someone else gets it.


On 04/06/2017 07:27 PM, Cogumelos Maravilha wrote:


Interesting

[  720.693768] blk_update_request: I/O error, dev nvme0n1, sector 
1397303056
[  750.698840] blk_update_request: I/O error, dev nvme0n1, sector 
1397303080
[ 1416.202103] blk_update_request: I/O error, dev nvme0n1, sector 
1397303080



On 04/06/2017 05:26 PM, Avi Kivity wrote:


Is there anything in dmesg?


On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote:


Now dies and restart (systemd) without logging why

system.log

INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362 
AuthCache.java:172 - (Re)initializing RolesCache (validity period

/update interval/max entries) (2000/2000/1000)
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 
- Configuration location: file:/etc/cassandra/cassandra.

yaml


debug.log
DEBUG [GossipStage:1] 2017-04-06 16:16:56,272 
FailureDetector.java:457 - Ignoring interval time of 2496703934 for 
/10.100.120.52
DEBUG [GossipStage:1] 2017-04-06 16:16:59,090 
FailureDetector.java:457 - Ignoring interval time of 2818071981 for 
/10.100.120.161
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 
- Configuration location: file:/etc/cassandra/cassandra.yaml
DEBUG [main] 2017-04-06 16:17:42,540 
YamlConfigurationLoader.java:108 - Loading settings from 
file:/etc/cassandra/cassandra.yaml



On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:

find/mnt/cassandra/  \! -user cassandra
nothing

I've found some "strange" solutions on Internet
chmod -R 2777 /tmp
chmod -R 2775 cassandra folder

Lets give some time to see the result

On 04/06/2017 03:14 PM, Michael Shuler wrote:

All it takes is one frustrated `sudo cassandra` run. Checking only the
top level directory ownership is insufficient, since root could own
files/dirs created below the top level. Find all files not owned by user
cassandra:  `find/mnt/cassandra/  \! -user cassandra`

Just another thought.

-- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:

 From cassandra.yaml:

hints_directory: /mnt/cassandra/hints
data_file_directories:
 - /mnt/cassandra/data
commitlog_directory: /mnt/cassandra/commitlog
saved_caches_directory: /mnt/cassandra/saved_caches

drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/

drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/

cassand+  2267 1 99 10:18 ?00:02:56 java
-Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...

/dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt

On /etc/security/limits.conf

*   -   memlock  unlimited
*   -  nofile  10
*   -  nproc  32768
*   -  as   unlimited

On /etc/security/limits.d/cassandra.conf

cassandra  -  memlock  unlimited
cassandra  -  nofile   10
cassandra  -  as   unlimited
cassandra  -  nproc32768

On /etc/sysctl.conf

vm.max_map_count = 1048575

On /etc/systcl.d/cassanda.conf

vm.max_map_count = 1048575
net.ipv4.tcp_keepalive_time=600

On /etc/pam.d/su
...
sessionrequired   pam_limits.so
...

Distro is the currently Ubuntu LTS.
Thanks


On 04/06/2017 10:39 AM, benjamin roth wrote:

Cassandra cannot write an SSTable to disk. Are you sure the
disk/volume where SSTables reside (normally /var/lib/cassandra/data)
is writeable for the CS user and has enough free space?
The CDC warning also implies that.
The other warnings indicate you are probably not running CS as root
and you did not set an appropriate limit for max open files. Running
out of open files can also be a reason for the IO error.

2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>:

 Hi list,

 I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
 i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
 I have
 one node that is always dieing and I don't understand why. Can anyone
 give me some hints please. All nodes using the same configuration.

 Thanks in advance.

 INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
 IndexSummaryRedistribution.java:75 - Redistributing index summaries
 ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
 CassandraDaemon.java:229 - Exception in thread
 Thread[MemtablePostFlush:22,5,main]
 org.apache.cassandra.io
 .FSWriteError:
 java.io.IOException: Input/output
 error
 at
 
org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Interesting

[  720.693768] blk_update_request: I/O error, dev nvme0n1, sector 1397303056
[  750.698840] blk_update_request: I/O error, dev nvme0n1, sector 1397303080
[ 1416.202103] blk_update_request: I/O error, dev nvme0n1, sector 1397303080


On 04/06/2017 05:26 PM, Avi Kivity wrote:
>
> Is there anything in dmesg?
>
>
> On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote:
>>
>> Now dies and restart (systemd) without logging why
>>
>> system.log
>>
>> INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362
>> AuthCache.java:172 - (Re)initializing RolesCache (validity period
>> /update interval/max entries) (2000/2000/1000)
>> INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89
>> - Configuration location: file:/etc/cassandra/cassandra.
>> yaml
>>
>>
>> debug.log
>> DEBUG [GossipStage:1] 2017-04-06 16:16:56,272
>> FailureDetector.java:457 - Ignoring interval time of 2496703934 for
>> /10.100.120.52
>> DEBUG [GossipStage:1] 2017-04-06 16:16:59,090
>> FailureDetector.java:457 - Ignoring interval time of 2818071981 for
>> /10.100.120.161
>> INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89
>> - Configuration location: file:/etc/cassandra/cassandra.yaml
>> DEBUG [main] 2017-04-06 16:17:42,540 YamlConfigurationLoader.java:108
>> - Loading settings from file:/etc/cassandra/cassandra.yaml
>>
>>
>> On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:
>>> find /mnt/cassandra/ \! -user cassandra
>>> nothing
>>>
>>> I've found some "strange" solutions on Internet
>>> chmod -R 2777 /tmp
>>> chmod -R 2775 cassandra folder
>>>
>>> Lets give some time to see the result
>>>
>>> On 04/06/2017 03:14 PM, Michael Shuler wrote:
 All it takes is one frustrated `sudo cassandra` run. Checking only the
 top level directory ownership is insufficient, since root could own
 files/dirs created below the top level. Find all files not owned by user
 cassandra:  `find /mnt/cassandra/ \! -user cassandra`

 Just another thought.

 -- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
> From cassandra.yaml:
>
> hints_directory: /mnt/cassandra/hints
> data_file_directories:
> - /mnt/cassandra/data
> commitlog_directory: /mnt/cassandra/commitlog
> saved_caches_directory: /mnt/cassandra/saved_caches
>
> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
>
> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
>
> cassand+  2267 1 99 10:18 ?00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities 
> -XX:Threa...
>
> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
>
> On /etc/security/limits.conf
>
> *   -   memlock  unlimited
> *   -  nofile  10
> *   -  nproc  32768
> *   -  as   unlimited
>
> On /etc/security/limits.d/cassandra.conf
>
> cassandra  -  memlock  unlimited
> cassandra  -  nofile   10
> cassandra  -  as   unlimited
> cassandra  -  nproc32768
>
> On /etc/sysctl.conf
>
> vm.max_map_count = 1048575
>
> On /etc/systcl.d/cassanda.conf
>
> vm.max_map_count = 1048575
> net.ipv4.tcp_keepalive_time=600
>
> On /etc/pam.d/su
> ...
> sessionrequired   pam_limits.so
> ...
>
> Distro is the currently Ubuntu LTS.
> Thanks
>
>
> On 04/06/2017 10:39 AM, benjamin roth wrote:
>> Cassandra cannot write an SSTable to disk. Are you sure the
>> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
>> is writeable for the CS user and has enough free space?
>> The CDC warning also implies that.
>> The other warnings indicate you are probably not running CS as root
>> and you did not set an appropriate limit for max open files. Running
>> out of open files can also be a reason for the IO error.
>>
>> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>> >:
>>
>> Hi list,
>>
>> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
>> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
>> I have
>> one node that is always dieing and I don't understand why. Can anyone
>> give me some hints please. All nodes using the same configuration.
>>
>> Thanks in advance.
>>
>> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
>> IndexSummaryRedistribution.java:75 - Redistributing index summaries
>> 

Re: Node always dieing

2017-04-06 Thread Avi Kivity

Is there anything in dmesg?


On 04/06/2017 07:25 PM, Cogumelos Maravilha wrote:


Now dies and restart (systemd) without logging why

system.log

INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362 
AuthCache.java:172 - (Re)initializing RolesCache (validity period

/update interval/max entries) (2000/2000/1000)
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 - 
Configuration location: file:/etc/cassandra/cassandra.

yaml


debug.log
DEBUG [GossipStage:1] 2017-04-06 16:16:56,272 FailureDetector.java:457 
- Ignoring interval time of 2496703934 for /10.100.120.52
DEBUG [GossipStage:1] 2017-04-06 16:16:59,090 FailureDetector.java:457 
- Ignoring interval time of 2818071981 for /10.100.120.161
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 - 
Configuration location: file:/etc/cassandra/cassandra.yaml
DEBUG [main] 2017-04-06 16:17:42,540 YamlConfigurationLoader.java:108 
- Loading settings from file:/etc/cassandra/cassandra.yaml



On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:

find/mnt/cassandra/  \! -user cassandra
nothing

I've found some "strange" solutions on Internet
chmod -R 2777 /tmp
chmod -R 2775 cassandra folder

Lets give some time to see the result

On 04/06/2017 03:14 PM, Michael Shuler wrote:

All it takes is one frustrated `sudo cassandra` run. Checking only the
top level directory ownership is insufficient, since root could own
files/dirs created below the top level. Find all files not owned by user
cassandra:  `find/mnt/cassandra/  \! -user cassandra`

Just another thought.

-- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:

 From cassandra.yaml:

hints_directory: /mnt/cassandra/hints
data_file_directories:
 - /mnt/cassandra/data
commitlog_directory: /mnt/cassandra/commitlog
saved_caches_directory: /mnt/cassandra/saved_caches

drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/

drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/

cassand+  2267 1 99 10:18 ?00:02:56 java
-Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...

/dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt

On /etc/security/limits.conf

*   -   memlock  unlimited
*   -  nofile  10
*   -  nproc  32768
*   -  as   unlimited

On /etc/security/limits.d/cassandra.conf

cassandra  -  memlock  unlimited
cassandra  -  nofile   10
cassandra  -  as   unlimited
cassandra  -  nproc32768

On /etc/sysctl.conf

vm.max_map_count = 1048575

On /etc/systcl.d/cassanda.conf

vm.max_map_count = 1048575
net.ipv4.tcp_keepalive_time=600

On /etc/pam.d/su
...
sessionrequired   pam_limits.so
...

Distro is the currently Ubuntu LTS.
Thanks


On 04/06/2017 10:39 AM, benjamin roth wrote:

Cassandra cannot write an SSTable to disk. Are you sure the
disk/volume where SSTables reside (normally /var/lib/cassandra/data)
is writeable for the CS user and has enough free space?
The CDC warning also implies that.
The other warnings indicate you are probably not running CS as root
and you did not set an appropriate limit for max open files. Running
out of open files can also be a reason for the IO error.

2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>:

 Hi list,

 I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
 i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
 I have
 one node that is always dieing and I don't understand why. Can anyone
 give me some hints please. All nodes using the same configuration.

 Thanks in advance.

 INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
 IndexSummaryRedistribution.java:75 - Redistributing index summaries
 ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
 CassandraDaemon.java:229 - Exception in thread
 Thread[MemtablePostFlush:22,5,main]
 org.apache.cassandra.io
 .FSWriteError:
 java.io.IOException: Input/output
 error
 at
 
org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 
org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 org.apache.cassandra.io
 
.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 org.apache.cassandra.io
 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Now dies and restart (systemd) without logging why

system.log

INFO  [Native-Transport-Requests-2] 2017-04-06 16:06:55,362
AuthCache.java:172 - (Re)initializing RolesCache (validity period
/update interval/max entries) (2000/2000/1000)
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 -
Configuration location: file:/etc/cassandra/cassandra.
yaml


debug.log
DEBUG [GossipStage:1] 2017-04-06 16:16:56,272 FailureDetector.java:457 -
Ignoring interval time of 2496703934 for /10.100.120.52
DEBUG [GossipStage:1] 2017-04-06 16:16:59,090 FailureDetector.java:457 -
Ignoring interval time of 2818071981 for /10.100.120.161
INFO  [main] 2017-04-06 16:17:42,535 YamlConfigurationLoader.java:89 -
Configuration location: file:/etc/cassandra/cassandra.yaml
DEBUG [main] 2017-04-06 16:17:42,540 YamlConfigurationLoader.java:108 -
Loading settings from file:/etc/cassandra/cassandra.yaml


On 04/06/2017 04:18 PM, Cogumelos Maravilha wrote:
> find /mnt/cassandra/ \! -user cassandra
> nothing
>
> I've found some "strange" solutions on Internet
> chmod -R 2777 /tmp
> chmod -R 2775 cassandra folder
>
> Lets give some time to see the result
>
> On 04/06/2017 03:14 PM, Michael Shuler wrote:
>> All it takes is one frustrated `sudo cassandra` run. Checking only the
>> top level directory ownership is insufficient, since root could own
>> files/dirs created below the top level. Find all files not owned by user
>> cassandra:  `find /mnt/cassandra/ \! -user cassandra`
>>
>> Just another thought.
>>
>> -- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
>>> From cassandra.yaml:
>>>
>>> hints_directory: /mnt/cassandra/hints
>>> data_file_directories:
>>> - /mnt/cassandra/data
>>> commitlog_directory: /mnt/cassandra/commitlog
>>> saved_caches_directory: /mnt/cassandra/saved_caches
>>>
>>> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
>>>
>>> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
>>> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
>>> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
>>> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
>>> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
>>> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
>>>
>>> cassand+  2267 1 99 10:18 ?00:02:56 java
>>> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...
>>>
>>> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
>>>
>>> On /etc/security/limits.conf
>>>
>>> *   -   memlock  unlimited
>>> *   -  nofile  10
>>> *   -  nproc  32768
>>> *   -  as   unlimited
>>>
>>> On /etc/security/limits.d/cassandra.conf
>>>
>>> cassandra  -  memlock  unlimited
>>> cassandra  -  nofile   10
>>> cassandra  -  as   unlimited
>>> cassandra  -  nproc32768
>>>
>>> On /etc/sysctl.conf
>>>
>>> vm.max_map_count = 1048575
>>>
>>> On /etc/systcl.d/cassanda.conf
>>>
>>> vm.max_map_count = 1048575
>>> net.ipv4.tcp_keepalive_time=600
>>>
>>> On /etc/pam.d/su
>>> ...
>>> sessionrequired   pam_limits.so
>>> ...
>>>
>>> Distro is the currently Ubuntu LTS.
>>> Thanks
>>>
>>>
>>> On 04/06/2017 10:39 AM, benjamin roth wrote:
 Cassandra cannot write an SSTable to disk. Are you sure the
 disk/volume where SSTables reside (normally /var/lib/cassandra/data)
 is writeable for the CS user and has enough free space?
 The CDC warning also implies that.
 The other warnings indicate you are probably not running CS as root
 and you did not set an appropriate limit for max open files. Running
 out of open files can also be a reason for the IO error.

 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
 >:

 Hi list,

 I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
 i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
 I have
 one node that is always dieing and I don't understand why. Can anyone
 give me some hints please. All nodes using the same configuration.

 Thanks in advance.

 INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
 IndexSummaryRedistribution.java:75 - Redistributing index summaries
 ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
 CassandraDaemon.java:229 - Exception in thread
 Thread[MemtablePostFlush:22,5,main]
 org.apache.cassandra.io
 .FSWriteError:
 java.io.IOException: Input/output
 error
 at
 
 org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 
 org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
 ~[apache-cassandra-3.10.jar:3.10]
 at
 

Re: Node Gossiping Information.

2017-04-06 Thread Pranay akula
I am using cassandra 2.1, is it possible to manually make a node to gossip
with a particular set of nodes or update gossip info to current to a
 particular node. I don't think it's possible just checking if that can be
done. As i have mentioned i occasionally see hinted handoff threads getting
hung which is delaying hints delivery stored on that node and increasing
hint compactions, Gc and load issues.


Thanks
Pranay.

On Tue, Apr 4, 2017 at 12:25 PM, Jeff Jirsa  wrote:

> Cassandra uses system.peers to record the list of peers for subsequent
> startups, but gossip state is only in memory
>
> You shouldn't ever manually need to disable/re-enable gossip unless you
> want the rest of the ring to believe a node went offline/online.
>
> What version are you using?
>
>
> On Tue, Apr 4, 2017 at 7:46 AM, Pranay akula 
> wrote:
>
>> @Jeff thanks for your reply,
>>
>> What i am trying to find is where Gossip data will be stored on, which
>> Keyspace ??  The nodes will Gossip at the time of their start or will get
>> Gossiping Data from seed nodes, What i wanted to do is can we refresh the
>> Gossiping Data with out restarting service, I often see Hinted Handoff's
>> getting hanged on Some nodes so what i am currently doing to handle that
>> situation is i am disabling and enabling Gossip  for that particular node.
>> which is currently helping but i am not  sure if it's right way to do it.
>>
>> Does  streaming_socket_timeout_in_ms parameter has any role to play for
>> Hinted Hand-off's ??
>>
>> Is there any setting in Yaml file i can change get through this
>> situation.
>>
>> Thanks
>> Pranay.
>>
>> On Tue, Apr 4, 2017 at 12:19 AM, Jeff Jirsa  wrote:
>>
>>>
>>>
>>> On 2017-04-02 11:27 (-0700), Pranay akula 
>>> wrote:
>>> > where can we check  gossip information of a node ??  I couldn't find
>>> > anything in System keyspace.
>>> >
>>> > Is it possible to update or refresh Gossiping information on a node
>>> without
>>> > restarting. Does enabling and disabling Gossip will help to refresh
>>> Gossip
>>> > information on that node.
>>> >
>>>
>>>
>>> "nodetool gossipinfo"
>>>
>>>
>>>
>>
>


Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
find /mnt/cassandra/ \! -user cassandra
nothing

I've found some "strange" solutions on Internet
chmod -R 2777 /tmp
chmod -R 2775 cassandra folder

Lets give some time to see the result


On 04/06/2017 03:14 PM, Michael Shuler wrote:
> All it takes is one frustrated `sudo cassandra` run. Checking only the
> top level directory ownership is insufficient, since root could own
> files/dirs created below the top level. Find all files not owned by user
> cassandra:  `find /mnt/cassandra/ \! -user cassandra`
>
> Just another thought.
>
> -- Michael On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
>> From cassandra.yaml:
>>
>> hints_directory: /mnt/cassandra/hints
>> data_file_directories:
>> - /mnt/cassandra/data
>> commitlog_directory: /mnt/cassandra/commitlog
>> saved_caches_directory: /mnt/cassandra/saved_caches
>>
>> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
>>
>> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
>> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
>> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
>> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
>> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
>> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
>>
>> cassand+  2267 1 99 10:18 ?00:02:56 java
>> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...
>>
>> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
>>
>> On /etc/security/limits.conf
>>
>> *   -   memlock  unlimited
>> *   -  nofile  10
>> *   -  nproc  32768
>> *   -  as   unlimited
>>
>> On /etc/security/limits.d/cassandra.conf
>>
>> cassandra  -  memlock  unlimited
>> cassandra  -  nofile   10
>> cassandra  -  as   unlimited
>> cassandra  -  nproc32768
>>
>> On /etc/sysctl.conf
>>
>> vm.max_map_count = 1048575
>>
>> On /etc/systcl.d/cassanda.conf
>>
>> vm.max_map_count = 1048575
>> net.ipv4.tcp_keepalive_time=600
>>
>> On /etc/pam.d/su
>> ...
>> sessionrequired   pam_limits.so
>> ...
>>
>> Distro is the currently Ubuntu LTS.
>> Thanks
>>
>>
>> On 04/06/2017 10:39 AM, benjamin roth wrote:
>>> Cassandra cannot write an SSTable to disk. Are you sure the
>>> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
>>> is writeable for the CS user and has enough free space?
>>> The CDC warning also implies that.
>>> The other warnings indicate you are probably not running CS as root
>>> and you did not set an appropriate limit for max open files. Running
>>> out of open files can also be a reason for the IO error.
>>>
>>> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>>> >:
>>>
>>> Hi list,
>>>
>>> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
>>> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
>>> I have
>>> one node that is always dieing and I don't understand why. Can anyone
>>> give me some hints please. All nodes using the same configuration.
>>>
>>> Thanks in advance.
>>>
>>> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
>>> IndexSummaryRedistribution.java:75 - Redistributing index summaries
>>> ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
>>> CassandraDaemon.java:229 - Exception in thread
>>> Thread[MemtablePostFlush:22,5,main]
>>> org.apache.cassandra.io
>>> .FSWriteError:
>>> java.io.IOException: Input/output
>>> error
>>> at
>>> 
>>> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> 
>>> org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> org.apache.cassandra.io
>>> 
>>> .compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> org.apache.cassandra.io
>>> 
>>> .compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> 
>>> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> 
>>> org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> org.apache.cassandra.io
>>> 
>>> .sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
>>> ~[apache-cassandra-3.10.jar:3.10]
>>> at
>>> 
>>> 

Re: Node always dieing

2017-04-06 Thread Michael Shuler
All it takes is one frustrated `sudo cassandra` run. Checking only the
top level directory ownership is insufficient, since root could own
files/dirs created below the top level. Find all files not owned by user
cassandra:  `find /mnt/cassandra/ \! -user cassandra`

Just another thought.

-- 
Michael


On 04/06/2017 05:23 AM, Cogumelos Maravilha wrote:
> From cassandra.yaml:
> 
> hints_directory: /mnt/cassandra/hints
> data_file_directories:
> - /mnt/cassandra/data
> commitlog_directory: /mnt/cassandra/commitlog
> saved_caches_directory: /mnt/cassandra/saved_caches
> 
> drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/
> 
> drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
> drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
> drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
> drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
> drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
> drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/
> 
> cassand+  2267 1 99 10:18 ?00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...
> 
> /dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt
> 
> On /etc/security/limits.conf
> 
> *   -   memlock  unlimited
> *   -  nofile  10
> *   -  nproc  32768
> *   -  as   unlimited
> 
> On /etc/security/limits.d/cassandra.conf
> 
> cassandra  -  memlock  unlimited
> cassandra  -  nofile   10
> cassandra  -  as   unlimited
> cassandra  -  nproc32768
> 
> On /etc/sysctl.conf
> 
> vm.max_map_count = 1048575
> 
> On /etc/systcl.d/cassanda.conf
> 
> vm.max_map_count = 1048575
> net.ipv4.tcp_keepalive_time=600
> 
> On /etc/pam.d/su
> ...
> sessionrequired   pam_limits.so
> ...
> 
> Distro is the currently Ubuntu LTS.
> Thanks
> 
> 
> On 04/06/2017 10:39 AM, benjamin roth wrote:
>> Cassandra cannot write an SSTable to disk. Are you sure the
>> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
>> is writeable for the CS user and has enough free space?
>> The CDC warning also implies that.
>> The other warnings indicate you are probably not running CS as root
>> and you did not set an appropriate limit for max open files. Running
>> out of open files can also be a reason for the IO error.
>>
>> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
>> >:
>>
>> Hi list,
>>
>> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
>> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
>> I have
>> one node that is always dieing and I don't understand why. Can anyone
>> give me some hints please. All nodes using the same configuration.
>>
>> Thanks in advance.
>>
>> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
>> IndexSummaryRedistribution.java:75 - Redistributing index summaries
>> ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
>> CassandraDaemon.java:229 - Exception in thread
>> Thread[MemtablePostFlush:22,5,main]
>> org.apache.cassandra.io
>> .FSWriteError:
>> java.io.IOException: Input/output
>> error
>> at
>> 
>> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> 
>> org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> org.apache.cassandra.io
>> 
>> .compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> org.apache.cassandra.io
>> 
>> .compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> 
>> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> 
>> org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> org.apache.cassandra.io
>> 
>> .sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> 
>> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> org.apache.cassandra.io
>> 
>> .sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
>> ~[apache-cassandra-3.10.jar:3.10]
>> at
>> 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
We tested in c4 instances but EBS is too slow. So we deployed for
production in i3.

It was running with 5 nodes without problems but we started running out
of space so we added another node. And is this last node that is giving
problems. I've already terminated the instance and created another once,
but the problem remains. The configuration is the same for all
instances. Runs nice for a few time but then

ERROR [MemtablePostFlush:11] 2017-04-06 11:47:40,840
CassandraDaemon.java:229 - Exception in thread
Thread[MemtablePostFlush:11,5,main]

Thanks.

On 04/06/2017 01:18 PM, Carlos Rolo wrote:
> There was some issue with the i3 instances and Cassandra. Did you had
> this cluster running always on i3? 
>
> On Apr 6, 2017 13:06, "Cogumelos Maravilha"
> > wrote:
>
> Limit Soft Limit   Hard
> Limit   Units
> Max cpu time  unlimited   
> unlimitedseconds  
> Max file size unlimited   
> unlimitedbytes
> Max data size unlimited   
> unlimitedbytes
> Max stack size8388608 
> unlimitedbytes
> Max core file size0   
> unlimitedbytes
> Max resident set  unlimited   
> unlimitedbytes
> Max processes 122575  
> 122575   processes
> Max open files10  
> 10   files
> Max locked memory unlimited   
> unlimitedbytes
> Max address space unlimited   
> unlimitedbytes
> Max file locksunlimited   
> unlimitedlocks
> Max pending signals   122575  
> 122575   signals  
> Max msgqueue size 819200  
> 819200   bytes
> Max nice priority 00   
> Max realtime priority 00   
> Max realtime timeout  unlimited   
> unlimitedus   
>
> Please find something wrong there!
>
> Thanks.
>
> On 04/06/2017 11:50 AM, benjamin roth wrote:
>> Limits: You should check them in /proc/$pid/limits
>>
>> 2017-04-06 12:48 GMT+02:00 Cogumelos Maravilha
>> >:
>>
>> Yes C* is running as cassandra:
>>
>> cassand+  2267 1 99 10:18 ?00:02:56 java
>> -Xloggc:/var/log/cassandra/gc.log -ea
>> -XX:+UseThreadPriorities -XX:Threa...
>>
>> INFO  [main] 2017-04-06 10:35:42,956 Config.java:474 - Node
>> configuration:[allocate_tokens_for_keyspace=null;
>> authenticator=PasswordAuthenticator;
>> authorizer=CassandraAuthorizer; auto_bootstrap=true;
>> auto_snapshot=true; back_pressure_enabled=false;
>> back_pressure_strategy=org.apache.cassandra.net
>> .RateBasedBackPressure{high_ratio=0.9,
>> factor=5, flow=FAST}; batch_size_fail_threshold_in_kb=50;
>> batch_size_warn_threshold_in_kb=5;
>> batchlog_replay_throttle_in_kb=1024; broadcast_address=null;
>> broadcast_rpc_address=null;
>> buffer_pool_use_heap_if_exhausted=true;
>> cas_contention_timeout_in_ms=600; cdc_enabled=false;
>> cdc_free_space_check_interval_ms=250; cdc_raw_directory=null;
>> cdc_total_space_in_mb=0;
>> client_encryption_options=; cluster_name=company;
>> column_index_cache_size_in_kb=2; column_index_size_in_kb=64;
>> commit_failure_policy=ignore; commitlog_compression=null;
>> commitlog_directory=/mnt/cassandra/commitlog;
>> commitlog_max_compression_buffers_in_pool=3;
>> commitlog_periodic_queue_size=-1;
>> commitlog_segment_size_in_mb=32; commitlog_sync=periodic;
>> commitlog_sync_batch_window_in_ms=NaN;
>> commitlog_sync_period_in_ms=1;
>> commitlog_total_space_in_mb=null;
>> compaction_large_partition_warning_threshold_mb=100;
>> compaction_throughput_mb_per_sec=16;
>> concurrent_compactors=null; concurrent_counter_writes=32;
>> concurrent_materialized_view_writes=32; concurrent_reads=32;
>> concurrent_replicates=null; concurrent_writes=32;
>> counter_cache_keys_to_save=2147483647;
>> counter_cache_save_period=7200;
>> counter_cache_size_in_mb=null;
>> counter_write_request_timeout_in_ms=600;
>> credentials_cache_max_entries=1000;
>> credentials_update_interval_in_ms=-1;
>> credentials_validity_in_ms=2000; cross_node_timeout=false;
>> 

Re: Node always dieing

2017-04-06 Thread Carlos Rolo
There was some issue with the i3 instances and Cassandra. Did you had this
cluster running always on i3?

On Apr 6, 2017 13:06, "Cogumelos Maravilha" 
wrote:

> Limit Soft Limit   Hard Limit
> Units
> Max cpu time  unlimitedunlimited
> seconds
> Max file size unlimitedunlimited
> bytes
> Max data size unlimitedunlimited
> bytes
> Max stack size8388608  unlimited
> bytes
> Max core file size0unlimited
> bytes
> Max resident set  unlimitedunlimited
> bytes
> Max processes 122575   122575
> processes
> Max open files10   10
> files
> Max locked memory unlimitedunlimited
> bytes
> Max address space unlimitedunlimited
> bytes
> Max file locksunlimitedunlimited
> locks
> Max pending signals   122575   122575
> signals
> Max msgqueue size 819200   819200
> bytes
> Max nice priority 00
> Max realtime priority 00
> Max realtime timeout  unlimitedunlimitedus
> Please find something wrong there!
>
> Thanks.
>
> On 04/06/2017 11:50 AM, benjamin roth wrote:
>
> Limits: You should check them in /proc/$pid/limits
>
> 2017-04-06 12:48 GMT+02:00 Cogumelos Maravilha  >:
>
>> Yes C* is running as cassandra:
>>
>> cassand+  2267 1 99 10:18 ?00:02:56 java
>> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities
>> -XX:Threa...
>>
>> INFO  [main] 2017-04-06 10:35:42,956 Config.java:474 - Node
>> configuration:[allocate_tokens_for_keyspace=null;
>> authenticator=PasswordAuthenticator; authorizer=CassandraAuthorizer;
>> auto_bootstrap=true; auto_snapshot=true; back_pressure_enabled=false;
>> back_pressure_strategy=org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9,
>> factor=5, flow=FAST}; batch_size_fail_threshold_in_kb=50;
>> batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024;
>> broadcast_address=null; broadcast_rpc_address=null;
>> buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=600;
>> cdc_enabled=false; cdc_free_space_check_interval_ms=250;
>> cdc_raw_directory=null; cdc_total_space_in_mb=0;
>> client_encryption_options=; cluster_name=company;
>> column_index_cache_size_in_kb=2; column_index_size_in_kb=64;
>> commit_failure_policy=ignore; commitlog_compression=null;
>> commitlog_directory=/mnt/cassandra/commitlog;
>> commitlog_max_compression_buffers_in_pool=3;
>> commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32;
>> commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN;
>> commitlog_sync_period_in_ms=1; commitlog_total_space_in_mb=null;
>> compaction_large_partition_warning_threshold_mb=100;
>> compaction_throughput_mb_per_sec=16; concurrent_compactors=null;
>> concurrent_counter_writes=32; concurrent_materialized_view_writes=32;
>> concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32;
>> counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200;
>> counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=600;
>> credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1;
>> credentials_validity_in_ms=2000; cross_node_timeout=false;
>> data_file_directories=[Ljava.lang.String;@223f3642;
>> disk_access_mode=auto; disk_failure_policy=ignore;
>> disk_optimization_estimate_percentile=0.95;
>> disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd;
>> dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1;
>> dynamic_snitch_reset_interval_in_ms=60;
>> dynamic_snitch_update_interval_in_ms=100; 
>> enable_scripted_user_defined_functions=false;
>> enable_user_defined_functions=false; 
>> enable_user_defined_functions_threads=true;
>> encryption_options=null; endpoint_snitch=SimpleSnitch;
>> file_cache_size_in_mb=null; gc_log_threshold_in_ms=200;
>> gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[];
>> hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024;
>> hints_compression=null; hints_directory=/mnt/cassandra/hints;
>> hints_flush_period_in_ms=1; incremental_backups=false;
>> index_interval=null; index_summary_capacity_in_mb=null;
>> index_summary_resize_interval_in_minutes=60; initial_token=null;
>> inter_dc_stream_throughput_outbound_megabits_per_sec=200;
>> inter_dc_tcp_nodelay=false; internode_authenticator=null;
>> internode_compression=dc; internode_recv_buff_size_in_bytes=0;
>> internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647;
>> key_cache_save_period=14400; key_cache_size_in_mb=null;
>> listen_address=10.100.100.213; listen_interface=null;
>> listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false;
>> 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Limit Soft Limit   Hard Limit  
Units
Max cpu time  unlimitedunlimited   
seconds  
Max file size unlimitedunlimited   
bytes
Max data size unlimitedunlimited   
bytes
Max stack size8388608  unlimited   
bytes
Max core file size0unlimited   
bytes
Max resident set  unlimitedunlimited   
bytes
Max processes 122575   122575  
processes
Max open files10   10  
files
Max locked memory unlimitedunlimited   
bytes
Max address space unlimitedunlimited   
bytes
Max file locksunlimitedunlimited   
locks
Max pending signals   122575   122575  
signals  
Max msgqueue size 819200   819200  
bytes
Max nice priority 00   
Max realtime priority 00   
Max realtime timeout  unlimitedunlimitedus   

Please find something wrong there!

Thanks.

On 04/06/2017 11:50 AM, benjamin roth wrote:
> Limits: You should check them in /proc/$pid/limits
>
> 2017-04-06 12:48 GMT+02:00 Cogumelos Maravilha
> >:
>
> Yes C* is running as cassandra:
>
> cassand+  2267 1 99 10:18 ?00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities
> -XX:Threa...
>
> INFO  [main] 2017-04-06 10:35:42,956 Config.java:474 - Node
> configuration:[allocate_tokens_for_keyspace=null;
> authenticator=PasswordAuthenticator;
> authorizer=CassandraAuthorizer; auto_bootstrap=true;
> auto_snapshot=true; back_pressure_enabled=false;
> back_pressure_strategy=org.apache.cassandra.net
> .RateBasedBackPressure{high_ratio=0.9,
> factor=5, flow=FAST}; batch_size_fail_threshold_in_kb=50;
> batch_size_warn_threshold_in_kb=5;
> batchlog_replay_throttle_in_kb=1024; broadcast_address=null;
> broadcast_rpc_address=null;
> buffer_pool_use_heap_if_exhausted=true;
> cas_contention_timeout_in_ms=600; cdc_enabled=false;
> cdc_free_space_check_interval_ms=250; cdc_raw_directory=null;
> cdc_total_space_in_mb=0; client_encryption_options=;
> cluster_name=company; column_index_cache_size_in_kb=2;
> column_index_size_in_kb=64; commit_failure_policy=ignore;
> commitlog_compression=null;
> commitlog_directory=/mnt/cassandra/commitlog;
> commitlog_max_compression_buffers_in_pool=3;
> commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32;
> commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN;
> commitlog_sync_period_in_ms=1;
> commitlog_total_space_in_mb=null;
> compaction_large_partition_warning_threshold_mb=100;
> compaction_throughput_mb_per_sec=16; concurrent_compactors=null;
> concurrent_counter_writes=32;
> concurrent_materialized_view_writes=32; concurrent_reads=32;
> concurrent_replicates=null; concurrent_writes=32;
> counter_cache_keys_to_save=2147483647;
> counter_cache_save_period=7200; counter_cache_size_in_mb=null;
> counter_write_request_timeout_in_ms=600;
> credentials_cache_max_entries=1000;
> credentials_update_interval_in_ms=-1;
> credentials_validity_in_ms=2000; cross_node_timeout=false;
> data_file_directories=[Ljava.lang.String;@223f3642;
> disk_access_mode=auto; disk_failure_policy=ignore;
> disk_optimization_estimate_percentile=0.95;
> disk_optimization_page_cross_chance=0.1;
> disk_optimization_strategy=ssd; dynamic_snitch=true;
> dynamic_snitch_badness_threshold=0.1;
> dynamic_snitch_reset_interval_in_ms=60;
> dynamic_snitch_update_interval_in_ms=100;
> enable_scripted_user_defined_functions=false;
> enable_user_defined_functions=false;
> enable_user_defined_functions_threads=true;
> encryption_options=null; endpoint_snitch=SimpleSnitch;
> file_cache_size_in_mb=null; gc_log_threshold_in_ms=200;
> gc_warn_threshold_in_ms=1000;
> hinted_handoff_disabled_datacenters=[];
> hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024;
> hints_compression=null; hints_directory=/mnt/cassandra/hints;
> hints_flush_period_in_ms=1; incremental_backups=false;
> index_interval=null; index_summary_capacity_in_mb=null;
> index_summary_resize_interval_in_minutes=60; initial_token=null;
> inter_dc_stream_throughput_outbound_megabits_per_sec=200;
> inter_dc_tcp_nodelay=false; internode_authenticator=null;
> internode_compression=dc; internode_recv_buff_size_in_bytes=0;
> 

The changing clustering key

2017-04-06 Thread Monmohan Singh
Dear Cassandra experts,
I have a data modeling question for cases where data needs to be sorted by
keys which can be modified.
So , say we have a user table
{
   dept_id text,
   user_id text,
   user_name text,
   mod_date timestamp
   PRIMARY KEY (dept_id,user_id)
}
Now I can query cassandra to get all users by a dept_id
What if I wanted to query to get all users in a dept, sorted by mod_date.
So, one way would be to
{
   dept_id text,
   user_id text,
   mod_date timestamp,
   user_name text,
   PRIMARY KEY (dept_id,user_id, mod_date)
}
But, mod_date changes every time user name is updated. So it can't be part
of clustering key.

Attempt 1:  Don't update the row but instead create new record for every
update. So, say the record for user foo is like below
{'dept_id1','user_id1',TimeStamp1','foo'} and then the name was changed to
'bar' and then to 'baz' . In that case we add another row to table, so the
table data would look like

{'dept_id1','user_id1',TimeStamp3','baz'}
{'dept_id1','user_id1',TimeStamp2','bar'}
{'dept_id1','user_id1',TimeStamp1','foo'}

Now we can get all users in a dept, sorted by mod_date but it presents a
different problem. The data returned is duplicated.

Attempt 2 : Add another column to identify the head record much like a
linked list
{
   dept_id text,
   user_id text,
   mod_date timestamp,
   user_name text,
   next_record text
   PRIMARY KEY (user_id,user_id, mod_date)
}
Every time an update happens it adds a row and also adds the PK of new
record except in the latest record.

{'dept_id1','user_id1',TimeStamp3','baz','HEAD'}
{'dept_id1','user_id1',TimeStamp2','bar','dept_id1#user_id1#TimeStamp3'}
{'dept_id1','user_id1',TimeStamp1','foo','dept_id1#user_id1#TimeStamp2'}
and also add a secondary index to 'next_record' column.

Now I can support get all users in a dept, sorted by mod_date by
SELECT * from USERS where dept_id=':dept' AND next_record='HEAD' order by
mod_date.

But it looks fairly involved solution and perhaps I am missing something ,
a simpler solution ..

The other option is delete and insert but for high frequency changes I
think Cassandra has issues with tombstones.

Thanks for helping on this.
Regards
Monmohan


Re: Node always dieing

2017-04-06 Thread benjamin roth
Limits: You should check them in /proc/$pid/limits

2017-04-06 12:48 GMT+02:00 Cogumelos Maravilha :

> Yes C* is running as cassandra:
>
> cassand+  2267 1 99 10:18 ?00:02:56 java
> -Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities
> -XX:Threa...
>
> INFO  [main] 2017-04-06 10:35:42,956 Config.java:474 - Node
> configuration:[allocate_tokens_for_keyspace=null; 
> authenticator=PasswordAuthenticator;
> authorizer=CassandraAuthorizer; auto_bootstrap=true; auto_snapshot=true;
> back_pressure_enabled=false; back_pressure_strategy=org.
> apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5,
> flow=FAST}; batch_size_fail_threshold_in_kb=50;
> batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024;
> broadcast_address=null; broadcast_rpc_address=null; 
> buffer_pool_use_heap_if_exhausted=true;
> cas_contention_timeout_in_ms=600; cdc_enabled=false;
> cdc_free_space_check_interval_ms=250; cdc_raw_directory=null;
> cdc_total_space_in_mb=0; client_encryption_options=;
> cluster_name=company; column_index_cache_size_in_kb=2;
> column_index_size_in_kb=64; commit_failure_policy=ignore;
> commitlog_compression=null; commitlog_directory=/mnt/cassandra/commitlog;
> commitlog_max_compression_buffers_in_pool=3;
> commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32;
> commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN;
> commitlog_sync_period_in_ms=1; commitlog_total_space_in_mb=null;
> compaction_large_partition_warning_threshold_mb=100;
> compaction_throughput_mb_per_sec=16; concurrent_compactors=null;
> concurrent_counter_writes=32; concurrent_materialized_view_writes=32;
> concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32;
> counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200;
> counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=600;
> credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1;
> credentials_validity_in_ms=2000; cross_node_timeout=false;
> data_file_directories=[Ljava.lang.String;@223f3642;
> disk_access_mode=auto; disk_failure_policy=ignore;
> disk_optimization_estimate_percentile=0.95; 
> disk_optimization_page_cross_chance=0.1;
> disk_optimization_strategy=ssd; dynamic_snitch=true;
> dynamic_snitch_badness_threshold=0.1; 
> dynamic_snitch_reset_interval_in_ms=60;
> dynamic_snitch_update_interval_in_ms=100; 
> enable_scripted_user_defined_functions=false;
> enable_user_defined_functions=false; 
> enable_user_defined_functions_threads=true;
> encryption_options=null; endpoint_snitch=SimpleSnitch;
> file_cache_size_in_mb=null; gc_log_threshold_in_ms=200;
> gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[];
> hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024;
> hints_compression=null; hints_directory=/mnt/cassandra/hints;
> hints_flush_period_in_ms=1; incremental_backups=false;
> index_interval=null; index_summary_capacity_in_mb=null;
> index_summary_resize_interval_in_minutes=60; initial_token=null;
> inter_dc_stream_throughput_outbound_megabits_per_sec=200;
> inter_dc_tcp_nodelay=false; internode_authenticator=null;
> internode_compression=dc; internode_recv_buff_size_in_bytes=0;
> internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647;
> key_cache_save_period=14400; key_cache_size_in_mb=null;
> listen_address=10.100.100.213; listen_interface=null;
> listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false;
> max_hint_window_in_ms=1080; max_hints_delivery_threads=2;
> max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null;
> max_streaming_retries=3; max_value_size_in_mb=256;
> memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null;
> memtable_flush_writers=0; memtable_heap_space_in_mb=null;
> memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50;
> native_transport_max_concurrent_connections=-1; native_transport_max_
> concurrent_connections_per_ip=-1; native_transport_max_frame_size_in_mb=256;
> native_transport_max_threads=128; native_transport_port=9042;
> native_transport_port_ssl=null; num_tokens=256; 
> otc_coalescing_strategy=TIMEHORIZON;
> otc_coalescing_window_us=200; 
> partitioner=org.apache.cassandra.dht.Murmur3Partitioner;
> permissions_cache_max_entries=1000; permissions_update_interval_in_ms=-1;
> permissions_validity_in_ms=2000; phi_convict_threshold=8.0;
> prepared_statements_cache_size_mb=null; range_request_timeout_in_ms=600;
> read_request_timeout_in_ms=600; request_scheduler=org.apache.
> cassandra.scheduler.NoScheduler; request_scheduler_id=null;
> request_scheduler_options=null; request_timeout_in_ms=600;
> role_manager=CassandraRoleManager; roles_cache_max_entries=1000;
> roles_update_interval_in_ms=-1; roles_validity_in_ms=2000;
> row_cache_class_name=org.apache.cassandra.cache.OHCProvider;
> row_cache_keys_to_save=2147483647; row_cache_save_period=0;
> 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Yes C* is running as cassandra:

cassand+  2267 1 99 10:18 ?00:02:56 java
-Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...

INFO  [main] 2017-04-06 10:35:42,956 Config.java:474 - Node
configuration:[allocate_tokens_for_keyspace=null;
authenticator=PasswordAuthenticator; authorizer=CassandraAuthorizer;
auto_bootstrap=true; auto_snapshot=true; back_pressure_enabled=false;
back_pressure_strategy=org.apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9,
factor=5, flow=FAST}; batch_size_fail_threshold_in_kb=50;
batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024;
broadcast_address=null; broadcast_rpc_address=null;
buffer_pool_use_heap_if_exhausted=true;
cas_contention_timeout_in_ms=600; cdc_enabled=false;
cdc_free_space_check_interval_ms=250; cdc_raw_directory=null;
cdc_total_space_in_mb=0; client_encryption_options=;
cluster_name=company; column_index_cache_size_in_kb=2;
column_index_size_in_kb=64; commit_failure_policy=ignore;
commitlog_compression=null;
commitlog_directory=/mnt/cassandra/commitlog;
commitlog_max_compression_buffers_in_pool=3;
commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32;
commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN;
commitlog_sync_period_in_ms=1; commitlog_total_space_in_mb=null;
compaction_large_partition_warning_threshold_mb=100;
compaction_throughput_mb_per_sec=16; concurrent_compactors=null;
concurrent_counter_writes=32; concurrent_materialized_view_writes=32;
concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32;
counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200;
counter_cache_size_in_mb=null;
counter_write_request_timeout_in_ms=600;
credentials_cache_max_entries=1000;
credentials_update_interval_in_ms=-1; credentials_validity_in_ms=2000;
cross_node_timeout=false;
data_file_directories=[Ljava.lang.String;@223f3642;
disk_access_mode=auto; disk_failure_policy=ignore;
disk_optimization_estimate_percentile=0.95;
disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd;
dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1;
dynamic_snitch_reset_interval_in_ms=60;
dynamic_snitch_update_interval_in_ms=100;
enable_scripted_user_defined_functions=false;
enable_user_defined_functions=false;
enable_user_defined_functions_threads=true; encryption_options=null;
endpoint_snitch=SimpleSnitch; file_cache_size_in_mb=null;
gc_log_threshold_in_ms=200; gc_warn_threshold_in_ms=1000;
hinted_handoff_disabled_datacenters=[]; hinted_handoff_enabled=true;
hinted_handoff_throttle_in_kb=1024; hints_compression=null;
hints_directory=/mnt/cassandra/hints; hints_flush_period_in_ms=1;
incremental_backups=false; index_interval=null;
index_summary_capacity_in_mb=null;
index_summary_resize_interval_in_minutes=60; initial_token=null;
inter_dc_stream_throughput_outbound_megabits_per_sec=200;
inter_dc_tcp_nodelay=false; internode_authenticator=null;
internode_compression=dc; internode_recv_buff_size_in_bytes=0;
internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647;
key_cache_save_period=14400; key_cache_size_in_mb=null;
listen_address=10.100.100.213; listen_interface=null;
listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false;
max_hint_window_in_ms=1080; max_hints_delivery_threads=2;
max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null;
max_streaming_retries=3; max_value_size_in_mb=256;
memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null;
memtable_flush_writers=0; memtable_heap_space_in_mb=null;
memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50;
native_transport_max_concurrent_connections=-1;
native_transport_max_concurrent_connections_per_ip=-1;
native_transport_max_frame_size_in_mb=256;
native_transport_max_threads=128; native_transport_port=9042;
native_transport_port_ssl=null; num_tokens=256;
otc_coalescing_strategy=TIMEHORIZON; otc_coalescing_window_us=200;
partitioner=org.apache.cassandra.dht.Murmur3Partitioner;
permissions_cache_max_entries=1000;
permissions_update_interval_in_ms=-1; permissions_validity_in_ms=2000;
phi_convict_threshold=8.0; prepared_statements_cache_size_mb=null;
range_request_timeout_in_ms=600; read_request_timeout_in_ms=600;
request_scheduler=org.apache.cassandra.scheduler.NoScheduler;
request_scheduler_id=null; request_scheduler_options=null;
request_timeout_in_ms=600; role_manager=CassandraRoleManager;
roles_cache_max_entries=1000; roles_update_interval_in_ms=-1;
roles_validity_in_ms=2000;
row_cache_class_name=org.apache.cassandra.cache.OHCProvider;
row_cache_keys_to_save=2147483647; row_cache_save_period=0;
row_cache_size_in_mb=0; rpc_address=10.100.100.213; rpc_interface=null;
rpc_interface_prefer_ipv6=false; rpc_keepalive=true;
rpc_listen_backlog=50; rpc_max_threads=2147483647; rpc_min_threads=16;
rpc_port=9160; rpc_recv_buff_size_in_bytes=null;
rpc_send_buff_size_in_bytes=null; rpc_server_type=sync;

Re: Node always dieing

2017-04-06 Thread benjamin roth
Have you checked the effective limits of a running CS process?
Is CS run as Cassandra? Just to rule out missing file perms.


Am 06.04.2017 12:24 schrieb "Cogumelos Maravilha" <
cogumelosmaravi...@sapo.pt>:

>From cassandra.yaml:
hints_directory: /mnt/cassandra/hints
data_file_directories:
- /mnt/cassandra/data
commitlog_directory: /mnt/cassandra/commitlog
saved_caches_directory: /mnt/cassandra/saved_caches

drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/

drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/

cassand+  2267 1 99 10:18 ?00:02:56 java
-Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...

/dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt

On /etc/security/limits.conf

*   -   memlock  unlimited
*   -  nofile  10
*   -  nproc  32768
*   -  as   unlimited

On /etc/security/limits.d/cassandra.conf

cassandra  -  memlock  unlimited
cassandra  -  nofile   10
cassandra  -  as   unlimited
cassandra  -  nproc32768

On /etc/sysctl.conf

vm.max_map_count = 1048575

On /etc/systcl.d/cassanda.conf

vm.max_map_count = 1048575
net.ipv4.tcp_keepalive_time=600
On /etc/pam.d/su
...
sessionrequired   pam_limits.so
...

Distro is the currently Ubuntu LTS.
Thanks



On 04/06/2017 10:39 AM, benjamin roth wrote:

Cassandra cannot write an SSTable to disk. Are you sure the disk/volume
where SSTables reside (normally /var/lib/cassandra/data) is writeable for
the CS user and has enough free space?
The CDC warning also implies that.
The other warnings indicate you are probably not running CS as root and you
did not set an appropriate limit for max open files. Running out of open
files can also be a reason for the IO error.

2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha :

> Hi list,
>
> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G. I have
> one node that is always dieing and I don't understand why. Can anyone
> give me some hints please. All nodes using the same configuration.
>
> Thanks in advance.
>
> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
> IndexSummaryRedistribution.java:75 - Redistributing index summaries
> ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
> CassandraDaemon.java:229 - Exception in thread
> Thread[MemtablePostFlush:22,5,main]
> org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output
> error
> at
> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyIn
> ternal(SequentialWriter.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.syncInternal(S
> equentialWriter.java:185)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter.
> access$100(CompressedSequentialWriter.java:38)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter$
> TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.utils.concurrent.Transactional$Abstract
> Transactional.prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.prepareToCommi
> t(SequentialWriter.java:358)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$Tr
> ansactionalProxy.doPrepare(BigTableWriter.java:367)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.utils.concurrent.Transactional$Abstract
> Transactional.prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.format.SSTableWriter.prepare
> ToCommit(SSTableWriter.java:281)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.pre
> pareToCommit(SimpleSSTableMultiWriter.java:101)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtabl
> e(ColumnFamilyStore.java:1153)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFa
> milyStore.java:1086)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1142)
> ~[na:1.8.0_121]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:617)
> [na:1.8.0_121]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$
> threadLocalDeallocator$0(NamedThreadFactory.java:79)
> 

Re: Node always dieing

2017-04-06 Thread Cogumelos Maravilha
From cassandra.yaml:

hints_directory: /mnt/cassandra/hints
data_file_directories:
- /mnt/cassandra/data
commitlog_directory: /mnt/cassandra/commitlog
saved_caches_directory: /mnt/cassandra/saved_caches

drwxr-xr-x   3 cassandra cassandra   23 Apr  5 16:03 mnt/

drwxr-xr-x 6 cassandra cassandra  68 Apr  5 16:17 ./
drwxr-xr-x 3 cassandra cassandra  23 Apr  5 16:03 ../
drwxr-xr-x 2 cassandra cassandra  80 Apr  6 10:07 commitlog/
drwxr-xr-x 8 cassandra cassandra 124 Apr  5 16:17 data/
drwxr-xr-x 2 cassandra cassandra  72 Apr  5 16:20 hints/
drwxr-xr-x 2 cassandra cassandra  49 Apr  5 20:17 saved_caches/

cassand+  2267 1 99 10:18 ?00:02:56 java
-Xloggc:/var/log/cassandra/gc.log -ea -XX:+UseThreadPriorities -XX:Threa...

/dev/mapper/um_vg-xfs_lv  885G   27G  858G   4% /mnt

On /etc/security/limits.conf

*   -   memlock  unlimited
*   -  nofile  10
*   -  nproc  32768
*   -  as   unlimited

On /etc/security/limits.d/cassandra.conf

cassandra  -  memlock  unlimited
cassandra  -  nofile   10
cassandra  -  as   unlimited
cassandra  -  nproc32768

On /etc/sysctl.conf

vm.max_map_count = 1048575

On /etc/systcl.d/cassanda.conf

vm.max_map_count = 1048575
net.ipv4.tcp_keepalive_time=600

On /etc/pam.d/su
...
sessionrequired   pam_limits.so
...

Distro is the currently Ubuntu LTS.
Thanks


On 04/06/2017 10:39 AM, benjamin roth wrote:
> Cassandra cannot write an SSTable to disk. Are you sure the
> disk/volume where SSTables reside (normally /var/lib/cassandra/data)
> is writeable for the CS user and has enough free space?
> The CDC warning also implies that.
> The other warnings indicate you are probably not running CS as root
> and you did not set an appropriate limit for max open files. Running
> out of open files can also be a reason for the IO error.
>
> 2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha
> >:
>
> Hi list,
>
> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G.
> I have
> one node that is always dieing and I don't understand why. Can anyone
> give me some hints please. All nodes using the same configuration.
>
> Thanks in advance.
>
> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
> IndexSummaryRedistribution.java:75 - Redistributing index summaries
> ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
> CassandraDaemon.java:229 - Exception in thread
> Thread[MemtablePostFlush:22,5,main]
> org.apache.cassandra.io
> .FSWriteError:
> java.io.IOException: Input/output
> error
> at
> 
> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io
> 
> .compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io
> 
> .compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io
> 
> .sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io
> 
> .sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io
> 
> .sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 
> 

Re: Node always dieing

2017-04-06 Thread benjamin roth
Cassandra cannot write an SSTable to disk. Are you sure the disk/volume
where SSTables reside (normally /var/lib/cassandra/data) is writeable for
the CS user and has enough free space?
The CDC warning also implies that.
The other warnings indicate you are probably not running CS as root and you
did not set an appropriate limit for max open files. Running out of open
files can also be a reason for the IO error.

2017-04-06 11:34 GMT+02:00 Cogumelos Maravilha :

> Hi list,
>
> I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
> i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G. I have
> one node that is always dieing and I don't understand why. Can anyone
> give me some hints please. All nodes using the same configuration.
>
> Thanks in advance.
>
> INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
> IndexSummaryRedistribution.java:75 - Redistributing index summaries
> ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
> CassandraDaemon.java:229 - Exception in thread
> Thread[MemtablePostFlush:22,5,main]
> org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output
> error
> at
> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(
> SequentialWriter.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.syncInternal(
> SequentialWriter.java:185)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter.access$100(
> CompressedSequentialWriter.java:38)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter$
> TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.
> prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(
> SequentialWriter.java:358)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.format.big.BigTableWriter$
> TransactionalProxy.doPrepare(BigTableWriter.java:367)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.
> prepareToCommit(Transactional.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.format.SSTableWriter.
> prepareToCommit(SSTableWriter.java:281)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.prepareToCommit(
> SimpleSSTableMultiWriter.java:101)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(
> ColumnFamilyStore.java:1153)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(
> ColumnFamilyStore.java:1086)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> ~[na:1.8.0_121]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> [na:1.8.0_121]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.
> lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> [apache-cassandra-3.10.jar:3.10]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> Caused by: java.io.IOException: Input/output error
> at sun.nio.ch.FileDispatcherImpl.force0(Native Method) ~[na:1.8.0_121]
> at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
> ~[na:1.8.0_121]
> at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
> ~[na:1.8.0_121]
> at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(
> SequentialWriter.java:169)
> ~[apache-cassandra-3.10.jar:3.10]
> ... 15 common frames omitted
> INFO  [IndexSummaryManager:1] 2017-04-06 06:22:18,366
> IndexSummaryRedistribution.java:75 - Redistributing index summaries
> ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525
> CassandraDaemon.java:229 - Exception in thread
> Thread[MemtablePostFlush:31,5,main]
> org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output
> error
> at
> org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(
> SequentialWriter.java:173)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.util.SequentialWriter.syncInternal(
> SequentialWriter.java:185)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter.access$100(
> CompressedSequentialWriter.java:38)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> org.apache.cassandra.io.compress.CompressedSequentialWriter$
> TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
> ~[apache-cassandra-3.10.jar:3.10]
> at
> 

Node always dieing

2017-04-06 Thread Cogumelos Maravilha
Hi list,

I'm using C* 3.10 in a 6 nodes cluster RF=2. All instances type
i3.xlarge (AWS) with 32GB, 2 cores and SSD LVM XFS formated 885G. I have
one node that is always dieing and I don't understand why. Can anyone
give me some hints please. All nodes using the same configuration.

Thanks in advance.

INFO  [IndexSummaryManager:1] 2017-04-06 05:22:18,352
IndexSummaryRedistribution.java:75 - Redistributing index summaries
ERROR [MemtablePostFlush:22] 2017-04-06 06:00:26,800
CassandraDaemon.java:229 - Exception in thread
Thread[MemtablePostFlush:22,5,main]
org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output
error
at
org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.prepareToCommit(SimpleSSTableMultiWriter.java:101)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1153)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1086)
~[apache-cassandra-3.10.jar:3.10]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
~[na:1.8.0_121]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_121]
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
[apache-cassandra-3.10.jar:3.10]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcherImpl.force0(Native Method) ~[na:1.8.0_121]
at sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:76)
~[na:1.8.0_121]
at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:388)
~[na:1.8.0_121]
at org.apache.cassandra.utils.SyncUtil.force(SyncUtil.java:158)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:169)
~[apache-cassandra-3.10.jar:3.10]
... 15 common frames omitted
INFO  [IndexSummaryManager:1] 2017-04-06 06:22:18,366
IndexSummaryRedistribution.java:75 - Redistributing index summaries
ERROR [MemtablePostFlush:31] 2017-04-06 06:39:19,525
CassandraDaemon.java:229 - Exception in thread
Thread[MemtablePostFlush:31,5,main]
org.apache.cassandra.io.FSWriteError: java.io.IOException: Input/output
error
at
org.apache.cassandra.io.util.SequentialWriter.syncDataOnlyInternal(SequentialWriter.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.util.SequentialWriter.syncInternal(SequentialWriter.java:185)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.compress.CompressedSequentialWriter.access$100(CompressedSequentialWriter.java:38)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.compress.CompressedSequentialWriter$TransactionalProxy.doPrepare(CompressedSequentialWriter.java:307)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.util.SequentialWriter.prepareToCommit(SequentialWriter.java:358)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.sstable.format.big.BigTableWriter$TransactionalProxy.doPrepare(BigTableWriter.java:367)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.prepareToCommit(Transactional.java:173)
~[apache-cassandra-3.10.jar:3.10]
at
org.apache.cassandra.io.sstable.format.SSTableWriter.prepareToCommit(SSTableWriter.java:281)
~[apache-cassandra-3.10.jar:3.10]
at