Re: Cassandra upgrade from 2.1 to 3.0

2018-05-12 Thread Jeff Jirsa
I haven't seen this before, but I have a guess.

What client/driver are you using?

Are you using a prepared statement that has every column listed for the
update, and leaving the un-set columns as null? If so, the null is being
translated into a delete, which is clearly not what you want.

The differentiation between UNSET and NULL went into 2.2 (
https://issues.apache.org/jira/browse/CASSANDRA-7304 ) , and most drivers
have been updated to know the difference (
https://github.com/gocql/gocql/issues/861 ,
https://datastax-oss.atlassian.net/browse/JAVA-777 , etc). I haven't read
the patch for 7304, but I suspect that maybe there's some sort of mixup
along the way (maybe in your driver, or maybe you upgraded the driver to
support 3.0 and picked up a new feature you didnt realize you picked up,
etc)


On Fri, May 11, 2018 at 11:26 AM, kooljava2 
wrote:

> After further analyzing the data. I see some pattern. The rows which were
> updated in last 2-3 weeks, the column which were not part of this update
> have the null values.
>
> Has anyone encountered this issue during the upgrade?
>
>
> Thank you,
>
>
> On Thursday, 10 May 2018, 19:49:50 GMT-7, kooljava2
>  wrote:
>
>
> Hello Jeff,
>
> 2.1.19 to 3.0.15.
>
> Thank you.
>
> On Thursday, 10 May 2018, 17:43:58 GMT-7, Jeff Jirsa 
> wrote:
>
>
> Which minor version of 3.0
>
> --
> Jeff Jirsa
>
>
> On May 11, 2018, at 2:54 AM, kooljava2 
> wrote:
>
>
> Hello,
>
> Upgraded Cassandra 2.1 to 3.0.  We see certain data in few columns being
> set to "null". These null columns were created during the row creation time.
>
> After looking at the data see a pattern where update was done on these
> rows. Rows which were updated has data but rows which were not part of the
> update are set to null.
>
>  created_on| created_by  | id
> -+-+
> -
> null |null
> |12345
>
>
>
> sstabledump:-
>
> WARN  20:47:38,741 Small cdc volume detected at
> /var/lib/cassandra/cdc_raw; setting cdc_total_space_in_mb to 1278.  You can
> override this in cassandra.yaml
> [
>   {
> "partition" : {
>   "key" : [ "12345" ],
>   "position" : 5155159
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 5168738,
> "deletion_info" : { "marked_deleted" :
> "2018-03-28T20:38:08.05Z", "local_delete_time" : "2018-03-28T20:38:08Z"
> },
> "cells" : [
>   { "name" : "doc_type", "value" : false, "tstamp" :
> "2018-03-28T20:38:08.060Z" },
>   { "name" : "industry", "deletion_info" : { "local_delete_time" :
> "2018-03-28T20:38:08Z" },
> "tstamp" : "2018-03-28T20:38:08.060Z"
>   },
>   { "name" : "last_modified_by", "value" : "12345", "tstamp" :
> "2018-03-28T20:38:08.060Z" },
>   { "name" : "last_modified_date", "value" : "2018-03-28
> 20:38:08.059Z", "tstamp" : "2018-03-28T20:38:08.060Z" },
>   { "name" : "locale", "deletion_info" : { "local_delete_time" :
> "2018-03-28T20:38:08Z" },
> "tstamp" : "2018-03-28T20:38:08.060Z"
>   },
>   { "name" : "postal_code", "deletion_info" : {
> "local_delete_time" : "2018-03-28T20:38:08Z" },
> "tstamp" : "2018-03-28T20:38:08.060Z"
>   },
>   { "name" : "ticket", "deletion_info" : { "marked_deleted" :
> "2018-03-28T20:38:08.05Z", "local_delete_time" : "2018-03-28T20:38:08Z"
> } },
>   { "name" : "ticket", "path" : [ "TEMP_DATA" ], "value" :
> "{\"name\":\"TEMP_DATA\",\"ticket\":\"a42638dae8350e889f2603be1427ac
> 6f5dec5e486d4db164a76bf80820cdf68d635cff5e7d555e6d4eabb9b5b8
> 2597b68bec0fcd735fcca\",\"lastRenewedDate\":\"2018-03-28T20:38:08Z\"}",
> "tstamp" : "2018-03-28T20:38:08.060Z" },
>   { "name" : "ticket", "path" : [ "TEMP_TEMP2" ], "value" :
> "{\"name\":\"TEMP_TEMP2\",\"ticket\":\"a4263b7350d1f2683\"
> ,\"lastRenewedDate\":\"2018-03-28T20:38:07Z\"}", "tstamp" :
> "2018-03-28T20:38:08.060Z" },
>   { "name" : "ppstatus_pf", "deletion_info" : { "marked_deleted" :
> "2018-03-28T20:38:08.05Z", "local_delete_time" : "2018-03-28T20:38:08Z"
> } },
>   { "name" : "ppstatus_pers", "deletion_info" : { "marked_deleted"
> : "2018-03-28T20:38:08.05Z", "local_delete_time" :
> "2018-03-28T20:38:08Z" } }
> ]
>   }
> ]
>   }
> ]WARN  20:47:41,325 Small cdc volume detected at
> /var/lib/cassandra/cdc_raw; setting cdc_total_space_in_mb to 1278.  You can
> override this in cassandra.yaml
> [
>   {
> "partition" : {
>   "key" : [ "12345" ],
>   "position" : 18743072
> },
> "rows" : [
>   {
> "type" : "row",
> "position" : 18751808,
> "liveness_info" : { "tstamp" : "2017-10-25T10:22:41.612Z" },
> "cells" : [
>   

Re: Error after 3.1.0 to 3.11.2 upgrade

2018-05-12 Thread Jeff Jirsa
RF of one means all auth requests go to the same node, so they’re more likely 
to time out if that host is overloaded or restarts

Increasing it distributed the queries among more hosts


-- 
Jeff Jirsa


> On May 12, 2018, at 6:11 AM, Abdul Patel  wrote:
> 
> Yeah found that all had 3 replication factor and system_auth had 1 , chnaged 
> to 3 now ..so was this issue due to system_auth replication facyor mismatch?
> 
>> On Saturday, May 12, 2018, Hannu Kröger  wrote:
>> Hi,
>> 
>> Did you check replication strategy and amounts of replicas of system_auth 
>> keyspace?
>> 
>> Hannu
>> 
>>> Abdul Patel  kirjoitti 12.5.2018 kello 5.21:
>>> 
>>> No applicatiom isnt impacted ..no complains ..
>>> Also its an 4 node cluster in lower non production and all are on same 
>>> version.
>>> 
 On Friday, May 11, 2018, Jeff Jirsa  wrote:
 The read is timing out - is the cluster healthy? Is it fully upgraded or 
 mixed versions? Repeated isn’t great, but is the application impacted? 
 
 -- 
 Jeff Jirsa
 
 
> On May 12, 2018, at 6:17 AM, Abdul Patel  wrote:
> 
> Seems its coming from 3.10, got bunch of them today for 3.11.2, so if 
> this is repeatedly coming , whats solution for this?
> 
> WARN  [Native-Transport-Requests-24] 2018-05-11 16:46:20,938 
> CassandraAuthorizer.java:96 - CassandraAuthorizer failed to authorize 
> # for 
> ERROR [Native-Transport-Requests-24] 2018-05-11 16:46:20,940 
> ErrorMessage.java:384 - Unexpected exception during request
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out 
> - received only 0 responses.
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) 
> ~[guava-18.0.jar:na]
> at com.google.common.cache.LocalCache.get(LocalCache.java:3937) 
> ~[guava-18.0.jar:na]
> at 
> com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) 
> ~[guava-18.0.jar:na]
> at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
>  ~[guava-18.0.jar:na]
> at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.auth.PermissionsCache.getPermissions(PermissionsCache.java:45)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.auth.AuthenticatedUser.getPermissions(AuthenticatedUser.java:104)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:439) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.ClientState.checkPermissionOnResourceChain(ClientState.java:368)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:345)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:332) 
> ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:310)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.checkAccess(SelectStatement.java:260)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:221)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:530)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:507)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> 
>> On Fri, May 11, 2018 at 8:30 PM, Jeff Jirsa  wrote:
>> That looks like Cassandra 3.10 not 3.11.2
>> 
>> It’s also just the auth cache failing to refresh - if it’s transient 
>> it’s probably not a big deal. If it continues then there may be an issue 
>> with the cache refresher.
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On May 12, 2018, at 5:55 AM, Abdul Patel  wrote:
>>> 
>>> HI All,
>>> 
>>> Seen below stack trace messages , in errorlog  one day after upgrade.
>>> one of the blogs said this might be due to old drivers, but not sure on 
>>> it.
>>> 
>>> FYI :
>>> 
>>> INFO  [HANDSHAKE-/10.152.205.150] 2018-05-09 10:22:27,160 
>>> OutboundTcpConnection.java:510 - Handshaking version with 
>>> /10.152.205.150
>>> 

Re: Error after 3.1.0 to 3.11.2 upgrade

2018-05-12 Thread Abdul Patel
Yeah found that all had 3 replication factor and system_auth had 1 ,
chnaged to 3 now ..so was this issue due to system_auth replication facyor
mismatch?

On Saturday, May 12, 2018, Hannu Kröger  wrote:

> Hi,
>
> Did you check replication strategy and amounts of replicas of system_auth
> keyspace?
>
> Hannu
>
> Abdul Patel  kirjoitti 12.5.2018 kello 5.21:
>
> No applicatiom isnt impacted ..no complains ..
> Also its an 4 node cluster in lower non production and all are on same
> version.
>
> On Friday, May 11, 2018, Jeff Jirsa  wrote:
>
>> The read is timing out - is the cluster healthy? Is it fully upgraded or
>> mixed versions? Repeated isn’t great, but is the application impacted?
>>
>> --
>> Jeff Jirsa
>>
>>
>> On May 12, 2018, at 6:17 AM, Abdul Patel  wrote:
>>
>> Seems its coming from 3.10, got bunch of them today for 3.11.2, so if
>> this is repeatedly coming , whats solution for this?
>>
>> WARN  [Native-Transport-Requests-24] 2018-05-11 16:46:20,938
>> CassandraAuthorizer.java:96 - CassandraAuthorizer failed to authorize
>> # for 
>> ERROR [Native-Transport-Requests-24] 2018-05-11 16:46:20,940
>> ErrorMessage.java:384 - Unexpected exception during request
>> com.google.common.util.concurrent.UncheckedExecutionException:
>> java.lang.RuntimeException: 
>> org.apache.cassandra.exceptions.ReadTimeoutException:
>> Operation timed out - received only 0 responses.
>> at 
>> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
>> ~[guava-18.0.jar:na]
>> at com.google.common.cache.LocalCache.get(LocalCache.java:3937)
>> ~[guava-18.0.jar:na]
>> at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941)
>> ~[guava-18.0.jar:na]
>> at 
>> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
>> ~[guava-18.0.jar:na]
>> at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.auth.PermissionsCache.getPermissions(PermissionsCache.java:45)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.auth.AuthenticatedUser.getPermissions(AuthenticatedUser.java:104)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.service.ClientState.authorize(ClientState.java:439)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at org.apache.cassandra.service.ClientState.checkPermissionOnRe
>> sourceChain(ClientState.java:368) ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:345)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:332)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:310)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.cql3.statements.SelectStatement.checkAccess(SelectStatement.java:260)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:221)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:530)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>> at 
>> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:507)
>> ~[apache-cassandra-3.11.2.jar:3.11.2]
>>
>> On Fri, May 11, 2018 at 8:30 PM, Jeff Jirsa  wrote:
>>
>>> That looks like Cassandra 3.10 not 3.11.2
>>>
>>> It’s also just the auth cache failing to refresh - if it’s transient
>>> it’s probably not a big deal. If it continues then there may be an issue
>>> with the cache refresher.
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On May 12, 2018, at 5:55 AM, Abdul Patel  wrote:
>>>
>>> HI All,
>>>
>>> Seen below stack trace messages , in errorlog  one day after upgrade.
>>> one of the blogs said this might be due to old drivers, but not sure on
>>> it.
>>>
>>> FYI :
>>>
>>> INFO  [HANDSHAKE-/10.152.205.150] 2018-05-09 10:22:27,160
>>> OutboundTcpConnection.java:510 - Handshaking version with /
>>> 10.152.205.150
>>> DEBUG [MessagingService-Outgoing-/10.152.205.150-Gossip] 2018-05-09
>>> 10:22:27,160 OutboundTcpConnection.java:482 - Done connecting to /
>>> 10.152.205.150
>>> ERROR [Native-Transport-Requests-1] 2018-05-09 10:22:29,971
>>> ErrorMessage.java:384 - Unexpected exception during request
>>> com.google.common.util.concurrent.UncheckedExecutionException:
>>> com.google.common.util.concurrent.UncheckedExecutionException:
>>> java.lang.RuntimeException: 
>>> org.apache.cassandra.exceptions.UnavailableException:
>>> Cannot achieve consistency level LOCAL_ONE
>>> at 
>>> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
>>> 

Re: Insert-only application repair

2018-05-12 Thread Jeff Jirsa
In a TTL only use case with no explicit deletes, if read CL + write CL > RF you 
can likely avoid repairs with a few huge caveats:

1) read repair may mess up your ttl expiration if you’re using TWCS
2) if you lose a host you probably need to run repairs or you may not see some 
data after replacement (true in general)

-- 
Jeff Jirsa


> On May 12, 2018, at 5:27 AM, onmstester onmstester  
> wrote:
> 
> Thank you Nitan,
> That's exactly my case (RF > CL). But as long as there is no node outage, 
> shouldn't the hinted handoff handle data consistency?
> 
> Sent using Zoho Mail
> 
> 
> 
>  On Sat, 12 May 2018 16:26:13 +0430 Nitan Kainth  
> wrote 
> 
> If you have RF>CL then Repair needs to be run to make sure data is in sync. 
> 
> Sent from my iPhone
> 
> On May 12, 2018, at 3:54 AM, onmstester onmstester  
> wrote:
> 
> 
> In an insert-only use case with TTL (6 months), should i run this command, 
> every 5-7 days on all the nodes of production cluster (according to this: 
> http://cassandra.apache.org/doc/latest/operating/repair.html )?
> nodetool repair -pr --full
> When none of the nodes was down in 4 months (ever since the cluster was 
> launched) and none of the rows been deleted, why should i run nodetool repair?
> 
> 


Re: Insert-only application repair

2018-05-12 Thread onmstester onmstester
Thank you Nitan,

That's exactly my case (RF  CL). But as long as there is no node outage, 
shouldn't the hinted handoff handle data consistency?


Sent using Zoho Mail






 On Sat, 12 May 2018 16:26:13 +0430 Nitan Kainth 
nitankai...@gmail.com wrote 




If you have RFCL then Repair needs to be run to make sure data is in sync. 



Sent from my iPhone



On May 12, 2018, at 3:54 AM, onmstester onmstester onmstes...@zoho.com 
wrote:







In an insert-only use case with TTL (6 months), should i run this command, 
every 5-7 days on all the nodes of production cluster (according to this: 
http://cassandra.apache.org/doc/latest/operating/repair.html )?

nodetool repair -pr --full

When none of the nodes was down in 4 months (ever since the cluster was 
launched) and none of the rows been deleted, why should i run nodetool repair?









Re: Insert-only application repair

2018-05-12 Thread Nitan Kainth
If you have RF>CL then Repair needs to be run to make sure data is in sync. 

Sent from my iPhone

> On May 12, 2018, at 3:54 AM, onmstester onmstester  
> wrote:
> 
> 
> In an insert-only use case with TTL (6 months), should i run this command, 
> every 5-7 days on all the nodes of production cluster (according to this: 
> http://cassandra.apache.org/doc/latest/operating/repair.html )?
> nodetool repair -pr --full
> When none of the nodes was down in 4 months (ever since the cluster was 
> launched) and none of the rows been deleted, why should i run nodetool repair?
> 


Insert-only application repair

2018-05-12 Thread onmstester onmstester


In an insert-only use case with TTL (6 months), should i run this command, 
every 5-7 days on all the nodes of production cluster (according to this: 
http://cassandra.apache.org/doc/latest/operating/repair.html )?

nodetool repair -pr --full

When none of the nodes was down in 4 months (ever since the cluster was 
launched) and none of the rows been deleted, why should i run nodetool repair?