Re: Cassandra 311x broken on Debian docker images

2020-05-18 Thread James Shaw
go to configure directory, i.e. /var/lib/cassandra/conf,  look logback.xml,
whether configured logs in other directory? Also,  do you see output.log,
debug.log in /var/log/cassandra/?

or you may use linux find command to search.


On Mon, May 18, 2020 at 10:51 PM Robert Snakard
 wrote:

> docker logs just outputs stderr and stdout. It doesn't show anything more
> than what I put in top email
>
> On Mon, May 18, 2020 at 7:42 PM James Shaw  wrote:
>
>> docker logs ... see any error in docker/container logs ?
>>
>> On Mon, May 18, 2020 at 10:27 PM Robert Snakard
>>  wrote:
>>
>>> # cat /var/log/cassandra/system.log
>>> =>  cat: system.log: No such file or directory
>>>
>>> I've also checked other possible locations. Since this is
>>> occurring before startup no logs are created
>>>
>>> On Mon, May 18, 2020 at 7:03 PM Erick Ramirez <
>>> erick.rami...@datastax.com> wrote:
>>>
>>>> Can you inspect the C* system.log? It might give clues for the startup
>>>> or it might point to another problem. Cheers!
>>>>
>>>
>>> NOTICE OF CONFIDENTIALITY: At Rapid7, the privacy of our customers,
>>> partners, and employees is paramount. If you received this email in error,
>>> please notify the sender and delete it from your inbox right away. Learn
>>> how Rapid7 handles privacy at rapid7.com/privacy-policy
>>> <https://rapid7.samanage.com/incidents/54726448-update-link-in-footer-of-external-emails>.
>>> To opt-out of Rapid7 marketing emails, please click here
>>> <https://information.rapid7.com/communication-preferences.html> or
>>> email priv...@rapid7.com.
>>
>>
> NOTICE OF CONFIDENTIALITY: At Rapid7, the privacy of our customers,
> partners, and employees is paramount. If you received this email in error,
> please notify the sender and delete it from your inbox right away. Learn
> how Rapid7 handles privacy at rapid7.com/privacy-policy
> <https://rapid7.samanage.com/incidents/54726448-update-link-in-footer-of-external-emails>.
> To opt-out of Rapid7 marketing emails, please click here
> <https://information.rapid7.com/communication-preferences.html> or email
> priv...@rapid7.com.


Re: Cassandra 311x broken on Debian docker images

2020-05-18 Thread James Shaw
docker logs ... see any error in docker/container logs ?

On Mon, May 18, 2020 at 10:27 PM Robert Snakard
 wrote:

> # cat /var/log/cassandra/system.log
> =>  cat: system.log: No such file or directory
>
> I've also checked other possible locations. Since this is occurring before
> startup no logs are created
>
> On Mon, May 18, 2020 at 7:03 PM Erick Ramirez 
> wrote:
>
>> Can you inspect the C* system.log? It might give clues for the startup
>> or it might point to another problem. Cheers!
>>
>
> NOTICE OF CONFIDENTIALITY: At Rapid7, the privacy of our customers,
> partners, and employees is paramount. If you received this email in error,
> please notify the sender and delete it from your inbox right away. Learn
> how Rapid7 handles privacy at rapid7.com/privacy-policy
> .
> To opt-out of Rapid7 marketing emails, please click here
>  or email
> priv...@rapid7.com.


Re: TEST Cluster corrupt after removenode. how to restore

2020-05-18 Thread James Shaw
Do you mean that you want to fix sstable table corrupt error and don't mind
the testing data ?  You may run nodetool scrub  
or nodetool upgradesstable -a( -a is
re-write to current version).

Thanks,

James

On Mon, May 18, 2020 at 12:54 PM Leena Ghatpande 
wrote:

> Running cassandra 3.7
> our TEST cluster has 6 nodes, 3 in each data center
>
> replication factor 2 for keyspaces.
>
> we added 1 new node in each data center for testing making it 8 node
> cluster.
>
> We decided to remove the 2 new nodes from cluster, but instead of
> decommission, the admin just deleted the data folder by mistake.
>
> so we ran a nodetool removenode for the 2 nodes.
> ran a cleanup and full repair on all remaining nodes.
>
> But now the whole cluster is corrupt. data returns inconsistent results
> and we are getting corrupt sstable errors
>
> Is there a way to cleanly recover the data? we do not have a old snapshot.
>


Re: Truncate Materialized View

2020-05-15 Thread James Shaw
Surbhi:
  I don't think you may truncate the materialized view.
What exact error got ?  If you think it is same as the bug, then you may
try to avoid the bug triggered condition. It says pending hints. So you may
let all hints applied, then try drop the view.

Thanks,

James

On Fri, May 15, 2020 at 1:35 PM Surbhi Gupta 
wrote:

> Anyone has truncated materialized views ?
>
> On Thu, 14 May 2020 at 11:59, Surbhi Gupta 
> wrote:
>
>> Hi,
>>
>> We are on 3.11.0 .
>> We have 11 Materialized view on a table.
>> After discussion with application team , we found out that they are using
>> only 4 out of 11 .
>> We tried to drop the materialized view and got hit by the bug
>> https://issues.apache.org/jira/browse/CASSANDRA-13696 which made our
>> whole cluster unstable and multiple nodes were down at the same time .
>>
>> We will upgrade this cluster but my question is , can we truncate the
>> Materialized view rather than dropping?
>>
>> How will it impact ?
>> Does update into base table , updates all materialized views (This is my
>> understanding) ?
>> If yes, then truncate the data from the materialized view will create too
>> many read repairs later, correct?
>> Please correct me if I am wrong .
>>
>> Thanks
>> Surbhi
>>
>>


Re: monitor and alert tool for open source cassandra

2018-09-24 Thread James Shaw
Adam:
   Thanks!
Very helpful. I will take a look.

James

On Mon, Sep 24, 2018 at 6:59 PM Adam Zegelin  wrote:

> Hi James,
>
> Prometheus is the most common monitoring solution for K8s-managed
> applications.
>
> There are a number of options to get Cassandra metrics into Prometheus.
> One of which, shameless plug, is something I've been working on for the
> past few months -- cassandra-exporter, a JVM agent that aims to be the
> easiest & fastest way to get Cassandra metrics into Prometheus.
>
> Check it out on GitHub: https://github.com/zegelin/cassandra-exporter
>
> cassandra-exporter is part of Instaclustrs current work-in-progress
> cassandra-operator for Kubernetes.
> If you're interested in running Cassandra on a K8s cluster and would like
> easy scale up/scale down, automatic Prometheus integration, automatic
> backups, etc, then an operator makes it a lot easier to manage.
> Check it out on GitHub: https://github.com/instaclustr/cassandra-operator
> The project is approaching MVP status, and we would certainly appreciate
> any feedback or contributions.
>
> Regards,
> Adam
>
>
>
> On Mon, 24 Sep 2018 at 14:32, James Shaw  wrote:
>
>> Hi, there:
>>What are latest good tools for monitoring open source cassandra ?
>> I was used to Datastax opscenter tool, felt all tasks quite easy. Now on
>> new project, open source cassandra, on Kubernetes container/docker, logs in
>> Splunk,  feel very challenge.
>> Most wanted metrics are read / write latency, read / write time out.
>>
>> Any advice is welcome.  I appreciate your help!
>>
>> Thanks,
>>
>> James
>>
>


monitor and alert tool for open source cassandra

2018-09-24 Thread James Shaw
Hi, there:
   What are latest good tools for monitoring open source cassandra ?
I was used to Datastax opscenter tool, felt all tasks quite easy. Now on
new project, open source cassandra, on Kubernetes container/docker, logs in
Splunk,  feel very challenge.
Most wanted metrics are read / write latency, read / write time out.

Any advice is welcome.  I appreciate your help!

Thanks,

James


Re: Cassandra 2.2.7 Compaction after Truncate issue

2018-08-23 Thread James Shaw
you may go OS level to delete the files.That's what I did before.  Truncate
action is frequently failed on some remote nodes in a heavy transactions
env.

Thanks,

James

On Thu, Aug 23, 2018 at 8:54 PM, Rahul Singh 
wrote:

> David ,
>
> What CL do you set when running this command?
>
> Rahul Singh
> Chief Executive Officer
> m 202.905.2818
>
> Anant Corporation
> 1010 Wisconsin Ave NW, Suite 250
> 
> Washington, D.C. 20007
>
> We build and manage digital business technology platforms.
> On Aug 14, 2018, 11:49 AM -0500, David Payne , wrote:
>
> Scenario: Cassandra 2.2.7, 3 nodes, RF=3 keyspace.
>
>
>
> Truncate a table.
>
> More than 24 hours later… FileCacheService is still reporting cold readers
> for sstables of truncated data for node 2 and 3, but not node 1.
>
> The output of nodeool compactionstats shows stuck compaction for the
> truncated table for node 2 and 3, but not node 1.
>
>
>
> This appears to be a defect that was fixed in 2.1.0.
> https://issues.apache.org/jira/browse/CASSANDRA-7803
>
>
>
> Any ideas?
>
>
>
> Thanks,
>
> David Payne
>
> | ̄ ̄|
> _☆☆☆_
> ( ´_⊃`)
>
> c. 303-717-0548
>
> dav...@cqg.com
>
>
>
>


Re: duplicate rows for partition

2018-08-22 Thread James Shaw
 can you run this:
select associate_degree, writetime( associate_degree ) from user_data where


Thanks,

James

On Wed, Aug 22, 2018 at 7:13 PM, James Shaw  wrote:

> can you run this:
> select writetime( associate_degree ) from user_data where 
> see what are writetime
>
> On Wed, Aug 22, 2018 at 7:03 PM, James Shaw  wrote:
>
>> interesting. what are insert statement and select statement ?
>>
>> Thanks,
>>
>> James
>>
>> On Wed, Aug 22, 2018 at 6:55 PM, Gosar M 
>> wrote:
>>
>>> CREATE TABLE user_data (
>>> "userid" text,
>>> "secondaryid" text,
>>> "tDate" timestamp,
>>> "tid3" text,
>>> "sid4" text,
>>> "pid5" text,
>>> associate_degree text
>>>   PRIMARY KEY (("userid", "secondaryid"),"tDate", "tid3", "sid4",
>>> "pid5")
>>>   WITH CLUSTERING ORDER BY ("tDate" ASC, "tid3" ASC, "sid4" ASC, "pid5"
>>> ASC)
>>>
>>>
>>>
>>> On Wednesday, 22 August 2018, 15:08:03 GMT-7,
>>> dinesh.jo...@yahoo.com.INVALID  wrote:
>>>
>>>
>>> What is the schema of the table? Could your include the output of
>>> DESCRIBE?
>>>
>>> Dinesh
>>>
>>>
>>> On Wednesday, August 22, 2018, 2:22:31 PM PDT, Gosar M
>>>  wrote:
>>>
>>>
>>> Hello,
>>>
>>> Have a table with following partition and clustering keys
>>>
>>> partition key - ("userid", "secondaryid"),
>>> clustering key - "tDate", "tid3", "sid4", "pid5"
>>>
>>> Data is inserted based on above partition and clustering key. For 1
>>> record seeing 2 rows returned when queried by both partition and clustering
>>> key.
>>>
>>>
>>>  userid  | secondaryid  | tdate   | tid3  | sid4
>>> | pid5| associate_degree
>>>  --+
>>> -+
>>>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34
>>> | l34_listing |   123145979615694
>>>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34
>>> | l34_listing |   123145979615694989
>>>
>>>
>>> We did not had any node which was down longer than gc_grace_period.
>>>
>>>
>>> Thank you.
>>>
>>
>>
>


Re: duplicate rows for partition

2018-08-22 Thread James Shaw
can you run this:
select writetime( associate_degree ) from user_data where 
see what are writetime

On Wed, Aug 22, 2018 at 7:03 PM, James Shaw  wrote:

> interesting. what are insert statement and select statement ?
>
> Thanks,
>
> James
>
> On Wed, Aug 22, 2018 at 6:55 PM, Gosar M 
> wrote:
>
>> CREATE TABLE user_data (
>> "userid" text,
>> "secondaryid" text,
>> "tDate" timestamp,
>> "tid3" text,
>> "sid4" text,
>> "pid5" text,
>> associate_degree text
>>   PRIMARY KEY (("userid", "secondaryid"),"tDate", "tid3", "sid4", "pid5")
>>   WITH CLUSTERING ORDER BY ("tDate" ASC, "tid3" ASC, "sid4" ASC, "pid5"
>> ASC)
>>
>>
>>
>> On Wednesday, 22 August 2018, 15:08:03 GMT-7,
>> dinesh.jo...@yahoo.com.INVALID  wrote:
>>
>>
>> What is the schema of the table? Could your include the output of
>> DESCRIBE?
>>
>> Dinesh
>>
>>
>> On Wednesday, August 22, 2018, 2:22:31 PM PDT, Gosar M
>>  wrote:
>>
>>
>> Hello,
>>
>> Have a table with following partition and clustering keys
>>
>> partition key - ("userid", "secondaryid"),
>> clustering key - "tDate", "tid3", "sid4", "pid5"
>>
>> Data is inserted based on above partition and clustering key. For 1
>> record seeing 2 rows returned when queried by both partition and clustering
>> key.
>>
>>
>>  userid  | secondaryid  | tdate   | tid3  | sid4
>> | pid5| associate_degree
>>  --+
>> -+
>>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34
>> | l34_listing |   123145979615694
>>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34
>> | l34_listing |   123145979615694989
>>
>>
>> We did not had any node which was down longer than gc_grace_period.
>>
>>
>> Thank you.
>>
>
>


Re: duplicate rows for partition

2018-08-22 Thread James Shaw
interesting. what are insert statement and select statement ?

Thanks,

James

On Wed, Aug 22, 2018 at 6:55 PM, Gosar M 
wrote:

> CREATE TABLE user_data (
> "userid" text,
> "secondaryid" text,
> "tDate" timestamp,
> "tid3" text,
> "sid4" text,
> "pid5" text,
> associate_degree text
>   PRIMARY KEY (("userid", "secondaryid"),"tDate", "tid3", "sid4", "pid5")
>   WITH CLUSTERING ORDER BY ("tDate" ASC, "tid3" ASC, "sid4" ASC, "pid5"
> ASC)
>
>
>
> On Wednesday, 22 August 2018, 15:08:03 GMT-7,
> dinesh.jo...@yahoo.com.INVALID  wrote:
>
>
> What is the schema of the table? Could your include the output of DESCRIBE?
>
> Dinesh
>
>
> On Wednesday, August 22, 2018, 2:22:31 PM PDT, Gosar M
>  wrote:
>
>
> Hello,
>
> Have a table with following partition and clustering keys
>
> partition key - ("userid", "secondaryid"),
> clustering key - "tDate", "tid3", "sid4", "pid5"
>
> Data is inserted based on above partition and clustering key. For 1 record
> seeing 2 rows returned when queried by both partition and clustering key.
>
>
>  userid  | secondaryid  | tdate   | tid3  | sid4 |
> pid5| associate_degree
>  --+
> -+
>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34 |
> l34_listing |   123145979615694
>   090sdfdsf898 | ab984564 | 2018-08-04 07:59:59+ | 0a5995672e3 | l34 |
> l34_listing |   123145979615694989
>
>
> We did not had any node which was down longer than gc_grace_period.
>
>
> Thank you.
>


Re: Alter table

2018-07-31 Thread James Shaw
in a heavy transaction PROD env, it is risk, considering c* has a lot of
bugs.
the DDL has to be replicated to all nodes,  use nodetool describecluster to
check schema version same on all nodes, if not,  you may restart that node
which DDL not replicated.
in new version, DDL is none or all,  you may not get it success.

It is similar to rdbms,  alter table in a heavy transaction PROD env, may
get resource busy error.

in non-prod, we always apply new DDL without stop applications, never had
issue.

Thanks,

James


On Tue, Jul 31, 2018 at 1:37 AM, Jeff Jirsa  wrote:

> This is safe (and normal, and good) in all versions except those impacted
> by https://issues.apache.org/jira/browse/CASSANDRA-13004
>
> So if you're on 2.1, 2.2, or 3.11 you're fine
>
> If you're on 3.0 between 3.0.0 and 3.0.13, you should upgrade first (to
> newest 3.0, probably 3.0.17)
> If you're on a version between 3.1 and 3.10, you should upgrade first (to
> newest 3.11, probably 3.11.3)
>
> - Jeff
>
>
> On Mon, Jul 30, 2018 at 10:16 PM, Visa  wrote:
>
>> Hi all,
>>
>> I have one question about altering schema. If we only add columns, is it
>> ok to alter the schema while the writes to the table are happening at the
>> same time? We can control that the writes will not touch the new columns
>> until the schema change is done. Or better to stop the writes to that table
>> first.
>>
>> Thanks!
>>
>> Li
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>


Re: Re: Data model storage optimization

2018-07-30 Thread James Shaw
considering:
row size large or not
update a lot or not   - update is insert actually
read heavy or not
overall read performance

if row size large , you may consider table:user_detail , add column id in
all tables.
In application side, merge/join by id.
But paid read price, 2nd query to user_detail.

Just my 2 cents.  hope helpful.

Thanks,

James


On Sun, Jul 29, 2018 at 11:20 PM, onmstester onmstester  wrote:

>
> How many rows in average per partition?
>
> around 10K.
>
>
> Let me get this straight : You are bifurcating your partitions on either
> email or username , essentially potentially doubling the data because you
> don’t have a way to manage a central system of record of users ?
>
> We are just analyzing output logs of a "perfectly" running application!,
> so no one let me change its data design, i thought maybe it would be a more
> general problem for cassandra users that someone both
> 1. needed to access a identical set of columns by multiple keys (all the
> keys should be present in rows)
> 2. there was a storage limit (due to TTL * input rate would be some TBs)
> I know that there is a strict rule in cassandra data modeling : "never use
> foreign keys and sacrifice disk instead", but anyone ever been forced to do
> such a thing and How?
>
>


Re: Infinite loop of single SSTable compactions

2018-07-25 Thread James Shaw
nodetool compactionstats  --- see compacting which table
nodetool cfstats keyspace_name.table_name  --- check partition side,
tombstones

go the data file directories:  look the data file size, timestamp,  ---
compaction will write to new temp file with _tmplink...,

use sstablemetadata ...    look the largest or oldest one first

of course, other factors may be,  like disk space, etc
also what are compaction_throughput_mb_per_sec in cassandra.yaml

Hope it is helpful.

Thanks,

James




On Wed, Jul 25, 2018 at 4:18 AM, Martin Mačura  wrote:

> Hi,
> we have a table which is being compacted all the time, with no change in
> size:
>
> Compaction History:
> compacted_atbytes_inbytes_out   rows_merged
> 2018-07-25T05:26:48.101 57248063878 57248063878 {1:11655}
>
>   2018-07-25T01:09:47.346 57248063878 57248063878
> {1:11655}
>  2018-07-24T20:52:48.652
> 57248063878 57248063878 {1:11655}
>
> 2018-07-24T16:36:01.828 57248063878 57248063878 {1:11655}
>
>   2018-07-24T12:11:00.026 57248063878 57248063878
> {1:11655}
>  2018-07-24T07:28:04.686
> 57248063878 57248063878 {1:11655}
>
> 2018-07-24T02:47:15.290 57248063878 57248063878 {1:11655}
>
>   2018-07-23T22:06:17.410 57248137921 57248063878
> {1:11655}
>
> We tried setting unchecked_tombstone_compaction to false, had no effect.
>
> The data is a time series, there will be only a handful of cell
> tombstones present. The table has a TTL, but it'll be least a month
> before it takes effect.
>
> Table properties:
>AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS',
> 'max_threshold': '32', 'min_threshold': '4',
> 'unchecked_tombstone_compaction': 'false'}
>AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>AND crc_check_chance = 1.0
>AND dclocal_read_repair_chance = 0.0
>AND default_time_to_live = 63072000
>AND gc_grace_seconds = 10800
>AND max_index_interval = 2048
>AND memtable_flush_period_in_ms = 0
>AND min_index_interval = 128
>AND read_repair_chance = 0.0
>AND speculative_retry = 'NONE';
>
> Thanks for any help
>
>
> Martin
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Timeout for only one keyspace in cluster

2018-07-23 Thread James Shaw
does your application really need counter ?  just an option.

Thanks,

James

On Mon, Jul 23, 2018 at 10:57 AM, learner dba  wrote:

> Thanks a lot Ben. This makes sense but feel bad that we don't have a
> solution yet. We can try consistency level one but that will be against
> general rule for having local_quorum for production. Also, consistency ONE
> will not guarantee 0 race condition.
>
> Is there any better solution?
>
> On Saturday, July 21, 2018, 8:27:57 PM CDT, Ben Slater <
> ben.sla...@instaclustr.com> wrote:
>
>
> Note that that writetimeout exception can be C*s way of telling you when
> there is contention on a LWT (rather than actually timing out). See
> https://issues.apache.org/jira/browse/CASSANDRA-9328
>
> Cheers
> Ben
>
> On Sun, 22 Jul 2018 at 11:20 Goutham reddy 
> wrote:
>
> Hi,
> As it is a single partition key, try to update the key with only partition
> key instead of passing other columns. And try to set consistency level ONE.
>
> Cheers,
> Goutham.
>
> On Fri, Jul 20, 2018 at 6:57 AM learner dba 
> wrote:
>
> Anybody has any ideas about this? This is happening in production and we
> really need to fix it.
>
> On Thursday, July 19, 2018, 10:41:59 AM CDT, learner dba <
> cassandra...@yahoo.com.INVALID> wrote:
>
>
> Our foreignid is unique idetifier and we did check for wide partitions;
> cfhistorgrams show all partitions are evenly sized:
>
> Percentile  SSTables Write Latency  Read LatencyPartition Size
>   Cell Count
>
>   (micros)  (micros)   (bytes)
>
>
> 50% 0.00 29.52  0.00  1916
>   12
>
> 75% 0.00 42.51  0.00  2299
>   12
>
> 95% 0.00 61.21  0.00  2759
>   14
>
> 98% 0.00 73.46  0.00  2759
>   17
>
> 99% 0.00 88.15  0.00  2759
>   17
>
> Min 0.00  9.89  0.00   150
> 2
>
> Max 0.00 88.15  0.00   7007506
> 42510
> any thing else that we can check?
>
> On Wednesday, July 18, 2018, 10:44:29 PM CDT, wxn...@zjqunshuo.com <
> wxn...@zjqunshuo.com> wrote:
>
>
> Your partition key is foreignid. You may have a large partition. Why not
> use foreignid+timebucket as partition key?
>
>
> *From:* learner dba 
> *Date:* 2018-07-19 01:48
> *To:* User cassandra.apache.org 
> *Subject:* Timeout for only one keyspace in cluster
> Hi,
>
> We have a cluster with multiple keyspaces. All queries are performing good
> but write operation on few tables in one specific keyspace gets write
> timeout. Table has counter column and counter update query times out
> always. Any idea?
>
> CREATE TABLE x.y (
>
> foreignid uuid,
>
> timebucket text,
>
> key text,
>
> timevalue int,
>
> value counter,
>
> PRIMARY KEY (foreignid, timebucket, key, timevalue)
>
> ) WITH CLUSTERING ORDER BY (timebucket ASC, key ASC, timevalue ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = ''
>
> AND compaction = {'class': 'org.apache.cassandra.db.compa
> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
> 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
> Query and Error:
>
> UPDATE x.y SET value = value + 1 where foreignid = ? AND timebucket = ? AND 
> key = ? AND timevalue = ?, err = {s:\"gocql: no response 
> received from cassandra within timeout period
>
>
> I verified CL=local_serial
>
> We had been working on this issue for many days; any help will be much 
> appreciated.
>
>
>
> --
> Regards
> Goutham Reddy
>
> --
>
>
> *Ben Slater*
>
> *Chief Product Officer *
>
>    
>
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then 

Re: Cassandra Upgrade with Different Protocol Version

2018-07-05 Thread James Shaw
other concerns:
there is no replication between 2.11 and 3.11, store in hints, and replay
hints when remote is same version. have to do repair if over window.  if
read quorum 2/3,  will get error.

in case rollback to 2.11, can not read new version 3.11 data files, but
online rolling upgrade, some new data is in new version format.

if hardlink snapshot is not copied to other device, then disk failure may
cause data loss. ( since may only have some data may just 1 copy during
upgrade because no replication).


On Thu, Jul 5, 2018 at 8:13 PM, kooljava2 
wrote:

> Hello Anuj,
>
> The 2nd workaround should work. As app will auto discover all the other
> nodes. Its the first contact with the node that app makes determines the
> protocol version. So if you remove the newer version nodes from the app
> configuration after the startup, it will auto discover the newer nodes as
> well.
>
> Thank you,
> TS.
>
> On Thursday, 5 July 2018, 12:45:39 GMT-7, Anuj Wadehra <
> anujw_2...@yahoo.co.in.INVALID> wrote:
>
>
> Hi,
>
> I woud like to know how people are doing rolling upgrade of Casandra
> clustes when there is a change in native protocol version say from 2.1 to
> 3.11. During rolling upgrade, if client application is restarted on nodes,
> the client driver may first contact an upgraded Cassandra node with v4 and
> permanently mark all old Casandra nodes on v3 as down. This may lead to
> request failures. Datastax recommends two ways to deal with this:
>
> 1. Before upgrade, set protocol version to lower protocol version. And
> move to higher version once entire cluster is upgraded.
> 2. Make sure driver only contacts upraded Cassandra nodes during rolling
> upgrade.
>
> Second workaround will lead to failures as you may not be able to meet
> required consistency for some time.
>
> Lets consider first workaround. Now imagine an application where protocol
> version is not configurable and code uses default protocol version. You can
> not apply first workaroud because you have to upgrade your application on
> all nodes to first make the protocol version configurable. How would you
> upgrade such a cluster without downtime? Thoughts?
>
> Thanks
> Anuj
>
>
>


Re: Added a new node, now what repair is best?

2018-07-01 Thread James Shaw
 nodetool repair -pr  on every node   ---  covered all ranges of data.

On Sun, Jul 1, 2018 at 7:03 AM, Riccardo Ferrari  wrote:

> Hi list,
>
> After long time  of operation we come to the need of growing our cluster.
> This cluster was born on 2.X and almos 2 years ago migrated to 3.0.6 ( I
> know we are bit prudent )
>
> The cluster was a 3 m1.xlarge (we are on AWS) and table RF was 3
>
> Thanks to your valuable hints we added a new node in the same AZ that
> joined flawlessy after making sure the streaming_socket_timeout_in_ms was
> high enough.
>
> I was used to run:
>
>- nodetool repair -pr (on each node and alterante days)
>
> QUESTION TIME:
>
> Q1: Should I keep it that way or should I run some "nodetool repair -full"
> on all the nodes the first time ?
>
> Reading today's documentation (http://cassandra.apache.org/
> doc/latest/operating/repair.html) it's not really clear to me what should
> be the best practice. Any pointer is much appreciated
>
> Thanks,
>


Re: nodetool repair and compact

2018-04-02 Thread James Shaw
you may use:  nodetool upgradesstables -a keyspace_name table_name
it will re-write this table's sstable files to current version, while
re-writing, will evit droppable tombstones (expired +  gc_grace_seconds
(default 10 days) ), if partition cross different files, they will still be
kept, but most droppable tombstones gone and size reduced.
It works well for ours.



On Mon, Apr 2, 2018 at 12:45 AM, Jon Haddad  wrote:

> You’ll find the answers to your questions (and quite a bit more) in this
> blog post from my coworker: http://thelastpickle.com/blog/2016/
> 07/27/about-deletes-and-tombstones.html
>
> Repair doesn’t clean up tombstones, they’re only removed through
> compaction.  I advise taking care with nodetool compact, most of the time
> it’s not a great idea for a variety of reasons.  Check out the above post,
> if you still have questions, ask away.
>
>
> On Apr 1, 2018, at 9:41 PM, Xiangfei Ni  wrote:
>
> Hi All,
>   I want to delete the expired tombstone, someone uses nodetool repair
> ,but someone uses compact,so I want to know which one is the correct way,
>   I have read the below pages from Datastax,but the page just tells us how
> to use the command,but doesn’t tell us what it is exactly dose,
>   https://docs.datastax.com/en/cassandra/3.0/cassandra/
> tools/toolsRepair.html
>could anybody tell me how to clean the tombstone and give me some
> materials include the detailed instruction about the nodetool command and
> options?Web link is also ok.
>   Thanks very much
> Best Regards,
>
> 倪项菲*/ **David Ni*
> 中移德电网络科技有限公司
>
> Virtue Intelligent Network Ltd, co.
> Add: 2003,20F No.35 Luojia creative city,Luoyu Road,Wuhan,HuBei
> Mob: +86 13797007811 <+86%20137%209700%207811>|Tel: + 86 27 5024 2516
> <+86%2027%205024%202516>
>
>
>


Re: Adding disk to operating C*

2018-03-10 Thread James Shaw
I think it's case by case, depended on chasing read performance or write
performance, or both.
Ours are used for application, read request 10 times larger than writing,
application wants read performance and doesn't care writing,  we use 4 SSD
each 380 GB for each node (total 1.5T a node),  read latency is 0.4 ms/op.
Of course, if cassandra used for reporting, that is different.

Thanks,

James

On Sat, Mar 10, 2018 at 7:55 AM, Rahul Singh 
wrote:

> My 1.5T bound is for high throughput for read and write with hundreds of
> nodes — specifically with needs for quick bootstrap / repairs when adding /
> replacing nodes.
>
> Lower the density the faster it is to add nodes.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Mar 9, 2018, 11:30 AM -0500, Jon Haddad , wrote:
>
> I agree with Jeff - I usually advise teams to cap their density around
> 3TB, especially with TWCS.  Read heavy workloads tend to use smaller
> datasets and ring size ends up being a function of performance tuning.
>
> Since 2.2 bootstrap can now be resumed, which helps quite a bit with the
> streaming problem, see CASSANDRA-8838.
>
> Jon
>
>
> On Mar 9, 2018, at 7:39 AM, Jeff Jirsa  wrote:
>
> 1.5 TB sounds very very conservative - 3-4T is where I set the limit at
> past jobs. Have heard of people doing twice that (6-8T).
>
> --
> Jeff Jirsa
>
>
> On Mar 8, 2018, at 11:09 PM, Niclas Hedhman  wrote:
>
> I am curious about the side comment; "Depending on your usecase you may not
> want to have a data density over 1.5 TB per node."
>
> Why is that? I am planning much bigger than that, and now you give me
> pause...
>
>
> Cheers
> Niclas
>
> On Wed, Mar 7, 2018 at 6:59 PM, Rahul Singh 
> wrote:
>
>> Are you putting both the commitlogs and the Sstables on the adds?
>> Consider moving your snapshots often if that’s also taking up space. Maybe
>> able to save some space before you add drives.
>>
>> You should be able to add these new drives and mount them without an
>> issue. Try to avoid different number of data dirs across nodes. It makes
>> automation of operational processes a little harder.
>>
>> As an aside, Depending on your usecase you may not want to have a data
>> density over 1.5 TB per node.
>>
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>>
>> Anant Corporation
>>
>> On Mar 7, 2018, 1:26 AM -0500, Eunsu Kim , wrote:
>>
>> Hello,
>>
>> I use 5 nodes to create a cluster of Cassandra. (SSD 1TB)
>>
>> I'm trying to mount an additional disk(SSD 1TB) on each node because each
>> disk usage growth rate is higher than I expected. Then I will add the the
>> directory to data_file_directories in cassanra.yaml
>>
>> Can I get advice from who have experienced this situation?
>> If we go through the above steps one by one, will we be able to complete
>> the upgrade without losing data?
>> The replication strategy is SimpleStrategy, RF 2.
>>
>> Thank you in advance
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>
> --
> Niclas Hedhman, Software Developer
> http://zest.apache.org - New Energy for Java
>
>
>


Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread James Shaw
per my testing, repair not help.
repair build Merkle tree to compare data, it only write to a new file while
have difference, very very small file at the end  (of course, means most
data are synced)

On Fri, Mar 9, 2018 at 10:31 PM, Madhu B  wrote:

> Yasir,
> I think you need to run full repair in off-peak hours
>
> Thanks,
> Madhu
>
>
> On Mar 9, 2018, at 7:20 AM, Kenneth Brotman 
> wrote:
>
> Yasir,
>
>
>
> How many nodes are in the cluster?
>
> What is num_tokens set to in the Cassandra.yaml file?
>
> Is it just this one node doing this?
>
> What replication factor do you use that affects the ranges on that disk?
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Kyrylo Lebediev [mailto:kyrylo_lebed...@epam.com
> ]
> *Sent:* Friday, March 09, 2018 4:14 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: uneven data movement in one of the disk in Cassandra
>
>
>
> Not sure where I heard this, but AFAIK data imbalance when multiple
> data_directories are in use is a known issue for older versions of
> Cassandra. This might be the root-cause of your issue.
>
> Which version of C* are you using?
>
> Unfortunately, don't remember in which version this imbalance issue was
> fixed.
>
>
>
> -- Kyrill
> --
>
> *From:* Yasir Saleem 
> *Sent:* Friday, March 9, 2018 1:34:08 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: uneven data movement in one of the disk in Cassandra
>
>
>
> Hi Alex,
>
>
>
> no active compaction, right now.
>
>
>
> 
>
>
>
> On Fri, Mar 9, 2018 at 3:47 PM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem 
> wrote:
>
> Thanks, Nicolas Guyomar
>
>
>
> I am new to cassandra, here is the properties which I can see in yaml
> file:
>
>
>
> # of compaction, including validation compaction.
>
> compaction_throughput_mb_per_sec: 16
>
> compaction_large_partition_warning_threshold_mb: 100
>
>
>
> To check currently active compaction please use this command:
>
>
>
> nodetool compactionstats -H
>
>
>
> on the host which shows the problem.
>
>
>
> --
>
> Alex
>
>
>
>
>
>


Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread James Shaw
Ours have similar issue and I am working to solve it this weekend.
Our case is because STCS make one huge table's sstable file bigger and
bigger after compaction  (this is STCS compaction nature, nothing wrong),
even all most all data TTL 30days, but tombstones not evicted since largest
file is waiting for other 3 files for compaction.  The largest file 99.99%
are tombstones.

use command:  nodetool upgradesstables -a keyspace table
it will re-write all existed sstables and evit tombstones.

in you case, first do a few checking:
1. cd  /data/disk03/cassandra/data_prod/data
du -ks * | sort -n
find which tables use most space

2.  check the snapshot for above bigger tables
it's possible too old snapshots caused.

3.  cd table directory
sstablemetadata  sstablefile
to look the tables, whether a lot tombstones droppable

 4.
ls -lhS /data/disk */ cassandra/data_prod/data/"that
keyspace"/"that_table"*/*Data.db
look all sstables files,  you will see what's next compaction.

Per my watch, when small size compaction, seems randomly to which disks,
but when size large, it goes to disks which has more free space.

5.  if the biggest file too big, will wait long time for next compaction.
You may test ( sorry, not in my case, so I am not 100% sure)
1) if new cassandra 3.0,  you may try nodetool compact -s  ( it will split )
2) if old cassandra version,  stop cassandra,  use sstbalesplit


Hope it helps

Thanks,

James


On Fri, Mar 9, 2018 at 7:14 AM, Kyrylo Lebediev 
wrote:

> Not sure where I heard this, but AFAIK data imbalance when multiple
> data_directories are in use is a known issue for older versions of
> Cassandra. This might be the root-cause of your issue.
>
> Which version of C* are you using?
>
> Unfortunately, don't remember in which version this imbalance issue was
> fixed.
>
>
> -- Kyrill
> --
> *From:* Yasir Saleem 
> *Sent:* Friday, March 9, 2018 1:34:08 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: uneven data movement in one of the disk in Cassandra
>
> Hi Alex,
>
> no active compaction, right now.
>
>
>
>
> On Fri, Mar 9, 2018 at 3:47 PM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem 
> wrote:
>
> Thanks, Nicolas Guyomar
>
> I am new to cassandra, here is the properties which I can see in yaml
> file:
>
> # of compaction, including validation compaction.
> compaction_throughput_mb_per_sec: 16
> compaction_large_partition_warning_threshold_mb: 100
>
>
> To check currently active compaction please use this command:
>
> nodetool compactionstats -H
>
> on the host which shows the problem.
>
> --
> Alex
>
>
>


Re: multiple tables vs. partitions and TTL

2018-02-01 Thread James Shaw
if me, I will go 1 table, just think too much labor to manage many tables
and also how reliable while switching tables.
Regarding tombstones,  may try some ways to fight:
reasonable partition size  ( big partition with large tombstones will be a
problem);
don't query tombstones as possible,  in application coding, put timestamp >
expired time in where condition, so will not touch tombstones, in table
side,  timestamp column in cluster key, with desc order, so better
performance.

James

On Thu, Feb 1, 2018 at 3:16 AM, Marcus Haarmann 
wrote:

> Hi experts,
>
> I have a design issue here:
> We want to store bigger amounts of data (> 30mio rows containing blobs)
> which will be deleted depending on the type
> of data on a monthly base (not in the same order as the data entered the
> system).
> Some data would survive for two month only, other data for 3-5 years.
>
> The choice now is to have one table only with TTL per partition and
> partitions per deletion month (when the data should be deleted)
> which will allow a single delete command, followed by a compaction
> or alternatively to have multiple tables (one per month when the deletion
> process would just drop the table).
> The logic to retrieve that data is per record, so we know both the
> retention period and the id (uuid) of the addressed record,
> so multiple tables can be handled.
>
> Since it would be one table per deletion month, I do not expect more than
> 1000-2000 tables, depending on the
> retention period of the data.
>
> The benefit creating multiple tables would be that there are no tombstones
> while more tables take more memory in the nodes.
> The one table approach would make the compaction process take longer and
> produce more I/O activity because
> the compaction would regenerate multiple tables internally.
>
> Any thoughts on this ?
> We want to use 9 nodes, cassandra 3.11 on Linux, total data amount
> expected ~15-20 TB.
>
> Thank you very much,
>
> Marcus Haarmann
>


Re: Old tombstones not being cleaned up

2018-02-01 Thread James Shaw
i see leveled compaction used, if it's last, it will have to stay until
next level compaction happens, then will be gone, right ?

On Thu, Feb 1, 2018 at 2:33 AM, Bo Finnerup Madsen 
wrote:

> Hi,
>
> We are running a small 9 node Cassandra v2.1.17 cluster. The cluster
> generally runs fine, but we have one table that are causing OOMs because an
> enormous amount of tombstones.
> Looking at the data in the table (sstable2json), the first of the
> tombstones are almost a year old. The table was initially created with a
> gc_grace_period of 10 days, but I have now lowered it to 1 hour.
> I have run a full repair of the table across all nodes. I have forced
> several major compactions of the table by using "nodetool compact", and
> also tried to switch from LeveledCompaction to SizeTierCompaction and back.
>
> What could cause cassandra to keep these tombstones?
>
> sstable2json:
> {"key": "foo",
>  "cells": [["082f-25ef-4324-bb8a-8cf013c823c1:_","082f-
> 25ef-4324-bb8a-8cf013c823c1:!",1507819135148000,"t",1507819135],
>["10f3-c05d-4ab9-9b8a-e6ebd8f5818a:_","10f3-
> c05d-4ab9-9b8a-e6ebd8f5818a:!",1503661731697000,"t",1503661731],
>["1d7a-ce95-4c74-b67e-f8cdffec4f85:_","1d7a-
> ce95-4c74-b67e-f8cdffec4f85:!",1509542102909000,"t",1509542102],
>["1dd3-ae22-4f6e-944a-8cfa147cde68:_","1dd3-
> ae22-4f6e-944a-8cfa147cde68:!",1512418006838000,"t",1512418006],
>["22cc-d69c-4596-89e5-3e976c0cb9a8:_","22cc-
> d69c-4596-89e5-3e976c0cb9a8:!",1497377448737001,"t",1497377448],
>["2777-4b1a-4267-8efc-c43054e63170:_","2777-
> 4b1a-4267-8efc-c43054e63170:!",1491014691515001,"t",1491014691],
>["61e8-f48b-4484-96f1-f8b6a3ed8f9f:_","61e8-
> f48b-4484-96f1-f8b6a3ed8f9f:!",1500820300544000,"t",1500820300],
>["63da-f165-449b-b65d-2b7869368734:_","63da-
> f165-449b-b65d-2b7869368734:!",1512806634968000,"t",1512806634],
>["656f-f8b5-472b-93ed-1a893002f027:_","656f-
> f8b5-472b-93ed-1a893002f027:!",1514554716141000,"t",1514554716],
> ...
> {"key": "bar",
>  "metadata": {"deletionInfo": {"markedForDeleteAt":1517402198585982,"
> localDeletionTime":1517402198}},
>  "cells": [["000af8c2-ffe9-4217-9032-61a1cd21781d:_","000af8c2-
> ffe9-4217-9032-61a1cd21781d:!",1495094965916000,"t",1495094965],
>["005b96cb-7eb3-4ec3-bfa2-8573e46892f4:_","005b96cb-
> 7eb3-4ec3-bfa2-8573e46892f4:!",1516360186865000,"t",1516360186],
>["005ec167-aa61-4868-a3ae-a44b00099eb6:_","005ec167-
> aa61-4868-a3ae-a44b00099eb6:!",1516671840920002,"t",1516671840],
> 
>
> sstablemetadata:
> stablemetadata /data/cassandra/data/xxx/yyy-9ed502c0734011e6a128fdafd829b1
> c6/ddp-yyy-ka-2741-Data.db
> SSTable: /data/cassandra/data/xxx/yyy-9ed502c0734011e6a128fdafd829b1
> c6/ddp-yyy-ka-2741
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Bloom Filter FP chance: 0.10
> Minimum timestamp: 1488976211688000
> Maximum timestamp: 1517468644066000
> SSTable max local deletion time: 2147483647 <(214)%20748-3647>
> Compression ratio: 0.5121956624389545
> Estimated droppable tombstones: 18.00161766553587
> SSTable Level: 0
> Repaired at: 0
> ReplayPosition(segmentId=1517168739626, position=22690189)
> Estimated tombstone drop times:%n
> 1488976211: 1
> 1489906506:  4706
> 1490174752:  6111
> 1490449759:  6554
> 1490735410:  6559
> 1491016789:  6369
> 1491347982: 10216
> 1491680214: 13502
> ...
>
> desc:
> CREATE TABLE xxx.yyy (
> ti text,
> uuid text,
> json_data text,
> PRIMARY KEY (ti, uuid)
> ) WITH CLUSTERING ORDER BY (uuid ASC)
> AND bloom_filter_fp_chance = 0.1
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.
> LeveledCompactionStrategy'}
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.
> compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 3600
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
>
> jmx props(picture):
> [image: image.png]
>