about cassandra..

2018-08-08 Thread Eunsu Kim
Hi all.

I’m worried about the amount of disk I use, so I’m more curious about 
compression. We are currently using 3.11.0 and use default LZ4 Compressor 
('chunk_length_in_kb': 64).
Is there a setting that can make more powerful compression?
Because most of them are time series data with TTL, we use 
TimeWindowCompositionStrategy.

Thank you in advance.
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Extending Cassandra on AWS from single Region to Multi-Region

2018-08-08 Thread srinivasarao daruna
Hi All,

We have built Cassandra on AWS EC2 instances. Initially when creating
cluster we have not considered multi-region deployment and we have used AWS
EC2Snitch.

We have used EBS Volumes to save our data and each of those disks were
filled around 350G.
We want to extend it to Multi Region and wanted to know the better approach
and recommendations to achieve this process.

I agree that we have made a mistake by not using EC2MultiRegionSnitch, but
its past now and if anyone faced or implemented similar thing i would like
to get some guidance.

Any help would be very much appreciated.

Thank You,
Regards,
Srini


Re: Paging in Cassandra

2018-08-08 Thread Ghazi Naceur
Hello everyone,

This is the solution :

@Autowired
CassandraOperations cassandraOperationsInstance;

...
...
Pageable request = CassandraPageRequest.first(1000);
Slice slice = null;
Query query = Query.empty().pageRequest(request);

do {
slice = cassandraOperationsInstance.slice(query, clazz);
if (slice.hasContent()) {
slice.getContent().forEach(s -> {
// treatement ...
});
}
if (slice.hasNext()) {
request = slice.getPageable();
query = query.pageRequest(request);
} else {
break;
}
} while (!slice.getContent().isEmpty());

Best regards.

2018-07-10 14:44 GMT+01:00 Alain RODRIGUEZ :

> Hello,
>
> It sounds like a client/coding issue. People are working with distinct
> clients to connect to Cassandra. And it looks like there are not many
> 'spring-data-cassandra' users around ¯\_(ツ)_/¯.
>
> You could try giving a try there see if you have more luck:
> https://spring.io/questions.
>
> C*heers,
>
> Alain
>
> 2018-07-05 6:21 GMT+01:00 Ghazi Naceur :
>
>> Hello Eveyone,
>>
>> I'm facing a problem with CassandraPageRequest and Slice.
>> In fact, I'm always obtaining the same Slice and I'm not able to get the
>> next slice (or Page) of data.
>> I'm based on this example :
>>
>> Link : https://github.com/spring-projects/spring-data-cassandra/pull/114
>>
>>
>> Query query = 
>> Query.empty().pageRequest(CassandraPageRequest.first(10));Slice slice 
>> = template.slice(query, User.class);
>> do {
>> // consume slice
>> if (slice.hasNext()) {
>> slice = template.select(query, slice.nextPageable(), User.class);
>> } else {break;
>> }
>> } while (!slice.getContent().isEmpty());
>>
>>
>>
>> I appreciate your help.
>>
>
>


Re: Compression Tuning Tutorial

2018-08-08 Thread Eric Plowe
Great post, Jonathan! Thank you very much.

~Eric

On Wed, Aug 8, 2018 at 2:34 PM Jonathan Haddad  wrote:

> Hey folks,
>
> We've noticed a lot over the years that people create tables usually
> leaving the default compression parameters, and have spent a lot of time
> helping teams figure out the right settings for their cluster based on
> their workload.  I finally managed to write some thoughts down along with a
> high level breakdown of how the internals function that should help people
> pick better settings for their cluster.
>
> This post focuses on a mixed 50:50 read:write workload, but the same
> conclusions are drawn from a read heavy workload.  Hopefully this helps
> some folks get better performance / save some money on hardware!
>
> http://thelastpickle.com/blog/2018/08/08/compression_performance.html
>
>
> --
> Jon Haddad
> Principal Consultant, The Last Pickle
>


Compression Tuning Tutorial

2018-08-08 Thread Jonathan Haddad
Hey folks,

We've noticed a lot over the years that people create tables usually
leaving the default compression parameters, and have spent a lot of time
helping teams figure out the right settings for their cluster based on
their workload.  I finally managed to write some thoughts down along with a
high level breakdown of how the internals function that should help people
pick better settings for their cluster.

This post focuses on a mixed 50:50 read:write workload, but the same
conclusions are drawn from a read heavy workload.  Hopefully this helps
some folks get better performance / save some money on hardware!

http://thelastpickle.com/blog/2018/08/08/compression_performance.html

-- 
Jon Haddad
Principal Consultant, The Last Pickle


Re: TWCS Compaction backed up

2018-08-08 Thread Brian Spindler
Hi Jeff/Jon et al, here is what I'm thinking to do to clean up, please lmk
what you think.

This is precisely my problem I believe:
http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html

With this I have a lot of wasted space due to a bad incremental repair.  So
I am thinking to abandon incremental repairs by;
- Set all repairedAt values to 0 on any/all *Data.db SSTables
- using either range_repair.py or reaper run sub range repairs

Will this clean everything up?


On Tue, Aug 7, 2018 at 9:18 PM Brian Spindler 
wrote:

> In fact all of them say Repaired at: 0.
>
> On Tue, Aug 7, 2018 at 9:13 PM Brian Spindler 
> wrote:
>
>> Hi, I spot checked a couple of the files that were ~200MB and the mostly
>> had "Repaired at: 0" so maybe that's not it?
>>
>> -B
>>
>>
>> On Tue, Aug 7, 2018 at 8:16 PM  wrote:
>>
>>> Everything is ttl’d
>>>
>>> I suppose I could use sstablemeta to see the repaired bit, could I just
>>> set that to unrepaired somehow and that would fix?
>>>
>>> Thanks!
>>>
>>> On Aug 7, 2018, at 8:12 PM, Jeff Jirsa  wrote:
>>>
>>> May be worth seeing if any of the sstables got promoted to repaired - if
>>> so they’re not eligible for compaction with unrepaired sstables and that
>>> could explain some higher counts
>>>
>>> Do you actually do deletes or is everything ttl’d?
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Aug 7, 2018, at 5:09 PM, Brian Spindler 
>>> wrote:
>>>
>>> Hi Jeff, mostly lots of little files, like there will be 4-5 that are
>>> 1-1.5gb or so and then many at 5-50MB and many at 40-50MB each.
>>>
>>> Re incremental repair; Yes one of my engineers started an incremental
>>> repair on this column family that we had to abort.  In fact, the node that
>>> the repair was initiated on ran out of disk space and we ended replacing
>>> that node like a dead node.
>>>
>>> Oddly the new node is experiencing this issue as well.
>>>
>>> -B
>>>
>>>
>>> On Tue, Aug 7, 2018 at 8:04 PM Jeff Jirsa  wrote:
>>>
 You could toggle off the tombstone compaction to see if that helps, but
 that should be lower priority than normal compactions

 Are the lots-of-little-files from memtable flushes or
 repair/anticompaction?

 Do you do normal deletes? Did you try to run Incremental repair?

 --
 Jeff Jirsa


 On Aug 7, 2018, at 5:00 PM, Brian Spindler 
 wrote:

 Hi Jonathan, both I believe.

 The window size is 1 day, full settings:
 AND compaction = {'timestamp_resolution': 'MILLISECONDS',
 'unchecked_tombstone_compaction': 'true', 'compaction_window_size': '1',
 'compaction_window_unit': 'DAYS', 'tombstone_compaction_interval': '86400',
 'tombstone_threshold': '0.2', 'class':
 'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}


 nodetool tpstats

 Pool NameActive   Pending  Completed   Blocked
 All time blocked
 MutationStage 0 068582241832 0
0
 ReadStage 0 0  209566303 0
0
 RequestResponseStage  0 044680860850 0
0
 ReadRepairStage   0 0   24562722 0
0
 CounterMutationStage  0 0  0 0
0
 MiscStage 0 0  0 0
0
 HintedHandoff 1 1203 0
0
 GossipStage   0 08471784 0
0
 CacheCleanupExecutor  0 0122 0
0
 InternalResponseStage 0 0 552125 0
0
 CommitLogArchiver 0 0  0 0
0
 CompactionExecutor8421433715 0
0
 ValidationExecutor0 0   2521 0
0
 MigrationStage0 0 527549 0
0
 AntiEntropyStage  0 0   7697 0
0
 PendingRangeCalculator0 0 17 0
0
 Sampler   0 0  0 0
0
 MemtableFlushWriter   0 0 116966 0
0
 MemtablePostFlush 0 0 209103 0
0
 MemtableReclaimMemory 0 0 116966 0
0
 Native-Transport-Requests 1 0 1715937778 0
   

Re: dynamic_snitch=false, prioritisation/order or reads from replicas

2018-08-08 Thread Kyrylo Lebediev
Thank you for explaining, Alain!


Predetermining the nodes to query, then sending 'data' request to one of them 
and 'digest' request to another (for CL=QUORUM, RF=3) indeed explains more 
effective use of filesystem cache when dynamic snitching is disabled.


So, there will be replica / replicas for each token range that will never be 
queried (2 replicas for CL=ONE, 1 replica for CL=QUORUM for RF=3). But taking 
into account that data is evenly distributed across all nodes in the cluster, 
looks like there shouldn't be any issues related to such load redistribution, 
except the case that you mentioned, when a node is having performance issues 
but all requests are being sent to in anyway.


Regards,

Kyrill



From: Alain RODRIGUEZ 
Sent: Wednesday, August 8, 2018 1:27:50 AM
To: user cassandra.apache.org
Subject: Re: dynamic_snitch=false, prioritisation/order or reads from replicas

Hello Kyrill,

But in case of CL=QUORUM/LOCAL_QUORUM, if I'm not wrong, read request is sent 
to all replicas waiting for first 2 to reply.

My understanding is that this sentence is wrong. It is as you described it for 
writes indeed, all the replicas got the information (and to all the data 
centers). It's not the case for reads. For reads, x nodes are picked and used 
(x = ONE, QUORUM, ALL, ...).

Looks like the only change for dynamic_snitch=false is that "data" request is 
sent to a determined node instead of "currently the fastest one".

Indeed, the problem is that the 'currently the fastest one' changes very often 
in certain cases, thus removing the efficiency from the cache without enough 
compensation in many cases.
The idea of not using the 'bad' nodes is interesting to have more predictable 
latencies when a node is slow for some reason. Yet one of the side effects of 
this (and of the scoring that does not seem to be absolutely reliable) is that 
the clients are often routed to distinct nodes when under pressure, due to GC 
pauses for example or any other pressure.
Saving disk reads in read-heavy workloads under pressure is more important than 
trying to save a few milliseconds picking the 'best' node I guess.
I can imagine that alleviating these disks, reducing the number of disk 
IO/throughput ends up lowering the latency for all the nodes, thus the client 
application latency improves overall. That is my understanding of why it is so 
often good to disable the dynamic_snitch.

Did you get improved response for CL=ONE only or for higher CL's as well?

I must admit I don't remember for sure, but many people are using 
'LOCAL_QUORUM' and I think I saw this for this consistency level as well. Plus 
this question might no longer stand as reads in Cassandra work slightly 
differently than what you thought.

I am not 100% comfortable with this 'dynamic_snitch theory' topic, so I hope 
someone else can correct me if I am wrong, confirm or add information :). But 
for sure I have seen this disabled giving some really nice improvement (as many 
others here as you mentioned). Sometimes it was not helpful, but I have never 
seen this change being really harmful though.

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-08-06 22:27 GMT+01:00 Kyrylo Lebediev 
mailto:kyrylo_lebed...@epam.com.invalid>>:

Thank you for replying, Alain!


Better use of cache for 'pinned' requests explains good the case when CL=ONE.


But in case of CL=QUORUM/LOCAL_QUORUM, if I'm not wrong, read request is sent 
to all replicas waiting for first 2 to reply.

When dynamic snitching is turned on, "data" request is sent to "the fastest 
replica", and "digest" requests - to the rest of replicas.

But anyway digest is the same read operation [from SSTables through filesystem 
cache] + calculating and sending hash to coordinator. Looks like the only 
change for dynamic_snitch=false is that "data" request is sent to a determined 
node instead of "currently the fastest one".

So, if there are no mistakes in above description, improvement shouldn't be 
much visible for CL=*QUORUM...


Did you get improved response for CL=ONE only or for higher CL's as well?


Indeed an interesting thread in Jira.


Thanks,

Kyrill


From: Alain RODRIGUEZ mailto:arodr...@gmail.com>>
Sent: Monday, August 6, 2018 8:26:43 PM
To: user cassandra.apache.org
Subject: Re: dynamic_snitch=false, prioritisation/order or reads from replicas

Hello,

There are reports (in this ML too) that disabling dynamic snitching decreases 
response time.

I confirm that I have seen this improvement on clusters under pressure.

What effects stand behind this improvement?

My understanding is that this is due to the fact that the clients are then 
'pinned', more sticking to specific nodes when the dynamic snitching is off. I 
guess there is