Re: CREATE INDEX without IF NOT EXISTS when snapshoting

2017-10-03 Thread kurt greaves
Certainly would make sense and should be trivial.  here

is
where you want to look. Just create a ticket for it and prod here for a
reviewer once you've got a change.​


Re: What is performance gain of clustering columns

2017-10-03 Thread kurt greaves
Clustering info is stored in the index of an SSTable, so if you are only
querying a subset of rows within the partition you don't necessarily have
to hit all SSTables, just the SSTables that contain the relevant clustering
col's. They make a big improvement, and can also be used quite effectively
in a time series use case and remove the need for time buckets in your
partition key.

On 3 October 2017 at 15:30, eugene miretsky 
wrote:

> Hi,
>
> Clustering columns are used to order the data in a partition. However,
> since data is split into SSTables, the rows are ordered by clustering key
> only within each SSTable. Cassandra still needs to check all SSTables, and
> merge the data if it is found in several SSTables. The only scanario where
> I can imagine big performance gain is  super wide paritions, where each
> partition is within a single SSTable (time series data, where partition
> keys are time-buckets)
>
> Has anybody done benchmarks on that and can share the data mode they have
> used?
>
> Cheers,
> Eugene
>


CREATE INDEX without IF NOT EXISTS when snapshoting

2017-10-03 Thread Javier Canillas
Hi everyone,

I came across something that bothers me a lot. I'm using snapshots to
backup data from my Cassandra cluster in case something really bad happens
(like dropping a table or a keyspace).

Exercising the recovery actions from those backups, I discover that the
schema put on the file "schema.cql" as a result of the snapshot has the
"CREATE IF NOT EXISTS" for the table, but not for the indexes.

When restoring from snapshots, and relying on the execution of these
schemas to build up the table structure, everything seems fine for tables
without secondary indexes, but for the ones that make use of them, the
execution of these statements fail miserably.

Here I paste a generated schema.cql content for a table with indexes:

CREATE TABLE IF NOT EXISTS keyspace1.table1 (
id text PRIMARY KEY,
content text,
last_update_date date,
last_update_date_time timestamp)
WITH ID = f1045fc0-2f59-11e7-95ec-295c3c064920
AND bloom_filter_fp_chance = 0.01
AND dclocal_read_repair_chance = 0.1
AND crc_check_chance = 1.0
AND default_time_to_live = 864
AND gc_grace_seconds = 864000
AND min_index_interval = 128
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE'
AND caching = { 'keys': 'NONE', 'rows_per_partition': 'NONE' }
AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
AND compression = { 'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor' }
AND cdc = false
AND extensions = {  };
*CREATE INDEX table1_last_update_date_idx ON keyspace1.table1
(last_update_date);*

I think the last part should be:

*CREATE INDEX IF NOT EXISTS table1_last_update_date_idx ON keyspace1.table1
(last_update_date);*

Any ideas? Have you wrote this part of the snapshot behavior for a
particular reason I'm not seeing?

I'm willing to help on coding (as I have done before xD) if you consider
this a trivial bug, but something that should be address.

Kind regards,

Javier.


Re: Read-/ Write Latency - Cassandra 2.1 .15 vs 3.10

2017-10-03 Thread Chris Lohfink
RecentReadLatency metrics has been deprecated for years (1.1 or 1.2) and were 
removed in 2.2. It was a very misleading metric. Instead pull from the Table's 
ReadLatency metrics from the org.apache.cassandra.metrics domain. 
http://cassandra.apache.org/doc/latest/operating/metrics.html?highlight=metrics#table-metrics
 


Chris

> On Oct 3, 2017, at 10:06 AM, Anumod Mullachery  
> wrote:
> 
> Hi,  We were running splunk queries to pull read / write latency.  It's 
> working fine in 2.1.15 , but not returning result from upgraded version 3.10. 
>  The bean used in the script is as shown below.  Let me know, if any changes 
> on the functionality on 2.1.15 vs 3.10 or it replaced to some other bean.   
> perf_queries= { "org.apache.cassandra.db:type=StorageProxy" => 
> "RecentReadLatencyMicros,RecentWriteLatencyMicros", }  stage_queries= { 
> "org.apache.cassandra.request:type=*" => 
> "ActiveCount,PendingTasks,CurrentlyBlockedTasks", }  curl 
> http://localhost:8778/jolokia/read/org.apache.cassandra.db:type=StorageProxy/RecentReadLatencyMicros,RecentWriteLatencyMicros
>  
> 
>   curl 
> http://localhost:8778/jolokia/read/org.apache.cassandra.request:type=*/ActiveCount,PendingTasks,CurrentlyBlockedTasks
>  
> 
>   
> 
> ~ Thanks ~  Anumod



Re: How do I install Cassandra on AWS

2017-10-03 Thread Lutaya Shafiq Holmes
Thank You Michael


https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=icon;
target="_blank">https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif;
alt="" width="46" height="29" style="width: 46px; height: 29px;"
/>
Virus-free. https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=webmail_term=link;
target="_blank" style="color: #4453ea;">www.avast.com




On 10/3/17, Michael Shuler  wrote:
> How to EC2:
>
>   https://aws.amazon.com/ec2/getting-started/
>
> After "Step 4: Connect to your instance", then install Cassandra as in
> the steps on the download page.
>
> If you're looking for details on instance configuration and cluster
> strategy, a quick search of "ec2 cassandra" found me the AWS whitepaper
> with guidelines and best practices:
>
>   https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf
>
> You might get better help if you provide some specific step you
> performed, what you expected to happen, an error you got, what you tried
> to fix it, etc.
>
> --
> Michael
>
> On 10/03/2017 06:28 AM, Lutaya Shafiq Holmes wrote:
>>  How do I install Cassandra on AWS- Amazon web services
>>
>> The instructions are not listed there
>>
>> On 10/2/17, Michael Shuler  wrote:
>>> On 10/02/2017 10:53 AM, Lutaya Shafiq Holmes wrote:

 How do I install Cassandra on AWS- Amazon web services
>>>
>>> Follow the installation instructions on the following page, relevant to
>>> your OS of choice:
>>>
>>>   http://cassandra.apache.org/download/
>>>
>>> Let the list know if you have any problems!
>>>
>>> --
>>> Kind regards,
>>> Michael
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


-- 
Lutaaya Shafiq
Web: www.ronzag.com | i...@ronzag.com
Mobile: +256702772721 | +256783564130
Twitter: @lutayashafiq
Skype: lutaya5
Blog: lutayashafiq.com
http://www.fourcornersalliancegroup.com/?a=shafiqholmes

"The most beautiful people we have known are those who have known defeat,
known suffering, known struggle, known loss and have found their way out of
the depths. These persons have an appreciation, a sensitivity and an
understanding of life that fills them with compassion, gentleness and a
deep loving concern. Beautiful people do not just happen." - *Elisabeth
Kubler-Ross*

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



What is performance gain of clustering columns

2017-10-03 Thread eugene miretsky
Hi,

Clustering columns are used to order the data in a partition. However,
since data is split into SSTables, the rows are ordered by clustering key
only within each SSTable. Cassandra still needs to check all SSTables, and
merge the data if it is found in several SSTables. The only scanario where
I can imagine big performance gain is  super wide paritions, where each
partition is within a single SSTable (time series data, where partition
keys are time-buckets)

Has anybody done benchmarks on that and can share the data mode they have
used?

Cheers,
Eugene


Read-/ Write Latency - Cassandra 2.1 .15 vs 3.10

2017-10-03 Thread Anumod Mullachery
Hi,

We were running splunk  queries to pull read / write latency.

It's working fine in 2.1.15 , but not returning result from upgraded
version 3.10.

The bean used in the script is as shown below.

Let me know, if any changes on the functionality on 2.1.15 vs 3.10 or
it replaced to some other bean.


perf_queries= {
   "org.apache.cassandra.db:type=StorageProxy" =>
"RecentReadLatencyMicros,RecentWriteLatencyMicros",
}

stage_queries= {
   "org.apache.cassandra.request:type=*" =>
"ActiveCount,PendingTasks,CurrentlyBlockedTasks",
}

curl 
http://localhost:8778/jolokia/read/org.apache.cassandra.db:type=StorageProxy/RecentReadLatencyMicros,RecentWriteLatencyMicros

curl 
http://localhost:8778/jolokia/read/org.apache.cassandra.request:type=*/ActiveCount,PendingTasks,CurrentlyBlockedTasks


~ Thanks ~

Anumod


Re: cassandra hardware requirements (STAT/SSD)

2017-10-03 Thread Jeronimo de A. Barros
Hello,

It's a bit old but at least for me, still a great guide:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html

My 2 cents: We deal with electronic invoices and our load is about 10,000
transactions/s during the peak housr.

We are not located in USA, so AWS would be a bit expensive for our project.
After a lot of research, tests and simulations (technical and financial) I
decide to purchase four Supermicros MicroClouds 5038ML-H12TRF with Xeon E3
series, 32GB RAM and 4 x 2.5 "1TB spinning disks each node. The chassis are
divided in 2 DCs with 2 x 1Gbps links redundant for DC/DC interconnection
and each blade is divided as follow: 3 x Cassandra 2.0 nodes for
production, 3 x Cassandra 3.x (tests for migration), 3 x Spark / Hadoop for
analytics and 3 x application servers. So, we have a 12 node Cassandra 2.0
that has been working very fine for the last 3 years with a very low
latency and overhead. Some bumps here and there but with properly
management and monitoring we can deal with almost everything.

Despite we use 2.5 "disks, we always check the BlackBlaze's hard drive
reliability reports before any disk purchasing:
https://www.backblaze.com/blog/hard-drive-failure-rates-q1-2017/

On Cas 2.0 we started using separeted disks for the data_file_directories.
On Cas 3.x, following Al Tobey's guide, we're using MD Raid0 in a XFS
filesystem and the performance are far better than on Cas 2.0.

I hope it helps.

Jero


On Fri, Sep 29, 2017 at 3:19 AM, Peng Xiao <2535...@qq.com> wrote:

> Hi there,
> we are struggling on hardware selection,we all know that ssd is good,and
> Datastax suggests us to use ssd,as Cassandra is a CPU bound db,we are
> considering to use sata disk,we noticed that the normal IO throughput is
> 7MB/s.
>
> Could anyone give some advice?
>
> Thanks,
> Peng Xiao
>
>


Re: How do I install Cassandra on AWS

2017-10-03 Thread Michael Shuler
How to EC2:

  https://aws.amazon.com/ec2/getting-started/

After "Step 4: Connect to your instance", then install Cassandra as in
the steps on the download page.

If you're looking for details on instance configuration and cluster
strategy, a quick search of "ec2 cassandra" found me the AWS whitepaper
with guidelines and best practices:

  https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf

You might get better help if you provide some specific step you
performed, what you expected to happen, an error you got, what you tried
to fix it, etc.

-- 
Michael

On 10/03/2017 06:28 AM, Lutaya Shafiq Holmes wrote:
>  How do I install Cassandra on AWS- Amazon web services
> 
> The instructions are not listed there
> 
> On 10/2/17, Michael Shuler  wrote:
>> On 10/02/2017 10:53 AM, Lutaya Shafiq Holmes wrote:
>>>
>>> How do I install Cassandra on AWS- Amazon web services
>>
>> Follow the installation instructions on the following page, relevant to
>> your OS of choice:
>>
>>   http://cassandra.apache.org/download/
>>
>> Let the list know if you have any problems!
>>
>> --
>> Kind regards,
>> Michael
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
> 
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: How do I install Cassandra on AWS

2017-10-03 Thread Russell Bateman

http://lmgtfy.com/?q=how+to+install+cassandra+on+aws


On 10/03/2017 05:28 AM, Lutaya Shafiq Holmes wrote:

  How do I install Cassandra on AWS- Amazon web services

The instructions are not listed there

On 10/2/17, Michael Shuler  wrote:

On 10/02/2017 10:53 AM, Lutaya Shafiq Holmes wrote:

How do I install Cassandra on AWS- Amazon web services

Follow the installation instructions on the following page, relevant to
your OS of choice:

   http://cassandra.apache.org/download/

Let the list know if you have any problems!

--
Kind regards,
Michael

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org








Re: How do I install Cassandra on AWS

2017-10-03 Thread Lutaya Shafiq Holmes
 How do I install Cassandra on AWS- Amazon web services

The instructions are not listed there

On 10/2/17, Michael Shuler  wrote:
> On 10/02/2017 10:53 AM, Lutaya Shafiq Holmes wrote:
>>
>> How do I install Cassandra on AWS- Amazon web services
>
> Follow the installation instructions on the following page, relevant to
> your OS of choice:
>
>   http://cassandra.apache.org/download/
>
> Let the list know if you have any problems!
>
> --
> Kind regards,
> Michael
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


-- 
Lutaaya Shafiq
Web: www.ronzag.com | i...@ronzag.com
Mobile: +256702772721 | +256783564130
Twitter: @lutayashafiq
Skype: lutaya5
Blog: lutayashafiq.com
http://www.fourcornersalliancegroup.com/?a=shafiqholmes

"The most beautiful people we have known are those who have known defeat,
known suffering, known struggle, known loss and have found their way out of
the depths. These persons have an appreciation, a sensitivity and an
understanding of life that fills them with compassion, gentleness and a
deep loving concern. Beautiful people do not just happen." - *Elisabeth
Kubler-Ross*

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Migrating a Limit/Offset Pagination and Sorting to Cassandra

2017-10-03 Thread kurt greaves
I get the impression that you are paging through a single partition in
Cassandra? If so you should probably use bounds on clustering keys to get
your "next page". You could use LIMIT as well here but it's mostly
unnecessary. Probably just use the pagesize that you intend for the API.

Yes you'll need a table for each sort order, which ties into how you would
use clustering keys for LIMIT/OFFSET. Essentially just do range slices on
the clustering keys for each table to get your "pages".

Also I'm assuming there's a lot of data per partition if in-mem sorting
isn't an option, if this is true you will want to be wary of creating large
partitions and reading them all at once. Although this depends on your data
model and compaction strategy choices.

On 3 October 2017 at 08:36, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:

> Hi,
> I am currently working on migrating a service that so far was MySQL based
> to Cassandra.
> Everything seems to work fine so far, but a few things in the old services
> API Spec is posing some interesting data modeling challenges:
>
> The old service was doing Limit/Offset pagination which is obviously
> something Cassandra can't really do. I understand how paginationState works
> - but so far I haven't figured out a good way to make Limit/Offset work on
> top of paginationState (as I need to be 100% backwards compatible).
> The only ways which I could think of to make Limit/Offset work would
> create scalability issues down the road.
>
> The old service allowed sorting by any field. If I understood correctly
> that would require a table for each sort order right? (In-Mem sorting is
> not an option unfortunately)
> In doing so, how can I make the Java Datastax mapper save to another table
> (I really don't want to be writing a Subclass of the Entity for each Table
> to add the @Table annotation.
>
> greetings Daniel
>


Migrating a Limit/Offset Pagination and Sorting to Cassandra

2017-10-03 Thread Daniel Hölbling-Inzko
Hi,
I am currently working on migrating a service that so far was MySQL based
to Cassandra.
Everything seems to work fine so far, but a few things in the old services
API Spec is posing some interesting data modeling challenges:

The old service was doing Limit/Offset pagination which is obviously
something Cassandra can't really do. I understand how paginationState works
- but so far I haven't figured out a good way to make Limit/Offset work on
top of paginationState (as I need to be 100% backwards compatible).
The only ways which I could think of to make Limit/Offset work would create
scalability issues down the road.

The old service allowed sorting by any field. If I understood correctly
that would require a table for each sort order right? (In-Mem sorting is
not an option unfortunately)
In doing so, how can I make the Java Datastax mapper save to another table
(I really don't want to be writing a Subclass of the Entity for each Table
to add the @Table annotation.

greetings Daniel