Re: Decommissioned nodes show as DOWN in Cassandra version 3.10

2017-06-12 Thread Vladimir Yudovin
Hi, you can use http://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsRemoveNode.html or if this doesn't work ("It is a last resort tool if you cannot successfully use nodetool removenode.") http://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsAssassinate.html Best

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes I am not thinking to go with MV. I am trying to implement by myself. May be some idea will get about doing cassandra-stress about data generation and all. Thanks Jonathan. On Tue, Jun 13, 2017 at 10:44 AM, Jonathan Haddad wrote: > Unless you're willing to put in a lot of

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jonathan Haddad
Unless you're willing to put in a lot of time fixing bugs, I'd recommend avoiding 3.0's materialized views and manage them yourself. On Mon, Jun 12, 2017 at 6:11 PM @Nandan@ wrote: > Correct, Our first concern is to store huge READ and WRITE, for that > Cassandra

Re: Bottleneck for small inserts?

2017-06-12 Thread Eric Pederson
Hi all - I wanted to follow up on this. I'm happy with the throughput we're getting but I'm still curious about the bottleneck. The big thing that sticks out is one of the nodes is logging frequent GCInspector messages: 350-500ms every 3-6 seconds. All three nodes in the cluster have identical

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Correct, Our first concern is to store huge READ and WRITE, for that Cassandra is our First and Best Choice. But according to Use Case, we need to implement Advance search like Partial text, Phrase search etc.. So we are thinking the best way, that how to implement data model. On Tue, Jun 13,

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Michael , MV is also good option when we have to select based on equality search, but here condition is to developing a model for advance partial search way. And Also , In case of MV, suppose we have 2 DC with 3 Nodes on each DC then MV will replicated data based on 6*6 times which will be

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Ok , Then let's try to implement and will check by using cassandra-stress to check what will be performance. I worked on another data model for book storage for my company, with same situations like having 1 single table with 80 columns and primary key as bookid uuid. Implemented Solr on top of

Re: Convert single node C* to cluster (rebalancing problem)

2017-06-12 Thread Akhil Mehra
Great point John. The OP should also note that data distribution also depends on your schema and incoming data profile. If your schema is not modelled correctly you can easily end up unevenly distributed data. Cheers, Akhil On Tue, Jun 13, 2017 at 3:36 AM, John Hughes

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
Agree, I meant as Jonathan said to use C* for primary key and as a primary storage and ES as an indexed version of what you have in cassandra. 2017-06-12 19:19 GMT+02:00 DuyHai Doan : > Sorry, I misread some reply I had the impression that people recommend ES > as primary

Decommissioned nodes show as DOWN in Cassandra version 3.10

2017-06-12 Thread pabbireddy avinash
Hi In the Cassandra version 3.10, after we decommission a node or datacenter, we observe the decommissioned nodes marked as DOWN in the cluster when you do a "nodetool describecluster". The nodes however do not show up in the "nodetool status" command. The decommissioned node also does not show

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread DuyHai Doan
Sorry, I misread some reply I had the impression that people recommend ES as primary datastore On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad wrote: > Nobody is promoting ES as a primary datastore in this thread. Every > mention of it is to accompany C*. > > > > On Mon,

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jonathan Haddad
Nobody is promoting ES as a primary datastore in this thread. Every mention of it is to accompany C*. On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan wrote: > For all those promoting ES as a PRIMARY datastore, please read this before: > >

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread DuyHai Doan
For all those promoting ES as a PRIMARY datastore, please read this before: https://discuss.elastic.co/t/elasticsearch-as-a-primary-database/85733/13 There are a lot of warning before recommending ES as a datastore. The answer from Pilato, ES official evangelist: - You absolutely care

Re: Convert single node C* to cluster (rebalancing problem)

2017-06-12 Thread John Hughes
Is the OP expecting a perfect 50%/50% split? That, to my experience, is not going to happen, it is almost always shifted from a fraction of a percent to a couple percent. Datacenter: eu-west === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- AddressLoad

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Michael Mior
For queries 1-5 this seems like a potentially good use case for materialized views. Create one table with the videos stored by ID and the materialized views for each of the queries. -- Michael Mior mm...@apache.org 2017-06-11 22:40 GMT-04:00 @Nandan@ : > Hi, > >

Re: Using Cassandra for my usecase

2017-06-12 Thread Eric Evans
On Sat, Jun 10, 2017 at 11:07 AM, Govindarajan Srinivasaraghavan wrote: > Hi All, > > Just to give a background I'm working on a project where I need to store > fast incoming time series data and have rest api's to query and serve the > data to users when needed. The data

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Jason Brown
removing dev@ from this conversation, as the thread is more appropriately for user@ On Mon, Jun 12, 2017 at 4:51 AM, Eduardo Alonso wrote: > -Virtual tokens are not recommended when using SOLR or > cassandra-lucene-index. > > If you use your table schema you will not

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
-Virtual tokens are not recommended when using SOLR or cassandra-lucene-index. If you use your table schema you will not have any problem with partition size because your table is *not* a WIDE row table (it does not have clustering keys) The limit for 1 record with those 15 or 20 columns must not

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
And due to single table videos, maybe it will go with around 15,20 columns, then we need to also think very carefully about partition sizes also. On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ wrote: > Yes this is only Option I am also thinking like this as my second

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes this is only Option I am also thinking like this as my second options. Before this I was thinking to do denormalize table based on search columns, but due to partial search this will be not that effective. Now suppose , if we are going with this single table as videos. and implemented with

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Using cassandra collections CREATE TABLE videos ( videoid uuid primary key, title text, actor list, producer list, release_date timestamp, description text, music text, etc... ); When using collection you need to take care of its length. Collections are designed to store

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
So In short we have to go with one single table as videos and put primary key as videoid uuid. But then how can we able to handle multiple actor name and producer name. ? On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso wrote: > Yes, you are right. > > Table

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Yes, you are right. Table denormalization is useful just when you have unique primary keys, not your case. Denormalized tables are only different in its primary key, every denormalized table contains all the data (it just change how it is structured). So, if you need to index it, do it with just

Re: Convert single node C* to cluster (rebalancing problem)

2017-06-12 Thread Akhil Mehra
auto_bootstrap is true by default. Ensure its set to true. On startup look at your logs for your auto_bootstrap value. Look at the node configuration line in your log file. Akhil On Mon, Jun 12, 2017 at 6:18 PM, Junaid Nasir wrote: > No, I didn't set it (left it at default

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Eduardo, And As we are trying to build an advanced search functionality in which we can able to do partial search based on actor, producer, director, etc. columns. So if we do denormalization of tables then we have to create tables such as below :- video_by_actor video_by_producer

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Edurado, As you mentioned queries 1-6 , In this condition, we have to proceed with a table like as below :- create table videos ( videoid uuid primary key, title text, actor text, producer text, release_date timestamp, description text, music text, etc... ); This table will help to store video

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
TLDR shouldBe *PD Eduardo Alonso Vía de las dos Castillas, 33, Ática 4, 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd * 2017-06-12 10:58 GMT+02:00 Eduardo Alonso : > Hi Nandan: > > So,

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Eduardo Alonso
Hi Nandan: So, your system must provide these queries: 1 - SELECT video FROM ... WHERE actor = '...'; 2 - SELECT video FROM ... WHERE producer = '...'; 3 - SELECT video FROM ... WHERE music = '...'; 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...'; 5 - SELECT video FROM ...

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
But Condition is , I am working with Apache Cassandra Database in which I have to store my data into Cassandra and then have to implement partial search capability. If we need to search based on full search primary key, then it really best and easy to work with Cassandra , but in case of flexible

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
I haven't run solr with Cassandra myself. I just meant to run elasticsearch as a completely separate service and write there as well. > On 12 Jun 2017, at 09:45, @Nandan@ wrote: > > Do you mean to use Elastic Search with Cassandra? > Even I am thinking to use

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Do you mean to use Elastic Search with Cassandra? Even I am thinking to use Apache Solr With Cassandra. In that case I have to create distributed tables such as:- 1) video_by_title, video_by_actor, video_by_year etc.. 2) After creating Tables , will have to configure solr core on all tables. Is

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread Oskar Kjellin
Why not elasticsearch for this use case? It will make your life much simpler > On 12 Jun 2017, at 04:40, @Nandan@ wrote: > > Hi, > > Currently, I am working on data modeling for Video Company in which we have > different types of users as well as different

Re: Using Cassandra for my usecase

2017-06-12 Thread Oskar Kjellin
You could put the tenant as a column that is part of the clustering key. That avoids large partitions. On 12 Jun 2017, at 07:14, Erick Ramirez wrote: >> Given my use case is cassandra the best suited one or is there any other >> database which suits my requirement

Re: Convert single node C* to cluster (rebalancing problem)

2017-06-12 Thread Junaid Nasir
No, I didn't set it (left it at default value) On Fri, Jun 9, 2017 at 3:18 AM, ZAIDI, ASAD A wrote: > Did you make sure auto_bootstrap property is indeed set to [true] when > you added the node? > > > > *From:* Junaid Nasir [mailto:jna...@an10.io] > *Sent:* Monday, June 05, 2017