DTCS Question

2016-03-19 Thread Anubhav Kale
I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does STCS within the DTCS windows). It also introduced a field called MAX_WINDOW_SIZE which defaults to one day. So in my data folders, I may see SS Tables that span beyond a day (generated through old data through repairs or

Apache Cassandra's license terms

2016-03-19 Thread Rakesh Kumar
What type of Open source license does Cassandra follow? If we use open source Cassandra for a revenue generating product, are we expected to contribute back our code to the open source. thanks

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Tom van den Berge
> > Is text the most appropriate data type to store JSON that contain couple > of dozen lines ? > It sure is the simplest way to store JSON. The query requirement is "where executedby = ?”. > Since executedby is a timeuuid, I guess you don't want to query a single record, since that would

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Jack Krupansky
executedby is the ID assigned to an employee. I'm presuming that JSON is to be used for objectbefore/after. This suggests no ability to query by individual object fields. I didn't sense any other columns that would be JSON. -- Jack Krupansky On Wed, Mar 16, 2016 at 3:48 PM, Tom van den Berge

cqlsh problem

2016-03-19 Thread joseph gao
hi, all cassandra version 2.1.7 When I use cqlsh to connect cassandra, something is wrong Connection error: ( Unable to connect to any servers', {'127.0.0.1': OperationTimedOut('errors=None, last_host=None,)}) This happens lots of times, but sometime it works just fine. Anybody knows why? --

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Clint Martin
I would arrange your primary key by how You intend to query. Primary key ((executedby), auditid) This allows you to query for who did it, and optionally on a time range for when it occurred. Retrieving in chronological order. You could do it with your proposed schema and Lucene but for what

Re: Single node Solr FTs not working

2016-03-19 Thread Jack Krupansky
Have you verified that the documented reference example functions as expected on your system? If so, then incrementally morph it towards your own code to discover exactly at which stage the problem occurs. Or just having the reference example side by side with your own code/schema/table will help

RE: DTCS bucketing Question

2016-03-19 Thread Anubhav Kale
CIL From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Thursday, March 17, 2016 11:01 AM To: user@cassandra.apache.org Subject: Re: DTCS bucketing Question > am trying to concretely understand how DTCS makes buckets and I am looking > at the

Re: DTCS Question

2016-03-19 Thread Marcus Eriksson
On Wed, Mar 16, 2016 at 6:49 PM, Anubhav Kale wrote: > I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does > STCS within the DTCS windows). It also introduced a field called > MAX_WINDOW_SIZE which defaults to one day. > > > > So in my data

Re: Experiencing strange disconnect issue

2016-03-19 Thread Steve Robenalt
Hi Bo, I would suggest adding: .withReconnectionPolicy(new ExponentialReconnectionPolicy(1000,3)) or something similar to your cluster builder. Steve On Wed, Mar 16, 2016 at 11:18 AM, Bo Finnerup Madsen wrote: > Hi Sean, > > Thank you for taking the time to

Re: Deploy latest cassandra on top of datastax-ddc ?

2016-03-19 Thread Mohamed Lrhazi
because I have no clue... :) So, after doing an ant build from the latest source... how would one "install" or deploy cassandra? Could not find a document on the install from source part... any pointers? All I find makes use of yum or apt repo's, or deploy from binary tarball... Thanks a lot,

Re: Strategies for avoiding corrupted duplicate data?

2016-03-19 Thread Clint Martin
Light weight transactions are going to be somewhat key to this. As are batches. The interesting thing about these views is that changing an email address is not the same operation on all of them. For The users by email view you have to delete a given existing row and insert a new one. For the

Re: Questions about Datastax support

2016-03-19 Thread Jack Krupansky
Maybe the question is what the fourth component of the release number actually means. The key point is simply that they have included additional fixes beyond the base Apache version - fixes that show up in future Apache releases that hadn't been released as of when they tested their DSE release.

Python to type field

2016-03-19 Thread Rakesh Kumar
Hi I have a type defined as follows CREATE TYPE etag ( ttype int, tvalue text ); And this is used in a col of a table as follows evetag list > I have the following value in a file [{ttype: 3 , tvalue: '90A1'}] This gets inserted via COPY command with no issues. However when I try

Deploy latest cassandra on top of datastax-ddc ?

2016-03-19 Thread Mohamed Lrhazi
Would simply overriding this one jar file do it? else could you please share a procedure? [root@avesterra-prod-1 ~]# rpm -qa| grep stax datastax-ddc-tools-3.2.1-1.noarch datastax-ddc-3.2.1-1.noarch [root@avesterra-prod-1 ~]# cp /tmp/apache-cassandra-3.6-SNAPSHOT.jar

Re: Read consistency

2016-03-19 Thread Alain RODRIGUEZ
Hi Arko, Never used that consistency level so far, but here is some information: http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_tunable_consistency_c.html Cassandra 2.0 uses the Paxos consensus protocol, which resembles 2-phase > commit, to support linearizable consistency. All

Re: Multi DC setup for analytics

2016-03-19 Thread Reddy Raja
Yes. Here are the steps. You will have to change the DC Names first. DC1 and DC2 would be independent clusters. Create a new DC, DC3 and include these two DC's on DC3. This should work well. On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin < clintlmar...@coolfiretechnologies.com> wrote: > When

discrepancy in up nodes from different nodes

2016-03-19 Thread Surbhi Gupta
Hi, I have changed endpoint_snitch from Simple to GossipingPropertyFileSnitch. And changed the cassandra-rackdc.properties file to reflect the correct DC and RACK. However when i did rolling restart then one node is showing 15 nodes up, otehr node is showing 10 nodes up etc. I have done rolling

Understanding SELECT * paging/ordering

2016-03-19 Thread Dan Checkoway
Say I have a table with 50M rows in a keyspace with RF=3 in a cluster of 15 nodes (single local data center). When I do "SELECT * FROM table" and page through those results (with a fetch size of say 1000), I'd like to understand better how that paging works. Specifically, what determines the

Re: What does FileCacheService's log message (invalidating cache) mean?

2016-03-19 Thread Satoshi Hikida
Sorry there is a mistake in my previous post. I would correct it. In Q3, I mentioned there are a lot of invalidating messages in the debug.log. It is true but cassandra configurations were wrong. In that case, the cassandra.yaml configurations are as follows: - cassandra.yaml -

Re: DTCS bucketing Question

2016-03-19 Thread Jeff Jirsa
> am trying to concretely understand how DTCS makes buckets and I am looking > at the DateTieredCompactionStrategyTest.testGetBuckets method and played with > some of the parameters to GetBuckets method call (Cassandra 2.1.12). I don’t > think I fully understand something there. Don’t feel

Re: cqlsh problem

2016-03-19 Thread Alain RODRIGUEZ
Hi, did you try with the address of the node rather than 127.0.0.1 Is the transport protocol used by cqlsh (not sure if it is thrift or binary - native in 2.1) active ? What is the "nodetool info" output ? C*heers, --- Alain Rodriguez - al...@thelastpickle.com France The

Re: cqlsh problem

2016-03-19 Thread Alain RODRIGUEZ
Is the node fully healthy or rejecting some requests ? What are the outputs for "grep -i "ERROR" /var/log/cassandra/system.log" and "nodetool tpstats"? Any error? Any pending / blocked or dropped messages? Also did you try using distinct ports (9160 for thrift, 9042 for native) - out of

Re: Experiencing strange disconnect issue

2016-03-19 Thread Bo Finnerup Madsen
Hi Sean, Thank you for taking the time to answer :) We are using a very vanilla connection, without any sorts of tuning policies. The cluster/session is constructed as follows: final Cluster cluster = Cluster.builder() .addContactPoints(key.getContactPoints())

Re: How can I make Cassandra stable in a 2GB RAM node environment ?

2016-03-19 Thread Alain RODRIGUEZ
Hi, I am not sure I understood your message correctly but I will try to answer it. but, I think, in Cassandra case, it seems a matter of how much data we use > with how much memory we have. If you are saying you can use poor commodity servers (vertically scale poorly) and just add nodes

Parallel bootstraps in two DCs with NetworkTopologyStrategy

2016-03-19 Thread Gabriel Wicke
Hi, we are in the process of expanding a multi-DC cluster, and are wondering if it is safe to bootstrap one node per DC in parallel. My intuition would be that this should not lead to any token range overlaps (similar to bootstrapping multiple nodes in a rack), so *should* be safe. We are using

Re: Data modelling, including cleanup

2016-03-19 Thread Hannu Kröger
Hi, That’s how I have done it in many occasions. Nowadays there is the possibility use Cassandra 3.0 and materialised views so that you don’t need to keep two tables up to date manually: http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views

Re: Experiencing strange disconnect issue

2016-03-19 Thread Bo Finnerup Madsen
Hi, I ran another test with the following client setup: final Cluster cluster = Cluster.builder() .addContactPoints(key.getContactPoints()) .withSocketOptions(new SocketOptions().setKeepAlive(true)) .withReconnectionPolicy(Policies.defaultReconnectionPolicy())

Re: Modeling Audit Trail on Cassandra

2016-03-19 Thread Jack Krupansky
Stratio (or DSE Search) should be good for ad hoc or complex queries, but if there are some fixed/common query patterns you might be better off implementing query tables or using materialized views. The latter allows you to include a non-PK data column in the PK of the MV so that you can directly

RE: Experiencing strange disconnect issue

2016-03-19 Thread SEAN_R_DURITY
Are you using any of the Tuning Policies (https://docs.datastax.com/en/developer/java-driver/2.0/common/drivers/reference/tuningPolicies_c.html)? It could be that you are hitting some peak load and the driver is not retrying hosts once they are marked “down.” Sean Durity – Lead Cassandra

Re: Getting Issue while setting up Cassandra in Windows 8.1

2016-03-19 Thread Paulo Motta
I don't see any apparent problem there, your cassandra process is running and ready to receive requests, but it's running on the foreground apparently. If you want to run C* as a service, you might be better off installing the DataStax Distribution of Apache Cassandra for Windows, which installs

Re: Questions about Datastax support

2016-03-19 Thread Jack Krupansky
1. They have a published support policy: http://www.datastax.com/support-policy/supported-software -- Jack Krupansky On Thu, Mar 17, 2016 at 10:09 AM, Rakesh Kumar wrote: > Few questions: > > 1 - Has there been an announcement as to when Datastax will stop >

Questions about Datastax support

2016-03-19 Thread Rakesh Kumar
Few questions: 1 - Has there been an announcement as to when Datastax will stop supporting 2.x version. I am aware that the community will stop supporting 2.x in Nov 2016. What about support to paid customers of Datastax. Will it go beyond Nov. 2 - Are there any plans by

DTCS bucketing Question

2016-03-19 Thread Anubhav Kale
Hello, I am trying to concretely understand how DTCS makes buckets and I am looking at the DateTieredCompactionStrategyTest.testGetBuckets method and played with some of the parameters to GetBuckets method call (Cassandra 2.1.12). I don't think I fully understand something there. Let me try

Re: Experiencing strange disconnect issue

2016-03-19 Thread Steve Robenalt
Hi Bo, You might try sending the same question to the java driver mailing list. I haven't seen your particular error in several years of running Cassandra on AWS. The closest I saw in the past was due to a protocol error in the driver during the 2.0 beta timeframe. Steve On Wed, Mar 16, 2016 at

Re: discrepancy in up nodes from different nodes

2016-03-19 Thread Alain RODRIGUEZ
Hi Surbhi. No idea that come to mind directly... Could you provide the cassandra version, a view of your keyspaces replication, some example of your cassandra-rackdc.properties. Also we could use some "nodetool status ks" outputs. Did you make sure GPFS was exactly identical to simple snitch

Re: What does FileCacheService's log message (invalidating cache) mean?

2016-03-19 Thread Satoshi Hikida
Thank you for your very useful advice! Definitely, I'm using Cassandra V2.2.5 not 3.x. And basically I've understood what does these logs mean. But I have more a few questions. So I would very much appreciate If I get some explanations about these questions. * Q1. In my understand, when open a

Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Fyi, this is the jira, https://issues.apache.org/jira/browse/CASSANDRA-11348 . We can move the discussion to the jira if want. On Thu, Mar 17, 2016 at 11:46 AM, Dikang Gu wrote: > Hi Eric, > > Thanks for sharing the information! > > We also mainly want to use it for

Single node Solr FTs not working

2016-03-19 Thread Joseph Tech
Hi, I had setup a single-node DSE 4.8.x to start in Search mode to explore some aspects of Solr search with field transformers (FT). Even though the configuration seems fine and Solr admin shows the indexed data, and searches on the actual fields (stored=true) work fine, but the FTs are not being

Re: Compaction Filter in Cassandra

2016-03-19 Thread Clint Martin
I would definitely be interested in this. Clint On Mar 15, 2016 9:36 PM, "Eric Stevens" wrote: > We have been working on filtering compaction for a month or so (though we > call it deleting compaction, its implementation is as a filtering > compaction strategy). The feature

Re: Read consistency

2016-03-19 Thread Robert Coli
On Tue, Mar 15, 2016 at 6:43 PM, Arko Provo Mukherjee < arkoprovomukher...@gmail.com> wrote: > I am designing a system where for a situation, I need to have SERIAL > consistency during writes. > Be sure to understand the implications of : https://issues.apache.org/jira/browse/CASSANDRA-9328

Data modelling, including cleanup

2016-03-19 Thread Bo Finnerup Madsen
Hi, We are pretty new to data modelling in cassandra, and are having a bit of a challenge creating a model that caters both for queries and updates. Let me try to explain it using the users example from http://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling They define two

Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Hi Eric, Thanks for sharing the information! We also mainly want to use it for trimming data, either by the time or the number of columns in a row. We haven't started the work yet, do you mind to share some patches? We'd love to try it and test it in our environment. Thanks. On Tue, Mar 15,

Re: Question about SELECT command

2016-03-19 Thread Jack Krupansky
Yes, gossip is how Cassandra knows which nodes are alive in the cluster. But... that has nothing to do with SELECT. It's still not clear what you are really getting at. I mean, if you have gone through the (free) online training and (free) doc on Cassandra architecture, what is it you are still

Question about SELECT command

2016-03-19 Thread Thouraya TH
Hi all; Please, i have a question about the architecure behind SELECT command. Given this table: c1 c2 c3 value1 value2 value3 ... etc... lines of this table are distributed over nodes that's it ? Thank you so much

Re: Python to type field

2016-03-19 Thread Tyler Hobbs
This should be useful: http://datastax.github.io/python-driver/user_defined_types.html On Wed, Mar 16, 2016 at 1:18 PM, Rakesh Kumar wrote: > Hi > > I have a type defined as follows > > CREATE TYPE etag ( > ttype int, > tvalue text > ); > > And this is used

Re: What does FileCacheService's log message (invalidating cache) mean?

2016-03-19 Thread Stefania Alborghetti
Q1. Readers are created as needed, there is no fixed number. For example, we may have 2 threads scanning sstables at the same time due to 2 different CQL SELECT statements. Q2. There is no correlation between sstable size and JVM HEAP size. We don't load entire sstables in memory. Q3. It's

Re: cqlsh problem

2016-03-19 Thread Vishwas Gupta
Have you started the Cassandra service? sh cassandra On 17-Mar-2016 7:59 pm, "Alain RODRIGUEZ" wrote: > Hi, did you try with the address of the node rather than 127.0.0.1 > > Is the transport protocol used by cqlsh (not sure if it is thrift or > binary - native in 2.1)