Re: batch_size_warn_threshold_in_kb

2014-12-11 Thread Ryan Svihla
-- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache

Re: Get column family size

2014-12-11 Thread Ryan Svihla
Dilshan Wijayarathna,* SMIEEE, SMIESL, Undergraduate, Department of Computer Science and Engineering, University of Moratuwa. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http

Re: Get column family size

2014-12-11 Thread Ryan Svihla
and Engineering, University of Moratuwa. -- *Chamila Dilshan Wijayarathna,* SMIEEE, SMIESL, Undergraduate, Department of Computer Science and Engineering, University of Moratuwa. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png

Re: Get column family size

2014-12-12 Thread Ryan Svihla
. I am running cassandra in a single node and have 1million + rows. Thank You! On Fri, Dec 12, 2014 at 2:57 AM, Ryan Svihla rsvi...@datastax.com wrote: An estimated partition key count can be had from nodetool cfstats, however for large data sets analytics style queries (such as verification

Re: batch_size_warn_threshold_in_kb

2014-12-12 Thread Ryan Svihla
mutations, you will hit that threshold. In addition, Patrick is saying that he does not recommend more than 100 mutations per batch. So why not warn users just on the # of mutations in a batch? Mohammed *From:* Ryan Svihla [mailto:rsvi...@datastax.com] *Sent:* Thursday, December 11, 2014 12

Re: batch_size_warn_threshold_in_kb

2014-12-12 Thread Ryan Svihla
11, 2014 at 9:56 PM, Ryan Svihla rsvi...@datastax.com wrote: Nothing magic, just put in there based on experience. You can find the story behind the original recommendation here https://issues.apache.org/jira/browse/CASSANDRA-6487 Key reasoning for the desire comes from Patrick McFadden

Re: nodetool breaks on firewall ?

2014-12-12 Thread Ryan Svihla
effect? -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution

Re: Using Per-Table Keyspaces for Tunable Replication

2014-12-12 Thread Ryan Svihla
Clarification keyspace for each should be keyspace for cassandra tables and solr tables On Fri, Dec 12, 2014 at 11:25 AM, Ryan Svihla rsvi...@datastax.com wrote: It would make more sense to just have a keyspace for each. Something like solr_tables, and cassandra_tables. I've done similar

Re: Using Per-Table Keyspaces for Tunable Replication

2014-12-12 Thread Ryan Svihla
gotchas we should be concerned about? Our total table count is small, in the tens range; our searchable tables are maybe 4 or 5. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png

Re: nodetool breaks on firewall ?

2014-12-12 Thread Ryan Svihla
(which is fire walled) On Fri, Dec 12, 2014 at 5:19 AM, Ryan Svihla rsvi...@datastax.com wrote: yes the node needs to restart to have cassandra-env.sh take effect, and the links you're providing are about making cassandra's JMX bind to the interface you want, so nodetool isn't really the issue

Re: nodetool breaks on firewall ?

2014-12-12 Thread Ryan Svihla
, Ryan Svihla rsvi...@datastax.com wrote: is appears to be localhost, I imagine the issue is more you changed the rpc_address to not be localhost anymore https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java lines 87 and 88 private static

Re: nodetool breaks on firewall ?

2014-12-12 Thread Ryan Svihla
, {sa_family=AF_INET6, sin6_port=htons(7199), inet_pton(AF_INET6, :::173.x.x.x, sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28 unfinished ... On Fri, Dec 12, 2014 at 12:20 PM, Ryan Svihla rsvi...@datastax.com wrote: is appears to be localhost, I imagine the issue is more you changed

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
? Mohammed *From:* Ryan Svihla [mailto:rsvi...@datastax.com] *Sent:* Thursday, December 11, 2014 12:56 PM *To:* user@cassandra.apache.org *Subject:* Re: batch_size_warn_threshold_in_kb Nothing magic, just put in there based on experience. You can find the story behind the original

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
, always good to push back on theory discussions with numbers. On Sat, Dec 13, 2014 at 8:12 AM, Ryan Svihla rsvi...@datastax.com wrote: Are batches to the same partition key (which results in a single mutation, and obviously eliminates the primary problem)? Is your client network and/or CPU bound

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Ryan Svihla
:* Jonathan Haddad j...@jonhaddad.com *Sent:* Friday, December 12, 2014 12:58 PM *To:* user@cassandra.apache.org ; Ryan Svihla rsvi...@datastax.com *Subject:* Re: batch_size_warn_threshold_in_kb The really important thing to really take away from Ryan's original post is that batches

Re: Cassandra Database using too much space

2014-12-14 Thread Ryan Svihla
and Engineering, University of Moratuwa. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable

Re: Cassandra Maintenance Best practices

2014-12-16 Thread Ryan Svihla
utility? 3. Is is necessary to run repair weekly? thanks regards Neha -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax

Re: Cassandra Maintenance Best practices

2014-12-16 Thread Ryan Svihla
, Dec 16, 2014 at 10:32 PM, Ryan Svihla rsvi...@datastax.com wrote: CL quorum with RF2 is equivalent to ALL, writes will require acknowledgement from both nodes, and reads will be from both nodes. CL one will write to both replicas, but return success as soon as the first one responds, read

Re: Comprehensive documentation on Cassandra Data modelling

2014-12-16 Thread Ryan Svihla
) If I want to search on a column, it has to be part of the primary key 3) If a column is part of the primary key, it cannot be edited so I have a circular dependency Thanks, Jason -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png

Re: Defining DataSet.json for cassandra-unit testing

2014-12-16 Thread Ryan Svihla
-- *Chamila Dilshan Wijayarathna,* SMIEEE, SMIESL, Undergraduate, Department of Computer Science and Engineering, University of Moratuwa. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image

Re: Changing replication factor of Cassandra cluster

2014-12-16 Thread Ryan Svihla
using the backup? Do I have to have the tokens range backed up as well? -Pranay -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727

Re: Comprehensive documentation on Cassandra Data modelling

2014-12-16 Thread Ryan Svihla
-- *From:* Ryan Svihla rsvi...@datastax.com *To:* user@cassandra.apache.org *Sent:* Tuesday, December 16, 2014 12:36 PM *Subject:* Re: Comprehensive documentation on Cassandra Data modelling Data Modeling a distributed application could be a book unto itself

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
better on performance tuning would be appreciated. arne -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
that nice CPU further down. No TombstoneOverflowingExceptions. On Tue, Dec 16, 2014 at 11:50 AM, Ryan Svihla rsvi...@datastax.com wrote: What's CPU, RAM, Storage layer, and data density per node? Exact heap settings would be nice. In the logs look for TombstoneOverflowingException

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
also based on replayed batches..are you using batches to load data? On Tue, Dec 16, 2014 at 3:12 PM, Ryan Svihla rsvi...@datastax.com wrote: So heap of that size without some tuning will create a number of problems (high cpu usage one of them), I suggest either 8GB heap and 400mb parnew

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
pegged at load 4 for the over 12 hours with hardly and read or write traffic. I will set one to 8GB/400MB and see if its load changes. On Tue, Dec 16, 2014 at 1:12 PM, Ryan Svihla rsvi...@datastax.com wrote: So heap of that size without some tuning will create a number of problems (high cpu usage

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
of tombstones in a row? thanks, arne On Tue, Dec 16, 2014 at 1:24 PM, Ryan Svihla rsvi...@datastax.com wrote: So 1024 is still a good 2.5 times what I'm suggesting, 6GB is hardly enough to run Cassandra well in, especially if you're going full bore on loads. However, you maybe just flat out

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
. On Tue, Dec 16, 2014 at 1:47 PM, Ryan Svihla rsvi...@datastax.com wrote: Can you define what is virtual no traffic sorry to be repetitive about that, but I've worked on a lot of clusters in the past year and people have wildly different ideas what that means. unlogged batches of the same partition

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
18 Compaction seems like the only thing consistently active and pending On Tue, Dec 16, 2014 at 2:18 PM, Ryan Svihla rsvi...@datastax.com wrote: Ok based on those numbers I have a theory.. can you show me nodetool tptats for all 3 nodes? On Tue, Dec 16, 2014 at 4:04 PM

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
, DELETE by partition key, insert all rows for partition key, repeat. We two tables that have similar frame data projections and some other aggregates with much smaller row count per partition key. hope that helps, arne On Dec 16, 2014, at 2:46 PM, Ryan Svihla rsvi...@datastax.com wrote: so

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
the cluster is idle? Is it compaction catching up and would manual forced compaction alleviate that? thanks, arne On Dec 16, 2014, at 3:28 PM, Ryan Svihla rsvi...@datastax.com wrote: so a delete is really another write for gc_grace_seconds (default 10 days), if you get enough tombstones it can

Re: 100% CPU utilization, ParNew and never completing compactions

2014-12-16 Thread Ryan Svihla
claim that there are any tombstones. On Dec 16, 2014, at 4:26 PM, Ryan Svihla rsvi...@datastax.com wrote: manual forced compactions create more problems than they solve, if you have no evidence of tombstones in your selects (which seems odd, can you share some of the tracing output?), then I'm

Re: [Consitency on cqlsh command prompt]

2014-12-17 Thread Ryan Svihla
-- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache

Re: Query strategy with respect to tombstones

2014-12-17 Thread Ryan Svihla
Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax

Re: simple data movement ?

2014-12-18 Thread Ryan Svihla
-- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache

Re: bootstrapping manually when auto_bootstrap=false ?

2014-12-18 Thread Ryan Svihla
… or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | +61 415 936 359 -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla

Re: Cassandra for Analytics?

2014-12-18 Thread Ryan Svihla
://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative

Re: Cassandra for Analytics?

2014-12-18 Thread Ryan Svihla
% reads). We are planning to use Spark as the in memory computation engine. Thanks Ajay -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621

Re: Cassandra for Analytics?

2014-12-18 Thread Ryan Svihla
to avoid putting the cart infront of the horse. Picking a tool before you have a clear understanding of the problem is a good recipe for disaster On Thu, Dec 18, 2014 at 8:04 AM, Ryan Svihla rsvi...@datastax.com wrote: Since Ajay is already using spark the Spark Cassandra Connector really gets

Re: Cassandra for Analytics?

2014-12-18 Thread Ryan Svihla
processing is the same. On Thu, Dec 18, 2014 at 8:18 AM, Ryan Svihla rsvi...@datastax.com wrote: I'll decline to continue the commentary on spark, as again this probably belongs on another list, other than to say, microbatches is an intentional design tradeoff that has notable benefits

Re: Drivers performance

2014-12-19 Thread Ryan Svihla
: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache

Re: 答复: Cassandra 2.1.0 Crashes the JVM with OOM with heaps of memory free

2014-12-19 Thread Ryan Svihla
] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s

Re: Multi DC informations (sync)

2014-12-19 Thread Ryan Svihla
exist but I am not aware of it. Any other important information or advice you can give me about best practices or tricks while running a multi DC (cross regions US - EU) is welcome of course ! cheers, Alain -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution

Re: Multi DC informations (sync)

2014-12-19 Thread Ryan Svihla
and potentially reads, the tools are there. C*heers Alain 2014-12-19 15:43 GMT+01:00 Ryan Svihla rsvi...@datastax.com: More accurately,the write path of Cassandra in a multi dc sense is kinda like the following 1. write goes to a node which acts as coordinator 2. writes go out to all

Re: Key Cache Questions

2014-12-19 Thread Ryan Svihla
: Hello all, I just read that the default size of the Key cache is 100 MB. Is it stored in memory or disk? -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub

Re: installing cassandra

2014-12-21 Thread Ryan Svihla
wide deployment in the install process already? B. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most

Re: Replacing nodes disks

2014-12-21 Thread Ryan Svihla
would take to rebuild a 250G data node? Thanks in advance, Or. -- Or Sher -- Or Sher -- Or Sher -- Or Sher -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
You can cheat it by using the non counter column as part of your primary key (clustering column specifically) but the cases where this could work are limited and the places this is a good idea are even more rare. As for using counters in batches are already a not well regarded concept and counter

Re: installing cassandra

2014-12-22 Thread Ryan Svihla
are operating at a scale where you need to be able to automate adding new nodes. On Sun, Dec 21, 2014, 8:05 AM Ryan Svihla rsvi...@datastax.com wrote: Puppet, Chef, Ansible and I'm sure many others. I've personally worked with a number of people on all three, a quick google for Puppet Cassandra

Re: Multi DC informations (sync)

2014-12-22 Thread Ryan Svihla
information Ryan, I hope I am clear enough while expressing my doubts. C*heers Alain 2014-12-19 15:43 GMT+01:00 Ryan Svihla rsvi...@datastax.com: More accurately,the write path of Cassandra in a multi dc sense is kinda like the following 1. write goes to a node which acts as coordinator 2. writes

Re: CF performance suddenly degraded

2014-12-22 Thread Ryan Svihla
There can be many root causes. Would need a lot more information such as node hardware specs, cf histograms on the table, tpstats,GC settings (Max heap, parnew, JVM version) and logs with specifically any ERROR, WARN, or GCInspector messages As a start a simple trace of the query in question is

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
realize it applied in my case. Thanks. On Mon, Dec 22, 2014 at 4:43 PM, Ryan Svihla rsvi...@datastax.com wrote: what is rpc_address set to in cassandra.yaml? my gut is localhost, set it to the interface that communicates between host and guest. On Mon, Dec 22, 2014 at 3:38 PM, Kai Wang dep

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
if this helps..what did you change rpc_address to? On Mon, Dec 22, 2014 at 8:15 PM, Ryan Svihla rsvi...@datastax.com wrote: right that's localhost, you have to change it to match the ip of whatever you changed rpc_address too On Mon, Dec 22, 2014 at 8:07 PM, Kai Wang dep...@gmail.com wrote

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
connect 127.0.0.1:9042. On Mon, Dec 22, 2014 at 9:01 PM, Ryan Svihla rsvi...@datastax.com wrote: totally depends on how the implementation is handled in virtualbox, I'm assuming you're connecting to an IP that makes sense on the guest (ie nodetool -h 192.168.1.100 and cqlsh 192.168.1.100, replace

Re: Connect to C* instance inside virtualbox

2014-12-22 Thread Ryan Svihla
but not both. So I didn't set rpc_addresa. Will double check tomorrow. Thanks. On Dec 22, 2014 9:17 PM, Ryan Svihla rsvi...@datastax.com wrote: if this helps..what did you change rpc_address to? On Mon, Dec 22, 2014 at 8:15 PM, Ryan Svihla rsvi...@datastax.com wrote: right that's localhost

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
) I don't need a 100% accurate count and strong consistency. Performance and application complexity is my main concern. Thanks On Mon, Dec 22, 2014 at 10:37 PM, Ryan Svihla rsvi...@datastax.com wrote: You can cheat it by using the non counter column as part of your primary key (clustering

Re: Store counter with non-counter column in the same column family?

2014-12-22 Thread Ryan Svihla
for different query paths and solr. If I switch to Spark, do I still needs to use counter or counting will be done by spark on regular table? On Tue, Dec 23, 2014 at 11:31 AM, Ryan Svihla rsvi...@datastax.com wrote: increment wouldn't be idempotent from the client unless you knew the count

Re: CQL3 vs Thrift

2014-12-22 Thread Ryan Svihla
Don't static columns get you what you want? http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refStaticCol.html On Dec 22, 2014 10:50 PM, David Broyles sj.clim...@gmail.com wrote: Although I used Cassandra 1.0.X extensively, I'm new to CQL3. Pages such as

Re: [Cassandra] [Generation of SStableLoader slow]

2014-12-24 Thread Ryan Svihla
I think that'd be slow copying large files with just the cp command. Cassandra isn't doing anything amazingly strange here, you don't have a lot of RAM, nor CPU and I'm assuming the underlying disk is slow here as well. Without more parameters and details it's hard to define if there is an issue.

Re: 答复:

2014-12-24 Thread Ryan Svihla
Mestrando em Ciências da Computação - UFG Arquiteto de Software CUIA Internet Brasil -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621

Re: Tombstones without DELETE

2014-12-24 Thread Ryan Svihla
an email to java-driver-user+unsubscr...@lists.datastax.com. -- [image: datastax_logo.png] http://www.datastax.com/ Ryan Svihla Solution Architect [image: twitter.png] https://twitter.com/foundev [image: linkedin.png] http://www.linkedin.com/pub/ryan-svihla/12/621/727/ DataStax

Re: [Cassandra] [Generation of SStableLoader slow]

2014-12-24 Thread Ryan Svihla
if there is other way to make it faster except adding CPUs and ram. *Best Regards!* *Chao Yan--**My twitter:Andy Yan @yanchao727 https://twitter.com/yanchao727* *My Weibo:http://weibo.com/herewearenow http://weibo.com/herewearenow--* 2014-12-24 20:40 GMT+08:00 Ryan Svihla

Re: CQL3 vs Thrift

2014-12-24 Thread Ryan Svihla
again! On Mon, Dec 22, 2014 at 9:50 PM, Ryan Svihla rsvi...@datastax.com wrote: Don't static columns get you what you want? http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refStaticCol.html On Dec 22, 2014 10:50 PM, David Broyles sj.clim...@gmail.com wrote: Although I used

Re: CQL3 vs Thrift

2014-12-24 Thread Ryan Svihla
/total_events (although with potentially many other pieces of static information). More generally, do you find that tuned applications tend to use Thrift, a combination of Thrift and CQL3, or is CQL3 really expected to replace Thrift? Thanks again! On Mon, Dec 22, 2014 at 9:50 PM, Ryan Svihla rsvi

Re: Is there a way to add a new node to a cluster but not sync old data?

2015-01-22 Thread Ryan Svihla
and this node only afford new data? -- Thanks, Ryan Svihla

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
, Ryan Svihla

Re: Re: Cassandra update row after delete immediately, and read that, the data not right?

2015-01-06 Thread Ryan Svihla
with cassandra?? Thanks! -- Thanks, Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
, Ryan Svihla r...@foundev.pro wrote: as long as they know how to handle node recovery and don't inflict return data back from the dead that was deleted. On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla r...@foundev.pro wrote

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
) by deleting and inserting as a new row. This is not something we would do on a regular basis, but after or during the process a compact would greatly help to clear out tombstones/rewritten data. @Ryan Svihla it also sounds like your suggestion in this case would be: create a new column family

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
as long as they know how to handle node recovery and don't inflict return data back from the dead that was deleted. On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla r...@foundev.pro wrote: In general today, large amounts

Re: Question about `nodetool rebuild` finsh

2015-01-06 Thread Ryan Svihla
, but there is few network traffic on my new data center nodes. I want to konw _how could I konw when the rebuild finsh_. Thanks all for your reply. -- All the best! http://luolee.me -- Thanks, Ryan Svihla

Re: deletedAt and localDeletion

2015-01-06 Thread Ryan Svihla
log. SliceQueryFilter.java (line 225) Read 6 live and 2688 tombstoned cells in ks.mytable (see tombstone_warn_threshold). 10 columns was requested, slices=[-], delInfo={deletedAt=-9223372036854775808, localDeletion= 2147483647} Thanks, -- Thanks, Ryan Svihla

Re: ttl in collections

2015-01-06 Thread Ryan Svihla
quickly, having thousands of record updates per second. That left us with a CF containing millions of records that we couldn't select the way we originally intended. Regards, Jens -- Thanks, Ryan Svihla

Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread Ryan Svihla
are this way. Inserts are actually UPSERTS and you can go ahead and do two updates instead of insert, delete, update. Thanks. -- Thanks, Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
of hints. I personally run with at least a 6 hour max_h_w_i_m. In older versions of Cassandra, 24-48 hours of hints could hose your node via ineffective constant compaction. =Rob -- Thanks, Ryan Svihla

Re: C* throws OOM error despite use of automatic paging

2015-01-12 Thread Ryan Svihla
. Does anybody have insights as to what could be happening? Thanks. Mohammed -- Thanks, Ryan Svihla

Re: Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-07 Thread Ryan Svihla
and update use the client side timestamp. The update timestamp should be always bigger than the deletion timestamp. I wonder why the update failed in some cases? thank you. - 原始邮件 - 发件人:Ryan Svihla r...@foundev.pro 收件人:user@cassandra.apache.org, yhq...@sina.com 主题:Re

Re: How to bulkload into a specific data center?

2015-01-08 Thread Ryan Svihla
Just noticed you'd sent this to the dev list, this is a question for only the user list, and please do not send questions of this type to the developer list. On Thu, Jan 8, 2015 at 8:33 AM, Ryan Svihla r...@foundev.pro wrote: The nature of replication factor is such that writes will go wherever

Re: How to bulkload into a specific data center?

2015-01-08 Thread Ryan Svihla
address. However, I found my jobs were connecting to the REST service data center. How can I specify the data center? -- Thanks, Ryan Svihla

Re:

2015-01-07 Thread Ryan Svihla
to explore. Materialized views are your friend, use them freely but as always being mindful of real world constraints and goals. Regards, Nageswara Rao On Tue, Jan 6, 2015 at 10:53 PM, Ryan Svihla r...@foundev.pro wrote: Normal data modeling approach in Cassandra is a separate column family

Re: Are Triggers in Cassandra 2.1.2 performace Hog??

2015-01-07 Thread Ryan Svihla
view does a Cassandra Trigger impacts the performance of read/Write of Cassandra. Also any other way you guys achieve this please guide me. I am struck on this . Regards Asit -- Thanks, Ryan Svihla

Re:

2015-01-06 Thread Ryan Svihla
a super column family which has the key PRIMARY KEY((prodgroup), staus, productid) should work. Would like to get expert advice on other alternatives. -- Thanks, Nageswara Rao.V *The LORD reigns* -- Thanks, Ryan Svihla

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
this. On Tuesday, January 6, 2015, Ryan Svihla r...@foundev.pro wrote: Btw side note here, you're using GIANT Batches, and the logs are indicating such, this will cause a signficant amount of heap pressure. The root cause fix is not to use giant batches in the first place. On Tue, Jan 6, 2015 at 4:43 AM

Re: Queries required before data modeling?

2015-01-06 Thread Ryan Svihla
when I get a query which I had not thought off? Regards, Seenu. -- Thanks, Ryan Svihla

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
. =Rob -- Thanks, Ryan Svihla

Re: Reload/resync system.peers table

2015-01-06 Thread Ryan Svihla
here, though I'm not sure if that's just FUD talking... =Rob -- Thanks, Ryan Svihla

Re: Does nodetool repair stop the node to answer requests ?

2015-01-23 Thread Ryan Svihla
that if you are operating near failure, repair might trip a node into failure. But if you are operating correctly, repair should not. =Rob -- Morgan SEGALIS -- Thanks, Ryan Svihla

Re: Is replication possible with already existing data?

2015-10-25 Thread Ryan Svihla
t from 12 seconds to 30 seconds. >>> >>> 2) >>> Increasing driver-connect-timeout from 5 seconds to 30 seconds. >>> >>> 3) >>> I have also confirmed that each of the 4 nodes are telnet-able over >>> ports 9042 and 9160 each. >>> >>> >>> Definitely seems to be some driver-issue, since >>> data-persistence/replication works perfect (with any permutation) if >>> data-persistence is done via "cqlsh". >>> >>> >>> Kindly provide some pointers. >>> Ultimately, it is the Java-driver that will be used in production, so it >>> is imperative that data-persistence/replication happens for any downing of >>> any permutation of node(s). >>> >>> >>> Thanks and Regards, >>> Ajay >>> >> >> >> >> -- >> Regards, >> Ajay >> > > > > -- > Regards, > Ajay > -- Thanks, Ryan Svihla

Re: Cassandra Object Mapper - Dynamically pass keyspace value

2015-10-25 Thread Ryan Svihla
re 1000's of files it become a big maintenance > issue > > @UDT (keyspace = "complex", name = "address")public class Address { > private String street; > private String city; > private int zipCode; > > -- Thanks, Ryan Svihla

Re: Advice for asymmetric reporting cluster architecture

2015-10-18 Thread Ryan Svihla
on the filtered dataset. - Ryan Svihla On Sat, Oct 17, 2015 at 7:12 PM -0700, "Jack Krupansky" <jack.krupan...@gmail.com> wrote: Yes, you can have all your normal data centers with DSE configured for real-time data access and then have a data center that shares the same d

Re: How to read data from local cassandra cluster

2015-10-18 Thread Ryan Svihla
Not a Cassandra question so this isn't the right list, but you can just upload the file to CFS and then access it by the path "cfs://filename". However, since you have DSE you may want to contact support for help with pathing in DSE using CFS and Spark. -Ryan Svihla On Fri, Oct 16,

Re: Realtime data and (C)AP

2015-10-11 Thread Ryan Svihla
;>> at QUORUM is important. If read is ONE then the read operation *may* >>>> not see important update. The safest option is QUORUM for both write and >>>> read. Then depending on the business or feature the consistency may be >>>> tuned. >>>> >>>> — Brice >>>> ​ >>>> >>> >>> >>> >>> -- >>> Steve Robenalt >>> Software Architect >>> sroben...@highwire.org <bza...@highwire.org> >>> (office/cell): 916-505-1785 >>> >>> HighWire Press, Inc. >>> 425 Broadway St, Redwood City, CA 94063 >>> www.highwire.org >>> >>> Technology for Scholarly Communication >>> >>> >> >> >> -- >> Steve Robenalt >> Software Architect >> sroben...@highwire.org <bza...@highwire.org> >> (office/cell): 916-505-1785 >> >> HighWire Press, Inc. >> 425 Broadway St, Redwood City, CA 94063 >> www.highwire.org >> >> Technology for Scholarly Communication >> >> -- Thanks, Ryan Svihla

Re: Is Cassandra really Strong consistency?

2015-09-07 Thread Ryan Svihla
echnology. On Mon, Sep 7, 2015 at 6:20 AM, ibrahim El-sanosi <ibrahimsaba...@gmail.com> wrote: > ""It you need strong consistency and don't mind lower transaction rate, > you're better off with base"" > I wish you can explain more how this statment relate to the my post? > Regards, > -- Thanks, Ryan Svihla

Re: Convert joins in RDBMS to Cassandra

2015-09-07 Thread Ryan Svihla
4. Solution 2: * >25. >26. 1) Create a map table for every possible join. >27. >28. Drawbacks with this aproach: >29. >30. I think, this is not a right approach. So join to table (map >table) mapping idea is not right. >31. >32. pastebin link for the same: http://pastebin.com/FRAyihPT >33. Please suggest me on this. > > > > -- Thanks, Ryan Svihla

Re: Data Size on each node

2015-09-07 Thread Ryan Svihla
currently have a Cassandra Cluster spread over 2 DC. The data size on >>> each node of the cluster is 1.2TB with spinning disk. Minor and Major >>> compactions are slowing down our Read queries. It has been suggested that >>> replacing Spinning disks with SSD might help. Has anybody done something >>> similar? If so what has been the results? >>> Also if we go with SSD, how big can each node get for commercially >>> available SSDs? >>> Regards >>> Sachin >>> >> >> > -- Regards, Ryan Svihla

Re: cassandra scalability

2015-09-07 Thread Ryan Svihla
Rack >> UN 40.0.0.208 128.73 KB 248 68.8% >> 6e7788f9-56bf-4314-a23a-3bf1642d0606 RAC1 >> UN 40.0.0.209 114.59 KB 249 67.8% >> 84f6f0be-6633-4c36-b341-b968ff91a58f RAC1 >> UN 40.0.0.205 129.53 KB 245 63.5% >> aa233dc2-a8ae-4c00-af74-0a119825237f RAC1 >> >> the result of the query select * from service_dictionary.table1; gave me >> 70 rows from 40.0.0.205 >> 64 from 40.0.0.209 >> 54 from 40.0.0.208 >> >> 2015-09-07 11:13 GMT+02:00 Edouard COLE <edouard.c...@rgsystem.com>: >> Could you provide the result of : >> - nodetool status >> - nodetool status YOURKEYSPACE >> >> >> > -- Regards, Ryan Svihla

Re: How to prevent queries being routed to new DC?

2015-09-07 Thread Ryan Svihla
at LOCAL_* quorum levels, I do not believe those queries should be >>> routed to the new dc. >>> >> >> Other than CASSANDRA-9753, this is true. >> >> https://issues.apache.org/jira/browse/CASSANDRA-9753 (Unresolved; ): >> "LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch" >> >> =Rob >> >> > -- Regards, Ryan Svihla

Re: Querying on multiple columns

2015-09-07 Thread Ryan Svihla
f while writing the data. > > > Please let me know if a better solution is available. I am using 2.1.5 > version. > > Regards, > Sam > -- Thanks, Ryan Svihla

  1   2   >