Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-15 Thread Cassa L
Hi, I did set --driver-memory 4G. I still run into this issue after 1 hour of data load. I also tried version 1.6 in test environment. I hit this issue much faster than in 1.5.1 setup. LCassa On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar wrote: > try setting the

Re: Spark Memory Error - Not enough space to cache broadcast

2016-06-15 Thread Cassa L
Hi, Upgrading sprak is not option right now. I did set --driver-memory 4G. I still run into this issue after 1 hour of data load. LCassa On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar wrote: > try setting the option --driver-memory 4G > > On Tue, Jun 14, 2016 at 3:52

Re: how to force cassandra-stress to actually generate enough data

2016-06-15 Thread Benedict Elliott Smith
cassandra-stress has some (many) limitations - that I had planned to address now it's seeing wider adoption, but since I no longer work on the project for my day job I am unlikely to now... so, sorry but you'll have to tolerate them :) In particular, the problem you encounter here is that a given

Are you using Cassandra to record user analytics?

2016-06-15 Thread Richard L. Burton III
I'm interested in hearing how you may be using Cassandra to capture user analytics and some design choices you've made. At a high level, I'm considering the following: web application -> kafka -> cassandra I need to be able to show the user and at a higher level the company: - When did a

Re: Cassandra monitoring

2016-06-15 Thread Julien Anguenot
Hey Otis, hehe :-) I do have the latest agents running but these metrics are still empty on my side. Will take that issue on Sematext side then. Thanks. J. > On Jun 15, 2016, at 8:26 AM, Otis Gospodnetić > wrote: > > Hi, > > On Wed, Jun 15, 2016 at 8:58

Re: Cassandra monitoring

2016-06-15 Thread Otis Gospodnetić
Hi, On Tue, Jun 14, 2016 at 4:20 PM, Arun Ramakrishnan < sinchronized.a...@gmail.com> wrote: > Thanks Jonathan. > > Out of curiosity, does opscenter support some later version of cassandra > that is not OSS ? > > Well, the most minimal requirement is that, I want to be able to monitor > for

Re: Streaming from 1 node only when adding a new DC

2016-06-15 Thread Paulo Motta
For rebuild, replace and -Dcassandra.consistent.rangemovement=false in general we currently pick the closest replica (as indicated by the Snitch) which has the range, what will often map to the same node due to the dynamic snitch, specially when N=RF. This is good for picking a node in the same DC

Re: Cqlsh Questions

2016-06-15 Thread Eric Stevens
There's an effort to improve the docs, but while that's catching up, 3.0 has the latest version of the document you're looking for: https://cassandra.apache.org/doc/cql3/CQL-3.0.html#createKeyspaceStmt On Wed, Jun 15, 2016 at 5:28 AM Steve Anderson wrote: > Couple of

Re: Data lost in Cassandra 3.5 single instance via Erlang driver

2016-06-15 Thread Eric Stevens
As a side note, if you're inserting records quickly enough that you're potentially doing multiple in the same millisecond, it seems likely to me that your partition size is going to be too large at a day level unless your writes are super bursty: ((appkey, pub_date), pub_timestamp). You might

Re: Cassandra monitoring

2016-06-15 Thread Julien Anguenot
> On Jun 14, 2016, at 2:10 PM, Arun Ramakrishnan > wrote: […] > > Sematext has a integrations for monitoring cassandra. Does anyone have good > experience with it ? We are using Sematext with Cassandra 3.0.x and it is mostly working fine for us expect couple

Re: how to force cassandra-stress to actually generate enough data

2016-06-15 Thread Ben Slater
Are you running with n=[number ops] or duration=[xx]? I’ve found you need to you n= when inserting data. When you use duration cassandra-stress defaults to 1,000,000 somethings (to be honest, I’m not entirely sure if it’s rows, partitions or something else that the 1,000,000 relates to) and

Re: how to force cassandra-stress to actually generate enough data

2016-06-15 Thread Julien Anguenot
I usually do a write only bench run first. Doing a 1B write iterations will produce 200GB+ data on disk. You can then do mixed tests. For instance, a write bench that would produce such volume on a 3 nodes cluster: ./tools/bin/cassandra-stress write cl=LOCAL_QUORUM n=10 -rate

Re: Reason for Trace Message Drop

2016-06-15 Thread Eric Stevens
This is better kept to the User groups. What are your JVM memory settings for Cassandra, and have you seen big GC's in your logs? The reason I ask is because that's a large number of column families, which produces memory pressure, and at first blush that strikes me as a likely cause. On Wed,

how to force cassandra-stress to actually generate enough data

2016-06-15 Thread Peter Kovgan
Hi, The cassandra-stress is not helping really to populate the disk sufficiently. I tried several table structures, providing cluster: UNIFORM(1..100) on clustering parts of the PK. Partition part of PK makes about 660 000 partitions. The hope was create enough cells in a row, make

Re: Cassandra monitoring

2016-06-15 Thread Kai Wang
I use graphite/jmxtrans/collectd to monitor not just Cassandra but also other jvm applications as well as OS. I found it's more useful and flexible than opscenter in terms of monitoring. On Jun 14, 2016 3:10 PM, "Arun Ramakrishnan" wrote: What are the options for a

Cqlsh Questions

2016-06-15 Thread Steve Anderson
Couple of Cqlsh questions: 1) Why when I run the DESCRIBE CLUSTER command no snitch information is required? Is this because my Cassandra cluster is a single node? 2) When I run the HELP CREATE_KEYSPACE command the following info is displayed: *** No browser to display CQL help. URL

Reason for Trace Message Drop

2016-06-15 Thread Varun Barala
Hi all, Can anyone tell me that what are all possible reasons for below log:- *"INFO [ScheduledTasks:1] 2016-06-14 06:27:39,498 MessagingService.java:929 - _TRACE messages were dropped in last 5000 ms: 928 for internal timeout and 0 for cross node timeout".* I searched online for the same and

Re: Data lost in Cassandra 3.5 single instance via Erlang driver

2016-06-15 Thread linbo liao
Thanks Ben, Paul, Alain. I debug at client side find the reason is pub_timestamp duplicated. I will use timeuuid instead. Thanks, Linbo 2016-06-15 13:09 GMT+08:00 Alain Rastoul : > On 15/06/2016 06:40, linbo liao wrote: > >> I am not sure, but looks it will cause the