Re: schema generation in cassandra

2015-03-18 Thread Ali Akhtar
Why are you creating new tables dynamically? I would try to use a static schema and use a collection (list / map / set) for storing arbitrary data. On Wed, Mar 18, 2015 at 2:52 PM, Ankit Agarwal agarwalankit.k...@gmail.com wrote: Hi, I am new to Cassandra, we are planning to use Cassandra for

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
there is not much load from other processes. Should I try changing default parameters of memory in Cassandra settings. On Wed, Mar 18, 2015 at 5:33 AM, Ali Akhtar ali.rac...@gmail.com wrote: What's your memory / CPU usage at? And how much ram + cpu do you have on this server? On Wed, Mar 18, 2015

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
but not more than that. How can I fetch all rows using this efficiently? On Wed, Mar 18, 2015 at 3:29 AM, Ali Akhtar ali.rac...@gmail.com wrote: Have you tried a smaller fetch size, such as 5k - 2k ? On Wed, Mar 18, 2015 at 12:22 PM, Mehak Mehta meme...@cs.stonybrook.edu wrote: Hi Jens, I have

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
4g also seems small for the kind of load you are trying to handle (billions of rows) etc. I would also try adding more nodes to the cluster. On Wed, Mar 18, 2015 at 2:53 PM, Ali Akhtar ali.rac...@gmail.com wrote: Yeah, it may be that the process is being limited by swap. This page: https

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
results will rarely fetched again. Also do you know how I can do 2d range queries using Cassandra. Some other users suggested me using Solr. But is there any way I can achieve that without using any other technology. On Wed, Mar 18, 2015 at 4:33 AM, Ali Akhtar ali.rac...@gmail.com wrote

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
, Mar 18, 2015 at 4:06 AM, Ali Akhtar ali.rac...@gmail.com wrote: Perhaps just fetch them in batches of 1000 or 2000? For 1m rows, it seems like the difference would only be a few minutes. Do you have to do this all the time, or only once in a while? On Wed, Mar 18, 2015 at 12:34 PM, Mehak Mehta

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
Sorry, meant to say that way when you have to render, you can just display the latest cache. On Wed, Mar 18, 2015 at 1:30 PM, Ali Akhtar ali.rac...@gmail.com wrote: I would probably do this in a background thread and cache the results, that way when you have to render, you can just cache

Re: nodetool help

2015-03-16 Thread Ali Akhtar
/cassandra/data/system/* On Mon, Mar 16, 2015 at 1:41 PM, Ali Akhtar ali.rac...@gmail.com wrote: https://gist.github.com/aliakhtar/3649e412787034156cbb Best run from a fresh ubuntu server. On Tue, Mar 17, 2015 at 12:50 AM, jean paul researche...@gmail.com wrote: i find this solution: http

Not seeing keyspace in nodetool compactionhistory

2015-03-18 Thread Ali Akhtar
When I run nodetool compactionhistory , I'm only seeing the system keyspace, and OpsCenter keyspace in the compactions. I only see one mention of my own keyspace, but its only for the smallest table within that keyspace (containing only about 1k rows). My two other tables, containing 1.1m and 100k

Recommended TTL time for max. performance with DateCompactionStrategy?

2015-03-18 Thread Ali Akhtar
I have a table which is going to be storing temporary search results. The results will be available for a short time ( anywhere from 1 to 24 hours) from the time of the search, and then should be deleted to clear up disk space. This is going to apply to all the rows within this table. What would

Re: Timeout error in fetching million rows as results using clustering keys

2015-03-18 Thread Ali Akhtar
Have you tried a smaller fetch size, such as 5k - 2k ? On Wed, Mar 18, 2015 at 12:22 PM, Mehak Mehta meme...@cs.stonybrook.edu wrote: Hi Jens, I have tried with fetch size of 1 still its not giving any results. My expectations were that Cassandra can handle a million rows easily. Is

IO scheduler for SSDs on EC2?

2015-03-15 Thread Ali Akhtar
I was watching a talk recently on Elasticsearch performance in EC2, and they recommended setting the IO scheduler to noop for SSDs. Is that the case for Cassandra as well, or is it recommended to keep the default 'deadline' scheduler for Cassandra? Thanks.

Re: Run Mixed Workload using two instances on one node

2015-03-16 Thread Ali Akhtar
I don't think its recommended to have two instances on the same node. Have you considered using something like elasticsearch for the reports? Its designed for that sort of thing. On Mar 17, 2015 8:07 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: Hi, We are trying to Decouple our Reporting

Re: nodetool help

2015-03-16 Thread Ali Akhtar
https://gist.github.com/aliakhtar/3649e412787034156cbb Best run from a fresh ubuntu server. On Tue, Mar 17, 2015 at 12:50 AM, jean paul researche...@gmail.com wrote: i find this solution:

Re: nodetool help

2015-03-17 Thread Ali Akhtar
The script that you ran has a lot of comments with links that describe the installation process. I would suggest reading those links. On Tue, Mar 17, 2015 at 3:42 PM, jean paul researche...@gmail.com wrote: Hello All, I launched the script (./cassandra-install.sh) without making any changes

Re: Store data with cassandra

2015-03-20 Thread Ali Akhtar
The files you store have to personally be vetted by the cassandra community. Only if they're found to not contain anything inappropriate, does cassandra let you store them. (A 3/4 majority vote is necessary). Please send your files for approval to j...@reallycereal.com On Fri, Mar 20, 2015 at

Re: Store data with cassandra

2015-03-20 Thread Ali Akhtar
) and see on what node the file and its replicas are stored on my cluster of 10 nodes it is a simple file with simple content (text) is that possible ? 2015-03-20 16:44 GMT+01:00 Ali Akhtar ali.rac...@gmail.com: The files you store have to personally be vetted by the cassandra community. Only

Re: best way to measure repair times?

2015-03-19 Thread Ali Akhtar
Cassandra doesn't guarantee eventual consistency? On Fri, Mar 20, 2015 at 12:04 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Mar 19, 2015 at 10:32 AM, Ali Akhtar ali.rac...@gmail.com wrote: Just wondering - why do you have to trigger the repairs? Is that necessary in Cassandra

Re: Cassandra cluster Too high DISK IOs

2015-03-20 Thread Ali Akhtar
That probably depends on how many read / write queries your cluster is processing? Also, since you mentioned provisoned IOPS, are you using EBS for storing the data? If so, you probably want to switch to the ephemeral storage since its locally attached to the instance and doesn't require a

Re: Unable to overwrite some rows

2015-03-11 Thread Ali Akhtar
What happens if you use update where. rather than insert? On Wed, Mar 11, 2015 at 7:58 PM, Guðmundur Örn Jóhannsson gudmundur@gmail.com wrote: I have a 3 node cluster of Cassandra version 2.0.9. My keyspace replication factor is 3 and I'm querying with consistency level ALL.

Re: getting Cassandra to listen to eth0 for port 9042

2015-03-08 Thread Ali Akhtar
On AWS, I've had to use the 'private IP' of the ec2 instance as the listen_address in order to get things to work. In your EC2 dashboard, there will be a private ip as well as public ip for the instance. Try using the private ip for the listen_address. Also, you might have more luck installing

Re: getting Cassandra to listen to eth0 for port 9042

2015-03-08 Thread Ali Akhtar
Other settings on AWS: rpc_address : 0.0.0.0 snitch: Ec2Snitch seeds, listen_address, broadcast_address: private ip On Mon, Mar 9, 2015 at 7:36 AM, Ali Akhtar ali.rac...@gmail.com wrote: On AWS, I've had to use the 'private IP' of the ec2 instance as the listen_address in order to get things

Re: cassandra node jvm stall intermittently

2015-03-07 Thread Ali Akhtar
What version are you running? On Sat, Mar 7, 2015 at 2:14 PM, Jason Wee peich...@gmail.com wrote: Hi Jan, thanks for your time to prepare the question and answer below, - How many nodes do you have on the ring ? 12 - What is the activity when this occurs - reads / writes/

Re: DataStax Enterprise Amazon AMI Launch Error

2015-03-12 Thread Ali Akhtar
Seems like its having trouble launching the other EC2 instances that you're requesting. You would need to provide it your AWS credentials for an account that has the permissions to create EC2 instances. Have you done that? If you just want to install cassandra on AWS, you might find this bash

Re: error deleting messages

2015-03-24 Thread Ali Akhtar
On 24 March 2015 at 12:19, Ali Akhtar ali.rac...@gmail.com wrote: What happens when you run it? How far does it get before stopping? On Tue, Mar 24, 2015 at 5:13 PM, joss Earl j...@rareformnewmedia.com wrote: sure: https://gist.github.com/joss75321/7d85e4c75c06530e9d80 On 24 March 2015

Re: cassandra source code

2015-03-24 Thread Ali Akhtar
Make sure to have a priest nearby, or the demon can get out of hands! ;) On Tue, Mar 24, 2015 at 7:11 PM, Job Thomas j...@suntecgroup.com wrote: Hi, Cassandra Demon found in org/apache/cassandra/service/CassandraDaemon.java This contain Main() method also.

Re: Not seeing keyspace in nodetool compactionhistory

2015-03-25 Thread Ali Akhtar
I also just inserted, didn't do any updates. On Thu, Mar 26, 2015 at 12:54 AM, Ali Akhtar ali.rac...@gmail.com wrote: I'm on 2.0.12 I'm not sure if that's issue, since the size isn't growing. The size is about what i'd expect. On Thu, Mar 26, 2015 at 12:44 AM, Tyler Hobbs ty

Re: Not seeing keyspace in nodetool compactionhistory

2015-03-25 Thread Ali Akhtar
://issues.apache.org/jira/browse/CASSANDRA-8635. On Wed, Mar 18, 2015 at 9:37 AM, Ali Akhtar ali.rac...@gmail.com wrote: When I run nodetool compactionhistory , I'm only seeing the system keyspace, and OpsCenter keyspace in the compactions. I only see one mention of my own keyspace, but its only

Re: Not seeing keyspace in nodetool compactionhistory

2015-03-25 Thread Ali Akhtar
not shown. On Thu, Mar 26, 2015 at 1:04 AM, Tyler Hobbs ty...@datastax.com wrote: How many sstables (*-Data.db files) do each of your two tables have? On Wed, Mar 25, 2015 at 2:54 PM, Ali Akhtar ali.rac...@gmail.com wrote: I also just inserted, didn't do any updates. On Thu, Mar 26, 2015 at 12:54

Data model suggestions

2015-04-23 Thread Ali Akhtar
Hey all, We are working on moving a mysql based application to Cassandra. The workflow in mysql is this: We have two tables: active and archive . Every hour, we pull in data from an external API. The records which are active, are kept in 'active' table. Once a record is no longer active, its

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
if the record is no longer active ? Is it a perioidic process that goes through every record and checks when the last update happened ? regards On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar ali.rac...@gmail.com wrote: Hey all, We are working on moving a mysql based application to Cassandra

Re: Adding New Node Issue

2015-04-23 Thread Ali Akhtar
What version are you running? On Fri, Apr 24, 2015 at 12:51 AM, Thomas Miller thomas.mil...@wda.com wrote: Jeff, Thanks for the response. I had come across that as a possible solution previously but there are discrepancies that would lead me to think that that is not the issue. It

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
on partition key will timeout in cassandra. They can however be made to work using the column cluster key. To comment more, We would need to see your proposed cassandra tables and queries that you might need to run. regards On Thu, Apr 23, 2015 at 9:45 AM, Ali Akhtar ali.rac...@gmail.com wrote

Re: Data model suggestions

2015-04-26 Thread Ali Akhtar
, 2015 1:32 PM, Ali Akhtar ali.rac...@gmail.com wrote: Good point about the range selects. I think they can be made to work with limits, though. Or, since the active records will never usually be 500k, the ids may just be cached in memory. Most of the time, during reads, the queries will just

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Ali Akhtar
You might find it better to use elasticsearch for your aggregate queries and analytics. Cassandra is more of just a data store. On Apr 22, 2015 4:42 PM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, Currently we are setting up a “big” data cluster, but we are only going to have a

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Ali Akhtar
is strictly prohibited. *From: *Ali Akhtar ali.rac...@gmail.com *Reply-To: *user@cassandra.apache.org *Date: *Wednesday, April 22, 2015 at 7:52 AM *To: *user@cassandra.apache.org *Subject: *Re: Adhoc querying in Cassandra? You might find it better to use elasticsearch for your aggregate

Re: Inserting null values

2015-04-29 Thread Ali Akhtar
Have you considered adding a 'toSafe' method which checks if the item is null, and if so, returns a default value? E.g String too = safe(bar, ); . On Apr 29, 2015 3:14 PM, Matthew Johnson matt.john...@algomi.com wrote: Hi all, I have some fields that I am storing into Cassandra, but some of

Re: Data model suggestions

2015-04-27 Thread Ali Akhtar
automatically create snapshots, there no “snapshotting” advantage for using DROP . See http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__auto_snapshot *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Sunday, April 26

Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
I'm running some ETL jobs, where the pattern is the following: 1- Get some records from an external API, 2- For each record, see if its lastModified date the lastModified i have in db (or if I don't have that record in db) 3- If lastModified dbLastModified, the item wasn't changed, ignore it.

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
. Updates to values don’t create tombstones. Only deletes (either by executing delete, inserting a null value or by setting a TTL) create tombstones. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 1:27 PM *To:* user@cassandra.apache.org *Subject

Re: Insert Vs Updates - Both create tombstones

2015-05-13 Thread Ali Akhtar
Sorry, wrong thread. Disregard the above On Wed, May 13, 2015 at 4:08 PM, Ali Akhtar ali.rac...@gmail.com wrote: If specifying 'using' timestamp, the docs say to provide microseconds, but where are these microseconds obtained from? I have regular java.util.Date objects, I can get the time

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
Would TimeUnit.MILLISECONDS.toMicros( myDate.getTime() ) work for producing the microsecond timestamp ? On Wed, May 13, 2015 at 4:09 PM, Ali Akhtar ali.rac...@gmail.com wrote: If specifying 'using' timestamp, the docs say to provide microseconds, but where are these microseconds obtained from

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
Is there a way in the java driver, to get the number of rows that an update was applied to? On Wed, May 13, 2015 at 4:33 PM, Ali Akhtar ali.rac...@gmail.com wrote: Thanks. So supplying the timestamp with the update (via using) should fix that, right? (By skipping updates where lastModified

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
tombstones. Only deletes (either by executing delete, inserting a null value or by setting a TTL) create tombstones. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 1:27 PM *To:* user@cassandra.apache.org *Subject:* Updating only modified records (where

Re: Insert Vs Updates - Both create tombstones

2015-05-13 Thread Ali Akhtar
If specifying 'using' timestamp, the docs say to provide microseconds, but where are these microseconds obtained from? I have regular java.util.Date objects, I can get the time in milliseconds (i.e the unix timestamp), how would I convert that to microseconds? On Wed, May 13, 2015 at 3:45 PM,

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
stated it’s rare for rows to be updated then the overhead should be negligible. The easiest way to convert a milliseconds timestamp long value to microseconds is to multiply by 1000. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 2:15 PM *To:* user

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
. Updates to values don’t create tombstones. Only deletes (either by executing delete, inserting a null value or by setting a TTL) create tombstones. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 1:27 PM *To:* user@cassandra.apache.org *Subject:* Updating

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
? UPDATE in Cassandra updates specific rows. All of them are updated, nothing is ignored. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 2:43 PM *To:* user@cassandra.apache.org *Subject:* Re: Updating only modified records (where lastModified current

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
described in the previous email. *From:* Ali Akhtar [mailto:ali.rac...@gmail.com] *Sent:* Wednesday, May 13, 2015 3:13 PM *To:* user@cassandra.apache.org *Subject:* Re: Updating only modified records (where lastModified current date) I don’t understand the ETL use case and its relevance

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
AM, Ali Akhtar ali.rac...@gmail.com wrote: But your previous email talked about when T1 is different: Assume timestamp T1 T2 and you stored value V with timestamp T2. Then you store V’ with timestamp T1. What if you issue an update twice, but with the same timestamp? E.g if you ran

Re: Updating only modified records (where lastModified current date)

2015-05-13 Thread Ali Akhtar
Can lightweight txns be used in a batch update? On Wed, May 13, 2015 at 5:48 PM, Ali Akhtar ali.rac...@gmail.com wrote: The 6k is only the starting value, its expected to scale up to ~200 million records. On Wed, May 13, 2015 at 5:44 PM, Robert Wille rwi...@fold3.com wrote: You could use

Re: Cassandra feature enhancement

2015-04-09 Thread Ali Akhtar
If I were you, I would learn Cassandra's internals and how it works (there are several Webinars that you can watch). Once you understand its internals, then you'll be in a much better position to think of a feature enhancement you can do. You'll also be in a better position to do future Cassandra

Re: How much disk is needed to compact Leveled compaction?

2015-04-06 Thread Ali Akhtar
I may have misunderstood, but it seems that he was already using LeveledCompaction On Tue, Apr 7, 2015 at 3:17 AM, DuyHai Doan doanduy...@gmail.com wrote: If you have SSD, you may afford switching to leveled compaction strategy, which requires much less than 50% of the current dataset for free

Re: Cassandra vs OS x

2015-04-07 Thread Ali Akhtar
Cost may be a factor? OS X servers would cost a lot more than Linux servers. On Tue, Apr 7, 2015 at 4:13 PM, Jean Tremblay jean.tremb...@zen-innovations.com wrote: Hi, Why do everyone say that Cassandra should not be used in production on an Mac OS x? Why would this not work? Are there

Re: Disabling auto snapshots

2015-05-21 Thread Ali Akhtar
Thanks! On Thu, May 21, 2015 at 12:34 PM, Mark Reddy mark.l.re...@gmail.com wrote: To disable auto snapshots, set the property auto_snapshot: false in your cassandra.yaml file. Mark On 21 May 2015 at 08:30, Ali Akhtar ali.rac...@gmail.com wrote: Is there a config setting where automatic

Disabling auto snapshots

2015-05-21 Thread Ali Akhtar
Is there a config setting where automatic snapshots can be disabled? I have a use case where a table is truncated quite often, and would like to not have snapshots. I can't find anything on google. Thanks.

Re: EC2snitch in AWS

2015-05-27 Thread Ali Akhtar
What details specifically do you mean? I wrote this bash script which is what I've been using for installing cassandra 2.0.xx on AWS: https://gist.github.com/aliakhtar/3649e412787034156cbb On Wed, May 27, 2015 at 9:31 PM, Kaushal Shriyan kaushalshri...@gmail.com wrote: Hi, Can somebody please

Normal to have an 8g commit log?

2015-05-22 Thread Ali Akhtar
I have a single node c* server (used for dev). I've been playing around with it, inserting / removing several million rows. At the moment, all tables have been dropped / truncated, and the data directory itself is showing about 45mb used (most of it is probably in the OpsCenter tables rather than

Re: Normal to have an 8g commit log?

2015-05-22 Thread Ali Akhtar
innovative companies such as Netflix, Adobe, Intuit, and eBay. On Fri, May 22, 2015 at 6:51 AM, Ali Akhtar ali.rac...@gmail.com wrote: I have a single node c* server (used for dev). I've been playing around with it, inserting / removing several million rows. At the moment, all tables have been

Re: A new Java Zero Day exploit is affecting Java 1.8.0.45

2015-07-14 Thread Ali Akhtar
If anyone finds that this effects servers / c*, please update us so we can take mitigation measures. Thanks. On Wed, Jul 15, 2015 at 12:48 AM, Ariel Weisberg ar...@weisberg.ws wrote: Hi, Sounds like this isn’t an issue with the runtime. It’s another plugin/webstart/whatever desktop issue

Ordering by multiple columns?

2016-10-08 Thread Ali Akhtar
Is it possible to have multiple clustering keys in cassandra, or some other way to order by multiple columns? For example, say I have a table of songs, and each song has a rating and a date. I want to sort songs by rating first, and then with newer songs on top. So if two songs have 5 rating,

Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
Say I have the following primary key: PRIMARY KEY((organization_id, employee_id)) Will this create 1 row whose primary key is the organization id, but it has a 4 billion column / cell limit? Or will this create 1 row for each employee in the same organization, so if i have 5 employees, they will

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
imit, but probably not a good idea). > > On Oct 8, 2016, at 8:35 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > > the last '4 billion rows' should say '4 billion columns / cells' > > On Sun, Oct 9, 2016 at 6:34 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Ali Akhtar
the last '4 billion rows' should say '4 billion columns / cells' On Sun, Oct 9, 2016 at 6:34 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > Say I have the following primary key: > PRIMARY KEY((organization_id, employee_id)) > > Will this create 1 row whose primary key is t

Re: Partition Key - Wide rows?

2016-10-06 Thread Ali Akhtar
ing on > your dataset). > > Cheers, > > -Phil > -- > From: Ali Akhtar <ali.rac...@gmail.com> > Sent: ‎2016-‎10-‎06 9:04 AM > To: user@cassandra.apache.org > Subject: Partition Key - Wide rows? > > Heya, > > I'm designing some tables, wh

Re: cql-maven-plugin

2016-10-07 Thread Ali Akhtar
Is there a way to call this programatically such as from unit tests, to create keyspace / table schema from a cql file? On Fri, Oct 7, 2016 at 2:40 PM, Brice Dutheil wrote: > Hi there, > > I’d like to share a very simple project around handling CQL files with > maven.

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
Oh, and how do you generate your tables / keyspaces for the tests (if you do)? On Fri, Oct 7, 2016 at 6:21 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > Peddi, > > Thanks, does this start @ localhost, default port? And, mind sharing which > version of cassandra you use this wi

Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
Is it possible to create an isolated cassandra instance which is run during integration tests and it disappears after tests have finished running? Then its recreated the next time tests run (perhaps being populated with test data). I'm using Java.

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
cassandra-unit <https://github.com/jsevellec/cassandra-unit> might be > what you are looking for. It allows you to run an embedded cassandra > instance along side your tests and has some nice integration with JUnit. > > Thanks, > Andy > > On Thu, Oct 6, 2016 at 7:13 PM Ali Ak

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
we ended up > instantiate CassandraDeamon after setting system property of > cassandra.config={yaml location}. It works fine for our needs. > > Praveen > > From: Ali Akhtar <ali.rac...@gmail.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
t;cassandra.config", "file://" + YAML_LOCATION); > cassandraDaemon = new CassandraDaemon(); > cassandraDaemon.init(null); > cassandraDaemon.start(); > > //stop cassandra after tests are done > cassandraDaemon.stop(); > > From: Ali Akhtar <ali.rac...@gmail.com

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
houldn't run into any problems. > > > > On Thu, Oct 6, 2016 4:08 PM, Ali Akhtar ali.rac...@gmail.com wrote: > >> Is it possible to create an isolated cassandra instance which is run >> during integration tests and it disappears after tests have finished >> running? Then i

Re: Running Cassandra in Integration Tests

2016-10-06 Thread Ali Akhtar
you dont need to look for cassandra java api to start/stop instance. you > just need to write a shell script or python or java or any language to > execute shell commands! > > > > On Thu, Oct 6, 2016 4:57 PM, Ali Akhtar ali.rac...@gmail.com wrote: > >> Okay.. but how would I

Partition Key - Wide rows?

2016-10-06 Thread Ali Akhtar
Heya, I'm designing some tables, where data needs to be stored in the following hierarchy: Organization -> Team -> Project -> Issues I need to be able to retrieve issues: - For the whole org - using org id - For a team (org id + team id) - For a project (org id + team id + project id) - If

Re: Ordering by multiple columns?

2016-10-10 Thread Ali Akhtar
slides > > On Sun, Oct 9, 2016 at 2:04 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> Is it possible to have multiple clustering keys in cassandra, or some >> other way to order by multiple columns? >> >> For example, say I have a table of songs, and e

Re: Where to change the datacenter name?

2016-10-10 Thread Ali Akhtar
alues > and snitch settings and there is a risk of node reporting invalid/ missing > data to client. > > > > On Mon, Oct 10, 2016 at 4:08 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> So I see this: >> >> cluster_name: 'Test Cluster' >> >>

Where to change the datacenter name?

2016-10-10 Thread Ali Akhtar
Where can I change the default name 'datacenter1'? I've looked through the configuration files in /etc/cassandra , and can't find where this value is being defined.

Re: Where to change the datacenter name?

2016-10-10 Thread Ali Akhtar
at 12:54 AM, Adam Hutson <a...@datascale.io> wrote: > There is a cluster name in the cassandra.yaml for naming the cluster, aka > data center. Then you assign keyspaces to the data center within the CREATE > KEYSPACE stmt with NetworkTopology. > > > On Monday, October 10, 2

Being asked to use frozen for UDT in 3.9

2016-10-10 Thread Ali Akhtar
According to http://docs.datastax.com/en/cql/3.3/cql/cql_using/useCreateUDT.html > In Cassandra 3.6 and later, the frozen keyword is not required for UDTs that contain only non-collection fields. However if I create a type with 4-5 all text fields, and try to use that type in another table, I

Cannot restrict clustering columns by IN relations when a collection is selected by the query

2016-10-27 Thread Ali Akhtar
I have the following table schema: *CREATE TABLE ticket_by_member (* * project_id text,* * member_id text,* * ticket_id text,* * ticket ticket,* *assigned_members list,* * votes list,* *labels list,* * PRIMARY KEY ( project_id, member_id, ticket_id )* *);* I have

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
t; cql/3.1/cql/cql_using/use_expire_c.html > > Best regards, Vladimir Yudovin, > > *Winguzone <https://winguzone.com?from=list> - Hosted Cloud > CassandraLaunch your cluster in minutes.* > > > On Tue, 08 Nov 2016 05:04:12 -0500*Ali Akhtar <ali.rac...@gmail.com > <

Re: Are Cassandra writes are faster than reads?

2016-11-06 Thread Ali Akhtar
tl;dr? I just want to know if updates are bad for performance, and if so, for how long. On Mon, Nov 7, 2016 at 10:23 AM, Ben Bromhead <b...@instaclustr.com> wrote: > Check out https://wiki.apache.org/cassandra/WritePathForUsers for the > full gory details. > > On Sun, 6 Nov

Re: Are Cassandra writes are faster than reads?

2016-11-06 Thread Ali Akhtar
How long does it take for updates to get merged / compacted into the main data file? On Mon, Nov 7, 2016 at 5:31 AM, Ben Bromhead wrote: > To add some flavor as to how the commitlog implementation is so quick. > > It only flushes to disk every 10s by default. So writes are

Re: Having Counters in a Collection, like a map<int, counter>?

2016-11-09 Thread Ali Akhtar
re creating 1 table >> per type of map<int, counter> that i need? >> But you don't need to create separate table per each counter, just use >> one row per counter: >> >> CREATE TABLE cnt (id int PRIMARY KEY , value counter); >> >> Best regards, Vla

Having Counters in a Collection, like a map<int, counter>?

2016-11-09 Thread Ali Akhtar
I have a use-case where I need to have a dynamic number of counters. The easiest way to do this would be to have a map where the int is the key, and the counter is the value which is incremented / decremented. E.g if something related to 5 happened, then i'd get the counter for 5

Using a Set for UDTs, how is uniqueness established?

2016-11-07 Thread Ali Akhtar
I have a UDT which contains a text 'id' field, which should be used to establish the uniqueness of the UDT. I'd like to have a set field in a table, and I'd like to use the id of the udts to establish uniqueness. Any ideas how this can be done? Also using Java, and c* 3.7

Re: Using a Set for UDTs, how is uniqueness established?

2016-11-07 Thread Ali Akhtar
ismatches then the 2 UDT are different. However, if > the "id" values do match, it does not guarantee that the UDT values match > since it requires that all other fields match. > > > > On Mon, Nov 7, 2016 at 1:14 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >>

Re: Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
7 days for a week) and do a truncate of the table at the end of the > day. > > On Tue, Nov 8, 2016 at 11:04 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> I have a use case where a lot of updates and deletes to a table will be >> necessary. >> >> The deletes

Improving performance where a lot of updates and deletes are required?

2016-11-08 Thread Ali Akhtar
I have a use case where a lot of updates and deletes to a table will be necessary. The deletes will be done at a scheduled time, probably at the end of the day, each day. Updates will be done throughout the day, as new data comes in. Are there any guidelines on improving cassandra's performance

Re: Speeding up schema generation during tests

2016-10-19 Thread Ali Akhtar
rable write to speed up mutation > (CREATE KEYSPACE ... WITH durable_write=false) > > On Wed, Oct 19, 2016 at 3:24 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> Is there a way to speed up the creation of keyspace + tables during >> integration tests? I am using an RF of 1, with SimpleStrategy, but it still >> takes upto 10-15 seconds. >> > >

Re: Speeding up schema generation during tests

2016-10-19 Thread Ali Akhtar
de to find the root cause, >> maybe it's something really stupid and simple to fix. If you want to >> investigate and try out my CassandraDaemon server, I'd be happy to get >> feedbacks >> >> On Wed, Oct 19, 2016 at 9:22 AM, Ali Akhtar <ali.rac...@gmail.com> w

Doing a calculation in a query?

2016-10-10 Thread Ali Akhtar
I have a table for tracking orders. Each order has an `ordered_at` field (can be a timestamp, or a long with the milliseconds of the timestamp) and `shipped_at` field (ditto, timestamp or long). orderd_at tracks when the order was made. shipped_at tracks when the order was shipped. When

Re: Hadoop vs Cassandra

2016-10-23 Thread Ali Akhtar
2016 at 4:00 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> By Hadoop do you mean HDFS? >> >> >> >> On Sun, Oct 23, 2016 at 1:56 PM, Welly Tambunan <if05...@gmail.com> >> wrote: >> >>> Hi All, >>> >>> I read the foll

Re: Hadoop vs Cassandra

2016-10-23 Thread Ali Akhtar
"from a particular query" should be " from a particular country" On Sun, Oct 23, 2016 at 2:36 PM, Ali Akhtar <ali.rac...@gmail.com> wrote: > They can be, but I would assume that if your Cassandra data model is > inefficient for the kind of queries you want to

Re: Hadoop vs Cassandra

2016-10-23 Thread Ali Akhtar
can be done in spark right? > > On 23 Oct 2016 4:08 p.m., "Ali Akhtar" <ali.rac...@gmail.com> wrote: > > > > > > I would say it depends on your use case. > > > > If you need a lot of queries that require joins, or complex analytics of &g

Re: What is the maximum value of Cassandra Counter Column?

2016-10-23 Thread Ali Akhtar
Probably: https://docs.oracle.com/javase/8/docs/api/java/lang/Long.html#MAX_VALUE On Sun, Oct 23, 2016 at 1:12 PM, Kant Kodali wrote: > What is the maximum value of Cassandra Counter Column? >

Re: What is the maximum value of Cassandra Counter Column?

2016-10-23 Thread Ali Akhtar
It seems obvious. On Sun, Oct 23, 2016 at 1:15 PM, Kant Kodali <k...@peernova.com> wrote: > where does it say counter is implemented as long? > > On Sun, Oct 23, 2016 at 1:13 AM, Ali Akhtar <ali.rac...@gmail.com> wrote: > >> Probably: https://docs.oracle.com/j

Re: Hadoop vs Cassandra

2016-10-23 Thread Ali Akhtar
By Hadoop do you mean HDFS? On Sun, Oct 23, 2016 at 1:56 PM, Welly Tambunan wrote: > Hi All, > > I read the following comparison between hadoop and cassandra. Seems the > conclusion that we use hadoop for data lake ( cold data ) and Cassandra for > hot data (real time

Re: Speeding up schema generation during tests

2016-10-23 Thread Ali Akhtar
>> >>>> I did not have time to dig into the source code to find the root cause, >>>> maybe it's something really stupid and simple to fix. If you want to >>>> investigate and try out my CassandraDaemon server, I'd be happy to get >>>> feedbacks &

CommitLogReadHandler$CommitLogReadException: Unexpected error deserializing mutation

2016-10-23 Thread Ali Akhtar
I have a single node cassandra installation on my dev laptop, which is used just for dev / testing. Recently, whenever I restart my laptop, Cassandra fails to start when I run it via 'sudo service cassandra start'. Doing a tail on /var/log/cassandra/system.log gives this log: *INFO [main]

  1   2   >