Why I can not do a "count(*) ... allow filtering " without facing operation timeout?

2015-09-04 Thread shahab
Hi, This is probably a silly problem , but it is really serious for me. I have a cluster of 3 nodes, with replication factor 2. But still I can not do a simple "select count(*) from ..." neither using DevCenter nor "cqlsh" . Any idea how this can be done? best, /Shahab

How to export query results (milions rows) as CSV fomat?

2015-07-07 Thread shahab
Hi, Is there any way to export the results of a query (e.g. select * from tbl1 where id =aa and loc =bb) into a file as CSV format? I tried to use COPY command with cqlsh, but the command does not work when you have where condition ?!!! does any have any idea how to do this? best, /Shahab

How to measure disk space used by a keyspace?

2015-06-29 Thread shahab
Hi, Probably this question has been already asked in the mailing list, but I couldn't find it. The question is how to measure disk-space used by a keyspace, column family wise, excluding snapshots? best, /Shahab

Re: How to store denormalized data

2015-06-03 Thread Shahab Yunus
? What harm in it? Also, you can slightly change it, (if applicable) and not populate as a separate batch process but in fact make part of your analysis job? Kind of a pre-process/prep step? Regards, Shahab On Wed, Jun 3, 2015 at 10:48 AM, Matthew Johnson matt.john...@algomi.com wrote: Hi all

Re: Data model suggestions

2015-04-26 Thread Shahab Yunus
Interesting approach Oded. Is this something similar that has been described here: http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html Regards, Shahab On Sun, Apr 26, 2015 at 4:29 AM, Peer, Oded oded.p...@rsa.com wrote: I would maintain two tables. An “archive” table

Getting ParNew GC in ... CMS Old Gen ... in logs

2015-04-20 Thread shahab
Eden Space: 167712624 - 0; Par Survivor Space: 0 - 20970080 Is above line is indication of something that need to be fixed in the system?? how can I resolve this? best, /Shahab

Re: best supported spark connector for Cassandra

2015-02-11 Thread shahab
I am using Calliope cassandra-spark connector( http://tuplejump.github.io/calliope/), which is quite handy and easy to use! The only problem is that it is a bit outdates , works with Spark 1.1.0, hopefully new version comes soon. best, /Shahab On Wed, Feb 11, 2015 at 2:51 PM, Marcelo Valle

Why RDD is not cached?

2014-10-27 Thread shahab
I am missing in my settings, or... ? thanks, /Shahab

Re: Increasing size of Batch of prepared statements

2014-10-23 Thread shahab
Thanks Jens for the comments. As I am trying cassandra stress tool, does it mean that the tool is executing batch of Insert statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil jens.ran

Re: Increasing size of Batch of prepared statements

2014-10-23 Thread shahab
OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil jens.ran...@tink.se wrote: Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens

Re: Increasing size of Batch of prepared statements

2014-10-23 Thread shahab
Thanks Tyler for sharing this. It is exactly what I was looking for to know. best, /Shahab On Thu, Oct 23, 2014 at 5:37 PM, Tyler Hobbs ty...@datastax.com wrote: CASSANDRA-8091 (Stress tool creates too large batches) is relevant: https://issues.apache.org/jira/browse/CASSANDRA-8091 On Thu

Re: Increasing size of Batch of prepared statements

2014-10-06 Thread shahab
with large size? best, /Shahab On Sun, Oct 5, 2014 at 6:03 PM, Jens Rantil jens.ran...@tink.se wrote: Shabab, If you are hitting this limit because you are inserting a lot of (CQL) rows in a single batch I suggest you split the statement up in multiple smaller batches. Generally, large inserts

Re: Increasing size of Batch of prepared statements

2014-10-05 Thread shahab
Thanks Shane. best, /Shahab On Fri, Oct 3, 2014 at 6:51 PM, Shane Hansen shanemhan...@gmail.com wrote: It appears to be configurable in cassandra.yaml using batch_size_warn_threshold https://issues.apache.org/jira/browse/CASSANDRA-6487 On Fri, Oct 3, 2014 at 10:47 AM, shahab shahab.mok

Why results of Cassandra Stress Toll is much worse than normal reading/writing from Cassandra?

2014-10-05 Thread shahab
appreciate of any one could help me to understand the output of Stress-Tool? BTW, I have already seen this one (but still the documentation is quite poor): http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStressOutput_c.html best, /Shahab

Increasing size of Batch of prepared statements

2014-10-03 Thread shahab
to change the default value? thanks /Shahab

Regarding Cassandra-Stress tool

2014-10-01 Thread shahab
, there is an output parameter partition_rate which is not explained in documentation? best, /Shahab

cassandra stress tools

2014-10-01 Thread shahab
, there is an output parameter partition_rate which is not explained in documentation? best, /Shahab

Re: using dynamic cell names in CQL 3

2014-09-25 Thread shahab
Thanks, It seems that I was not clear in my question, I would like to store values in the column name, for example column.name would be event_name (temperature) and column-content would be the respective value (e.g. 40.5) . And I need to know how the schema should look like in CQL 3 best, /Shahab

using dynamic cell names in CQL 3

2014-09-24 Thread shahab
/example that I can look at ? best, /Shahab

Re: Machine Learning With Cassandra

2014-08-30 Thread Shahab Yunus
and thus you can run complex ML algorithms relatively faster. I think we just discussed this a short while ago when similar question (storm vs. spark, I think) was raised by you earlier. Here is the link for that discussion: http://markmail.org/message/lc4icuw4hobul6oh Regards, Shahab On Sat

Re: Why select count(*) from .. hangs ?

2014-03-26 Thread shahab
Thanks for the hints. I got a better picture of how to deal with count queries. On Tue, Mar 25, 2014 at 7:01 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Mar 25, 2014 at 8:36 AM, shahab shahab.mok...@gmail.com wrote: But after iteration 8, (i.e. inserting 150 sensor data

Why select count(*) from .. hangs ?

2014-03-25 Thread shahab
GUI, but I got same result. I am sure that I have missed something or misunderstood how Cassandra works, but don't know really what? I do appreciate any hints. best, /Shahab

Re: Why select count(*) from .. hangs ?

2014-03-25 Thread shahab
': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; On Tue, Mar 25, 2014 at 4:58 PM, Michael Shuler mich...@pbandjelly.orgwrote: On 03/25/2014 10:36 AM, shahab wrote: In our application, we need to insert roughly 30 sensor data every 30 seconds (basically we need to store

Re: Modeling multi-tenanted Cassandra schema

2013-11-13 Thread Shahab Yunus
Nate, (slightly OT), what client API/library is recommended now that Hector is sunsetting? Thanks. Regards, Shahab On Wed, Nov 13, 2013 at 9:28 AM, Nate McCall n...@thelastpickle.com wrote: You basically want option (c). Option (d) might work, but you would be bending the paradigm a bit

Re: Deleting data using timestamp

2013-10-09 Thread Shahab Yunus
I might be missing something obvious here but can't you afford (time-wise) to run cleanup or repair after the deletion so that the deleted data is gone? Assuming that your columns are time-based data? Regards, Shahab On Wed, Oct 9, 2013 at 10:35 AM, Ravikumar Govindarajan ravikumar.govindara

Re: Deleting data using timestamp

2013-10-09 Thread Shahab Yunus
Ahh, yes, 'compaction'. I blanked out while mentioning repair and cleanup. That is in fact what needs to be done first and what I meant. Thanks Robert. Regards, Shahab On Wed, Oct 9, 2013 at 1:50 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Oct 9, 2013 at 7:35 AM, Ravikumar

Re: get float column in cassandra mapreduce

2013-10-05 Thread Shahab Yunus
' or 'temprature'? You are using the latter in your code and if it is not what is in the data then you might be trying to parse empty or malformed string. Regards, Shahab On Sat, Oct 5, 2013 at 5:16 AM, Anseh Danesh anseh.dan...@gmail.com wrote: Hi all... I have a question. in the cassandra wordcount

Re: Deleting Row Key

2013-10-05 Thread Shahab Yunus
Yes you can: http://hbase.apache.org/book/regions.arch.html#compaction http://hbase.apache.org/book/important_configurations.html (Managed Compaction section) Regards, Shahab On Sat, Oct 5, 2013 at 6:02 PM, Sebastian Schmidt isib...@gmail.com wrote: Am 06.10.2013 00:00, schrieb Cem Cayiroglu

Re: Deleting Row Key

2013-10-05 Thread Shahab Yunus
, Shahab On Sat, Oct 5, 2013 at 7:06 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Yes you can: http://hbase.apache.org/book/regions.arch.html#compaction http://hbase.apache.org/book/important_configurations.html (Managed Compaction section) Regards, Shahab On Sat, Oct 5, 2013 at 6:02 PM

Re: Cassandra nodetool could not resolve '127.0.0.1': unknown host

2013-09-17 Thread Shahab Yunus
Have you tried specifying your hostname (not localhost) in cassandra.yaml and start it? Regards, Shahab On Tue, Sep 17, 2013 at 8:39 AM, pradeep kumar pradeepkuma...@gmail.comwrote: I am very new to cassandra. Just started exploring. I am running a single node cassandra server facing

Re: questions related to the SSTable file

2013-09-17 Thread Shahab Yunus
SSTable? I am also interesting in knowing the answer. Regards, Shahab On Tue, Sep 17, 2013 at 9:50 AM, java8964 java8964 java8...@hotmail.comwrote: Thanks Dean for clarification. But if I put hundreds of megabyte data of one row through one put, what you mean is Cassandra will put all of them

Re: questions related to the SSTable file

2013-09-17 Thread Shahab Yunus
Thanks Robert for the answer. It makes sense. If that happens then it means that your design or use case needs some rework ;) Regards, Shahab On Tue, Sep 17, 2013 at 2:37 PM, java8964 java8964 java8...@hotmail.comwrote: Another question related to the SSTable files generated

Re: VMs versus Physical machines

2013-09-12 Thread Shahab Yunus
at a time. Regards, Shahab On Thu, Sep 12, 2013 at 1:51 AM, Aaron Turner synfina...@gmail.com wrote: On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus shahab.yu...@gmail.comwrote: Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we don't go the physical route. Look how

VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
. Data size? Writing speed (whether write heavy usecases or not)? Random ead use-cases? column family design/how we store data? Any pointers, documents, guidance, advise would be appreciated. Thanks a lot. Regards, Shahab

Re: VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
whether we use physical or VMs (in cloud)? Regards, Shahab On Wed, Sep 11, 2013 at 7:34 PM, Aaron Turner synfina...@gmail.com wrote: Physical machines unless you're running your cluster in the cloud (AWS/etc). Reason is simple: Look how Cassandra scales and provides redundancy. Aaron Turner

Re: Cassandra Reads

2013-09-06 Thread Shahab Yunus
://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docsversion=1.2file=#cassandra/dml/dml_about_reads_c.html http://www.roman10.net/how-apache-cassandra-read-works/ http://wiki.apache.org/cassandra/ArchitectureInternals Regards, Shahab On Fri, Sep 6, 2013 at 6:28 AM, Sridhar

Re: Help on Cassandra Limitaions

2013-09-06 Thread Shahab Yunus
Also, Sylvain, you have couple of great posts about relationships between CQL3/Thrift entities and naming issues: http://www.datastax.com/dev/blog/cql3-for-cassandra-experts http://www.datastax.com/dev/blog/thrift-to-cql3 I always refer to them when I get confuse :) Regards, Shahab On Fri

Re: Secondary Indexes On Partitioned Time Series Data Question

2013-08-01 Thread Shahab Yunus
Hi Robert, Can you shed some more light (or point towards some other resource) that why you think built-in Secondary Indexes should not be used easily or without much consideration? Thanks. Regards, Shahab On Thu, Aug 1, 2013 at 3:53 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Aug 1

Re: Secondary Indexes On Partitioned Time Series Data Question

2013-08-01 Thread Shahab Yunus
Thanks a lot. Regards, Shahab On Thu, Aug 1, 2013 at 8:32 PM, Robert Coli rc...@eventbrite.com wrote: On Thu, Aug 1, 2013 at 2:34 PM, Shahab Yunus shahab.yu...@gmail.comwrote: Can you shed some more light (or point towards some other resource) that why you think built-in Secondary Indexes

Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Shahab Yunus
Hi Jan, One question...you say - I must make sure the disks are directly attached, to prevent problems when multiple nodes flush the commit log at the same time What do you mean by that? Thanks, Shahab On Wed, Jul 31, 2013 at 3:10 AM, Jan Algermissen jan.algermis...@nordsc.com wrote

Re: MapReduce response time and speed

2013-07-24 Thread Shahab Yunus
. Regards, Shahab On Wed, Jul 24, 2013 at 10:33 AM, Jan Algermissen jan.algermis...@nordsc.com wrote: Hi, I am Jan Algermissen (REST-head, freelance programmer/consultant) and Cassandra-newbie. I am looking at Cassandra for an application I am working on. There will be a max. of 10 Million

Re: Unable to describe table in CQL 3

2013-07-23 Thread Shahab Yunus
Rahul, See this as it was discussed earlier: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html Regards, Shahab On Tue, Jul 23, 2013 at 2:51 PM, Rahul Gupta rgu

Re: Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-23 Thread Shahab Yunus
See this as this was discussed earlier: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Representation-of-dynamically-added-columns-in-table-column-family-schema-using-cqlsh-td7588997.html Regards, Shahab On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus shahab.yu

Re: Auto Discovery of Hosts by Clients

2013-07-22 Thread Shahab Yunus
Thanks for you replies. Regards, Shahab On Sun, Jul 21, 2013 at 4:49 PM, aaron morton aa...@thelastpickle.comwrote: Give the app the same nodes you have in the seed lists. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http

Re: Socket buffer size

2013-07-20 Thread Shahab Yunus
I think the former is for client communication to the nodes and the latter for communication between nodes themselves as evident by the name of the property. Please feel free to correct me if I am wrong. Regards, Shahab On Saturday, July 20, 2013, Mohammad Hajjat wrote: Hi, What's

Auto Discovery of Hosts by Clients

2013-07-19 Thread Shahab Yunus
with the client API that I am using? Thanks a lot. Regards, Shahab

Re: IllegalArgumentException on query with AbstractCompositeType

2013-07-13 Thread Shahab Yunus
Aaron Morton can confirm but I think one problem could be that to create an index on a field with small number of possible values is not good. Regards, Shahab On Sat, Jul 13, 2013 at 9:14 AM, Tristan Seligmann mithra...@mithrandi.netwrote: On Fri, Jul 12, 2013 at 10:38 AM, aaron morton aa

Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-12 Thread Shahab Yunus
, displays multiple columns as expected. Basically the demarcation of multiple columns i clearer. Thanks a lot. Regards, Shahab

Re: Representation of dynamically added columns in table (column family) schema using cqlsh

2013-07-12 Thread Shahab Yunus
Thanks Eric for the explanation. Regards, Shahab On Fri, Jul 12, 2013 at 11:13 AM, Shahab Yunus shahab.yu...@gmail.comwrote: A basic question and it seems that I have a gap in my understanding. I have a simple table in Cassandra with multiple column families. I add new columns to each

Re: what happen if coordinator node fails during write

2013-06-29 Thread Shahab Yunus
. Regards, Shahab On Friday, June 28, 2013, aaron morton wrote: As far as I know in 1.2 coordinator logs request before it updates replicas. You may be thinking about atomic batches, which are enabled by default for 1.2 via CQL but must be supported by Thrift clients. I would guess Hector

Re: block size

2013-06-20 Thread Shahab Yunus
Have you seen this? http://www.datastax.com/dev/blog/cassandra-file-system-design Regards, Shahab On Thu, Jun 20, 2013 at 3:17 PM, Kanwar Sangha kan...@mavenir.com wrote: Hi – What is the block size for Cassandra ? is it taken from the OS defaults ?

Re: block size

2013-06-20 Thread Shahab Yunus
with) Cassandra unlike Hadoop. Regards, Shahab On Thu, Jun 20, 2013 at 3:38 PM, Kanwar Sangha kan...@mavenir.com wrote: Yes. Is that not specific to hadoop with CFS ? I want to know that If I have a data in column of size 500KB, how many IOPS are needed to read that ? (assuming we have key cache

Re: Dropped mutation messages

2013-06-19 Thread Shahab Yunus
Hello Arthur, What do you mean by The queries need to be lightened? Thanks, Shahb On Tue, Jun 18, 2013 at 8:47 PM, Arthur Zubarev arthur.zuba...@aol.comwrote: Cem hi, as per http://wiki.apache.org/cassandra/FAQ#dropped_messages Internode messages which are received by a node, but do

Re: Unit Testing Cassandra

2013-06-19 Thread Shahab Yunus
for now. I do see some stuff out there but wanted to know recommendations from the community given their experience. Regards, Shahab On Wed, Jun 19, 2013 at 3:15 AM, Stephen Connolly stephen.alan.conno...@gmail.com wrote: Unit testing means testing in isolation the smallest part. Unit tests

Re: Unit Testing Cassandra

2013-06-19 Thread Shahab Yunus
Thanks Edward, Ben and Dean for the pointers. Yes, I am using Java and these sounds promising for unit testing, at least. Regards, Shahab On Wed, Jun 19, 2013 at 9:58 AM, Edward Capriolo edlinuxg...@gmail.comwrote: You really do not need much in java you can use the embedded server. Hector

Unit Testing Cassandra

2013-06-18 Thread Shahab Yunus
Hello, Can anyone suggest a good/popular Unit Test tools/frameworks/utilities out there for unit testing Cassandra stores? I am looking for testing from performance/load and monitoring perspective. I am using 1.2. Thanks a lot. Regards, Shahab

Re: Dynamic Columns Question Cassandra 1.2.5, Datastax Java Driver 1.0

2013-06-06 Thread Shahab Yunus
Dynamic columns are not supported in CQL3. We just had a discussion a day or two ago about this where Eric Stevens explained it. Please see this: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/CQL-3-returning-duplicate-keys-td7588181.html Regards, Shahab On Thu, Jun 6, 2013

Re: CQL 3 returning duplicate keys

2013-06-05 Thread Shahab Yunus
Thanks Eric. Yeah, I was asking about the second limitation (about dynamic columns) and you have explained it well along with pointers to read further. Regards, Shahab On Wed, Jun 5, 2013 at 8:18 AM, Eric Stevens migh...@gmail.com wrote: I mentioned a few limitations, so I'm not sure which

Re: Multiple JBOD data directory

2013-06-05 Thread Shahab Yunus
Though, I am a newbie bust just had a thought regarding your question 'How will it handle requests for data which unavailable?', wouldn't the data be served in that case from other nodes where it has been replicated? Regards, Shahab On Wed, Jun 5, 2013 at 5:32 AM, Christopher Wirt chris.w

Re: CQL 3 returning duplicate keys

2013-06-04 Thread Shahab Yunus
Thanks Eric for the detailed explanation but can you point to a source or document for this restriction in CQL3 tables? Doesn't it take away the main feature of the NoSQL store? Or am I am missing something obvious here? Regards, Shahab On Tue, Jun 4, 2013 at 2:12 PM, Eric Stevens migh