Re: Opscenter help?

2014-03-13 Thread Jack Krupansky
You don't need any reputation points to ask a new question with an existing tag - just type opscenter in the Tags box under the question. Otherwise, how would any new user ever be able to ask a question and have it tagged?! -- Jack Krupansky -Original Message- From: Drew from

Re: CQL Select Map using an IN relationship

2014-03-13 Thread Jack Krupansky
. -- Jack Krupansky From: Laing, Michael Sent: Thursday, March 13, 2014 1:39 PM To: user@cassandra.apache.org Subject: Re: CQL Select Map using an IN relationship Think of them as: PRIMARY KEY (partition_key[, range_key]) where the partition_key can be compounded as: (partition_key0

Re: Question about how compaction and partition keys interact

2014-03-27 Thread Jack Krupansky
it does come down to how you will be accessing the data – query, view, update. -- Jack Krupansky From: Donald Smith Sent: Wednesday, March 26, 2014 1:22 PM To: mailto:user@cassandra.apache.org Subject: Question about how compaction and partition keys interact In CQL we need to decide between using

Re: Securing Cassandra database

2014-04-06 Thread Jack Krupansky
Take a look at the DataStax Enterprise Security Management. http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/sec/secDSE.html -- Jack Krupansky From: Check Peck Sent: Friday, April 4, 2014 11:54 PM To: user Subject: Securing Cassandra database Hi All, We

Re: OPSCENTER has table row maintenence?

2014-04-15 Thread Jack Krupansky
DevCenter makes it easy to query and display data and execute commands against Cassandra, but it doesn’t have a spreadsheet-like tool for hand editing columns. That sounds like a great suggestion to make. See: http://www.datastax.com/what-we-offer/products-services/devcenter -- Jack Krupansky

Re: Cassandra vs Elasticsearch.

2014-05-03 Thread Jack Krupansky
on Lucene for the core underlying indexing and query layers. -- Jack Krupansky From: Jon Haddad Sent: Saturday, May 3, 2014 4:03 AM To: user@cassandra.apache.org Subject: Re: Cassandra vs Elasticsearch. Agreed w/ ES not being the durable data store. I would recommend treating it as ephemeral

Re: Cassandra MapReduce/Storm/ etc

2014-05-16 Thread Jack Krupansky
Here’s a meetup talk on analytics using Cassandra, Storm, and Kafka: http://www.slideshare.net/aih1013/building-largescale-analytics-platform-with-storm-kafka-and-cassandra-nyc-storm-user-group-meetup-21st-nov-2013 -- Jack Krupansky From: Manoj Khangaonkar Sent: Thursday, May 8, 2014 5:43 PM

Re: What % of cassandra developers are employed by Datastax?

2014-05-16 Thread Jack Krupansky
You can always check the project committer wiki: http://wiki.apache.org/cassandra/Committers -- Jack Krupansky From: Kevin Burton Sent: Wednesday, May 14, 2014 4:39 PM To: user@cassandra.apache.org Subject: What % of cassandra developers are employed by Datastax? I'm curious what

Re: What % of cassandra developers are employed by Datastax?

2014-05-17 Thread Jack Krupansky
) and support for Cassandra: http://wiki.apache.org/cassandra/ThirdPartySupport (For disclosure, I am a part-time contractor for DataStax, but now on the sales side, although by background is as a developer.) -- Jack Krupansky From: Dave Brosius Sent: Saturday, May 17, 2014 10:48 AM To: user

Re: CQL 3 and wide rows

2014-05-19 Thread Jack Krupansky
You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.” See: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows -- Jack Krupansky From: Maciej Miklas

Re: CQL 3 and wide rows

2014-05-20 Thread Jack Krupansky
To keep the terminology clear, your “row_key” is actually the “partition key”, and “wide_row_column” is actually a “clustering column”, and the combination of your row_key and wide_row_column is a “compound primary key”. -- Jack Krupansky From: Aaron Morton Sent: Tuesday, May 20, 2014 3:06 AM

Re: Possible to Add multiple columns in one query ?

2014-05-25 Thread Jack Krupansky
of performance for a real cluster, especially compared to a server with more CPU cores than your laptop. And for a real cluster, rows with different partition keys can be sent to a coordinator node that owns that partition key, which could be multiple nodes for RF1. -- Jack Krupansky From: Mark Farnan

Re: Cassandra snapshot

2014-06-02 Thread Jack Krupansky
You might check the doc: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html -- Jack Krupansky From: ng Sent: Monday, June 2, 2014 3:18 PM To: user@cassandra.apache.org Subject: Cassandra snapshot I need to make sure that all the data in sstable

Re: memtable mem usage off by 10?

2014-06-04 Thread Jack Krupansky
Yeah, it is in the doc: http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html And I don’t find a Jira issue mentioning it being removed, so... what’s the full story there?! -- Jack Krupansky From: Idrén, Johan Sent: Wednesday, June 4, 2014 8

Re: memtable mem usage off by 10?

2014-06-04 Thread Jack Krupansky
And sorry that the doc confused you as well! -- Jack Krupansky From: Idrén, Johan Sent: Wednesday, June 4, 2014 10:51 AM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? I wasn’t supplying it, I was assuming it was using the default. It does not exist in my config

Re: CQLSSTableWriter memory leak

2014-06-05 Thread Jack Krupansky
How many rows (primary key values) are you writing for each partition of the primary key? I mean, are there relatively few, or are these very wide partitions? Oh, I see! You’re writing 50,000,000 rows to a single partition! My, that IS ambitious. -- Jack Krupansky From: Xu Zhongxing Sent

Re: Bad Request: Type error: cannot assign result of function token (type bigint) to id (type int)

2014-06-06 Thread Jack Krupansky
The message does seem a little odd in that it refers to “assign”, but it would make more sense to say “compare”. -- Jack Krupansky From: Kevin Burton Sent: Friday, June 6, 2014 1:22 AM To: user@cassandra.apache.org Subject: Bad Request: Type error: cannot assign result of function token (type

Re: Data model for streaming a large table in real time.

2014-06-08 Thread Jack Krupansky
balancing for multiple tables” -- Jack Krupansky From: Kevin Burton Sent: Saturday, June 7, 2014 1:27 PM To: user@cassandra.apache.org Subject: Re: Data model for streaming a large table in real time. I just checked the source and in 2.1.0 it's not deprecated. So it *might* be *being

Re: Large number of row keys in query kills cluster

2014-06-11 Thread Jack Krupansky
batches” as an anti-pattern: http://www.slideshare.net/mattdennis -- Jack Krupansky From: Peter Sanford Sent: Wednesday, June 11, 2014 7:34 PM To: user@cassandra.apache.org Subject: Re: Large number of row keys in query kills cluster On Wed, Jun 11, 2014 at 10:12 AM, Jeremy Jongsma jer

Re: Backup Cassandra to

2014-06-12 Thread Jack Krupansky
The doc for backing up – and restoring – Cassandra is here: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html That doesn’t tell you how to move the “snapshot” to or from tape, but a snapshot is the starting point for backing up Cassandra. -- Jack

Re: Pattern to store maps of maps...

2014-06-13 Thread Jack Krupansky
can select ‘foo_bar’. Ditto for additional levels. And if you want each of the intermediate levels, pick a serialization format such as JSON or BSON in addition to the flattened leaf values. Anything in your use case(s) that doesn’t cover? -- Jack Krupansky From: Kevin Burton Sent: Friday, June

Re: Best practices for repair

2014-06-19 Thread Jack Krupansky
The DataStax doc should be current best practices: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html If you or anybody else finds it inadequate, speak up. -- Jack Krupansky -Original Message- From: Paolo Crosato Sent: Thursday, June 19

Re: Are writes to indexes performed asynchronously?

2014-06-22 Thread Jack Krupansky
it may generally complete quicker than you can do a query. See: http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_build_index_c.html AFAICT, there is no mechanism for guaranteeing that a 2i has been updated. -- Jack Krupansky From: Tom van den Berge Sent: Thursday, June 19, 2014 5:26 AM

Re: Are writes to indexes performed asynchronously?

2014-06-22 Thread Jack Krupansky
It would be nice to add that note to the doc. And that leaves open the possibility that this feature may have a bug. -- Jack Krupansky From: DuyHai Doan Sent: Sunday, June 22, 2014 12:01 PM To: user@cassandra.apache.org Subject: Re: Are writes to indexes performed asynchronously? As far as I

Re: Datastax DSE binaries

2014-06-27 Thread Jack Krupansky
support for DSE at this time. -- Jack Krupansky From: Som Nair Sent: Friday, June 27, 2014 10:29 AM To: user@cassandra.apache.org Subject: Datastax DSE binaries Hi How and where do we include these binaries in a apache cassandra installation . I would like to include these in a puppet

Re: Datastax DSE binaries

2014-06-27 Thread Jack Krupansky
/upgrade/datastax_enterprise/upgrdAnyVersion.html See: http://www.datastax.com/wp-content/uploads/2014/04/WP-DataStax-Enterprise-Best-Practices.pdf -- Jack Krupansky From: Som Nair Sent: Friday, June 27, 2014 11:28 AM To: user@cassandra.apache.org Subject: Re: Datastax DSE binaries Hi Jack

Re: SSTable compression ratio… percentage or 0.0 - 1.0???

2014-06-29 Thread Jack Krupansky
It’s sloppy language in the doc. It is indeed a “ratio” – scroll down to the example. See: http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCFstats.html -- Jack Krupansky From: Kevin Burton Sent: Sunday, June 29, 2014 12:33 AM To: user@cassandra.apache.org Subject

Re: Primary key question

2014-07-01 Thread Jack Krupansky
. Since you only have 30,000 rows, it probably doesn’t matter which way you go – organize your data based on how it is logically structured and how you wish to access it. -- Jack Krupansky From: Wim Deblauwe Sent: Tuesday, July 1, 2014 8:24 AM To: user@cassandra.apache.org Subject: Re: Primary

Re: keyspace with hundreds of columnfamilies

2014-07-02 Thread Jack Krupansky
the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.” See: https://issues.apache.org/jira/browse/CASSANDRA-5935 “Allow disabling slab allocation” IOW, this is considered an anti-pattern, but... -- Jack Krupansky From: tommaso barbugli Sent: Wednesday, July 2, 2014 2:16 PM To: user

Re: Write Inconsistency to update a row

2014-07-03 Thread Jack Krupansky
You said that the updates do show up eventually – how long does it take? -- Jack Krupansky From: Sávio S. Teles de Oliveira Sent: Thursday, July 3, 2014 1:30 PM To: user@cassandra.apache.org Subject: Re: Write Inconsistency to update a row Are you sure all the nodes are working at that time

Re: Cassandra use cases/Strengths/Weakness

2014-07-04 Thread Jack Krupansky
/datastax-opscenter Here’s a feature comparison of some NoSQL databases: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis -- Jack Krupansky From: Prem Yadav Sent: Friday, July 4, 2014 10:37 AM To: user@cassandra.apache.org Subject: Cassandra use cases/Strengths/Weakness Hi, I have seen

Re: Cassandra use cases/Strengths/Weakness

2014-07-08 Thread Jack Krupansky
manual partitioning when you have more than 25 million rows or so. And then you have to pay attention to query latency as well. First big question: It may be 100 million rows today, but what growth rate do you anticipate? -- Jack Krupansky From: Matthias Hübner Sent: Saturday, July 5, 2014 5:49 AM

Re: Practical limit to number of keyspaces?

2014-07-11 Thread Jack Krupansky
to allow disabling the SlabAllocator.” Emphasis on “almost certainly a Bad Idea.” See: https://issues.apache.org/jira/browse/CASSANDRA-5935 “Allow disabling slab allocation” -- Jack Krupansky From: Sourabh Agrawal Sent: Thursday, July 10, 2014 11:22 PM To: user@cassandra.apache.org Subject

Re: keyspace with hundreds of columnfamilies

2014-07-12 Thread Jack Krupansky
tables with 10 columns vs. 100 tables with 100 columns – it should primarily be your queries (and updates) that drive the decision. Do fewer tables and more columns make your queries (and updates) a lot simpler and cleaner? -- Jack Krupansky From: tommaso barbugli Sent: Saturday, July 12, 2014

Re: keyspace with hundreds of columnfamilies

2014-07-13 Thread Jack Krupansky
cases look like? By all means, start simple, but also be careful not to paint yourself into a corner. In the alternative, be prepared to throw away entire implementations as your conceptualization of the data evolves. -- Jack Krupansky From: tommaso barbugli Sent: Saturday, July 12, 2014 3:12

Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Jack Krupansky
tune Cassandra for single-node performance, but that seems lot a lot of extra work, to me, compared to adding more cheap nodes. -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 9:31 AM To: user Subject: Re: trouble showing cluster scalability for read performance Duncan

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
in a single partition would certainly not be a test of “horizontal scaling” (adding nodes to handle more data – more token values or partitions.) -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 1:33 PM To: user Subject: horizontal query scaling issues follow

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
and whether you are using a small number of partition keys and a large number of clustering columns, or does each row have a unique partition key and no clustering columns. -- Jack Krupansky From: Diane Griffith Sent: Thursday, July 17, 2014 6:21 PM To: user Subject: Re: horizontal query scaling

Re: I want either all the DML statements within the batch succeed or rollback all. is it possible?

2014-07-22 Thread Jack Krupansky
SQL-like joins can be horrendously expensive. -- Jack Krupansky From: M.Tarkeshwar Rao Sent: Tuesday, July 22, 2014 9:45 AM To: user@cassandra.apache.org Subject: I want either all the DML statements within the batch succeed or rollback all. is it possible? Hi all, In the user guide

Re: JSON to Cassandra ?

2014-07-22 Thread Jack Krupansky
DSE, with Solr integration, does provide “field input transformers” so that you can parse a column in JSON or any other format and then split it into any number of Solr fields, including dynamic fields, which would then let you query elements of that JSON. -- Jack Krupansky From: Alain

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Jack Krupansky
sharing knowledge on this list is always a big step forward. -- Jack Krupansky From: spa...@gmail.com Sent: Wednesday, July 23, 2014 4:25 AM To: user@cassandra.apache.org Subject: Re: Why is the cassandra documentation such poor quality? I would like to help out with the documentation of C*. How do

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Jack Krupansky
I do recall seeing your announcement of your driver, but I think it got lost in the discussion of whether it supported CQL. If you say it supports CQL and native protocol, I’m sure it will get very prompt attention. -- Jack Krupansky From: Peter Lin Sent: Wednesday, July 23, 2014 8:30 AM

Re: CSV Import is taking huge time

2014-07-23 Thread Jack Krupansky
Is it compute bound or I/O bound? What does your cluster look like? -- Jack Krupansky From: Akshay Ballarpure Sent: Wednesday, July 23, 2014 5:00 AM To: user@cassandra.apache.org Subject: CSV Import is taking huge time Hello, I am trying copy command in Cassandra to import CSV file in to DB

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread Jack Krupansky
to for this two node cluster? -- Jack Krupansky From: Andrew Sent: Wednesday, July 23, 2014 1:02 AM To: graham sanderson ; user@cassandra.apache.org Cc: Kevin Burton Subject: Re: All writes fail with ONE consistency level when adding second node to cluster? I looked into this; ONE means

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Jack Krupansky
Out of curiosity, did you look at or utilize DataStax’s free online training? See: http://www.datastax.com/what-we-offer/products-services/training/virtual-training Any feedback? Any suggestions as to what needs it does or doesn’t fulfill? -- Jack Krupansky From: Nicholas Okunew Sent

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread Jack Krupansky
immediate queries can see the data. And as the description notes, hinted handoff will eventually propagate the data (unless it times out and drops the hint.) -- Jack Krupansky From: Robert Coli Sent: Wednesday, July 23, 2014 1:15 PM To: user@cassandra.apache.org Cc: Kevin Burton Subject: Re: All

Re: What is C*?

2014-07-24 Thread Jack Krupansky
some people would prefer C8 or C7a– at least that would have a chance of returning narrower Google search results than searching for “C*”. -- Jack Krupansky From: Mark Reddy Sent: Thursday, July 24, 2014 4:04 AM To: user@cassandra.apache.org Subject: Re: What is C*? Yes you are correct

Re: Hot, large row

2014-07-24 Thread Jack Krupansky
? -- Jack Krupansky From: DuyHai Doan Sent: Thursday, July 24, 2014 3:53 PM To: user@cassandra.apache.org Subject: Re: Hot, large row Your extract of cfhistograms show that there are no particular wide rows. The widest has 61214 cells which is big but not that huge to be really a concern

Re: Why is the cassandra documentation such poor quality?

2014-07-24 Thread Jack Krupansky
Blog posts are great for highlighting and focusing the community on new features, changes, and techniques, but any knowledge content in them definitely needs to be in the docs as well. -- Jack Krupansky From: Tyler Hobbs Sent: Thursday, July 24, 2014 12:07 PM To: user@cassandra.apache.org

Re: here's a good example of poor cassandra documentation.

2014-07-24 Thread Jack Krupansky
Thanks. I’ll pass it along to the doc team. -- Jack Krupansky From: Kevin Burton Sent: Thursday, July 24, 2014 6:34 PM To: user@cassandra.apache.org Subject: here's a good example of poor cassandra documentation. so searching google for cassandra leveled compaction there are no hits

Re: Hot, large row

2014-07-25 Thread Jack Krupansky
Is it the accumulated tombstones on a row that make it act as if “wide”? Does cfhistograms count the tombstones or subtract them when reporting on cell-count for rows? (I don’t know.) -- Jack Krupansky From: Keith Wright Sent: Friday, July 25, 2014 10:24 AM To: user@cassandra.apache.org Cc

Re: Cassandra trigger following the CQL for Cassandra 2.0 tutorial does not work

2014-07-28 Thread Jack Krupansky
of the specifics. -- Jack Krupansky -Original Message- From: Michael Dykman Sent: Monday, July 28, 2014 10:35 AM To: Cassandra Users Subject: Re: Cassandra trigger following the CQL for Cassandra 2.0 tutorial does not work How to compile is a much biggest question and probably way off topic

Re: select many rows one time or select many times?

2014-07-31 Thread Jack Krupansky
This doesn’t seem like a reasonable use case for Cassandra. I mean, it’s not a typical “database” use case. -- Jack Krupansky From: Philo Yang Sent: Thursday, July 31, 2014 1:44 PM To: user@cassandra.apache.org Subject: select many rows one time or select many times? Hi all, I have

Re: Occasional read timeouts seen during row scans

2014-08-02 Thread Jack Krupansky
retry get you to 100% success? I would note that even the best distributed systems do not guarantee zero failures for environmental issues, so apps need to tolerate occasional failures. -- Jack Krupansky -Original Message- From: Duncan Sands Sent: Saturday, August 2, 2014 7:04 AM

Re: Reasonable range for the max number of tables?

2014-08-04 Thread Jack Krupansky
apart, with no warning. -- Jack Krupansky From: Robert Coli Sent: Monday, August 4, 2014 4:54 PM To: user@cassandra.apache.org Subject: Re: Reasonable range for the max number of tables? On Mon, Aug 4, 2014 at 1:35 PM, Kevin Burton bur...@spinn3r.com wrote: What are the bottlenecks here

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Jack Krupansky
in this area. -- Jack Krupansky -Original Message- From: Phil Luckhurst Sent: Tuesday, August 5, 2014 4:09 AM To: cassandra-u...@incubator.apache.org Subject: Re: Reasonable range for the max number of tables? Is there any mention of this limitation anywhere in the Cassandra documentation

Re: too many open files

2014-08-09 Thread Jack Krupansky
or is unlimited, is probably not so important. Or, maybe, simply have a single limit, without the modes and default it to 10 or 25 or some other relatively low number for “normal” apps. This would be more developer-friendly, for both new and “normal” developers... I think. -- Jack Krupansky

Re: Number of columns per row for composite columns?

2014-08-12 Thread Jack Krupansky
” – in a partition would be the number of rows you have inserted in that partition times the number of columns you have declared in the table. If you need to review the terminology: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows -- Jack Krupansky From: hlqv Sent: Tuesday

Re: range query times out (on 1 node, just 1 row in table)

2014-08-13 Thread Jack Krupansky
Agreed, but... in this case the table has ONE row, so what exactly could be causing this timeout? I mean, it can’t be the row count, right? -- Jack Krupansky From: DuyHai Doan Sent: Wednesday, August 13, 2014 9:01 AM To: user@cassandra.apache.org Subject: Re: range query times out (on 1 node

Re: Strange select result when using date grater than query

2014-08-17 Thread Jack Krupansky
Are you more than 7 time zones behind GMT? If so, that would make 03:33 your query less than 03:33-0700 Your query is using the default time zone, which will be the time zone configured for the coordinator node executing the query. IOW, where are you? -- Jack Krupansky -Original

Re: Strange select result when using date grater than query

2014-08-17 Thread Jack Krupansky
. And separate from the actual data, which is stored in GMT. -- Jack Krupansky -Original Message- From: Subodh Nijsure Sent: Sunday, August 17, 2014 10:04 AM To: user@cassandra.apache.org Subject: Re: Strange select result when using date grater than query I am in PST ( Oakland ). I am

Re: Compaction before Decommission and Bootstrapping

2014-08-17 Thread Jack Krupansky
, and the new nodes are not available until the new data center is completely ready. And if something goes wrong, no harm to the existing nodes. -- Jack Krupansky -Original Message- From: Robert Stupp Sent: Sunday, August 17, 2014 11:17 AM To: user@cassandra.apache.org Subject: Re

Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
With CQL3, you, the developer, get to decide whether to place a primary key column in the partition key or as a clustering column. So, make sensorID the partition key and datetime as a clustering column. -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 6:48 PM To: user

Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
separate nodes? I mean, the whole point of Cassandra is scalability and distributed processing, right? -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 7:31 PM To: user@cassandra.apache.org Subject: Re: Data partitioning and composite partition key Hi Jack, I think you

Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
talking about? One of the secrets of Cassandra is to use more, smaller requests in parallel, rather than massive requests to a single coordinator node. -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 8:28 PM To: user@cassandra.apache.org Subject: Re: Data partitioning

Re: Help with migration from Thrift to CQL3 on Cassandra 2.0.10

2014-08-31 Thread Jack Krupansky
You might want to take a look at Titan, a graph database that can use Cassandra as its storage engine, and see how it does these things. -- Jack Krupansky From: Todd Nine Sent: Sunday, August 31, 2014 11:06 AM To: user@cassandra.apache.org Subject: Re: Help with migration from Thrift to CQL3

Re: Help with select IN query in cassandra

2014-09-01 Thread Jack Krupansky
are “current” (or “recent”)? Shouldn’t we be looking for and promoting “write once” approaches as a much stronger preference/pattern? Or maybe I should say “write once and bulk delete on aging” rather than the exercise in futility of doing a massive number of deletes and updates in place? -- Jack

Re: Help with select IN query in cassandra

2014-09-01 Thread Jack Krupansky
I did see a reference to deletions: “overall approaches considering volumes, deletion, compaction etc.” Did I merely misunderstand the reference? That’s all I was responding to... sorry if my misunderstanding added any confusion! -- Jack Krupansky From: Laing, Michael Sent: Monday, September

Re: C 2.1

2014-09-15 Thread Jack Krupansky
If you’re indexing and querying on that many columns (dozens, or more than a handful), consider DSE/Solr, especially if you need to query on multiple columns in the same query. -- Jack Krupansky From: Robert Coli Sent: Monday, September 15, 2014 11:07 AM To: user@cassandra.apache.org Subject

Re: C 2.1

2014-09-16 Thread Jack Krupansky
for distributing queries across the cluster. And... Lucene (underneath Solr) is optimal for queries that span multiple fields. DSE/Solr supports CQL3 wide rows (clustering columns.) -- Jack Krupansky From: Ram N Sent: Monday, September 15, 2014 4:34 PM To: user Subject: Re: C 2.1 Jack, Using Solr

Re: Document of WRITETIME function needs update

2014-09-17 Thread Jack Krupansky
Fixed. Thanks for reporting this! -- Jack Krupansky From: ziju feng Sent: Tuesday, September 16, 2014 8:30 AM To: user@cassandra.apache.org Subject: Document of WRITETIME function needs update Hi, I found that the WRITETIME function on counter column returns date/time in milliseconds

Re: C 2.1

2014-09-17 Thread Jack Krupansky
as well, but supports Solr query syntax rather than needing to pass a structured JSON format. SELECT * FROM persons WHERE solr_query=’name:jo* age:[20 TO 40]’; And your app can use SolrJ or raw HTTP requests to talk to Solr within DSE as well. -- Jack Krupansky From: Ram N Sent: Wednesday

Re: Help with approach to remove RDBMS schema from code to move to C*?

2014-09-19 Thread Jack Krupansky
, and the string columns in a string map collection, but... it’s best to first step back and look at the big picture of what the data actually looks like as well as how you want to query it. -- Jack Krupansky From: Les Hartzman Sent: Friday, September 19, 2014 5:46 PM To: user

Re: Indexes Fragmentation

2014-09-28 Thread Jack Krupansky
Take a look at DataStax Enterprise as well, with its integrated Solr indexing of Cassandra data. -- Jack Krupansky From: Arthur Zubarev Sent: Sunday, September 28, 2014 10:55 AM To: user@cassandra.apache.org Subject: Indexes Fragmentation Hi all: A client on a RDBMS faces quick index

Re: Indexes Fragmentation

2014-09-28 Thread Jack Krupansky
, but the background question remains how you intend to access that updated data? I mean, any perceived fragmentation may just be statistical noise compared to access efficiency overall. -- Jack Krupansky From: Arthur Zubarev Sent: Sunday, September 28, 2014 11:19 AM To: user@cassandra.apache.org

Re: is lack of full text search hurting cassandra and datastax?

2014-10-03 Thread Jack Krupansky
available as well. -- Jack Krupansky From: DuyHai Doan Sent: Friday, October 3, 2014 3:54 AM To: user@cassandra.apache.org Subject: Re: is lack of full text search hurting cassandra and datastax? There are some options around for full text search integration with C*. Google for Stratio deep

Re: Cassandra + Solr

2014-10-04 Thread Jack Krupansky
or the file system cache. But as a general proposition, plan on having enough system memory to cache your Lucene/Solr index. SSDs... are a bit different, but I'd still want most of the index to fit in RAM. -- Jack Krupansky -Original Message- From: Robert Wille Sent: Saturday

Re: Consistency Levels

2014-10-08 Thread Jack Krupansky
consistency - to all nodes beyond the immediate quorum. -- Jack Krupansky -Original Message- From: William Katsak Sent: Wednesday, October 8, 2014 12:27 PM To: user@cassandra.apache.org Subject: Consistency Levels Hello, I was wondering if anyone (Datastax?) has any usage data about

Re: Consistency Levels

2014-10-08 Thread Jack Krupansky
, that isn't necessarily a failure is it? All of that said, it depends on where you're trying to get to. -- Jack Krupansky -Original Message- From: William Katsak Sent: Wednesday, October 8, 2014 7:19 PM To: user@cassandra.apache.org Subject: Re: Consistency Levels Thanks. I am thinking

Experiences with repairs using vnodes

2014-10-24 Thread Jack Krupansky
the cluster, more confidence in maintaining the cluster, or... whatever else may have been impacted. IOW, what actual benefit/change did you experience firsthand. Thanks! -- Jack Krupansky

Re: Redundancy inside a cassandra node

2014-11-08 Thread Jack Krupansky
and OOM issues. 2. Replication redundancy is also for supporting higher load, not just availability on node outage. -- Jack Krupansky From: Jabbar Azam Sent: Friday, November 7, 2014 3:24 PM To: user@cassandra.apache.org Subject: Redundancy inside a cassandra node Hello all, My work

Re: Rule of thumb for concurrent asynchronous queries?

2014-11-25 Thread Jack Krupansky
. It would be nice if we had a way to calculate this number (both numbers) for you so that a client (driver) could ping for it from the cluster, as well as for the cluster to return a suggested wait interval before sending another request based on actual load. -- Jack Krupansky -Original

Re: Storing time-series and geospatial data in C*

2014-11-27 Thread Jack Krupansky
most frequently. For example, will aged data be deleted and with what frequency. It also depends on what types of aggregations, or rollups, you might want to perform. -- Jack Krupansky From: Spico Florin Sent: Thursday, November 27, 2014 7:38 AM To: user@cassandra.apache.org Subject: Storing

Re: mysql based columnar DB to Cassandra DB - Migration

2014-11-28 Thread Jack Krupansky
Planet Cassandra has some resource pages related to migrations to Cassandra. For HBase: http://planetcassandra.org/hbase-to-cassandra-migration/ There are pages for migration from Oracle, MySQL, MongoDB, and Redis, as well. -- Jack Krupansky From: Akshay Ballarpure Sent: Friday, November 28

Re: Keyspace and table/cf limits

2014-12-06 Thread Jack Krupansky
into memory and performance issues. There is an undocumented method to reduce the table overhead to support more tables, but... if you are not expert enough to find it on your own, then you are definitely not expert enough to be using it. -- Jack Krupansky From: Raj N Sent: Tuesday, November

Re: Keyspace and table/cf limits

2014-12-06 Thread Jack Krupansky
implementation to determine what table limit works best for your use case. -- Jack Krupansky From: Raj N Sent: Wednesday, December 3, 2014 4:54 PM To: user@cassandra.apache.org Subject: Re: Keyspace and table/cf limits The question is more from a multi-tenancy point of view. We wanted to see

Re: How to model data to achieve specific data locality

2014-12-07 Thread Jack Krupansky
more Thrift/slice oriented. -- Jack Krupansky From: Eric Stevens Sent: Sunday, December 7, 2014 10:12 AM To: user@cassandra.apache.org Subject: Re: How to model data to achieve specific data locality Also new seq_types can be added and old seq_types can be deleted. This means I often need

Re: How to model data to achieve specific data locality

2014-12-07 Thread Jack Krupansky
or 10 million or...? Sure, buckets are a very real option, but if the number of seq_types was only 10,000 to 50,000, then bucketing might be unnecessary complexity and access overhead. -- Jack Krupansky From: Kai Wang Sent: Sunday, December 7, 2014 3:06 PM To: user@cassandra.apache.org Subject

Re: batch_size_warn_threshold_in_kb

2014-12-13 Thread Jack Krupansky
flies in the face of the admonition to to refrain from using server-side distribution of requests. At a minimum the CQL spec should make a more clear statement of intent and non-intent for BATCH. -- Jack Krupansky From: Jonathan Haddad Sent: Friday, December 12, 2014 12:58 PM To: user

Re: Cassandra Database using too much space

2014-12-14 Thread Jack Krupansky
instances, both for the full corpus and for the subset containing your 1.5 million words? -- Jack Krupansky From: Chamila Wijayarathna Sent: Sunday, December 14, 2014 7:01 AM To: user@cassandra.apache.org Subject: Cassandra Database using too much space Hello all, We are trying to develop

Re: Cassandra Database using too much space

2014-12-15 Thread Jack Krupansky
limit – beyond that you need to start using “buckets” to break up ultra-large partitions. Also, you need to look carefully at how you want to query each table. -- Jack Krupansky From: Chamila Wijayarathna Sent: Sunday, December 14, 2014 11:36 PM To: user@cassandra.apache.org Subject: Re

Re: Understanding what is key and partition key

2014-12-16 Thread Jack Krupansky
clustering columns. “The key” should just be a synonym for “primary key”, although sometimes people are loosely speaking about “the partition” (which should be “the partition key”) rather than the CQL “row”. -- Jack Krupansky From: Chamila Wijayarathna Sent: Tuesday, December 16, 2014 8:03 AM To: user

Re: does consistency=ALL for deletes obviate the need for tombstones?

2014-12-16 Thread Jack Krupansky
When you say “no need for tombstones”, did you actually read that somewhere or were you just speculating? If the former, where exactly? -- Jack Krupansky From: Ian Rose Sent: Tuesday, December 16, 2014 10:22 AM To: user Subject: does consistency=ALL for deletes obviate the need for tombstones

Re: Cassandra update row after delete immediately, and read that, the data not right?

2014-12-25 Thread Jack Krupansky
What RF? Is the update and read immediately after the delete and insert, or is the read after doing all the updates? Is the delete and insert done with a single batch? -- Jack Krupansky On Thu, Dec 25, 2014 at 4:14 AM, yhq...@sina.com wrote: Hi, all I write a program to test

Re: Why read row is so slower than read column.

2014-12-26 Thread Jack Krupansky
What do your CQL queries look like? -- Jack Krupansky On Fri, Dec 26, 2014 at 8:00 AM, yhq...@sina.com wrote: Hi, all: In my cf, each row has two column, one column is the timestamp(64bit), another column is data which may be 500k about. I read row, the qps is about 30. I read

Re: any code to load large data from web into Cassandra

2014-12-27 Thread Jack Krupansky
is needed. Alternatively, you could hire a consultant to help guide you through the application analysis process to determine your application requirements, and then you could simply post your application requirements, or at least a concise summary or relevant excerpt. -- Jack Krupansky -- Jack

Re: Is compound index a planned feature in 3.0?

2014-12-31 Thread Jack Krupansky
Lucene directly: https://github.com/Stratio/stratio-cassandra 3. Stargate which also uses Lucene directly: http://tuplejump.github.io/stargate/ All three support Lucene queries in the WHERE clause of CQL SELECT. DSE also supports direct Solr HTTP API access for both queries and updates. -- Jack

Re: is primary key( foo, bar) the same as primary key ( foo ) with a ‘set' of bars?

2015-01-03 Thread Jack Krupansky
). -- Jack Krupansky On Sat, Jan 3, 2015 at 2:31 PM, Sylvain Wallez sylv...@apache.org wrote: From what I understand from the docs, the 64k limit applies to both the number of items in a collection and the size of its elements? Why is there a constraint on value size in collections, when

Re: Help on modeling a table

2015-02-02 Thread Jack Krupansky
developers. -- Jack Krupansky On Mon, Feb 2, 2015 at 10:33 AM, Asit KAUSHIK asitkaushikno...@gmail.com wrote: HI All We are working on a application logging project and this is one of the search tables as below : CREATE TABLE logentries ( logentrytimestamputcguid timeuuid PRIMARY KEY

Re: Mutable primary key in a table

2015-02-08 Thread Jack Krupansky
, or... whatever. -- Jack Krupansky On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal ajku@gmail.com wrote: On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens migh...@gmail.com wrote: I'm struggling to think of a model where it makes sense to update a primary key as a typical operation. It suggests

  1   2   3   4   >