[Ann] Cassandra Interpreter for Zeppelin
Hello I'm pleased to announce a Cassandra interpreter for Apache Zepplin. For those who don't know, Apache Zeppelin[1] is a web-based notebook that enables interactive data analytics. It is similar to IPython/Jupyter but is JVM-based and its architecture is modular enough to allow various back-ends(Spark, HBase, Lens, ...) to be plugged in. This Cassandra-Zeppelin integration is not meant to replace a complete suite like Tableau Software, QlikView or similar but it offers at least some user-friendly web-based interface for interactive data visualization. The Zeppelin project and community are young but very promising. They plan to add more graph capabilities and features in the future. The JIRA has been created here[2]. If you're interested to play with it, please vote on the JIRA so that the pull request can be merged quickly. Feedbacks are also welcomed. A brief description of what can be done with this interpreter: - support single-line and multi-line comments - one CQL statement can span many line - a @prefix system to pass in runtime parameters to queries - support for preparing statements before-hand and injecting bound values to prepared statements - parallel execution of each paragraphs - the last statement is displayed as tabular data if it is a SELECT statement. For non SELECT statements, execution statistics are returned - simple syntax validation by the interpreter, CQL syntax validation is delegated to Cassandra - support for Zeppelin dynamic form with the mustache syntax { {input_name=default value}} or { {select_name=val1 | val2 | ... | valN}} Detailed documentation and build instructions for the interpreter can be found here[3] [1]: http://zeppelin.incubator.apache.org/ [2]: https://issues.apache.org/jira/browse/ZEPPELIN-179 [3]: https://docs.google.com/document/d/1krRrpZ3jKx_EOnALp30R1aAL8_tqCiu3W9oz5og0hDg/pub Regards Duy Hai DOAN
RE: Can't connect to Cassandra server
Setting system_memory_in_mb to 16 GB means the Cassandra heap size you are using is 4 GB. If you meant to use a 16GB heap you should uncomment the line #MAX_HEAP_SIZE=4G And set MAX_HEAP_SIZE=16G You should uncomment the HEAP_NEWSIZE setting as well. I would leave it with the default setting 800M until you are certain it needs to be changed. From: Chamila Wijayarathna [mailto:cdwijayarat...@gmail.com] Sent: Tuesday, July 21, 2015 9:21 PM To: Erick Ramirez Cc: user@cassandra.apache.org Subject: Re: Can't connect to Cassandra server Hi Erick, In cassandra-env.sh, system_memory_in_mb was set to 2GB, I changed it into 16GB, but I still get the same issue. Following are my complete system.log after changing cassandra-env.sh, and new cassandra-env.sh. https://gist.githubusercontent.com/cdwijayarathna/5e7e69c62ac09b45490b/raw/f73f043a6cd68eb5e7f93cf597ec514df7ac61ae/log https://gist.github.com/cdwijayarathna/2665814a9bd3c47ba650 I can't find ant output.log in my cassandra installation. Thanks On Tue, Jul 21, 2015 at 4:31 AM, Erick Ramirez er...@ramirez.com.aumailto:er...@ramirez.com.au wrote: Chamila, As you can see from the netstat/lsof output, there is nothing listening on port 9042 because Cassandra has not started yet. This is the reason you are unable to connect via cqlsh. You need to work out first why Cassandra has not started. With regards to JVM, Oded is referring to the max heap size and new heap size you have configured. The suspicion is that you have max heap size set too low which is apparent from the heap pressure and GC pattern in the log you provided. Please provide the gist for the following so we can assist: - updated system.log - copy of output.log - cassandra-env.sh Cheers, Erick Erick Ramirez About Me about.me/erickramirezonlinehttp://about.me/erickramirezonline -- Chamila Dilshan Wijayarathna, Software Engineer Mobile:(+94)788193620 WSO2 Inc., http://wso2.com/
Re: howto do sql query like in a relational database
Hello Anton, You need to look into Datastax Entreprise (DSE) Offering. It integrates Solr search which allows you to do searches like the one you mention. There are also some opensource projects doing this kind of integration, so its up to you. And as Oded mentioned Cassandra really shines on key queries. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649 www.pythian.com On Wed, Jul 22, 2015 at 7:39 AM, Peer, Oded oded.p...@rsa.com wrote: Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store http://wiki.apache.org/cassandra/ It is intended for searching by key. It has more querying options but it really shines when querying by key. Not all databases offer the same functionality. Both a knife and a fork are eating utensils, but you wouldn't want to cut a tomato with a fork. There are text-indexing databases out there that might suit your needs better. Try elasticsearch. -Original Message- From: anton [mailto:anto...@gmx.de] Sent: Tuesday, July 21, 2015 7:54 PM To: user@cassandra.apache.org Subject: howto do sql query like in a relational database Hi, I have a simple (perhaps stupid) question. If I want to *search* data in cassandra, how could find in a text field all records which start with 'Cas' ( in sql I do select * from table where field like 'Cas%') I know that this is not directly possible. - But how is it possible? - Do nobody have the need to search text fragments, and if not is there a small example to explain *why* this is not needed? As far as I understand, databases are great for *searching* data. Concerning numerical data in cassandra I can use = all that operators. Is cassandra intended to be used for mostly numerical data? I did not catch the point up to now, sorry. Anton -- --
Re: Schema questions for data structures with recently-modified access patterns
Ah, so you your access pattern is to get all documents modified in a particular date, right? Then I think your approach is good, and to avoid duplication, why don't add the docId as the first clustering column and remove the last_modified field from it? That way, your primary key would be PRIMARY KEY(date, docId), making all docs modified in same day be together in the same partition, and on the other hand, two updates on the same date won't generate a two rows as the primary key would be exactly the same. Does it make sense? Carlos Alonso | Software Engineer | @calonso https://twitter.com/calonso On 21 July 2015 at 18:37, Robert Wille rwi...@fold3.com wrote: The time series doesn’t provide the access pattern I’m looking for. No way to query recently-modified documents. On Jul 21, 2015, at 9:13 AM, Carlos Alonso i...@mrcalonso.com wrote: Hi Robert, What about modelling it as a time serie? CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMESTAMP PRIMARY KEY(docId, last_modified) ) WITH CLUSTERING ORDER BY (last_modified DESC); This way, you the lastest modification will always be the first record in the row, therefore accessing it should be as easy as: SELECT * FROM document WHERE docId == the docId LIMIT 1; And, if you experience diskspace issues due to very long rows, then you can always expire old ones using TTL or on a batch job. Tombstones will never be a problem in this case as, due to the specified clustering order, the latest modification will always be first record in the row. Hope it helps. Carlos Alonso | Software Engineer | @calonso https://twitter.com/calonso On 21 July 2015 at 05:59, Robert Wille rwi...@fold3.com wrote: Data structures that have a recently-modified access pattern seem to be a poor fit for Cassandra. I’m wondering if any of you smart guys can provide suggestions. For the sake of discussion, lets assume I have the following tables: CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMEUUID, PRIMARY KEY ((docid)) ) CREATE TABLE doc_by_last_modified ( date TEXT, last_modified TIMEUUID, docId UUID, PRIMARY KEY ((date), last_modified) ) When I update a document, I retrieve its last_modified time, delete the current record from doc_by_last_modified, and add a new one. Unfortunately, if you’d like each document to appear at most once in the doc_by_last_modified table, then this doesn’t work so well. Documents can get into the doc_by_last_modified table multiple times if there is concurrent access, or if there is a consistency issue. Any thoughts out there on how to efficiently provide recently-modified access to a table? This problem exists for many types of data structures, not just recently-modified. Any ordered data structure that can be dynamically reordered suffers from the same problems. As I’ve been doing schema design, this pattern keeps recurring. A nice way to address this problem has lots of applications. Thanks in advance for your thoughts Robert
Cassandra compaction appears to stall, node becomes partially unresponsive
Hi there, Within our Cassandra cluster, we're observing, on occasion, one or two nodes at a time becoming partially unresponsive. We're running 2.1.7 across the entire cluster. nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. Each time we've seen this, the only thing that appears to resolve the issue is a restart of the Cassandra process; the restart does not appear to be clean, and requires one or more attempts (or a -9 on occasion). There does not seem to be any pattern to what machines are affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
Hi Bryan How's GC behaving on these boxes? On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: Hi there, Within our Cassandra cluster, we're observing, on occasion, one or two nodes at a time becoming partially unresponsive. We're running 2.1.7 across the entire cluster. nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. Each time we've seen this, the only thing that appears to resolve the issue is a restart of the Cassandra process; the restart does not appear to be clean, and requires one or more attempts (or a -9 on occasion). There does not seem to be any pattern to what machines are affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan -- *Aiman Parvaiz* Lead Systems Architect ai...@flipagram.com cell: 213-300-6377 http://flipagram.com/apz
Re: Upgraded to Cassandra 2.2.0 nodes not seeing each other
I agreed Michael. I was generating stuff for it again, Looks like they had the SSL stack changed. I came from 2.1.6 to 2.2.0. Thanks. On Wed, Jul 22, 2015 at 5:45 PM, Michael Shuler mich...@pbandjelly.org wrote: What version of Cassandra did you upgrade to 2.2.0 *from*? This would help with looking at config differences, changelogs, etc. It seems you have some pretty clear SSL connection errors, according to the logs, which at least helps with seeing why the nodes can't talk to each other. I'm not terribly familiar with using SSL with Cassandra, but it seems clear that you have an incorrect server_encryption_options: cipher_suites: configuration. -- Kind regards, Michael On 07/22/2015 06:33 PM, Carlos Scheidecker wrote: Thanks for the reply, Michael! Yes, I did followed the upgrade nodes. I am running Ubuntu Ubuntu 14.04.2 LTS on all and kernel 3.13.0-57-generic on all. I have 4 machines: .31, .32, .33 and .34. If I run nodetool status from .34 I now see all the others as DN the same happens if I log in from the others: DN 192.168.1.31 ? 256 ? 1f8000f5-026c-42c7-8189-cf19fbede566 RAC1 DN 192.168.1.32 ? 256 ? 12478d45-3d5e-418b-a0dc-dba6d4307af3 RAC1 DN 192.168.1.33 ? 256 ? 994172b3-cd36-4558-a4b8-054cfac027f3 RAC1 UN 192.168.1.34 1.7 MB 256 ? b66be1f3-bb4a-49bd-9835-5c8ee2a71e5c RAC1 If I do a netstat -atn from .34 I get: tcp0 0 127.0.1.1:53 http://127.0.1.1:53 0.0.0.0:* LISTEN tcp0 0 0.0.0.0:22 http://0.0.0.0:22 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:631 http://127.0.0.1:631 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:7001 http://192.168.1.34:7001 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:7199 http://127.0.0.1:7199 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:9160 http://192.168.1.34:9160 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:59441 http://127.0.0.1:59441 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:52951 http://192.168.1.34:52951 192.168.1.31:7001 http://192.168.1.31:7001 ESTABLISHED On the logs I now have the following errors (/var/log/syslog.log): WARN [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:229 - error processing a message intended for /192.168.1.31 http://192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net .OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net .OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:316 - error writing to /192.168.1.31 http://192.168.1.31 java.lang.NullPointerException: null at org.apache.cassandra.net .OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:323) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net .OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:285) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net .OutboundTcpConnection.run(OutboundTcpConnection.java:219) [apache-cassandra-2.2.0.jar:2.2.0] WARN [MessagingService-Outgoing-/192.168.1.33 http://192.168.1.33] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket WARN [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.33 http://192.168.1.33] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for /192.168.1.33 http://192.168.1.33 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
Robert, thanks for these references! We're not using DTCS, so 9056 and 8243 seem out, but I'll take a look at 9577 (also looked at the referenced thread on this list, which seems to have some interesting data) On Wed, Jul 22, 2015 at 5:33 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. I've heard other reports of compaction appearing to stall in 2.1.7... wondering if you're affected by any of these... https://issues.apache.org/jira/browse/CASSANDRA-9577 or https://issues.apache.org/jira/browse/CASSANDRA-9056 or https://issues.apache.org/jira/browse/CASSANDRA-8243 (these should not be in 2.1.7) =Rob
Issues with SSL encrption after updating to 2.2.0 from 2.1.6
Hello all, After updating to Cassandra 2.2.0 from 2.1.6 I am having SSL issues: My JVM is java version 1.8.0_45 Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) Ubuntu 14.04.2 LTS is on all nodes, they are the same. Below is the encryption settings from cassandra.yaml of all nodes. I am using the same keystore and trustore as I had used before on 2.1.6 # Enable or disable inter-node encryption # Default settings are TLS v1, RSA 1024-bit keys (it is imperative that # users generate their own keys) TLS_RSA_WITH_AES_128_CBC_SHA as the cipher # suite for authentication, key exchange and encryption of the actual data transfers. # Use the DHE/ECDHE ciphers if running in FIPS 140 compliant mode. # NOTE: No custom encryption options are enabled at the moment # The available internode options are : all, none, dc, rack # # If set to dc cassandra will encrypt the traffic between the DCs # If set to rack cassandra will encrypt the traffic between the racks # # The passwords used in these options must match the passwords used when generating # the keystore and truststore. For instructions on generating these files, see: # http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore # server_encryption_options: internode_encryption: all keystore: /etc/cassandra/certs/node.keystore keystore_password: mypasswd truststore: /etc/cassandra/certs/global.truststore truststore_password: mypasswd # More advanced defaults below: # protocol: TLS # algorithm: SunX509 # store_type: JKS cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] require_client_auth: false # enable or disable client/server encryption. Nodes cannot talk to each other as per SSL errors bellow. WARN [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:316 - error writing to /192.168.1.31 java.lang.NullPointerException: null at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:323) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:285) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:219) [apache-cassandra-2.2.0.jar:2.2.0] WARN [MessagingService-Outgoing-/192.168.1.33] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket WARN [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.33] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.33 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at
Re: Cassandra - Spark - Flume: best architecture for log analytics.
Cassandra is not very good at massive read/bulk read if you need to retrieve and compute a large amount of data on multiple machines using something like spark or hadoop (or you'll need to hack and process the sstable directly, something which is not natively supported, you'll have to hack your way) However, it's very good to store and retrieve them once they have been processed and sorted. That's why I would opt for solution 2) or for another solution which process data before inserting them in cassandra, and doesn't use cassandra as a temporary store. 2015-07-23 2:04 GMT+02:00 Renato Perini renato.per...@gmail.com: Problem: Log analytics. Solutions: 1) Aggregating logs using Flume and storing the aggregations into Cassandra. Spark reads data from Cassandra, make some computations and write the results in distinct tables, still in Cassandra. 2) Aggregating logs using Flume to a sink, streaming data directly into Spark. Spark make some computations and store the results in Cassandra. 3) *** your solution *** Which is the best workflow for this task? I would like to setup something flexible enough to allow me to use batch processing and realtime streaming without major fuss. Thank you in advance.
Re: Schema questions for data structures with recently-modified access patterns
No way to query recently-modified documents. I don't follow why you say that. I mean, that was the point of the data model suggestion I proposed. Maybe you could clarify. I also wanted to mention that the new materialized view feature of Cassandra 3.0 might handle this use case, including taking care of the delete, automatically. -- Jack Krupansky On Tue, Jul 21, 2015 at 12:37 PM, Robert Wille rwi...@fold3.com wrote: The time series doesn’t provide the access pattern I’m looking for. No way to query recently-modified documents. On Jul 21, 2015, at 9:13 AM, Carlos Alonso i...@mrcalonso.com wrote: Hi Robert, What about modelling it as a time serie? CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMESTAMP PRIMARY KEY(docId, last_modified) ) WITH CLUSTERING ORDER BY (last_modified DESC); This way, you the lastest modification will always be the first record in the row, therefore accessing it should be as easy as: SELECT * FROM document WHERE docId == the docId LIMIT 1; And, if you experience diskspace issues due to very long rows, then you can always expire old ones using TTL or on a batch job. Tombstones will never be a problem in this case as, due to the specified clustering order, the latest modification will always be first record in the row. Hope it helps. Carlos Alonso | Software Engineer | @calonso https://twitter.com/calonso On 21 July 2015 at 05:59, Robert Wille rwi...@fold3.com wrote: Data structures that have a recently-modified access pattern seem to be a poor fit for Cassandra. I’m wondering if any of you smart guys can provide suggestions. For the sake of discussion, lets assume I have the following tables: CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMEUUID, PRIMARY KEY ((docid)) ) CREATE TABLE doc_by_last_modified ( date TEXT, last_modified TIMEUUID, docId UUID, PRIMARY KEY ((date), last_modified) ) When I update a document, I retrieve its last_modified time, delete the current record from doc_by_last_modified, and add a new one. Unfortunately, if you’d like each document to appear at most once in the doc_by_last_modified table, then this doesn’t work so well. Documents can get into the doc_by_last_modified table multiple times if there is concurrent access, or if there is a consistency issue. Any thoughts out there on how to efficiently provide recently-modified access to a table? This problem exists for many types of data structures, not just recently-modified. Any ordered data structure that can be dynamically reordered suffers from the same problems. As I’ve been doing schema design, this pattern keeps recurring. A nice way to address this problem has lots of applications. Thanks in advance for your thoughts Robert
Upgraded to Cassandra 2.2.0 nodes not seeing each other
All, I have a 4 node Cassandra system running on 4 Ubuntu boxes. After updating to Cassandra 2.2.0 and keeping the same cassandra.yaml file, the nodes cannot see each other. When I do a nodetool status it only reports as being up the machine where I had issue the command. In other words, all the machines cannot communicate to each other any longer. Nodetool status behave the same on each machine. I am trying to debug that, hopefully only something on the configuration that has changed. Any ideas? Thanks. C.
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
Hi Aiman, We previously had issues with GC, but since upgrading to 2.1.7 things seem a lot healthier. We collect GC statistics through collectd via the garbage collector mbean, ParNew GC's report sub 500ms collection time on average (I believe accumulated per minute?) and CMS peaks at about 300ms collection time when it runs. On Wed, Jul 22, 2015 at 3:22 PM, Aiman Parvaiz ai...@flipagram.com wrote: Hi Bryan How's GC behaving on these boxes? On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: Hi there, Within our Cassandra cluster, we're observing, on occasion, one or two nodes at a time becoming partially unresponsive. We're running 2.1.7 across the entire cluster. nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. Each time we've seen this, the only thing that appears to resolve the issue is a restart of the Cassandra process; the restart does not appear to be clean, and requires one or more attempts (or a -9 on occasion). There does not seem to be any pattern to what machines are affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan -- *Aiman Parvaiz* Lead Systems Architect ai...@flipagram.com cell: 213-300-6377 http://flipagram.com/apz
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
Aiman, Your post made me look back at our data a bit. The most recent occurrence of this incident was not preceded by any abnormal GC activity; however, the previous occurrence (which took place a few days ago) did correspond to a massive, order-of-magnitude increase in both ParNew and CMS collection times which lasted ~17 hours. Was there something in particular that links GC to these stalls? At this point in time, we cannot identify any particular reason for either that GC spike or the subsequent apparent compaction stall, although it did not seem to have any effect on our usage of the cluster. On Wed, Jul 22, 2015 at 3:35 PM, Bryan Cheng br...@blockcypher.com wrote: Hi Aiman, We previously had issues with GC, but since upgrading to 2.1.7 things seem a lot healthier. We collect GC statistics through collectd via the garbage collector mbean, ParNew GC's report sub 500ms collection time on average (I believe accumulated per minute?) and CMS peaks at about 300ms collection time when it runs. On Wed, Jul 22, 2015 at 3:22 PM, Aiman Parvaiz ai...@flipagram.com wrote: Hi Bryan How's GC behaving on these boxes? On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: Hi there, Within our Cassandra cluster, we're observing, on occasion, one or two nodes at a time becoming partially unresponsive. We're running 2.1.7 across the entire cluster. nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. Each time we've seen this, the only thing that appears to resolve the issue is a restart of the Cassandra process; the restart does not appear to be clean, and requires one or more attempts (or a -9 on occasion). There does not seem to be any pattern to what machines are affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan -- *Aiman Parvaiz* Lead Systems Architect ai...@flipagram.com cell: 213-300-6377 http://flipagram.com/apz
Re: Upgraded to Cassandra 2.2.0 nodes not seeing each other
What version of Cassandra did you upgrade to 2.2.0 *from*? This would help with looking at config differences, changelogs, etc. It seems you have some pretty clear SSL connection errors, according to the logs, which at least helps with seeing why the nodes can't talk to each other. I'm not terribly familiar with using SSL with Cassandra, but it seems clear that you have an incorrect server_encryption_options: cipher_suites: configuration. -- Kind regards, Michael On 07/22/2015 06:33 PM, Carlos Scheidecker wrote: Thanks for the reply, Michael! Yes, I did followed the upgrade nodes. I am running Ubuntu Ubuntu 14.04.2 LTS on all and kernel 3.13.0-57-generic on all. I have 4 machines: .31, .32, .33 and .34. If I run nodetool status from .34 I now see all the others as DN the same happens if I log in from the others: DN 192.168.1.31 ? 256 ? 1f8000f5-026c-42c7-8189-cf19fbede566 RAC1 DN 192.168.1.32 ? 256 ? 12478d45-3d5e-418b-a0dc-dba6d4307af3 RAC1 DN 192.168.1.33 ? 256 ? 994172b3-cd36-4558-a4b8-054cfac027f3 RAC1 UN 192.168.1.34 1.7 MB 256 ? b66be1f3-bb4a-49bd-9835-5c8ee2a71e5c RAC1 If I do a netstat -atn from .34 I get: tcp0 0 127.0.1.1:53 http://127.0.1.1:53 0.0.0.0:* LISTEN tcp0 0 0.0.0.0:22 http://0.0.0.0:22 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:631 http://127.0.0.1:631 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:7001 http://192.168.1.34:7001 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:7199 http://127.0.0.1:7199 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:9160 http://192.168.1.34:9160 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:59441 http://127.0.0.1:59441 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:52951 http://192.168.1.34:52951 192.168.1.31:7001 http://192.168.1.31:7001 ESTABLISHED On the logs I now have the following errors (/var/log/syslog.log): WARN [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:229 - error processing a message intended for /192.168.1.31 http://192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:316 - error writing to /192.168.1.31 http://192.168.1.31 java.lang.NullPointerException: null at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:323) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:285) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:219) [apache-cassandra-2.2.0.jar:2.2.0] WARN [MessagingService-Outgoing-/192.168.1.33 http://192.168.1.33] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket WARN [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.33 http://192.168.1.33] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for /192.168.1.33 http://192.168.1.33 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31 http://192.168.1.31] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error
Re: Upgraded to Cassandra 2.2.0 nodes not seeing each other
Thanks for the reply, Michael! Yes, I did followed the upgrade nodes. I am running Ubuntu Ubuntu 14.04.2 LTS on all and kernel 3.13.0-57-generic on all. I have 4 machines: .31, .32, .33 and .34. If I run nodetool status from .34 I now see all the others as DN the same happens if I log in from the others: DN 192.168.1.31 ? 256 ? 1f8000f5-026c-42c7-8189-cf19fbede566 RAC1 DN 192.168.1.32 ? 256 ? 12478d45-3d5e-418b-a0dc-dba6d4307af3 RAC1 DN 192.168.1.33 ? 256 ? 994172b3-cd36-4558-a4b8-054cfac027f3 RAC1 UN 192.168.1.34 1.7 MB 256 ? b66be1f3-bb4a-49bd-9835-5c8ee2a71e5c RAC1 If I do a netstat -atn from .34 I get: tcp0 0 127.0.1.1:530.0.0.0:* LISTEN tcp0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:631 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:7001 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:7199 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:9160 0.0.0.0:* LISTEN tcp0 0 127.0.0.1:59441 0.0.0.0:* LISTEN tcp0 0 192.168.1.34:52951 192.168.1.31:7001 ESTABLISHED On the logs I now have the following errors (/var/log/syslog.log): WARN [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:48,764 OutboundTcpConnection.java:316 - error writing to /192.168.1.31 java.lang.NullPointerException: null at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:323) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:285) [apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:219) [apache-cassandra-2.2.0.jar:2.2.0] WARN [MessagingService-Outgoing-/192.168.1.33] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket WARN [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:49,764 SSLFactory.java:163 - Filtering out TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket ERROR [MessagingService-Outgoing-/192.168.1.33] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.33 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:49,764 OutboundTcpConnection.java:229 - error processing a message intended for / 192.168.1.31 java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:213) ~[guava-16.0.jar:na] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.init(BufferedDataOutputStreamPlus.java:74) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.connect(OutboundTcpConnection.java:404) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:218) ~[apache-cassandra-2.2.0.jar:2.2.0] ERROR [MessagingService-Outgoing-/192.168.1.31] 2015-07-22 17:29:50,763 OutboundTcpConnection.java:316 - error writing to /192.168.1.31 java.lang.NullPointerException: null at org.apache.cassandra.net.OutboundTcpConnection.writeInternal(OutboundTcpConnection.java:323)
RE: Schema questions for data structures with recently-modified access patterns
I believe what he really wants is to be able to search for the x most recently modified documents, i.e. without specifying the docID. I don’t believe there is a ‘nice’ way of doing this in Cassandra by itself, given it really favours key-value storage. Even having the date as the partition key is usually not recommended because it means all writes on a given date will be hitting one node. Perhaps Solr integration is the way to go for this access pattern? Alec Collier From: Jack Krupansky [mailto:jack.krupan...@gmail.com] Sent: Thursday, 23 July 2015 8:20 AM To: user@cassandra.apache.org Subject: Re: Schema questions for data structures with recently-modified access patterns No way to query recently-modified documents. I don't follow why you say that. I mean, that was the point of the data model suggestion I proposed. Maybe you could clarify. I also wanted to mention that the new materialized view feature of Cassandra 3.0 might handle this use case, including taking care of the delete, automatically. -- Jack Krupansky On Tue, Jul 21, 2015 at 12:37 PM, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: The time series doesn’t provide the access pattern I’m looking for. No way to query recently-modified documents. On Jul 21, 2015, at 9:13 AM, Carlos Alonso i...@mrcalonso.commailto:i...@mrcalonso.com wrote: Hi Robert, What about modelling it as a time serie? CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMESTAMP PRIMARY KEY(docId, last_modified) ) WITH CLUSTERING ORDER BY (last_modified DESC); This way, you the lastest modification will always be the first record in the row, therefore accessing it should be as easy as: SELECT * FROM document WHERE docId == the docId LIMIT 1; And, if you experience diskspace issues due to very long rows, then you can always expire old ones using TTL or on a batch job. Tombstones will never be a problem in this case as, due to the specified clustering order, the latest modification will always be first record in the row. Hope it helps. Carlos Alonso | Software Engineer | @calonsohttps://twitter.com/calonso On 21 July 2015 at 05:59, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: Data structures that have a recently-modified access pattern seem to be a poor fit for Cassandra. I’m wondering if any of you smart guys can provide suggestions. For the sake of discussion, lets assume I have the following tables: CREATE TABLE document ( docId UUID, doc TEXT, last_modified TIMEUUID, PRIMARY KEY ((docid)) ) CREATE TABLE doc_by_last_modified ( date TEXT, last_modified TIMEUUID, docId UUID, PRIMARY KEY ((date), last_modified) ) When I update a document, I retrieve its last_modified time, delete the current record from doc_by_last_modified, and add a new one. Unfortunately, if you’d like each document to appear at most once in the doc_by_last_modified table, then this doesn’t work so well. Documents can get into the doc_by_last_modified table multiple times if there is concurrent access, or if there is a consistency issue. Any thoughts out there on how to efficiently provide recently-modified access to a table? This problem exists for many types of data structures, not just recently-modified. Any ordered data structure that can be dynamically reordered suffers from the same problems. As I’ve been doing schema design, this pattern keeps recurring. A nice way to address this problem has lots of applications. Thanks in advance for your thoughts Robert This email, including any attachments, is confidential. If you are not the intended recipient, you must not disclose, distribute or use the information in this email in any way. If you received this email in error, please notify the sender immediately by return email and delete the message. Unless expressly stated otherwise, the information in this email should not be regarded as an offer to sell or as a solicitation of an offer to buy any financial product or service, an official confirmation of any transaction, or as an official statement of the entity sending this message. Neither Macquarie Group Limited, nor any of its subsidiaries, guarantee the integrity of any emails or attached files and are not responsible for any changes made to them by any other person.
Upgraded to Cassandra 2.2.0 nodes not seeing each other
All, I have a 4 node Cassandra system running on 4 Ubuntu boxes. After updating to Cassandra 2.2.0 and keeping the same cassandra.yaml file, the nodes cannot see each other. When I do a nodetool status it only reports as being up the machine where I had issue the command. In other words, all the machines cannot communicate to each other any longer. Nodetool status behave the same on each machine. I am trying to debug that, hopefully only something on the configuration that has changed. Any ideas? Thanks. C.
Re: Upgraded to Cassandra 2.2.0 nodes not seeing each other
On 07/22/2015 04:45 PM, Carlos Scheidecker wrote: I have a 4 node Cassandra system running on 4 Ubuntu boxes. After updating to Cassandra 2.2.0 and keeping the same cassandra.yaml file, the nodes cannot see each other. What version did you upgrade from? Usually, when upgrading, it is probably a good idea to start with the default cassandra.yaml from the new version (2.2.0 in your case) and edit the necessary items from your old version; i.e. num_tokens, initial_token, listen_address, broadcast_address, etc. You are perhaps missing some sort of default setting that 2.2.0 is looking for? When I do a nodetool status it only reports as being up the machine where I had issue the command. In other words, all the machines cannot communicate to each other any longer. Nodetool status behave the same on each machine. I am trying to debug that, hopefully only something on the configuration that has changed. Any ideas? Anything helpful in the system.log on each of your nodes? Did you follow all the upgrade notes from your previous release to 2.2.0? https://github.com/apache/cassandra/blob/cassandra-2.2.0/NEWS.txt -- Kind regards, Michael
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
I faced something similar in past and the reason for nodes becoming unresponsive intermittently was Long GC pauses. That's why I wanted to bring this to your attention incase GC pause is a potential cause. Sent from my iPhone On Jul 22, 2015, at 4:32 PM, Bryan Cheng br...@blockcypher.com wrote: Aiman, Your post made me look back at our data a bit. The most recent occurrence of this incident was not preceded by any abnormal GC activity; however, the previous occurrence (which took place a few days ago) did correspond to a massive, order-of-magnitude increase in both ParNew and CMS collection times which lasted ~17 hours. Was there something in particular that links GC to these stalls? At this point in time, we cannot identify any particular reason for either that GC spike or the subsequent apparent compaction stall, although it did not seem to have any effect on our usage of the cluster. On Wed, Jul 22, 2015 at 3:35 PM, Bryan Cheng br...@blockcypher.com wrote: Hi Aiman, We previously had issues with GC, but since upgrading to 2.1.7 things seem a lot healthier. We collect GC statistics through collectd via the garbage collector mbean, ParNew GC's report sub 500ms collection time on average (I believe accumulated per minute?) and CMS peaks at about 300ms collection time when it runs. On Wed, Jul 22, 2015 at 3:22 PM, Aiman Parvaiz ai...@flipagram.com wrote: Hi Bryan How's GC behaving on these boxes? On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: Hi there, Within our Cassandra cluster, we're observing, on occasion, one or two nodes at a time becoming partially unresponsive. We're running 2.1.7 across the entire cluster. nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. Each time we've seen this, the only thing that appears to resolve the issue is a restart of the Cassandra process; the restart does not appear to be clean, and requires one or more attempts (or a -9 on occasion). There does not seem to be any pattern to what machines are affected; the nodes thus far have been different instances on different physical machines and on different racks. Has anyone seen this before? Alternatively, when this happens again, what data can we collect that would help with the debugging process (in addition to tpstats)? Thanks in advance, Bryan -- Aiman Parvaiz Lead Systems Architect ai...@flipagram.com cell: 213-300-6377 http://flipagram.com/apz
Cassandra - Spark - Flume: best architecture for log analytics.
Problem: Log analytics. Solutions: 1) Aggregating logs using Flume and storing the aggregations into Cassandra. Spark reads data from Cassandra, make some computations and write the results in distinct tables, still in Cassandra. 2) Aggregating logs using Flume to a sink, streaming data directly into Spark. Spark make some computations and store the results in Cassandra. 3) *** your solution *** Which is the best workflow for this task? I would like to setup something flexible enough to allow me to use batch processing and realtime streaming without major fuss. Thank you in advance.
Re: Cassandra compaction appears to stall, node becomes partially unresponsive
On Wed, Jul 22, 2015 at 2:55 PM, Bryan Cheng br...@blockcypher.com wrote: nodetool still reports the node as being healthy, and it does respond to some local queries; however, the CPU is pegged at 100%. One common thread (heh) each time this happens is that there always seems to be one of more compaction threads running (via nodetool tpstats), and some appear to be stuck (active count doesn't change, pending count doesn't decrease). A request for compactionstats hangs with no response. I've heard other reports of compaction appearing to stall in 2.1.7... wondering if you're affected by any of these... https://issues.apache.org/jira/browse/CASSANDRA-9577 or https://issues.apache.org/jira/browse/CASSANDRA-9056 or https://issues.apache.org/jira/browse/CASSANDRA-8243 (these should not be in 2.1.7) =Rob