AW: How to interpret some GC logs
this should help you: https://blogs.oracle.com/poonam/entry/understanding_cms_gc_logs Best Regards, Sebastian Martinka Von: Michał Łowicki [mailto:mlowi...@gmail.com] Gesendet: Montag, 1. Juni 2015 11:47 An: user@cassandra.apache.org Betreff: How to interpret some GC logs Hi, Normally I get logs like: 2015-06-01T09:19:50.610+: 4736.314: [GC 6505591K-4895804K(8178944K), 0.0494560 secs] which is fine and understandable but occasionalIy I see something like: 2015-06-01T09:19:50.661+: 4736.365: [GC 4901600K(8178944K), 0.0049600 secs] How to interpret it? Does it miss only part before - so memory occupied before GC cycle? -- BR, Michał Łowicki
Re: How to interpret some GC logs
On Tue, Jun 2, 2015 at 9:06 AM, Sebastian Martinka sebastian.marti...@mercateo.com wrote: this should help you: https://blogs.oracle.com/poonam/entry/understanding_cms_gc_logs I don't see there such format. Passed options related to GC are: -XX:+PrintGCDateStamps -Xloggc:/var/log/cassandra/gc.log Best Regards, Sebastian Martinka *Von:* Michał Łowicki [mailto:mlowi...@gmail.com] *Gesendet:* Montag, 1. Juni 2015 11:47 *An:* user@cassandra.apache.org *Betreff:* How to interpret some GC logs Hi, Normally I get logs like: 2015-06-01T09:19:50.610+: 4736.314: [GC 6505591K-4895804K(8178944K), 0.0494560 secs] which is fine and understandable but occasionalIy I see something like: 2015-06-01T09:19:50.661+: 4736.365: [GC 4901600K(8178944K), 0.0049600 secs] How to interpret it? Does it miss only part before - so memory occupied before GC cycle? -- BR, Michał Łowicki -- BR, Michał Łowicki
Re: ERROR Compaction Interrupted
looks like it is graciously handle in the code, should be okay. if (ci.isStopRequested()) throw new CompactionInterruptedException(ci.getCompactionInfo()); https://github.com/apache/cassandra/blob/cassandra-2.0.9/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L156-L157 jason On Tue, Jun 2, 2015 at 2:31 AM, Aiman Parvaiz ai...@flipagram.com wrote: Hi everyone, I am running C* 2.0.9 without vnodes and RF=2. Recently while repairing, rebalancing the cluster I encountered one instance of this(just one on one node): ERROR CompactionExecutor: https://logentries.com/app/9f95dbd4#55472 CassandraDaemon.uncaughtException - Exception in thread Thread[ CompactionExecutor: https://logentries.com/app/9f95dbd4#55472,1,main] May 30 19:31:09 cass-prod4.localdomain cassandra: 2015-05-30 19:31:09,991 ERROR CompactionExecutor:55472 CassandraDaemon.uncaughtException - Exception in thread Thread[CompactionExecutor:55472,1,main] May 30 19:31:09 cass-prod4.localdomain org.apache.cassandra.db.compaction.CompactionInterruptedException: Compaction interrupted: Compaction@1b0b43e5-bef5-34f9-af08-405a7b58c71f(flipagram, home_feed_entry_index, 218409618/450008574)bytes May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:157) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) May 30 19:31:09 cass-prod4.localdomain at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.FutureTask.run(FutureTask.java:262) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) May 30 19:31:09 cass-prod4.localdomain at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) May 30 19:31:09 cass-prod4.localdomain at java.lang.Thread.run(Thread.java:745) After looking up a bit on the mailing list archives etc I understand that this might mean data corruption and I plan to take the node offline and replace it with a new one but still wanted to see if anyone can throw some light here about me missing out on something. Also, if this is a case of corrupted SST should I be concerned about it getting replicated and take care of it on the replication too. Thanks
Re: JSON Cassandra 2.2 - insert syntax
Well, your column is not called address, it's called addresses. It's your type that is called address. On Tue, Jun 2, 2015 at 4:39 AM, Michel Blase mblas...@gmail.com wrote: Zach, this is embarrassing.you were right, I was running 2.1 shame on me! but now I'm getting the error: *InvalidRequest: code=2200 [Invalid query] message=JSON values map contains unrecognized column: address* any idea? This is the sequence of commands that I'm running: CREATE KEYSPACE json WITH REPLICATION = { 'class' :'SimpleStrategy', 'replication_factor' : 1 }; USE json; CREATE TYPE address (street text,city text,zip_code int,phones settext); CREATE TABLE users (id int PRIMARY KEY,name text,addresses maptext, frozenaddress); INSERT INTO users JSON '{id: 123,name: jbellis,address: {home: {street: 123 Cassandra Dr,city:Austin,zip_code: 78747,phones: [2101234567]}}}'; Consider that I'm running a just downloaded C2.2 instance (I'm on a mac) Thanks and sorry for the waste of time before! On Mon, Jun 1, 2015 at 7:10 PM, Zach Kurey zach.ku...@datastax.com wrote: Hi Michel, My only other guess is that you actually are running Cassandra 2.1, since thats the exact error I get if I try to execute a JSON statement against a version earlier than 2.2. On Mon, Jun 1, 2015 at 6:13 PM, Michel Blase mblas...@gmail.com wrote: Thanks Zach, tried that but I get the same error: *SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:24 mismatched input '{id: 123,name: jbellis,address: {home: {street: 123 Cassandra Dr,city: Austin,zip_code: 78747,phones: [2101234567]}}}' expecting ')' (INSERT INTO users JSON ['{id: 123,name: jbellis,address: {home: {street: 123 Cassandra Dr,city: Austin,zip_code: 78747,phones: [2101234567]}}]}';)* On Mon, Jun 1, 2015 at 6:12 PM, Zach Kurey zach.ku...@datastax.com wrote: Looks like you have your use of single vs. double quotes inverted. What you want is: INSERT INTO users JSON '{id: 123,name: jbellis,address: { home: {street: 123 Cassandra Dr,city: Austin,zip_code: 78747,phones: [2101234567]}}}'; HTH On Mon, Jun 1, 2015 at 6:03 PM, Michel Blase mblas...@gmail.com wrote: Hi all, I'm trying to test the new JSON functionalities in C* 2.2. I'm using this example: https://issues.apache.org/jira/browse/CASSANDRA-7970 I believe there is a typo in the CREATE TABLE statement that requires frozen: CREATE TABLE users (id int PRIMARY KEY,name text,addresses maptext, frozenaddress); but my real problem is in the insert syntax. I've found the CQL-2.2 documentation and my best guess is this: INSERT INTO users JSON {'id': 123,'name': 'jbellis','address': {'home': {'street': '123 Cassandra Dr','city': 'Austin','zip_code': 78747,'phones': [2101234567]}}}; but I get the error: SyntaxException: ErrorMessage code=2000 [Syntax error in CQL query] message=line 1:23 mismatched input '{'id': 123,'name': 'jbellis','address': {'home': {'street': '123 Cassandra Dr','city': 'Austin','zip_code': 78747,'phones': [2101234567]}}}' expecting ')' (INSERT INTO users JSON [{'id': 123,'name': 'jbellis','address': {'home': {'street': '123 Cassandra Dr','city': 'Austin','zip_code': 78747,'phones': [2101234567]}}]};) Any idea? Thanks, Michael
Re: How to interpret some GC logs
On Mon, Jun 1, 2015 at 7:25 PM, Jason Wee peich...@gmail.com wrote: can you tell what jvm is that? root@db2:~# java -version java version 1.7.0_80 Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode) jason On Mon, Jun 1, 2015 at 5:46 PM, Michał Łowicki mlowi...@gmail.com wrote: Hi, Normally I get logs like: 2015-06-01T09:19:50.610+: 4736.314: [GC 6505591K-4895804K(8178944K), 0.0494560 secs] which is fine and understandable but occasionalIy I see something like: 2015-06-01T09:19:50.661+: 4736.365: [GC 4901600K(8178944K), 0.0049600 secs] How to interpret it? Does it miss only part before - so memory occupied before GC cycle? -- BR, Michał Łowicki -- BR, Michał Łowicki
Re: Minor Compactions Not Triggered
On Mon, Jun 1, 2015 at 11:25 AM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: As per the algorithm shared in the CASSANDRA 6654, I understand that tombstone_threshold property only comes into picture if you have expirying columns and it wont have any effect if you have manually deleted rows in cf. Is my understanding correct? According to you What would be the expected behavior of following steps?? I inserted x rows I deleted x rows Ran major compaction to make sure that one big sstable contains all tombstones Waited for gc grace period to see whether that big sstable formed after major compaction is compacted on its own without finding any other sstable That's a good question, and I don't actually know the answer. If you aren't generating new SSTables in the CF via writes and flushes, I would doubt any background process notices it's expired and re-compacts it. Have you considered asking this question in the #cassandra IRC channel on freenode? =Rob
Re: Spark SQL JDBC Server + DSE
If you want a web based notebook style approach (similar to ipython) check out https://github.com/apache/incubator-zeppelin And https://github.com/apache/incubator-zeppelin/pull/86 Bonus free pretty graphs! On 1 June 2015 at 11:41, Sebastian Estevez sebastian.este...@datastax.com wrote: Have you looked at job server? https://github.com/spark-jobserver/spark-jobserver https://www.youtube.com/watch?v=8k9ToZ4m6os http://planetcassandra.org/blog/post/fast-spark-queries-on-in-memory-datasets/ All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Mon, Jun 1, 2015 at 8:13 AM, Mohammed Guller moham...@glassbeam.com wrote: Brian, We haven’t open sourced the REST server, but not opposed to doing it. Just need to carve out some time to clean up the code and carve it out from all the other stuff that we do in that REST server. Will try to do it in the next few weeks. If you need it sooner, let me know. I did consider the option of writing our own Spark SQL JDBC driver for C*, but it is lower on the priority list right now. Mohammed *From:* Brian O'Neill [mailto:boneil...@gmail.com] *On Behalf Of *Brian O'Neill *Sent:* Saturday, May 30, 2015 3:12 AM *To:* user@cassandra.apache.org *Subject:* Re: Spark SQL JDBC Server + DSE Any chance you open-sourced, or could open-source the REST server? ;) In thinking about it… It doesn’t feel like it would be that hard to write a Spark SQL JDBC driver against Cassandra, akin to what they have for hive: https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server I wouldn’t mind collaborating on that, if you are headed in that direction. (and then I could write the REST server on top of that) LMK, -brian --- *Brian O'Neill * Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42 This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. *From: *Mohammed Guller moham...@glassbeam.com *Reply-To: *user@cassandra.apache.org *Date: *Friday, May 29, 2015 at 2:15 PM *To: *user@cassandra.apache.org user@cassandra.apache.org *Subject: *RE: Spark SQL JDBC Server + DSE Brian, I implemented a similar REST server last year and it works great. Now we have a requirement to support JDBC connectivity in addition to the REST API. We want to allow users to use tools like Tableau to connect to C* through the Spark SQL JDBC/Thift server. Mohammed *From:* Brian O'Neill [mailto:boneil...@gmail.com boneil...@gmail.com] *On Behalf Of *Brian O'Neill *Sent:* Thursday, May 28, 2015 6:16 PM *To:* user@cassandra.apache.org *Subject:* Re: Spark SQL JDBC Server + DSE Mohammed, This doesn’t really answer your question, but I’m working on a new REST server that allows people to submit SQL queries over REST, which get executed via Spark SQL. Based on what I started here: http://brianoneill.blogspot.com/2015/05/spark-sql-against-cassandra-example.html I assume you need JDBC connectivity specifically? -brian --- *Brian O'Neill * Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42 This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at
How to set datastax-angent connect with jmx an
I am using opscenter 5.1.2 and just enabled JMX username/password authentication on my Cassandra cluster. I think I've updated all my opscenter configs correctly to force the agents to use JMX auth, but it is not working. I've updated the config under /etc/opscenter/Clusters/[cluster-name].conf with the following jmx properties [jmx] username=username password=password port=7199 I then restarted opscenter and opscenter agents, but see the following error in the opscenter agent logs: INFO [main] 2015-06-03 10:55:53,910 Loading conf files: ./conf/address.yaml INFO [main] 2015-06-03 10:55:53,953 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51 INFO [main] 2015-06-03 10:55:53,953 DataStax Agent version: 5.1.2 INFO [main] 2015-06-03 10:55:54,010 Default config values: {:cassandra_port 9042, :rollups300_ttl 2419200, :settings_cf settings, :agent_rpc_interface localhost, :restore_req_update_period 60, :my_channel_prefix /agent, :poll_period 60, :jmx_username heweiping, :thrift_conn_timeout 1, :rollups60_ttl 604800, :stomp_port 61620, :shorttime_interval 10, :longtime_interval 300, :max-seconds-to-sleep 25, :private-conf-props [initial_token listen_address broadcast_address rpc_address broadcast_rpc_address], :thrift_port 9160, :async_retry_timeout 5, :agent-conf-group global-cluster-agent-group, :jmx_host 127.0.0.1, :ec2_metadata_api_host 169.254.169.254, :metrics_enabled 1, :async_queue_size 5000, :backup_staging_dir nil, :read-buffer-size 1000, :remote_verify_max 30, :disk_usage_update_period 60, :throttle-bytes-per-second 50, :rollups7200_ttl 31536000, :agent_rpc_broadcast_address localhost, :remote_backup_retries 3, :ssl_keystore nil, :rollup_snapshot_period 300, :is_package false, :monitor_command /usr/share/datastax-agent/bin/datastax_agent_monitor, :thrift_socket_timeout 5000, :remote_verify_initial_delay 1000, :cassandra_log_location /var/log/cassandra/system.log, :max-pending-repairs 5, :remote_backup_region us-west-1, :restore_on_transfer_failure false, :tmp_dir /var/lib/datastax-agent/tmp/, :config_md5 nil, :jmx_port 7299, :write-buffer-size 10, :jmx_metrics_threadpool_size 4, :use_ssl 0, :rollups86400_ttl 0, :nodedetails_threadpool_size 3, :api_port 61621, :kerberos_service nil, :backup_file_queue_max 1, :jmx_thread_pool_size 5, :production 1, :runs_sudo 1, :max_file_transfer_attempts 30, :jmx_password eefung, :stomp_interface 172.19.104.123, :storage_keyspace OpsCenter, :hosts [127.0.0.1], :rollup_snapshot_threshold 300, :jmx_retry_timeout 30, :unthrottled-default 100, :remote_backup_retry_delay 5000, :remote_backup_timeout 1000, :seconds-to-read-kill-channel 0.005, :realtime_interval 5, :pdps_ttl 259200} INFO [main] 2015-06-03 10:55:54,174 Waiting for the config from OpsCenter INFO [main] 2015-06-03 10:55:54,175 Attempting to determine Cassandra's broadcast address through JMX INFO [main] 2015-06-03 10:55:54,176 Starting Stomp INFO [main] 2015-06-03 10:55:54,176 Starting up agent communcation with OpsCenter. INFO [Initialization] 2015-06-03 10:55:54,180 New JMX connection (127.0.0.1:7299) WARN [Initialization] 2015-06-03 10:55:54,409 Error when trying to match our local token: java.lang.SecurityException: Authentication failed! Credentials required INFO [main] 2015-06-03 10:55:59,412 Reconnecting to a backup OpsCenter instance INFO [main] 2015-06-03 10:55:59,413 SSL communication is disabled INFO [main] 2015-06-03 10:55:59,413 Creating stomp connection to 172.19.104.123:61620 INFO [Initialization] 2015-06-03 10:55:59,418 Sleeping for 2s before trying to determine IP over JMX again WARN [clojure-agent-send-off-pool-0] 2015-06-03 10:55:59,422 Tried to send message while not connected: /conf-request [[172.19.104.123,0:0:0:0:0:0:0:1%1,fe80:0:0:0:225:90ff:fe6a:d35c%2,127.0.0.1],[5.1.2,\/437054467\/conf]] INFO [StompConnection receiver] 2015-06-03 10:55:59,423 Reconnecting in 0s. INFO [StompConnection receiver] 2015-06-03 10:55:59,424 Connected to 172.19.104.123:61620 INFO [main] 2015-06-03 10:55:59,432 Starting Jetty server: {:join? false, :ssl? false, :host localhost, :port 61621} Checks with other jmx based tools (nodetool, jmxtrans) confirm that the jmx setup is correct. Any ideals ? Thank you very much! 发自 Windows 邮件
Re: How to set datastax-angent connect with jmx an
the error in the log output looks similar to this http://serverfault.com/questions/614810/opscenter-4-1-4-authentication-failing , in the opscenter 5.1.2 , do you configure the username/password same with the agent and cassandra node too? jason On Wed, Jun 3, 2015 at 11:13 AM, 贺伟平 wolai...@hotmail.com wrote: I am using opscenter 5.1.2 and just enabled JMX username/password authentication on my Cassandra cluster. I think I've updated all my opscenter configs correctly to force the agents to use JMX auth, but it is not working. I've updated the config under /etc/opscenter/Clusters/[cluster-name].conf with the following jmx properties [jmx] username=username password=password port=7199 I then restarted opscenter and opscenter agents, but see the following error in the opscenter agent logs: INFO [main] 2015-06-03 10:55:53,910 Loading conf files: ./conf/address.yaml INFO [main] 2015-06-03 10:55:53,953 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_51 INFO [main] 2015-06-03 10:55:53,953 DataStax Agent version: 5.1.2 INFO [main] 2015-06-03 10:55:54,010 Default config values: {:cassandra_port 9042, :rollups300_ttl 2419200, :settings_cf settings, :agent_rpc_interface localhost, :restore_req_update_period 60, :my_channel_prefix /agent, :poll_period 60, :jmx_username heweiping, :thrift_conn_timeout 1, :rollups60_ttl 604800, :stomp_port 61620, :shorttime_interval 10, :longtime_interval 300, :max-seconds-to-sleep 25, :private-conf-props [initial_token listen_address broadcast_address rpc_address broadcast_rpc_address], :thrift_port 9160, :async_retry_timeout 5, :agent-conf-group global-cluster-agent-group, :jmx_host 127.0.0.1, :ec2_metadata_api_host 169.254.169.254, :metrics_enabled 1, :async_queue_size 5000, :backup_staging_dir nil, :read-buffer-size 1000, :remote_verify_max 30, :disk_usage_update_period 60, :throttle-bytes-per-second 50, :rollups7200_ttl 31536000, :agent_rpc_broadcast_address localhost, :remote_backup_retries 3, :ssl_keystore nil, :rollup_snapshot_period 300, :is_package false, :monitor_command /usr/share/datastax-agent/bin/datastax_agent_monitor, :thrift_socket_timeout 5000, :remote_verify_initial_delay 1000, :cassandra_log_location /var/log/cassandra/system.log, :max-pending-repairs 5, :remote_backup_region us-west-1, :restore_on_transfer_failure false, :tmp_dir /var/lib/datastax-agent/tmp/, :config_md5 nil, :jmx_port 7299, :write-buffer-size 10, :jmx_metrics_threadpool_size 4, :use_ssl 0, :rollups86400_ttl 0, :nodedetails_threadpool_size 3, :api_port 61621, :kerberos_service nil, :backup_file_queue_max 1, :jmx_thread_pool_size 5, :production 1, :runs_sudo 1, :max_file_transfer_attempts 30, :jmx_password eefung, :stomp_interface 172.19.104.123, :storage_keyspace OpsCenter, :hosts [127.0.0.1], :rollup_snapshot_threshold 300, :jmx_retry_timeout 30, :unthrottled-default 100, :remote_backup_retry_delay 5000, :remote_backup_timeout 1000, :seconds-to-read-kill-channel 0.005, :realtime_interval 5, :pdps_ttl 259200} INFO [main] 2015-06-03 10:55:54,174 Waiting for the config from OpsCenter INFO [main] 2015-06-03 10:55:54,175 Attempting to determine Cassandra's broadcast address through JMX INFO [main] 2015-06-03 10:55:54,176 Starting Stomp INFO [main] 2015-06-03 10:55:54,176 Starting up agent communcation with OpsCenter. INFO [Initialization] 2015-06-03 10:55:54,180 New JMX connection ( 127.0.0.1:7299) WARN [Initialization] 2015-06-03 10:55:54,409 Error when trying to match our local token: java.lang.SecurityException: Authentication failed! Credentials required INFO [main] 2015-06-03 10:55:59,412 Reconnecting to a backup OpsCenter instance INFO [main] 2015-06-03 10:55:59,413 SSL communication is disabled INFO [main] 2015-06-03 10:55:59,413 Creating stomp connection to 172.19.104.123:61620 INFO [Initialization] 2015-06-03 10:55:59,418 Sleeping for 2s before trying to determine IP over JMX again WARN [clojure-agent-send-off-pool-0] 2015-06-03 10:55:59,422 Tried to send message while not connected: /conf-request [[172.19.104.123,0:0:0:0:0:0:0:1%1,fe80:0:0:0:225:90ff:fe6a:d35c%2,127.0.0.1],[5.1.2,\/437054467\/conf]] INFO [StompConnection receiver] 2015-06-03 10:55:59,423 Reconnecting in 0s. INFO [StompConnection receiver] 2015-06-03 10:55:59,424 Connected to 172.19.104.123:61620 INFO [main] 2015-06-03 10:55:59,432 Starting Jetty server: {:join? false, :ssl? false, :host localhost, :port 61621} Checks with other jmx based tools (nodetool, jmxtrans) confirm that the jmx setup is correct. Any ideals ? Thank you very much! 发自 Windows 邮件
RE: Cassandra datacenters replication advanced usage
Hello Marcus and thank you for your fast reply. Yes, we thought about that and indeed it would work. However we really have writes and reads constraints for respectively producer and consumer datacenters so we would like to keep all/most access local. We don't need synchronization between datacenters to be fast, we just need to know when it's done :-/ Fabrice From: Marcus Olsson [mailto:marcus.ols...@ericsson.com] Sent: mardi 2 juin 2015 13:29 To: user@cassandra.apache.org Subject: Re: Cassandra datacenters replication advanced usage Hi Fabrice, Have you considered using each_quorum instead of all? Each_quorum will require replies from a quorum of nodes from all datacenters. This could be used either: Producer using each_quorum and consumer local_quroum. (better read latencies at the cost of write latencies) or Producer using local_quorum and consumer each_quorum. (better write latencies at the cost of read latencies) BR Marcus Olsson On 06/02/2015 01:00 PM, Fabrice Douchant wrote: Hi everyone. For a project, we use a Cassandra cluster in order to have fast reads/writes on a large number of (column oriented) generated data. Until now, we only had 1 datacenter for prototyping. We now plan to split our cluster in 2 datacenters to meet performance requirements (the data transfer between both datacenter is quite slow): datacenter #1 : located near our data producer services : intensively writes all data in Cassandra periodically (each writes has a run_id column in its primary key) datacenter #2 : located near our data consumer services: intensively reads all data produced by datacenter #1 for a given run _id. However, we would like our consumer services to access data only in the datacenter near them (datacenter #2) and when all data for a given run_id have been completely replicated from datacenter #1 (data generated by the producer services). My question is : how can we ensure that all data have been replicated in datancenter #2 before telling producer services (near datacenter #2) to start using them ? Our best solutions so far (but still not good enough :-P): producer services (datacenter #1) writes in consistency all. But this leads to poor partitioning failure tolerance AND really bad writes performances. producer services (datacenter #1) writes in consistency local_quorum and a last run finished value could be written in consistency all. But it seems Cassandra does not ensure replication ordering. Do you have any suggestion ? Thanks a lot, Fabrice
deadlock cleanup request due to compaction
After adding a 5-th node I started running NodeProbe:: forceKeyspaceCleanup. That function is not returning. Below I added the program stack trace information that shows that in this case and probably others cases it is possible that there is a deadlock In my case, the compactionManager is calling a synchronized method getNextBackgroundTask that is calling getMaximalTask. And in my case that function is running an endless loop so the synchronized object is never released. And the request to cleanup is now trying to call the synchronized pause() function, so that function and thus the cleanup cannot proceed. Configuration : 5 node, v-nodes, replication 3, LCS, version 2.0.7 Daemon Thread [CompactionExecutor:72] (Suspended) owns: LeveledManifest (id=8722) owns: LeveledCompactionStrategy (id=8552) waited by: Daemon System Thread [RMI TCP Connection(14840)-10.164.8.73] (Suspended) LeveledManifest.getCompactionCandidates() line: 247 LeveledCompactionStrategy.getMaximalTask(int) line: 121 LeveledCompactionStrategy.getNextBackgroundTask(int) line: 113 CompactionManager$BackgroundCompactionTask.run() line: 191 Executors$RunnableAdapterT.call() line: 471 FutureTaskV.run() line: 262 CompactionManager$CompactionExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1145 ThreadPoolExecutor$Worker.run() line: 615 Thread.run() line: 745 Daemon System Thread [RMI TCP Connection(14840)-10.164.8.73] (Suspended) owns: ColumnFamilyStore (id=8551) waiting for: LeveledCompactionStrategy (id=8552) owned by: Daemon Thread [CompactionExecutor:72] (Suspended) LeveledCompactionStrategy(AbstractCompactionStrategy).pause() line: 112 ColumnFamilyStore.runWithCompactionsDisabled(CallableV, boolean) line: 2056 ColumnFamilyStore.markAllCompacting() line: 2125 CompactionManager.performAllSSTableOperation(ColumnFamilyStore, CompactionManager$AllSSTablesOperation) line: 214 CompactionManager.performCleanup(ColumnFamilyStore, CounterId$OneShotRenewer) line: 265 ColumnFamilyStore.forceCleanup(CounterId$OneShotRenewer) line: 1105 StorageService.forceKeyspaceCleanup(String, String...) line: 2215 GeneratedMethodAccessor42.invoke(Object, Object[]) line: not available DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 606 Trampoline.invoke(Method, Object, Object[]) line: 75 GeneratedMethodAccessor19.invoke(Object, Object[]) line: not available DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 606 MethodUtil.invoke(Method, Object, Object[]) line: 279 StandardMBeanIntrospector.invokeM2(Method, Object, Object[], Object) line: 112 StandardMBeanIntrospector.invokeM2(Object, Object, Object[], Object) line: 46 StandardMBeanIntrospector(MBeanIntrospectorM).invokeM(M, Object, Object[], Object) line: 237 PerInterfaceM.invoke(Object, String, Object[], String[], Object) line: 138 StandardMBeanSupport(MBeanSupportM).invoke(String, Object[], String[]) line: 252 DefaultMBeanServerInterceptor.invoke(ObjectName, String, Object[], String[]) line: 819 JmxMBeanServer.invoke(ObjectName, String, Object[], String[]) line: 801 RMIConnectionImpl.doOperation(int, Object[]) line: 1487 RMIConnectionImpl.access$300(RMIConnectionImpl, int, Object[]) line: 97 RMIConnectionImpl$PrivilegedOperation.run() line: 1328 RMIConnectionImpl.doPrivilegedOperation(int, Object[], Subject) line: 1420 RMIConnectionImpl.invoke(ObjectName, String, MarshalledObject, String[], Subject) line: 848 GeneratedMethodAccessor41.invoke(Object, Object[]) line: not available DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 Method.invoke(Object, Object...) line: 606 UnicastServerRef.dispatch(Remote, RemoteCall) line: 322 Transport$1.run() line: 177 Transport$1.run() line: 174 AccessController.doPrivileged(PrivilegedExceptionActionT, AccessControlContext) line: not available [native method] TCPTransport(Transport).serviceCall(RemoteCall) line: 173 TCPTransport.handleMessages(Connection, boolean) line: 556 TCPTransport$ConnectionHandler.run0() line: 811
Re: RE: Cassandra datacenters replication advanced usage
I think you should use local_quorum for writes and read consistency of consumer can be as per application requirement. I dont think that cross dc synchronous reads/ writes are good choice. Knowing when a batch ends is an application level problem nothing to do with Cassandra. May be you can add Run no and run count with each record. When rows read for a Run matches the count , polling consumer knows that run is fully replicated. Not sure its the best solution. Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:Fabrice Douchant fdouch...@gfproducts.ch Date:Tue, 2 Jun, 2015 at 5:12 pm Subject:RE: Cassandra datacenters replication advanced usage Hello Marcus and thank you for your fast reply. Yes, we thought about that and indeed it would work. However we really have writes and reads constraints for respectively producer and consumer datacenters so we would like to keep all/most access “local”. We don’t need synchronization between datacenters to be fast, we just need to know when it’s done :-/ Fabrice From: Marcus Olsson [mailto:marcus.ols...@ericsson.com] Sent: mardi 2 juin 2015 13:29 To: user@cassandra.apache.org Subject: Re: Cassandra datacenters replication advanced usage Hi Fabrice, Have you considered using each_quorum instead of all? Each_quorum will require replies from a quorum of nodes from all datacenters. This could be used either: Producer using each_quorum and consumer local_quroum. (better read latencies at the cost of write latencies) or Producer using local_quorum and consumer each_quorum. (better write latencies at the cost of read latencies) BR Marcus Olsson On 06/02/2015 01:00 PM, Fabrice Douchant wrote: Hi everyone. For a project, we use a Cassandra cluster in order to have fast reads/writes on a large number of (column oriented) generated data. Until now, we only had 1 datacenter for prototyping. We now plan to split our cluster in 2 datacenters to meet performance requirements (the data transfer between both datacenter is quite slow): datacenter #1 : located near our data producer services : intensively writes all data in Cassandra periodically (each writes has a “run_id” column in its primary key) datacenter #2 : located near our data consumer services: intensively reads all data produced by datacenter #1 for a given “run _id”. However, we would like our consumer services to access data only in the datacenter near them (datacenter #2) and when all data for a given “run_id” have been completely replicated from datacenter #1 (data generated by the producer services). My question is : how can we ensure that all data have been replicated in datancenter #2 before telling producer services (near datacenter #2) to start using them ? Our best solutions so far (but still not good enough :-P): producer services (datacenter #1) writes in consistency “all”. But this leads to poor partitioning failure tolerance AND really bad writes performances. producer services (datacenter #1) writes in consistency “local_quorum” and a last “run finished” value could be written in consistency “all”. But it seems Cassandra does not ensure replication ordering. Do you have any suggestion ? Thanks a lot, Fabrice
Cassandra datacenters replication advanced usage
Hi everyone. For a project, we use a Cassandra cluster in order to have fast reads/writes on a large number of (column oriented) generated data. Until now, we only had 1 datacenter for prototyping. We now plan to split our cluster in 2 datacenters to meet performance requirements (the data transfer between both datacenter is quite slow): datacenter #1 : located near our data producer services : intensively writes all data in Cassandra periodically (each writes has a run_id column in its primary key) datacenter #2 : located near our data consumer services: intensively reads all data produced by datacenter #1 for a given run _id. However, we would like our consumer services to access data only in the datacenter near them (datacenter #2) and when all data for a given run_id have been completely replicated from datacenter #1 (data generated by the producer services). My question is : how can we ensure that all data have been replicated in datancenter #2 before telling producer services (near datacenter #2) to start using them ? Our best solutions so far (but still not good enough :-P): producer services (datacenter #1) writes in consistency all. But this leads to poor partitioning failure tolerance AND really bad writes performances. producer services (datacenter #1) writes in consistency local_quorum and a last run finished value could be written in consistency all. But it seems Cassandra does not ensure replication ordering. Do you have any suggestion ? Thanks a lot, Fabrice
Re: Cassandra datacenters replication advanced usage
Hi Fabrice, Have you considered using each_quorum instead of all? Each_quorum will require replies from a quorum of nodes from all datacenters. This could be used either: Producer using each_quorum and consumer local_quroum. (better read latencies at the cost of write latencies) or Producer using local_quorum and consumer each_quorum. (better write latencies at the cost of read latencies) BR Marcus Olsson On 06/02/2015 01:00 PM, Fabrice Douchant wrote: Hi everyone. For a project, we use a Cassandra cluster in order to have fast reads/writes on a large number of (column oriented) generated data. Until now, we only had 1 datacenter for prototyping. We now plan to split our cluster in 2 datacenters to meet performance requirements (the data transfer between both datacenter is quite slow): datacenter #1 : located near our data producer services : intensively writes all data in Cassandra periodically (each writes has a “run_id” column in its primary key) datacenter #2 : located near our data consumer services: intensively reads all data produced by datacenter #1 for a given “run _id”. However, we would like our consumer services to access data only in the datacenter near them (datacenter #2) and when all data for a given “run_id” have been completely replicated from datacenter #1 (data generated by the producer services). My question is : how can we ensure that all data have been replicated in datancenter #2 before telling producer services (near datacenter #2) to start using them ? Our best solutions so far (but still not good enough :-P): producer services (datacenter #1) writes in consistency “all”. But this leads to poor partitioning failure tolerance AND really bad writes performances. producer services (datacenter #1) writes in consistency “local_quorum” and a last “run finished” value could be written in consistency “all”. But it seems Cassandra does not ensure replication ordering. Do you have any suggestion ? Thanks a lot, Fabrice
Different number of records from COPY command
I am seeing different number of records each time I export a particular table. There were no writes/reads in this table while exporting the data. I am not able to understand why it is happening. Am I missing something here? Cassandra version: 2.1.4 Java driver version: 2.1.5 Cluster Size: 4 Nodes in same DC Keyspace Replication factor: 2 Following commands were issued: cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 68000 rows; Write: 3025.93 rows/s 68682 rows exported in 27.737 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 65000 rows; Write: 2821.06 rows/s 65535 rows exported in 26.667 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 66000 rows; Write: 3285.07 rows/s 66055 rows exported in 26.269 seconds. cfstats for adlog.adclicklog20150528: --- $ nodetool cfstats adlog.adclicklog20150528 Keyspace: adlog Read Count: 217 Read Latency: 2.773073732718894 ms. Write Count: 103191 Write Latency: 0.10233075558915021 ms. Pending Flushes: 0 Table: adclicklog20150528 SSTable count: 11 Space used (live): 37981202 Space used (total): 37981202 Space used by snapshots (total): 13407843 Off heap memory used (total): 25580 SSTable Compression Ratio: 0.26684147550494164 Number of keys (estimate): 5627 Memtable cell count: 94620 Memtable data size: 13459445 Memtable off heap memory used: 0 Memtable switch count: 19 Local read count: 217 Local read latency: 2.774 ms Local write count: 103191 Local write latency: 0.103 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 7192 Bloom filter off heap memory used: 7104 Index summary off heap memory used: 980 Compression metadata off heap memory used: 17496 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 182785 Compacted partition mean bytes: 27808 Average live cells per slice (last five minutes): 44.663594470046085 Maximum live cells per slice (last five minutes): 86.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 - Saurabh
Re: Different number of records from COPY command
I have never exported data myself but can u just try setting 'consistency ALL' on cqlsh before executing command? Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:Saurabh Chandolia s.chando...@gmail.com Date:Tue, 2 Jun, 2015 at 8:47 pm Subject:Different number of records from COPY command I am seeing different number of records each time I export a particular table. There were no writes/reads in this table while exporting the data. I am not able to understand why it is happening. Am I missing something here? Cassandra version: 2.1.4 Java driver version: 2.1.5 Cluster Size: 4 Nodes in same DC Keyspace Replication factor: 2 Following commands were issued: cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 68000 rows; Write: 3025.93 rows/s 68682 rows exported in 27.737 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 65000 rows; Write: 2821.06 rows/s 65535 rows exported in 26.667 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 66000 rows; Write: 3285.07 rows/s 66055 rows exported in 26.269 seconds. cfstats for adlog.adclicklog20150528: --- $ nodetool cfstats adlog.adclicklog20150528 Keyspace: adlog Read Count: 217 Read Latency: 2.773073732718894 ms. Write Count: 103191 Write Latency: 0.10233075558915021 ms. Pending Flushes: 0 Table: adclicklog20150528 SSTable count: 11 Space used (live): 37981202 Space used (total): 37981202 Space used by snapshots (total): 13407843 Off heap memory used (total): 25580 SSTable Compression Ratio: 0.26684147550494164 Number of keys (estimate): 5627 Memtable cell count: 94620 Memtable data size: 13459445 Memtable off heap memory used: 0 Memtable switch count: 19 Local read count: 217 Local read latency: 2.774 ms Local write count: 103191 Local write latency: 0.103 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 7192 Bloom filter off heap memory used: 7104 Index summary off heap memory used: 980 Compression metadata off heap memory used: 17496 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 182785 Compacted partition mean bytes: 27808 Average live cells per slice (last five minutes): 44.663594470046085 Maximum live cells per slice (last five minutes): 86.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 - Saurabh
Re: Different number of records from COPY command
Still getting inconsistent number of records on consistency ALL and QUORUM. Following is the output of consistency ALL and QUORUM. cqlsh:adlog CONSISTENCY ALL; Consistency level set to ALL. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 58000 rows; Write: 3065.60 rows/s 58463 rows exported in 21.353 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 63000 rows; Write: 3517.03 rows/s 63972 rows exported in 22.885 seconds. cqlsh:adlog CONSISTENCY QUORUM ; Consistency level set to QUORUM. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 63000 rows; Write: 3443.37 rows/s 63440 rows exported in 21.987 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 65000 rows; Write: 3405.90 rows/s 65524 rows exported in 24.053 seconds. - Saurabh On Tue, Jun 2, 2015 at 9:09 PM, Anuj Wadehra anujw_2...@yahoo.co.in wrote: I have never exported data myself but can u just try setting 'consistency ALL' on cqlsh before executing command? Thanks Anuj Wadehra Sent from Yahoo Mail on Android https://overview.mail.yahoo.com/mobile/?.src=Android -- *From*:Saurabh Chandolia s.chando...@gmail.com *Date*:Tue, 2 Jun, 2015 at 8:47 pm *Subject*:Different number of records from COPY command I am seeing different number of records each time I export a particular table. There were no writes/reads in this table while exporting the data. I am not able to understand why it is happening. Am I missing something here? Cassandra version: 2.1.4 Java driver version: 2.1.5 Cluster Size: 4 Nodes in same DC Keyspace Replication factor: 2 Following commands were issued: cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 68000 rows; Write: 3025.93 rows/s 68682 rows exported in 27.737 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 65000 rows; Write: 2821.06 rows/s 65535 rows exported in 26.667 seconds. cqlsh:adlog copy adclicklog20150528 (imprid) TO 'adclicklog20150528.csv'; Processed 66000 rows; Write: 3285.07 rows/s 66055 rows exported in 26.269 seconds. cfstats for adlog.adclicklog20150528: --- $ nodetool cfstats adlog.adclicklog20150528 Keyspace: adlog Read Count: 217 Read Latency: 2.773073732718894 ms. Write Count: 103191 Write Latency: 0.10233075558915021 ms. Pending Flushes: 0 Table: adclicklog20150528 SSTable count: 11 Space used (live): 37981202 Space used (total): 37981202 Space used by snapshots (total): 13407843 Off heap memory used (total): 25580 SSTable Compression Ratio: 0.26684147550494164 Number of keys (estimate): 5627 Memtable cell count: 94620 Memtable data size: 13459445 Memtable off heap memory used: 0 Memtable switch count: 19 Local read count: 217 Local read latency: 2.774 ms Local write count: 103191 Local write latency: 0.103 ms Pending flushes: 0 Bloom filter false positives: 0 Bloom filter false ratio: 0.0 Bloom filter space used: 7192 Bloom filter off heap memory used: 7104 Index summary off heap memory used: 980 Compression metadata off heap memory used: 17496 Compacted partition minimum bytes: 1110 Compacted partition maximum bytes: 182785 Compacted partition mean bytes: 27808 Average live cells per slice (last five minutes): 44.663594470046085 Maximum live cells per slice (last five minutes): 86.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0 - Saurabh