Re: Inserting list data
Hi Vladimir, In fact I am having difficulty to reproduce this issue by cqlsh. I was reported this issue by one of our developers and he is using his client application that uses cassandra java driver 3.0.3. (we're using DSE5.0.1) app A: 2016-10-11 13:28:23,014 [TRACE] [core.QueryLogger.NORMAL] [cluster1] [HOST1/IP1:9042] Query completed normally, took 5 ms: [8 bound values] INSERT INTO global.table_name ("id","alert_to","alert_emails","created_by","created_date","alert_level","updated_by","updated_date") VALUES (?,?,?,?,?,?,?,?); [id:25712, alert_to:[2], alert_emails:NULL, created_by:'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2', created_date:1476160103007, alert_level:2, updated_by:NULL, updated_date:NULL] app B: 2016-10-11 13:28:23,014 [TRACE] [core.QueryLogger.NORMAL] [cluster1] [HOST2/IP2:9042] Query completed normally, took 6 ms: [8 bound values] INSERT INTO global.table_name ("alert_to","alert_emails","created_date","id","created_by","updated_by","updated_date","alert_level") VALUES (?,?,?,?,?,?,?,?); [alert_to:[1], alert_emails:NULL, created_date:1476160103007, id:25712, created_by:'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2', updated_by:NULL, updated_date:NULL, alert_level:1] id bigint, alert_emails list, alert_level int, alert_to list, created_by text, created_date timestamp, updated_by text, updated_date timestamp, PRIMARY KEY (id) SELECT id, alert_level, alert_to FROM global.table_name WHERE id=25712; | id | alert_level | alert_to | | 25712 | 2 | [2, 1] | but when I threw the queries like below from cqlsh from different nodes at the same time in my testing environment, the data(alert_to) was just [1], which is expected behavior. on host 1 cqlsh> INSERT INTO global.table_name ("id","alert_to","alert_emails","created_by","created_date","alert_level","updated_by","updated_date") VALUES (25712,[2],NULL,'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',1476160103007,2,NULL,NULL); on host2 cqlsh> INSERT INTO global.table_name ("alert_to","alert_emails","created_date","id","created_by","updated_by","updated_date","alert_level") VALUES ([1],NULL,1476160103007,25712,'service-worker:ec45afd2-c40a-44d9-a2a1-7416409be6e2',NULL,NULL,1); so I wonder if this is something wrong with java driver but I cannot figure out the way to break this down further. @Andrew we're not using UDT..but appreciate if you could share your case, too. Thanks, Aoi 2016-10-13 11:26 GMT-07:00 Andrew Baker: > I saw evidence of this behavior, but when we created a test to try to make > it happen it never did, we assumed it was UDT related and lost interest, > since it didn't have a big impact. I will try to carve some time to look > into this some more and let you know if I find anything. > > On Wed, Oct 12, 2016 at 9:24 PM Vladimir Yudovin > wrote: >> >> The data is actually appended. not overwritten. >> Strange, can you send exactly operators? >> >> Here is example I do: >> CREATE KEYSPACE events WITH replication = {'class': 'SimpleStrategy', >> 'replication_factor': 1}; >> CREATE TABLE events.data (id int primary key, events list); >> INSERT INTO events.data (id, events) VALUES ( 0, ['a']); >> SELECT * FROM events.data ; >> id | events >> + >> 0 | ['a'] >> >> (1 rows) >> >> INSERT INTO events.data (id, events) VALUES ( 0, ['b']); >> SELECT * FROM events.data ; >> id | events >> + >> 0 | ['b'] >> >> (1 rows) >> >> As you see, 'a' was overwritten by 'b' >> >> >> Best regards, Vladimir Yudovin, >> Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer. >> Launch your cluster in minutes. >> >> >> On Wed, 12 Oct 2016 23:58:23 -0400Aoi Kadoya >> wrote >> >> yes, that's what I thought. but, when I use these forms, >> INSERT ... ['A'] >> INSERT ... ['B'] >> >> The data is actually appended. not overwritten. >> so I guess this is something unexpected? >> >> Thanks, >> Aoi >> >> 2016-10-12 20:55 GMT-07:00 Vladimir Yudovin : >> > If you use form >> > INSERT ... ['A'] >> > INSERT ... ['B'] >> > >> > latest INSERT will overwrite first, because this insert the whole list. >> > It's >> > better to use UPDATE like: >> > UPDATE ... SET events = events + ['A'] >> > UPDATE ... SET events = events + ['B'] >> > These operations add new elements to the end of existing list. >> > >> > >> > From here >> > https://docs.datastax.com/en/cql/3.0/cql/cql_using/use_list_t.html >> > : >> > >> > These update operations are implemented internally without any >> > read-before-write. Appending and prepending a new element to the list >> > writes >> > only the new element. >> > >> > >> > Best regards, Vladimir Yudovin, >> > Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer. >> > Launch your cluster in minutes. >> > >> > >> > On Wed, 12 Oct 2016 17:39:46 -0400Aoi Kadoya >> > wrote >> > >> > Hi, >> > >> >
Re: Inserting list data
yes, that's what I thought. but, when I use these forms, INSERT ... ['A'] INSERT ... ['B'] The data is actually appended. not overwritten. so I guess this is something unexpected? Thanks, Aoi 2016-10-12 20:55 GMT-07:00 Vladimir Yudovin: > If you use form > INSERT ... ['A'] > INSERT ... ['B'] > > latest INSERT will overwrite first, because this insert the whole list. It's > better to use UPDATE like: > UPDATE ... SET events = events + ['A'] > UPDATE ... SET events = events + ['B'] > These operations add new elements to the end of existing list. > > > From here https://docs.datastax.com/en/cql/3.0/cql/cql_using/use_list_t.html > : > > These update operations are implemented internally without any > read-before-write. Appending and prepending a new element to the list writes > only the new element. > > > Best regards, Vladimir Yudovin, > Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer. > Launch your cluster in minutes. > > > On Wed, 12 Oct 2016 17:39:46 -0400Aoi Kadoya > wrote > > Hi, > > When inserting different data into a list type column from different > clients at the same time, is data supposed to be combined into one > list? > > For example, if these 2 queries were requested from clients at the > same timing, how events list should look like after? > > INSERT INTO cycling.upcoming_calendar (year, month, events) VALUES > (2015, 06, ['A']); > INSERT INTO cycling.upcoming_calendar (year, month, events) VALUES > (2015, 06, ['B']); > > In my understanding, each operation should be treated as atomic, which > makes me think that even if client throw the queries at the same time, > cassandra would take them separately and the last insert would update > the events list. (= data should be either ['A'] or ['B']) > > In my environment, I found that some data was saved as like ['A',B'] > in the case like above. > Is this expected behavior of list data type? > > I am still new to cassandra and trying to make myself understood how > this happened. > Appreciate if you could help me with figuring this out! > > Thanks, > Aoi > >
Inserting list data
Hi, When inserting different data into a list type column from different clients at the same time, is data supposed to be combined into one list? For example, if these 2 queries were requested from clients at the same timing, how events list should look like after? INSERT INTO cycling.upcoming_calendar (year, month, events) VALUES (2015, 06, ['A']); INSERT INTO cycling.upcoming_calendar (year, month, events) VALUES (2015, 06, ['B']); In my understanding, each operation should be treated as atomic, which makes me think that even if client throw the queries at the same time, cassandra would take them separately and the last insert would update the events list. (= data should be either ['A'] or ['B']) In my environment, I found that some data was saved as like ['A',B'] in the case like above. Is this expected behavior of list data type? I am still new to cassandra and trying to make myself understood how this happened. Appreciate if you could help me with figuring this out! Thanks, Aoi
opscenter cluster metric API call - 400 error
Hi, I have upgraded my cluster to DSE 5.0.1 and Opscenter 6.0.1. I am testing Opscenter APIs to retrieve node/cluster metrics but I get 400 errors when I throw queries as like below. curl -vvv -H 'opscenter-session: xx' -G 'http:metrics//data-load' > GET /IDC/metrics//data-load HTTP/1.1 > User-Agent: curl/7.35.0 > Host: > Accept: */* > opscenter-session: xx > < HTTP/1.1 400 Bad Request < Transfer-Encoding: chunked < Date: Fri, 19 Aug 2016 20:45:36 GMT * Server TwistedWeb/15.3.0 is not blacklisted < Server: TwistedWeb/15.3.0 < Content-Type: application/json < Cache-Control: max-age=0, no-cache, no-store, must-revalidate < * Connection #0 to host left intact {"message": "unsupported operand type(s) for -: 'NoneType' and 'NoneType'", "type": "TypeError"} I could use the same session id, opscenter ip, node ip for different API call and I got responses with no problem but no luck with any of metrics calls. The error message seems complaining about "-" but I am not sure what is wrong with the command I threw.. I am just trying some calls described here without using any query parameters: http://docs.datastax.com/en/opscenter/6.0/api/docs/metrics.html Can you please advice me where I should change to retrieve the metrics? Thanks, Aoi
Re: CPU high load
Thank you, Alain. There was no frequent GC nor compaction so it have been a mystery,however, once I stopped chef-client(we're managing the cluster though chef-cookbook), the load was eased for almost all of the servers. so we're now refactoring our cookbook, in the meanwhile, we also decided to rebuild a cluster with DSE5.0.1. Thank you very much for your advices on the debugging processes, Aoi 2016-07-20 4:03 GMT-07:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hi Aoi, > >> >> since few weeks >> ago, all of the cluster nodes are hitting avg. 15-20 cpu load. >> These nodes are running on VMs(VMware vSphere) that have 8vcpu >> (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) > > > I take my chance, a few ideas / questions below: > > What Cassandra version are you running? > How is your GC doing? > > Run something like: grep "GC" /var/log/cassandra/system.log > If you have a lot of long CMS pauses you might not be keeping things in the > new gen long enough: Xmn800M looks too small to me, it has been a default > but I never saw a case where this setting worked better than a higher value > (let's say 2G), also tenuring threshold gives better results if set a bit > higher than default (let's say 16). Those options are in cassandra-env.sh. > > Do you have other warnings or errors? Anything about tombstones or > compacting wide rows incrementally? > What compaction strategy are you using > How many concurrent compactors do you use (if you have 8 cores, this value > should probably be between 2 and 6, 4 is a good starting point) > If your compaction is not fast enough and disk are doing fine, consider > increasing the compaction throughput from default 16 to 32 or 64 Mbps to > mitigate the impact of the point above. > Do you use compression ? What kind ? > Did the request count increased recently? Do you consider adding capacity or > do you think you're hitting a new bug / issue that is worth it investigating > / solving? > Are you using default configuration? What did you change? > > No matter what you try, do it as much as possible on one canary node first, > and incrementally (one change at the time - using NEWHEAP = 2GB + > tenuringThreshold = 16 would be one change, it makes sense to move those 2 > values together) > >> >> I have enabled a auto repair service on opscenter and it's running behind > > > Also when did you do that, starting repairs? Repair is an expensive > operation, consuming a lot of resources that is often needed, but that is > hard to tune correctly. Are you sure you have enough CPU power to handle the > load + repairs? > > Some other comments probably not directly related: > >> >> I also realized that my cluster isn't well balanced > > > Well you cluster looks balanced to me 7 GB isn't that far from 11 GB. To > have a more accurate information, use 'nodetool status mykeyspace'. This way > ownership will be displayed, replacing (?) by ownership (xx %). Total > ownership = 300 % in your case (RF=3) > >> >> I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks >> ago, all of the cluster nodes are hitting avg. 15-20 cpu load. > > > By the way, from > https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/RNdse.html: > > "Warning: DataStax does not recommend 4.8.1 or 4.8.2 versions for > production, see warning. Use 4.8.3 instead.". > > I am not sure what happened there but I would move to 4.8.3+ asap, datastax > people know their products and I don't like this kind of orange and bold > warnings :-). > > C*heers, > --- > Alain Rodriguez - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2016-07-14 4:36 GMT+02:00 Aoi Kadoya <cadyan@gmail.com>: >> >> Hi Romain, >> >> No, I don't think we upgraded cassandra version or changed any of >> those schema elements. After I realized this high load issue, I found >> that some of the tables have a shorter gc_grace_seconds(1day) than the >> rest and because it seemed causing constant compaction cycles, I have >> changed them to 10days. but again, that's after load hit this high >> number. >> some of nodes got eased a little bit after changing gc_grace_seconds >> values and repairing nodes, but since few days ago, all of nodes are >> constantly reporting load 15-20. >> >> Thank you for the suggestion about logging, let me try to change the >> log level to see what I can get from it. >> >> Thanks, >> Aoi >> >> >> 2016-07-13 13:28 GMT-07:00 Romain Hardouin <romainh...@yahoo.f
Re: CPU high load
Hi Romain, No, I don't think we upgraded cassandra version or changed any of those schema elements. After I realized this high load issue, I found that some of the tables have a shorter gc_grace_seconds(1day) than the rest and because it seemed causing constant compaction cycles, I have changed them to 10days. but again, that's after load hit this high number. some of nodes got eased a little bit after changing gc_grace_seconds values and repairing nodes, but since few days ago, all of nodes are constantly reporting load 15-20. Thank you for the suggestion about logging, let me try to change the log level to see what I can get from it. Thanks, Aoi 2016-07-13 13:28 GMT-07:00 Romain Hardouin <romainh...@yahoo.fr>: > Did you upgrade from a previous version? DId you make some schema changes > like compaction strategy, compression, bloom filter, etc.? > What about the R/W requests? > SharedPool Workers are... shared ;-) Put logs in debug to see some examples > of what services are using this pool (many actually). > > Best, > > Romain > > > Le Mercredi 13 juillet 2016 18h15, Patrick McFadin <pmcfa...@gmail.com> a > écrit : > > > Might be more clear looking at nodetool tpstats > > From there you can see all the thread pools and if there are any blocks. > Could be something subtle like network. > > On Tue, Jul 12, 2016 at 3:23 PM, Aoi Kadoya <cadyan@gmail.com> wrote: > > Hi, > > I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks > ago, all of the cluster nodes are hitting avg. 15-20 cpu load. > These nodes are running on VMs(VMware vSphere) that have 8vcpu > (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) > > At first I thought this is because of CPU iowait, however, iowait is > constantly low(in fact it's 0 almost all time time), CPU steal time is > also 0%. > > When I took a thread dump, I found some of "SharedPool-Worker" threads > are consuming CPU and those threads seem to be waiting for something > so I assume this is the cause of cpu load. > > "SharedPool-Worker-1" #240 daemon prio=5 os_prio=0 > tid=0x7fabf459e000 nid=0x39b3 waiting on condition > [0x7faad7f02000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:85) > at java.lang.Thread.run(Thread.java:745) > > Thread dump looks like this, but I am not sure what is this > sharedpool-worker waiting for. > Would you please help me with the further trouble shooting? > I am also reading the thread posted by Yuan as the situation is very > similar to mine but I didn't get any blocked, dropped or pending count > in my tpstat result. > > Thanks, > Aoi > > > >
Re: CPU high load
Hi Patrick, In fact I couldn't see any thread pool named "shared". here is the result of tpstats from one of my nodes. Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 173237609 0 0 ReadStage 0 0 71266557 0 0 RequestResponseStage 0 0 87617557 0 0 ReadRepairStage 0 0 51822 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 AntiEntropySessions 0 0 3828 0 0 HintedHandoff 0 0 23 0 0 GossipStage 0 02169599 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor0 01353194 0 0 ValidationExecutor0 03337647 0 0 MigrationStage0 0 5 0 0 AntiEntropyStage 0 07527026 0 0 PendingRangeCalculator0 0 24 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 118019 0 0 MemtablePostFlush 0 03398738 0 0 MemtableReclaimMemory 0 0 122249 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 I have enabled a auto repair service on opscenter and it's running behind but I also realized that my cluster isn't well balanced.. other than system/opscenter keyspaces, I only have one keyspace and its replication factor is 3 (network topology strategy) Datacenter: xxx Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN xx 10.19 GB 256 ? 6bf8db87-d4cc-4a75-86a5-bc1b27ced32c RAC1 UN xx 10.59 GB 256 ? 2d407831-e10d-4a6b-86c0-26c7a60e613d RAC1 UN xx 7.99 GB256 ? 1e05d70e-502e-4ac4-a6ed-bf912c332062 RAC1 UN xx 7.67 GB256 ? 41a8e12a-c8e8-42ff-b681-b74f493a2407 RAC1 UN xx 11.13 GB 256 ? 67572986-99b8-4a78-9039-aaa0aca8c236 RAC1 UN xx 9.54 GB256 ? 3f22001b-f03d-4bd0-8608-dd467cbc17f0 RAC1 Thanks, Aoi 2016-07-13 9:15 GMT-07:00 Patrick McFadin <pmcfa...@gmail.com>: > Might be more clear looking at nodetool tpstats > > From there you can see all the thread pools and if there are any blocks. > Could be something subtle like network. > > On Tue, Jul 12, 2016 at 3:23 PM, Aoi Kadoya <cadyan@gmail.com> wrote: >> >> Hi, >> >> I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks >> ago, all of the cluster nodes are hitting avg. 15-20 cpu load. >> These nodes are running on VMs(VMware vSphere) that have 8vcpu >> (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) >> >> At first I thought this is because of CPU iowait, however, iowait is >> constantly low(in fact it's 0 almost all time time), CPU steal time is >> also 0%. >> >> When I took a thread dump, I found some of "SharedPool-Worker" threads >> are consuming CPU and those threads seem to be waiting for something >> so I assume this is the cause of cpu load. >> >> "SharedPool-Worker-1" #240 daemon prio=5 os_prio=0 >> tid=0x7fabf459e000 nid=0x39b3 waiting on condition >> [0x7faad7f02000] >>java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) >> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:85) >> at java.lang.Thread.run(Thread.java:745) >> >> Thread dump looks like this, but I am not sure what is this >> sharedpool-worker waiting for. >> Would you plea
CPU high load
Hi, I am running 6 nodes vnode cluster with DSE 4.8.1, and since few weeks ago, all of the cluster nodes are hitting avg. 15-20 cpu load. These nodes are running on VMs(VMware vSphere) that have 8vcpu (1core/socket)-16 vRAM.(JVM options : -Xms8G -Xmx8G -Xmn800M) At first I thought this is because of CPU iowait, however, iowait is constantly low(in fact it's 0 almost all time time), CPU steal time is also 0%. When I took a thread dump, I found some of "SharedPool-Worker" threads are consuming CPU and those threads seem to be waiting for something so I assume this is the cause of cpu load. "SharedPool-Worker-1" #240 daemon prio=5 os_prio=0 tid=0x7fabf459e000 nid=0x39b3 waiting on condition [0x7faad7f02000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:85) at java.lang.Thread.run(Thread.java:745) Thread dump looks like this, but I am not sure what is this sharedpool-worker waiting for. Would you please help me with the further trouble shooting? I am also reading the thread posted by Yuan as the situation is very similar to mine but I didn't get any blocked, dropped or pending count in my tpstat result. Thanks, Aoi