Re: Cassandra query degradation with high frequency updated tables.

Jonathan Haddad Fri, 09 Oct 2015 14:21:13 -0700

I'd be curious to see GC logs.

jstat -gccause <pid>


On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs <ty...@datastax.com> wrote:

> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
> especially when there are only ~500 cells from one sstable and the
> memtables.  Can you open a ticket (
> https://issues.apache.org/jira/browse/CASSANDRA) with your schema,
> details on your data layout, and these traces?
>
> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala <dodongj...@gmail.com>
> wrote:
>
>>
>>
>> So the trace is varying a lot. And does not seem to correlate with the
>> data return from the client ? Maybe datastax java  driver related. ..? (not
>> likely).. Just checkout the results.
>>
>>
>> Below is the one that I took when from the client (java application)
>> perspective it was returning data in  about 1100 ms.
>>
>>
>>
>> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                     |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                               *Execute CQL3 query* | 
>> *2015-10-09
>> 15:31:28.700000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
>> *172.31.17.129* |            *101*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-1]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *334*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |            *692*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *713*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.701000* | *172.31.17.129* |            *726*
>>
>>
>>             *Key cache hit for sstable 209 [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3143*
>>
>>                                                                           
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3169*
>>
>>
>>             *Key cache hit for sstable 208 [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3691*
>>
>>                                                                           
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3713*
>>
>>                                             *Skipped 0/2
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>>         *3807*
>>
>>                                                                             
>> *Merging
>> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
>> 15:31:28.704000* | *172.31.17.129* |           *3818*
>>
>>
>>       *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | 
>> *2015-10-09
>> 15:31:29.611000* | *172.31.17.129* |         *910723*
>>
>>
>>                                                 *Request complete* | 
>> *2015-10-09
>> 15:31:29.649251* | *172.31.17.129* |         *949251*
>>
>>
>>
>>
>> Below when this is around 1400 ms . But the trace data seems to look
>> faster ..?
>>
>>
>>
>> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                       |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                                 *Execute CQL3 query* | 
>> *2015-10-09
>> 15:54:00.869000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-133]* | *2015-10-09 15:54:00.869000* |
>> *172.31.17.129* |            *122*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-133]* | 
>> *2015-10-09
>> 15:54:00.869000* | *172.31.17.129* |            *265*
>>
>>                                                           *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-9]*
>> | *2015-10-09 15:54:00.870000* | *172.31.17.129* |            *673*
>>
>>
>>                 *Acquiring sstable references [SharedPool-Worker-9]* | 
>> *2015-10-09
>> 15:54:00.870000* | *172.31.17.129* |            *695*
>>
>>
>>                 *Merging memtable tombstones [SharedPool-Worker-9]* | 
>> *2015-10-09
>> 15:54:00.870000* | *172.31.17.129* |            *709*
>>
>>
>>               *Key cache hit for sstable 242 [SharedPool-Worker-9]* | 
>> *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3134*
>>
>>                                                                             
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-9]* | *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3155*
>>
>>                                               *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-9]* | *2015-10-09 15:54:00.872000* | *172.31.17.129*
>> |           *3259*
>>
>>
>>   *Merging data from memtables and 1 sstables [SharedPool-Worker-9]* | 
>> *2015-10-09
>> 15:54:00.872000* | *172.31.17.129* |           *3270*
>>
>>
>>         *Read 462 live and 0 tombstone cells [SharedPool-Worker-9]* | 
>> *2015-10-09
>> 15:54:02.070000* | *172.31.17.129* |         *201640*
>>
>>
>>                                                   *Request complete* | 
>> *2015-10-09
>> 15:54:02.111294* | *172.31.17.129* |         *242294*
>>
>>
>>
>>
>> This is when it was 1600 ms.
>>
>> *Tracing session: *9c0ea900-6ec4-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                       |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                                 *Execute CQL3 query* | 
>> *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-149]* | *2015-10-09 16:30:41.552000* |
>> *172.31.17.129* |             *99*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-149]* | 
>> *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *215*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-164]*
>> | *2015-10-09 16:30:41.552000* | *172.31.17.129* |            *507*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-164]* | 
>> *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *533*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-164]* | 
>> *2015-10-09
>> 16:30:41.552000* | *172.31.17.129* |            *551*
>>
>>
>>             *Key cache hit for sstable 302 [SharedPool-Worker-164]* | 
>> *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *3875*
>>
>>                                                                           
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *3902*
>>
>>                                             *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-164]* | *2015-10-09 16:30:41.556000* | *172.31.17.129*
>> |           *4050*
>>
>>                                                                             
>> *Merging
>> data from memtables and 1 sstables [SharedPool-Worker-164]* | *2015-10-09
>> 16:30:41.556000* | *172.31.17.129* |           *4068*
>>
>>
>>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-164]* | 
>> *2015-10-09
>> 16:30:43.269000* | *172.31.17.129* |         *717277*
>>
>>
>>                                                   *Request complete* | 
>> *2015-10-09
>> 16:30:43.307559* | *172.31.17.129* |         *755559*
>>
>>
>>
>>
>> The last one is now at 2300 ms.
>>
>>
>> *racing session: *7d89a1e0-6ec6-11e5-9493-9131aba66d63
>>
>>  *activity*
>>
>>                                                                     |
>> *timestamp*                  | *source*        | *source_elapsed*
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+----------------
>>
>>
>>                                               *Execute CQL3 query* | 
>> *2015-10-09
>> 16:44:09.342000* | *172.31.17.129* |              *0*
>>  *Parsing select * from processinfometric_profile where
>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 16:44:09.342000* |
>> *172.31.17.129* |             *87*
>>
>>
>>                       *Preparing statement [SharedPool-Worker-1]* | 
>> *2015-10-09
>> 16:44:09.342000* | *172.31.17.129* |            *277*
>>
>>                                                         *Executing
>> single-partition query on processinfometric_profile [SharedPool-Worker-2]*
>> | *2015-10-09 16:44:09.343000* | *172.31.17.129* |            *456*
>>
>>
>>               *Acquiring sstable references [SharedPool-Worker-2]* | 
>> *2015-10-09
>> 16:44:09.343000* | *172.31.17.129* |            *473*
>>
>>
>>               *Merging memtable tombstones [SharedPool-Worker-2]* | 
>> *2015-10-09
>> 16:44:09.343000* | *172.31.17.129* |            *485*
>>
>>
>>             *Key cache hit for sstable 328 [SharedPool-Worker-2]* | 
>> *2015-10-09
>> 16:44:09.345000* | *172.31.17.129* |           *2851*
>>
>>                                                                           
>> *Seeking
>> to partition beginning in data file [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.345000* | *172.31.17.129* |           *2869*
>>
>>                                             *Skipped 0/1
>> non-slice-intersecting sstables, included 0 due to tombstones
>> [SharedPool-Worker-2]* | *2015-10-09 16:44:09.345000* | *172.31.17.129*
>> |           *3005*
>>
>>                                                                             
>> *Merging
>> data from memtables and 1 sstables [SharedPool-Worker-2]* | *2015-10-09
>> 16:44:09.345001* | *172.31.17.129* |           *3017*
>>
>>
>>       *Read 468 live and 0 tombstone cells [SharedPool-Worker-2]* | 
>> *2015-10-09
>> 16:44:11.329000* | *172.31.17.129* |         *987011*
>>
>>
>>                                                 *Request complete* | 
>> *2015-10-09
>> 16:44:11.388637* | *172.31.17.129* |          *46637*
>>
>>
>>
>>
>> On Oct 9, 2015, at 2:48 PM, Nazario Parsacala <dodongj...@gmail.com>
>> wrote:
>>
>> Compaction did not help too.
>>
>>
>>
>> On Oct 9, 2015, at 1:01 PM, Nazario Parsacala <dodongj...@gmail.com>
>> wrote:
>>
>> So I upgraded to 2.2.2 and change the compaction strategy from 
>> DateTieredCompactionStrategy
>> to LeveledCompactionStrategy. But the problem still exists.
>> At the start we were getting responses around 80 to a couple of hundred
>> of ms. But after 1.5 hours of running, it is now hitting 1447 ms. I think
>> this will degrade some more as time progresses. I will let this run a
>> couple of hours more  and will also try to force compaction.
>>
>> BTW, with 2.2.2 I am getting the following exceptions. Not sure if there
>> is already a bug report on this.
>>
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>> segment (seg offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> ... 17 common frames omitted
>> WARN  [SharedPool-Worker-42] 2015-10-09 12:54:57,221
>> AbstractTracingAwareExecutorService.java:169 - Uncaught exception on thread
>> Thread[SharedPool-Worker-42,5,main]: {}
>> java.lang.RuntimeException:
>> org.apache.cassandra.io.sstable.CorruptSSTableException:
>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>> offs: 0, length: 182054)
>> at
>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2187)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> ~[na:1.8.0_60]
>> at
>> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
>> [apache-cassandra-2.2.2.jar:2.2.2]
>> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
>> java.io.IOException: Seek position 182054 is not within mmap segment (seg
>> offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:250)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1558)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:75)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:246)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:270)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2004)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1808)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:360)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1537)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2183)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> ... 4 common frames omitted
>> Caused by: java.io.IOException: Seek position 182054 is not within mmap
>> segment (seg offs: 0, length: 182054)
>> at
>> org.apache.cassandra.io.util.ByteBufferDataInput.seek(ByteBufferDataInput.java:47)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.AbstractDataInput.skipBytes(AbstractDataInput.java:33)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:405)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skipPromotedIndex(RowIndexEntry.java:164)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.db.RowIndexEntry$Serializer.skip(RowIndexEntry.java:155)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>> at
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:244)
>> ~[apache-cassandra-2.2.2.jar:2.2.2]
>>
>>
>>
>>
>> On Oct 9, 2015, at 9:26 AM, Carlos Alonso <i...@mrcalonso.com> wrote:
>>
>> Yeah, I was about to suggest the compaction strategy too. Leveled
>> compaction sounds like a better fit when records are being updated
>>
>> Carlos Alonso | Software Engineer | @calonso
>> <https://twitter.com/calonso>
>>
>> On 8 October 2015 at 22:35, Tyler Hobbs <ty...@datastax.com> wrote:
>>
>>> Upgrade to 2.2.2.  Your sstables are probably not compacting due to
>>> CASSANDRA-10270 <https://issues.apache.org/jira/browse/CASSANDRA-10270>,
>>> which was fixed in 2.2.2.
>>>
>>> Additionally, you may want to look into using leveled compaction (
>>> http://www.datastax.com/dev/blog/when-to-use-leveled-compaction).
>>>
>>> On Thu, Oct 8, 2015 at 4:27 PM, Nazario Parsacala <dodongj...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> so we are developing a system that computes profile of things that it
>>>> observes. The observation comes in form of events. Each thing that it
>>>> observe has an id and each thing has a set of subthings in it which has
>>>> measurement of some kind. Roughly there are about 500 subthings within each
>>>> thing. We receive events containing measurements of these 500 subthings
>>>> every 10 seconds or so.
>>>>
>>>> So as we receive events, we  read the old profile value, calculate the
>>>> new profile based on the new value and save it back. We use the following
>>>> schema to hold the profile.
>>>>
>>>> CREATE TABLE myprofile (
>>>>     id text,
>>>>     month text,
>>>>     day text,
>>>>     hour text,
>>>>     subthings text,
>>>>     lastvalue double,
>>>>     count int,
>>>>     stddev double,
>>>>  PRIMARY KEY ((id, month, day, hour), subthings)
>>>> ) WITH CLUSTERING ORDER BY (subthings ASC) );
>>>>
>>>>
>>>> This profile will then be use for certain analytics that can use in the
>>>> context of the ‘thing’ or in the context of specific thing and subthing.
>>>>
>>>> A profile can be defined as monthly, daily, hourly. So in case of
>>>> monthly the month will be set to the current month (i.e. ‘Oct’) and the day
>>>> and hour will be set to empty ‘’ string.
>>>>
>>>>
>>>> The problem that we have observed is that over time (actually in just a
>>>> matter of hours) we will see a huge degradation of query response  for the
>>>> monthly profile. At the start it will be respinding in 10-100 ms and after
>>>> a couple of hours it will go to 2000-3000 ms . If you leave it for a couple
>>>> of days you will start experiencing readtimeouts . The query is basically
>>>> just :
>>>>
>>>> select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and
>>>> hour=‘'
>>>>
>>>> This will have only about 500 rows or so.
>>>>
>>>>
>>>> I believe that this is cause by the fact there are multiple updates
>>>> done to this specific partition. So what do we think can be done to resolve
>>>> this ?
>>>>
>>>> BTW, I am using Cassandra 2.2.1 . And since this is a test , this is
>>>> just running on a single node.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>>
>>
>>
>>
>>
>>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Cassandra query degradation with high frequency updated tables.

Reply via email to