Hy Alain, Thank you for your response.
> - Other than the 'lock', Counters perform an implicit read before the write > operation. >From what I know there is one counter cache[1], that is used to read the old values of the counters. According to [2], it is used only for UPDATE requests > I would say what you are seeing is expected with this use case. Also, I have > never seen a use case where using RF = 1 is good idea (excepted for some > testing maybe). Be aware this data is weak and can easily be lost (if it's a > deliberate choice, ignore my comment). On the bright side, you have no > entropy / consistency issues or need for repairs with RF = 1 :D. Yes, indeed RF=1 policy is our choice (basically because we didn't manage to scale the counter writes very good and we assumed that we can loose some data) [1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#193 [2]https://issues.apache.org/jira/browse/CASSANDRA-12500?focusedCommentId=15464023&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15464023 2018-01-18 12:51 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hello Octavian, > >> >> I have a counter table(RF=1) >> >> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write >> Count: 3401236000, in one month) >> >> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write >> Count: 3401236000, in one month) > > >> The problem is that our read rate limit on our hard-disk is always near >> 30MBps and our write rate limit is near 500KBps. > > > I did not read all your numbers, but here are the internal details you could > be missing: > > - Other than the 'lock', Counters perform an implicit read before the write > operation. To increment, you need to know about past value. It was true last > time I used them, I believe there is no real workaround and it's still the > case today. > - Writes do not hit the disk synchronously. Instead of this, they are stored > in the Memtable and only flushed once, sequentially and efficiently. Then > compactions manages to merge partitions after, asynchronously. > > I would say what you are seeing is expected with this use case. Also, I have > never seen a use case where using RF = 1 is good idea (excepted for some > testing maybe). Be aware this data is weak and can easily be lost (if it's a > deliberate choice, ignore my comment). On the bright side, you have no > entropy / consistency issues or need for repairs with RF = 1 :D. > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2018-01-17 17:40 GMT+00:00 Octavian Rinciog <octavian.rinc...@gmail.com>: >> >> Hello! >> >> I am using Cassandra 3.10, on Ubuntu 14.04 and I have a counter >> table(RF=1), with the following schema: >> >> CREATE TABLE edges ( >> src_id text, >> src_type text, >> source text >> weight counter, >> PRIMARY KEY ((src_id, src_type), source) >> ) WITH >> compaction = {'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32', 'min_threshold': '4'} >> >> SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write >> Count: 3401236000, in one month) >> >> We have Counter Cache enabled: >> >> Counter Cache : entries 1018782, size 256 MiB, capacity 256 >> MiB, 2799913189 hits, 3469459479 requests, 0.807 recent hit rate, 7200 >> save period in seconds >> >> The problem is that our read rate limit on our hard-disk is always >> near 30MBps and our write rate limit is near 500KBps. >> >> One example of output of "iostat -x" is >> >> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s >> avgrq-sz avgqu-sz await r_await w_await svctm %util >> sdb 0.06 1.04 263.65 2.04 28832.42 572.53 >> 146.07 0.36 1.35 0.74 81.16 1.27 33.81 >> >> Also with iotop, we saw that are about 8 threads that each goes around >> 3MB/s read rate. >> >> Total DISK READ : 22.73 M/s | Total DISK WRITE : 494.35 K/s >> Actual DISK READ: 22.62 M/s | Actual DISK WRITE: 528.57 K/s >> TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND >> 14793 be/4 cassandra 3.061 M/s 0.0010 B/s 0.00 % 93.27 % java >> -Dcassandra.fd_max_interval_ms=400 >> >> The output of strace on these threads is : >> >> strace -cp 14793 >> Process 14793 attached >> ^CProcess 14793 detached >> % time seconds usecs/call calls errors syscall >> ------ ----------- ----------- --------- --------- ---------------- >> 99.85 32.118518 57 567288 256251 futex >> 0.15 0.048822 3 15339 write >> 0.00 0.000000 0 1 rt_sigreturn >> ------ ----------- ----------- --------- --------- ---------------- >> 100.00 32.167340 582628 256251 total >> >> >> Despite that iotop shows that this thread is reading with 3MB/s, there >> is no read syscall in strace. >> >> I want to ask if actually the futex is responsible for the read rate >> and how can we debug this problem further ? >> >> Btw, there are no compaction tasks in progress and there are no SELECT >> queries in progress. >> >> Also, I know that for each update, a lock is obtained[1] >> >> Thank you, >> >> >> [1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#121 >> -- >> Octavian Rinciog >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> > -- Octavian Rinciog --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org