[jira] [Updated] (TS-2089) one log collation thread is not enough for busy system

Yunkai Zhang (JIRA) Thu, 15 Aug 2013 00:44:34 -0700

     [ 
https://issues.apache.org/jira/browse/TS-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yunkai Zhang updated TS-2089:
-----------------------------

    Description: 
we should make log collation more thread & cpu nice.

 
{code}
top - 22:34:22 up  3:37,  1 user,  load average: 1.20, 1.71, 1.66
Tasks: 979 total,   2 running, 977 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.5%us,  0.6%sy,  0.0%ni, 95.6%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:  65943744k total, 62536348k used,  3407396k free,    90544k buffers
Swap:  2097144k total,        0k used,  2097144k free, 54189788k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4651 ats       20   0 12.9g 5.3g 4928 R 100.1  8.5 197:51.78 [LOGGING]
 4460 ats       20   0 12.9g 5.3g 4928 S  7.9  8.5  18:12.57 [ET_NET 44]
 4457 ats       20   0 12.9g 5.3g 4928 S  5.3  8.5   7:32.34 [ET_NET 41]
 4461 ats       20   0 12.9g 5.3g 4928 S  4.3  8.5   6:41.79 [ET_NET 45]
 4453 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   6:44.52 [ET_NET 37]
 4463 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   7:54.01 [ET_NET 47]
 3652 root      39  19     0    0    0 S  1.0  0.0   2:14.15 kipmi0
 4425 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.21 [ET_NET 9]
 4426 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.42 [ET_NET 10]
 4430 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.55 [ET_NET 14]
  132 root      20   0     0    0    0 S  0.3  0.0   0:00.82 events/1
 4417 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.82 [ET_NET 1]
 4418 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:20.02 [ET_NET 2]
 4420 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.64 [ET_NET 4]
 4422 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.84 [ET_NET 6]
 4427 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.94 [ET_NET 11]
 4435 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.88 [ET_NET 19]
{code}

== Update ===

I have developed a patch to fix this issue. After I rebase it to master branch, 
I'll attach it to here. Here is the description of my patch:

---------------
We found that CPU of logging thread could be easy to reach up 100% in
collation host, but disk IO was low at the same time.

The bottleneck of logging thread is that some preprocessing job, such as
convert LogBuffer to ascii text, consume so much CPU time. And more
worse, the write() operation will block logging thread.

So this patch try to split logging thread into two parts:
1) Configurable preproc threads, which are responsiable for processing all
   of prepare work, and then forward the preprocessed buffer to flush thread,
   or send them to CollationClient/HostSM.

2) One Flush thread, it will consume preprocessed buffers and write them to
   disk. In our testing, one flush thread is enough for us.

TODO: This patch supports only one flush thread, we can improve it to
      "one flush thread per file/disk" in the future.

== How to configure ==
The number of preproc threads is 1 by default.

Please modify "proxy.config.log.collation_preproc_threads" option to
change it.
---------------

======
After apply this patch, let's have a look the 'top' output:
{code}
[[email protected] core]# top -H -p 3984 -b -n 1
top - 15:39:06 up 17:23,  6 users,  load average: 3.37, 3.09, 3.15
Tasks: 246 total,   6 running, 240 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.4%us,  0.5%sy,  0.0%ni, 97.6%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  65943744k total, 65763084k used,   180660k free,   136688k buffers
Swap:  2097144k total,        0k used,  2097144k free, 57275428k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4237 ats       20   0 8941m 5.2g 5148 R 39.5  8.3  80:46.23 [LOG_PREPROC 2]
 4236 ats       20   0 8941m 5.2g 5148 R 37.5  8.3  80:46.53 [LOG_PREPROC 1]
 4235 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.71 [LOG_PREPROC 0]
 4238 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:48.37 [LOG_PREPROC 3]
 4239 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.58 [LOG_PREPROC 4]
 4240 ats       20   0 8941m 5.2g 5148 R 11.9  8.3  20:45.28 [LOG_FLUSH]
 4037 ats       20   0 8941m 5.2g 5148 S  7.9  8.3  15:06.60 [ET_NET 38]
 4045 ats       20   0 8941m 5.2g 5148 S  5.9  8.3   9:55.74 [ET_NET 46]
 4038 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   6:08.97 [ET_NET 39]
 4042 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   8:36.73 [ET_NET 43]
...
{code}

[LOG_PREPROC 0-4] are collation preproc threads, [LOG_FLUSH] is new flush 
thread. They have good throughput now.

  was:
we should make log collation more thread & cpu nice.

 
{code}
top - 22:34:22 up  3:37,  1 user,  load average: 1.20, 1.71, 1.66
Tasks: 979 total,   2 running, 977 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.5%us,  0.6%sy,  0.0%ni, 95.6%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:  65943744k total, 62536348k used,  3407396k free,    90544k buffers
Swap:  2097144k total,        0k used,  2097144k free, 54189788k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4651 ats       20   0 12.9g 5.3g 4928 R 100.1  8.5 197:51.78 [LOGGING]
 4460 ats       20   0 12.9g 5.3g 4928 S  7.9  8.5  18:12.57 [ET_NET 44]
 4457 ats       20   0 12.9g 5.3g 4928 S  5.3  8.5   7:32.34 [ET_NET 41]
 4461 ats       20   0 12.9g 5.3g 4928 S  4.3  8.5   6:41.79 [ET_NET 45]
 4453 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   6:44.52 [ET_NET 37]
 4463 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   7:54.01 [ET_NET 47]
 3652 root      39  19     0    0    0 S  1.0  0.0   2:14.15 kipmi0
 4425 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.21 [ET_NET 9]
 4426 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.42 [ET_NET 10]
 4430 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.55 [ET_NET 14]
  132 root      20   0     0    0    0 S  0.3  0.0   0:00.82 events/1
 4417 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.82 [ET_NET 1]
 4418 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:20.02 [ET_NET 2]
 4420 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.64 [ET_NET 4]
 4422 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.84 [ET_NET 6]
 4427 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.94 [ET_NET 11]
 4435 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.88 [ET_NET 19]
{code}

== Update ===

I have developed a patch to fix this issue. After I rebase it to master branch, 
I'll attach it to here. Here is the description of my patch:

---------------
We found that CPU of logging thread could be easy to reach up 100% in
collation host, but disk IO was low at the same time.

The bottleneck of logging thread is that some preprocessing job, such as
convert LogBuffer to ascii text, consume so much CPU time. And more
worse, the write() operation will block logging thread.

So this patch try to split logging thread into two parts:
1) Configurable preproc threads, which are responsiable for processing all
   of prepare work, and then forward the preprocessed buffer to flush thread,
   or send them to CollationClient/HostSM.

2) One Flush thread, it will consume preprocessed buffers and write them to
   disk. In our testing, one flush thread is enough for us.

TODO: This patch supports only one flush thread, we can improve it to
      "one flush thread per file/disk" in the future.

== How to configure ==
The number of preproc threads is 1 by default.

Please modify "proxy.config.log.collation_preproc_threads" option to
change it.
---------------

======
After apply this patch, let's have a look the 'top' output:
{code}
[[email protected] core]# top -H -p 3984 -b -n 1
top - 15:39:06 up 17:23,  6 users,  load average: 3.37, 3.09, 3.15
Tasks: 246 total,   6 running, 240 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.4%us,  0.5%sy,  0.0%ni, 97.6%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  65943744k total, 65763084k used,   180660k free,   136688k buffers
Swap:  2097144k total,        0k used,  2097144k free, 57275428k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 4237 ats       20   0 8941m 5.2g 5148 R 39.5  8.3  80:46.23 [LOG_PREPROC 2]
 4236 ats       20   0 8941m 5.2g 5148 R 37.5  8.3  80:46.53 [LOG_PREPROC 1]
 4235 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.71 [LOG_PREPROC 0]
 4238 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:48.37 [LOG_PREPROC 3]
 4239 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.58 [LOG_PREPROC 4]
 4240 ats       20   0 8941m 5.2g 5148 R 11.9  8.3  20:45.28 [LOG_FLUSH]
 4037 ats       20   0 8941m 5.2g 5148 S  7.9  8.3  15:06.60 [ET_NET 38]
 4045 ats       20   0 8941m 5.2g 5148 S  5.9  8.3   9:55.74 [ET_NET 46]
 4038 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   6:08.97 [ET_NET 39]
 4042 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   8:36.73 [ET_NET 43]
...
{code}

[LOG_PREPROC 0-4] are collation preproc threads, [LOG_FLUSH] are new flush 
threads. They have good throughput now.

    
> one log collation thread is not enough for busy system
> ------------------------------------------------------
>
>                 Key: TS-2089
>                 URL: https://issues.apache.org/jira/browse/TS-2089
>             Project: Traffic Server
>          Issue Type: Improvement
>          Components: Logging
>            Reporter: Zhao Yongming
>             Fix For: 3.5.2
>
>
> we should make log collation more thread & cpu nice.
>  
> {code}
> top - 22:34:22 up  3:37,  1 user,  load average: 1.20, 1.71, 1.66
> Tasks: 979 total,   2 running, 977 sleeping,   0 stopped,   0 zombie
> Cpu(s):  3.5%us,  0.6%sy,  0.0%ni, 95.6%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Mem:  65943744k total, 62536348k used,  3407396k free,    90544k buffers
> Swap:  2097144k total,        0k used,  2097144k free, 54189788k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  4651 ats       20   0 12.9g 5.3g 4928 R 100.1  8.5 197:51.78 [LOGGING]
>  4460 ats       20   0 12.9g 5.3g 4928 S  7.9  8.5  18:12.57 [ET_NET 44]
>  4457 ats       20   0 12.9g 5.3g 4928 S  5.3  8.5   7:32.34 [ET_NET 41]
>  4461 ats       20   0 12.9g 5.3g 4928 S  4.3  8.5   6:41.79 [ET_NET 45]
>  4453 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   6:44.52 [ET_NET 37]
>  4463 ats       20   0 12.9g 5.3g 4928 S  3.0  8.5   7:54.01 [ET_NET 47]
>  3652 root      39  19     0    0    0 S  1.0  0.0   2:14.15 kipmi0
>  4425 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.21 [ET_NET 9]
>  4426 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.42 [ET_NET 10]
>  4430 ats       20   0 12.9g 5.3g 4928 S  0.7  8.5   0:19.55 [ET_NET 14]
>   132 root      20   0     0    0    0 S  0.3  0.0   0:00.82 events/1
>  4417 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.82 [ET_NET 1]
>  4418 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:20.02 [ET_NET 2]
>  4420 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.64 [ET_NET 4]
>  4422 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.84 [ET_NET 6]
>  4427 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.94 [ET_NET 11]
>  4435 ats       20   0 12.9g 5.3g 4928 S  0.3  8.5   0:19.88 [ET_NET 19]
> {code}
> == Update ===
> I have developed a patch to fix this issue. After I rebase it to master 
> branch, I'll attach it to here. Here is the description of my patch:
> ---------------
> We found that CPU of logging thread could be easy to reach up 100% in
> collation host, but disk IO was low at the same time.
> The bottleneck of logging thread is that some preprocessing job, such as
> convert LogBuffer to ascii text, consume so much CPU time. And more
> worse, the write() operation will block logging thread.
> So this patch try to split logging thread into two parts:
> 1) Configurable preproc threads, which are responsiable for processing all
>    of prepare work, and then forward the preprocessed buffer to flush thread,
>    or send them to CollationClient/HostSM.
> 2) One Flush thread, it will consume preprocessed buffers and write them to
>    disk. In our testing, one flush thread is enough for us.
> TODO: This patch supports only one flush thread, we can improve it to
>       "one flush thread per file/disk" in the future.
> == How to configure ==
> The number of preproc threads is 1 by default.
> Please modify "proxy.config.log.collation_preproc_threads" option to
> change it.
> ---------------
> ======
> After apply this patch, let's have a look the 'top' output:
> {code}
> [[email protected] core]# top -H -p 3984 -b -n 1
> top - 15:39:06 up 17:23,  6 users,  load average: 3.37, 3.09, 3.15
> Tasks: 246 total,   6 running, 240 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.4%us,  0.5%sy,  0.0%ni, 97.6%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  65943744k total, 65763084k used,   180660k free,   136688k buffers
> Swap:  2097144k total,        0k used,  2097144k free, 57275428k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  4237 ats       20   0 8941m 5.2g 5148 R 39.5  8.3  80:46.23 [LOG_PREPROC 2]
>  4236 ats       20   0 8941m 5.2g 5148 R 37.5  8.3  80:46.53 [LOG_PREPROC 1]
>  4235 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.71 [LOG_PREPROC 0]
>  4238 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:48.37 [LOG_PREPROC 3]
>  4239 ats       20   0 8941m 5.2g 5148 R 35.6  8.3  80:46.58 [LOG_PREPROC 4]
>  4240 ats       20   0 8941m 5.2g 5148 R 11.9  8.3  20:45.28 [LOG_FLUSH]
>  4037 ats       20   0 8941m 5.2g 5148 S  7.9  8.3  15:06.60 [ET_NET 38]
>  4045 ats       20   0 8941m 5.2g 5148 S  5.9  8.3   9:55.74 [ET_NET 46]
>  4038 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   6:08.97 [ET_NET 39]
>  4042 ats       20   0 8941m 5.2g 5148 S  4.0  8.3   8:36.73 [ET_NET 43]
> ...
> {code}
> [LOG_PREPROC 0-4] are collation preproc threads, [LOG_FLUSH] is new flush 
> thread. They have good throughput now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-2089) one log collation thread is not enough for busy system

Reply via email to