qwedsazzcc commented on pull request #1680:
URL: https://github.com/apache/zookeeper/pull/1680#issuecomment-876852544
> Hi @qwedsazzcc,
>
> Thank you for your investigation and contribution. I understand what you
are reporting, and imagine that your patch effectively makes the 100% CPU usage
situation disappear—but unless I am missing something, it does not address the
root cause.
>
> Unfortunately, I have been unable to reproduce the specific sequence you
are observing.
>
> According to POSIX, `POLLNVAL` means "file descriptor not open," and while
I could imagine some code closing the FD which is being polled, it is difficult
to imagine how it would happen more than once and/or lead to a busy loop.
>
> In your experience, is the problem deterministic? Is it "easy" to
reproduce? Also: would you have log entries or some other kind of trace which
might help figure out the exact conditions?
>
> As for the error condition(s), which should indeed be handled, how about
the following:
>
> ```diff
> --- a/zookeeper-client/zookeeper-client-c/src/mt_adaptor.c
> +++ b/zookeeper-client/zookeeper-client-c/src/mt_adaptor.c
> @@ -388,7 +388,7 @@ void *do_io(void *v)
> timeout=tv.tv_sec * 1000 + (tv.tv_usec/1000);
>
> poll(fds,maxfd,timeout);
> - if (fd != -1) {
> + if (fd != 1 && (fds[1].revents&POLLNVAL) == 0) {
> interest=(fds[1].revents&POLLIN)?ZOOKEEPER_READ:0;
>
interest|=((fds[1].revents&POLLOUT)||(fds[1].revents&POLLHUP))?ZOOKEEPER_WRITE:0;
> }
> ```
>
> Not resetting `interest` should lead to `check_events` accessing the
socket and noticing the closed state. Or does such a patch still cause 100% CPU
usage?
>
> Cheers, -D
The error log in console is
```
2021-07-09 01:31:17,352:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,374:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,385:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,417:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,432:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,442:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,458:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,476:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,492:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,509:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,521:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,538:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,552:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,570:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,587:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,604:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,621:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,632:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,647:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,658:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,674:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,683:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,702:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,717:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,735:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,750:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,760:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,778:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,793:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,813:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,831:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,841:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,858:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,873:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,892:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,905:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,929:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
2021-07-09 01:31:17,938:5615(0x7fa0f9ffb700):ZOO_ERROR@resolve_hosts@947:
getaddrinfo: Name or service not known
```
It's call getaddrinfo too fast cause hight cpu use .
I run zookeeper server in local , and add a dns rule in hosts (127.0.0.1
zookeeper) , my process connected zookeeper with zookeeper:2181 , remove
the rule in hosts and then stop zookeeper server immediately . It's might
reproduce .
I used this
```
if (fd != -1 && (fds[1].revents&POLLNVAL) == 0) {
interest=(fds[1].revents&POLLIN)?ZOOKEEPER_READ:0;
interest|=((fds[1].revents&POLLOUT)||(fds[1].revents&POLLHUP))?ZOOKEEPER_WRITE:0;
}
```
It didn't solve my problem.
Thanks for your reply
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]