[
https://issues.apache.org/jira/browse/GIRAPH-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437152#comment-13437152
]
Eli Reisman commented on GIRAPH-304:
------------------------------------
We are routinely able to run 4 figures of workers here without problems, we see
connection errors only after the netty buffers or worker memory in general got
overwhelmed and crashed, and is trying to restart. Funny that you guys would
get this error at such a low number of workers. How far into the job does this
happen?
> Closed channels between workers
> -------------------------------
>
> Key: GIRAPH-304
> URL: https://issues.apache.org/jira/browse/GIRAPH-304
> Project: Giraph
> Issue Type: Bug
> Reporter: Alessandro Presta
> Assignee: Alessandro Presta
>
> With GIRAPH-300 we are able to complete jobs with higher numbers of workers
> thanks to retrying failed connections. However, we still observe
> ClosedChannelException with more than a 100 workers.
> The patch also introduces a default TCP backlog of 100, so we should probably
> set this dynamically to equal the number of workers instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira