[ 
https://issues.apache.org/jira/browse/GIRAPH-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014247#comment-14014247
 ] 

Jaeho Shin commented on GIRAPH-904:
-----------------------------------

The exact code where Giraph hangs is 
{{BspServiceMaster#barrierOnWorkerList()}}.  It keeps printing log lines 
starting with {{"barrierOnWorkerList: Waiting on "}} with hostnames that has 
definitely finished, e.g., when launched with a single worker, it waits for 
itself after finishing the loading.

On a second look, I think a more precise fix would be doing the lowercase 
normalization in that function when computing the {{hostnameIdList}} rather 
than doing it in {{GraphConfiguration#getLocalHostname()}}.  The case 
difference between that and the hostname returned by 
{{TaskInfo#getHostnameId()}} is causing the set difference computation in the 
while loop to fail.

> Giraph can hang when hostnames include uppercase letters
> --------------------------------------------------------
>
>                 Key: GIRAPH-904
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-904
>             Project: Giraph
>          Issue Type: Bug
>          Components: bsp, conf and scripts, zookeeper
>    Affects Versions: 1.1.0
>            Reporter: Jaeho Shin
>             Fix For: 1.1.0
>
>         Attachments: GIRAPH-904.patch
>
>
> We found that Giraph jobs were consistently hanging if uppercase letters were 
> included in the DNS (or /etc/hosts) resolved hostnames ({{foo.stanford.edu}} 
> vs. {{foo.Stanford.EDU}} from our DNS).  Normalizing the hostnames to lower 
> case from {{GiraphConfiguration#getLocalHostname()}} fixed our problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to