[ 
https://issues.apache.org/jira/browse/ARROW-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893872#comment-16893872
 ] 

Wes McKinney commented on ARROW-6044:
-------------------------------------

We're passing through calls to libhdfs. It's possible that there is some 
resource leak, but I'm not sure where it would be. Maybe you can ask the Apache 
Hadoop community?

> Pyarrow HDFS client gets hung after a while
> -------------------------------------------
>
>                 Key: ARROW-6044
>                 URL: https://issues.apache.org/jira/browse/ARROW-6044
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.13.0
>         Environment: hadoop-3.0.3
> driver='libhdfs'
> python 3.6
> Centos7
>            Reporter: Fred Tzeng
>            Priority: Major
>
> I'm using the pyarrow HDFS client in a long running (forever) app that makes 
> connections to HDFS as external requests come in and destroys the connection 
> as soon as the request is handled. This happens a large amount of times on 
> separate threads and everything works great.
> The problem is, after the app idles for a while (perhaps hours) and no HDFS 
> connections are made during this time, when the next connection is attempted, 
> the API hdfs.connect(...) just hangs. No exceptions are thrown.
> Code snippet on what i'm doing to instantiate each connection:
> ...
> hdfs = pyarrow.hdfs.connect(self.hdfs_authority, self.hdfs_port, 
> user=self.hdfs_user)
> try:
> //Do something
> finally:
> hdfs.close
>  
> Any help on what might be causing these hangs is appreciated
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to