[ https://issues.apache.org/jira/browse/HADOOP-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12659108#action_12659108 ]
Craig Macdonald commented on HADOOP-4932: ----------------------------------------- Dima, It's not the case that a connection is made for each FS call. Instead, in Java, Hadoop caches file systems for each each user that uses the file system, so the handle returned by libhdfs should have the same underlying FileSystem object. Did you fuse_dfs object grow to be very large in size? However, would be interesting to identify what's happening to fuse_dfs in this case. Are there corresponding error messages in your namenode/datanode(s), or can you increase the logging level of hadoop to debug? > fuse_dfs is unable to connect to the dfs after a copying a large number of > files into the dfs over fuse > ------------------------------------------------------------------------------------------------------- > > Key: HADOOP-4932 > URL: https://issues.apache.org/jira/browse/HADOOP-4932 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/fuse-dfs > Affects Versions: 0.19.0 > Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP > (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) > Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build > 10.0-b19, mixed mode) > Reporter: Dima Brodsky > > I run the following test: > 1. Run hadoop DFS in single node mode > 2. start up fuse_dfs > 3. copy my source tree, about 250 megs, into the DFS > cp -av * /mnt/hdfs/ > in /var/log/messages I keep seeing: > Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime > /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to > 1229385138/1229963739 > and then eventually > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1333 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1209 > Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs > fuse_dfs.c:1037 > and the file system hangs. hadoop is still running and I don't see any > errors in it's logs. I have to unmount the dfs and restart fuse_dfs and then > everything is fine again. At some point I see the following messages in the > /var/log/messages: > ERROR: dfs problem - could not close file_handle(139677114350528) for > /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log > fuse_dfs.c:1464 > Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close > file_handle(139676770220176) for > /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log > fuse_dfs.c:1464 > Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close > file_handle(139677114812832) for > /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8138-93825070138960-1229251587.log > fuse_dfs.c:1464 > Is this a known issue? Am I just flooding the system too much. All of this > is being performed on a single, dual core, machine. > Thanks! > ttyl > Dima -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.