[
https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037734#comment-13037734
]
Brock Noland commented on HDFS-420:
-----------------------------------
Looks like hudson could not apply the patch due to the directory structure.
I cannot speak to the patches correctness, but I manually applied and saw great
improvement in terms memory. Setting -Xmx128m caused a repeatable OOM in about
5 minutes reading about 1GB of data from 800 files. With the patch, my test has
been running for an hour without an OOM.
> fuse_dfs is unable to connect to the dfs after a copying a large number of
> files into the dfs over fuse
> -------------------------------------------------------------------------------------------------------
>
> Key: HDFS-420
> URL: https://issues.apache.org/jira/browse/HDFS-420
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: contrib/fuse-dfs
> Affects Versions: 0.20.2
> Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP
> (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64)
> Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build
> 10.0-b19, mixed mode)
> Reporter: Dima Brodsky
> Assignee: Brian Bockelman
> Fix For: 0.20.3
>
> Attachments: fuse_dfs_020_memleaks.patch,
> fuse_dfs_020_memleaks_v3.patch, fuse_dfs_020_memleaks_v8.patch
>
>
> I run the following test:
> 1. Run hadoop DFS in single node mode
> 2. start up fuse_dfs
> 3. copy my source tree, about 250 megs, into the DFS
> cp -av * /mnt/hdfs/
> in /var/log/messages I keep seeing:
> Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime
> /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to
> 1229385138/1229963739
> and then eventually
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1333
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1333
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1333
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1333
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1209
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1209
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1333
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1209
> Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs
> fuse_dfs.c:1037
> and the file system hangs. hadoop is still running and I don't see any
> errors in it's logs. I have to unmount the dfs and restart fuse_dfs and then
> everything is fine again. At some point I see the following messages in the
> /var/log/messages:
> ERROR: dfs problem - could not close file_handle(139677114350528) for
> /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log
> fuse_dfs.c:1464
> Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close
> file_handle(139676770220176) for
> /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log
> fuse_dfs.c:1464
> Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close
> file_handle(139677114812832) for
> /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8138-93825070138960-1229251587.log
> fuse_dfs.c:1464
> Is this a known issue? Am I just flooding the system too much. All of this
> is being performed on a single, dual core, machine.
> Thanks!
> ttyl
> Dima
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira