Re: Some queries about stability and reliability

2006-08-14 Thread Konstantin Shvachko
Bryan, could you describe in more details how your files rot? Do you loose entire files or blocks? How do you detect missing files? How regular is regular basis? Should we have a jira issue for that? I think Jagadeesh' q#3 was about running a spare name node on the same cluster, rather than

Re: Using Hadoop with NFS mounted file server

2006-08-14 Thread Doug Cutting
You don't want to use DFS on top of NFS. If you use DFS, keep its data on the local drives, not in NFS. If you want to use NFS for shared data, then simply don't use DFS: specify local as the filesystem and don't start datanodes or a namenode. I think you'll find DFS will perform better

Re: Some queries about stability and reliability

2006-08-14 Thread Doug Cutting
Konstantin Shvachko wrote: On the logging issue. I think we should change the default logging level, which is INFO at the moment. I think INFO is the appropriate default logging level. If there are things logged at the INFO level that are too verbose, then we should change these to DEBUG

Re: Compile error with trunk and java 1.4

2006-08-14 Thread Doug Cutting
Renaud Richardet wrote: Does Hadoop requires java 5? Yes. We're not yet extensively using or encouraging Java 5 features, but it is now required. I get a compile error when building the trunk with java 1.4. This change below will make it build again. I think there are more changes

Re: Some queries about stability and reliability

2006-08-14 Thread Paul Sutter
we've had instances that certain block replicas are corrupted or truncated and other replicas of the same block are fine (hundreds of instances). we've used two workarounds: - get the file over and over, until no checksum error occurs, then put the file back in place, or - replace/regenerate the

RE: Some queries about stability and reliability

2006-08-14 Thread Yoram Arnon
User code data gets written to the tasktracker's log at the INFO level. We switched to WARNING level when a rogue user program produced a lot of output to stdout, and it filled the task trackers' logs with junk. Yoram -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED]