Re: Scaling out/up or a mix

2009-06-26 Thread Brian Bockelman
Hey Marcus, Are you recording the data rates coming out of HDFS? Since you have such a low CPU utilizations, I'd look at boxes utterly packed with big hard drives (also, why are you using RAID1 for Hadoop??). You can get 1U boxes with 4 drive bays or 2U boxes with 12 drive bays. Based o

Re: About fuse-dfs and NFS

2009-06-26 Thread Brian Bockelman
Hey Chris, FUSE in general does not support NFS mounts well because it has a tendency to renumber inodes upon NFS restart, which causes clients to choke. FUSE-DFS supports a limited range of write operations; it's possible that your application is trying to use write functionality that is

Re: "Too many open files" error, which gets resolved after some time

2009-06-23 Thread Brian Bockelman
Hey Stas, It sounds like it's technically possible, but it also sounds like a horrible hack: I'd avoid this at all expense. This is how cruft is born. The pipes/epolls are something that eventually get cleaned up - but they don't get cleaned up often enough for your cluster. I would r

Re: hadoop interaction with AFS

2009-06-17 Thread Brian Bockelman
Hey Brock, Looking through my notes, OpenAFS 1.4.1 through 1.4.8 have an issue with the GIDs. Maybe you can upgrade to the latest maintenance release for the 1.4.x series and get lucky? Otherwise, I think I might have a patch somewhere for this. Brian On Jun 17, 2009, at 8:41 AM, Brock P

Re: hadoop interaction with AFS

2009-06-17 Thread Brian Bockelman
Hey Brock, I've seen a similar problem at another site. They were able to solve this by upgrading their version of OpenAFS. Is that an option for you? Brian On Jun 17, 2009, at 8:35 AM, Brock Palen wrote: Ran into an issue with running hadoop on a cluster that also has AFS installed. W

Re: HDFS data transfer!

2009-06-12 Thread Brian Bockelman
nd on 3 of the nodes simultaneously to copy files that were local on those machines into the hdfs. Brian Bockelman wrote: What'd you do for the tests? Was it a single stream or a multiple stream test? Brian On Jun 12, 2009, at 6:48 AM, Scott wrote: So is ~ 1GB/minute transfer rate a

Re: HDFS data transfer!

2009-06-12 Thread Brian Bockelman
. My initial tests show a transfer rate of around 1GB/minute, and that was slower that I expected it to be. Thanks, Scott Brian Bockelman wrote: Hey Sugandha, Transfer rates depend on the quality/quantity of your hardware and the quality of your client disk that is generating the dat

Re: HDFS data transfer!

2009-06-10 Thread Brian Bockelman
Hey Sugandha, Transfer rates depend on the quality/quantity of your hardware and the quality of your client disk that is generating the data. I usually say that you should expect near-hardware-bottleneck speeds for an otherwise idle cluster. There should be no "make it fast" required (th

Re: maybe a bug in hadoop?

2009-06-10 Thread Brian Bockelman
Hey Stephen, I've hit this "bug" before (rather, our admins did...). I would be happy to see you file it - after checking for duplicates - so I no longer have to warn people about it. Brian On Jun 10, 2009, at 6:29 AM, stephen mulcahy wrote: Tim Wintle wrote: I thought I'd double check.

Re: Max. Possible No. of Files

2009-06-05 Thread Brian Bockelman
On Jun 5, 2009, at 11:51 AM, Wasim Bari wrote: Hi, Does someone has some data regarding maximum possible number of files over HDFS ? Hey Wasim, I don't think that there is a maximum limit. Remember: 1) Less is better. HDFS is optimized for big files. 2) The amount of memory the HDF

Re: Monitoring hadoop?

2009-06-05 Thread Brian Bockelman
han just Hadoop metrics. You'll get CPU, load, memory, disk and network monitoring as well for free. You can see live demos of ganglia at http://ganglia.info/?page_id=69. Good luck. -Matt On Jun 5, 2009, at 7:10 AM, Brian Bockelman wrote: Hey Anthony, Look into hooking your Hadoop syst

Re: Monitoring hadoop?

2009-06-05 Thread Brian Bockelman
Hey Anthony, Look into hooking your Hadoop system into Ganglia; this produces about 20 real-time statistics per node. Hadoop also does JMX, which hooks into more "enterprise"-y monitoring systems. Brian On Jun 5, 2009, at 8:55 AM, Anthony McCulley wrote: Hey all, I'm currently tasked t

Re: Question about Hadoop filesystem

2009-06-04 Thread Brian Bockelman
It's in the FAQ: http://wiki.apache.org/hadoop/FAQ#17 Brian On Jun 4, 2009, at 6:26 PM, Harold Lim wrote: How do I remove a datanode? Do I simply "destroy" my datanode and the namenode will automatically detect it? Is there a more elegent way to do it? Also, when I remove a datanode, d

Re: Subdirectory question revisited

2009-06-02 Thread Brian Bockelman
Hey Aaron, I had a similar problem. I have log files arranged in the following fashion: /logs//.log I want to analyze a range of dates for all hosts. What I did was write into my driver class a subroutine that descends through the HDFS file system starting at /logs and builds a list of

Re: Subdirectory question revisited

2009-06-02 Thread Brian Bockelman
Hey Aaron, I had a similar problem. I have log files arranged in the following fashion: /logs//.log I want to analyze a range of dates for all hosts. What I did was write into my driver class a subroutine that descends through the HDFS file system starting at /logs and builds a list of

Re: Subdirectory question revisited

2009-06-02 Thread Brian Bockelman
Hey Aaron, I had a similar problem. I have log files arranged in the following fashion: /logs//.log I want to analyze a range of dates for all hosts. What I did was write into my driver class a subroutine that descends through the HDFS file system starting at /logs and builds a list of

Re: hadoop hardware configuration

2009-05-28 Thread Brian Bockelman
On May 28, 2009, at 2:00 PM, Patrick Angeles wrote: On Thu, May 28, 2009 at 10:24 AM, Brian Bockelman >wrote: We do both -- push the disk image out to NFS and have a mirrored SAS hard drives on the namenode. The SAS drives appear to be overkill. This sounds like a nice appro

Re: hadoop hardware configuration

2009-05-28 Thread Brian Bockelman
On May 28, 2009, at 10:32 AM, Ian Soboroff wrote: Brian Bockelman writes: Despite my trying, I've never been able to come even close to pegging the CPUs on our NN. I'd recommend going for the fastest dual-cores which are affordable -- latency is king. Clue? Surely the la

Re: hadoop hardware configuration

2009-05-28 Thread Brian Bockelman
On May 28, 2009, at 5:02 AM, Steve Loughran wrote: Patrick Angeles wrote: Sorry for cross-posting, I realized I sent the following to the hbase list when it's really more a Hadoop question. This is an interesting question. Obviously as an HP employee you must assume that I'm biased when

Circumventing Hadoop's data placement policy

2009-05-23 Thread Brian Bockelman
Hey all, Had a problem I wanted to ask advice on. The Caltech site I work with currently have a few GridFTP servers which are on the same physical machines as the Hadoop datanodes, and a few that aren't. The GridFTP server has a libhdfs backend which writes incoming network data into HD

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Brian Bockelman
On May 21, 2009, at 3:10 PM, Stas Oskin wrote: Hi. If this analysis is right, I would add it can happen even on large clusters! I've seen this error at our cluster when we're very full (>97%) and very few nodes have any empty space. This usually happens because we have two very large no

Re: ssh issues

2009-05-21 Thread Brian Bockelman
Hey Pankil, Use ~/.ssh/config to set the default key location to the proper place for each host, if you're going down that route. I'd remind you that SSH is only used as a convenient method to launch daemons. If you have a preferred way to start things up on your cluster, you can use tha

Re: Could only be replicated to 0 nodes, instead of 1

2009-05-21 Thread Brian Bockelman
On May 21, 2009, at 2:01 PM, Raghu Angadi wrote: I think you should file a jira on this. Most likely this is what is happening : * two out of 3 dns can not take anymore blocks. * While picking nodes for a new block, NN mostly skips the third dn as well since '# active writes' on it is la

Re: hadoop performance with very small cluster

2009-05-21 Thread Brian Bockelman
On May 21, 2009, at 2:30 AM, Miles Osborne wrote: if you mean "hadoop does not give a speed-up compared with a sequential version" then this is because of overhead associated with running the framework: your job will need to be scheduled, JVMs instantiated, data copied, data sorted etc etc.

Re: Setting up another machine as secondary node

2009-05-14 Thread Brian Bockelman
Hey Koji, It's an expensive operation - for the secondary namenode, not the namenode itself, right? I don't particularly care if I stress out a dedicated node that doesn't have to respond to queries ;) Locally we checkpoint+backup fairly frequently (not 5 minutes ... maybe less than the

Re: core-user Digest 23 Apr 2009 02:09:48 -0000 Issue 887

2009-04-22 Thread Brian Bockelman
Hey Nigel, HDFS is real nice in 0.19.1; we plan to stay on that version for a few more months (until 0.20.1 comes out). Can't say anything about the mapred side, we don't use it. Brian On Apr 23, 2009, at 1:30 PM, Nigel Daley wrote: No, I didn't mark 0.19.1 stable. I left 0.18.3 as our mo

Re: getting DiskErrorException during map

2009-04-21 Thread Brian Bockelman
Hey Jason, We've never had the hadoop.tmp.dir identical on all our nodes. Brian On Apr 22, 2009, at 10:54 AM, jason hadoop wrote: For reasons that I have never bothered to investigate I have never had a cluster work when the hadoop.tmp.dir was not identical on all of the nodes. My soluti

Re: max value for a dataset

2009-04-20 Thread Brian Bockelman
Hey Jason, Wouldn't this be avoided if you used a combiner to also perform the max() operation? A minimal amount of data would be written over the network. I can't remember if the map output gets written to disk first, then combine applied or if the combine is applied and then the data i

Re: Help with Map Reduce

2009-04-19 Thread Brian Bockelman
Hm, I don't know how equals() is implemented for Text, but I'd try: key.toString().equals("sex") Brian On Apr 19, 2009, at 11:29 AM, Reza wrote: Brian: Thanks for your response. I have 8 total keys and values. The code I show below is part of the whole thing, just to illustrate my probl

Re: Help with Map Reduce

2009-04-19 Thread Brian Bockelman
Hey Reza, From reading your code, you are calling this for the key "sex": output.collect("The total population is: ", (actual population)) and, for every other key: output.collect("The total population is: ", 0) You probably only want to call the output collector in the first case, not ever

Re: Interesting Hadoop/FUSE-DFS access patterns

2009-04-16 Thread Brian Bockelman
method. Tom On Mon, Apr 13, 2009 at 4:22 PM, Brian Bockelman wrote: Hey Todd, Been playing more this morning after thinking about it for the night -- I think the culprit is not the network, but actually the cache. Here's the output of your script adjusted to do the same calls as I w

Re: fyi: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks

2009-04-14 Thread Brian Bockelman
the tools get better, everyone wins. Brian On Tue, Apr 14, 2009 at 2:26 PM, Brian Bockelman wrote: Hey Guilherme, It's good to see comparisons, especially as it helps folks understand better what tool is the best for their problem. As you show in your paper, a MapReduce syste

Re: fyi: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks

2009-04-14 Thread Brian Bockelman
Hey Guilherme, It's good to see comparisons, especially as it helps folks understand better what tool is the best for their problem. As you show in your paper, a MapReduce system is hideously bad in performing tasks that column-store databases were designed for (selecting a single value

Re: Generating many small PNGs to Amazon S3 with MapReduce

2009-04-14 Thread Brian Bockelman
Hey Tim, Why don't you put the PNGs in a SequenceFile in the output of your reduce task? You could then have a post-processing step that unpacks the PNG and places it onto S3. (If my numbers are correct, you're looking at around 3TB of data; is this right? With that much, you might wan

Re: Interesting Hadoop/FUSE-DFS access patterns

2009-04-14 Thread Brian Bockelman
", argv[optind], strerror( posix_fadvise( fd, 0, 0, POSIX_FADV_DONTNEED ) ) ); failCount++; } close(fd); } exit( failCount ); } On Mon, Apr 13, 2009 at 4:01 PM, Scott Carey wrote: On 4/12/09 9:41 PM, "Brian Bockelman" wrote: Ok, here's something perhaps even more

Re: Interesting Hadoop/FUSE-DFS access patterns

2009-04-13 Thread Brian Bockelman
here... looks like TCP_WINDOW_SIZE isn't actually used for any socket configuration, so I don't think that will make a difference... still think networking might be the culprit, though. -Todd On Sun, Apr 12, 2009 at 9:41 PM, Brian Bockelman >wrote: Ok, here's somethi

Re: Interesting Hadoop/FUSE-DFS access patterns

2009-04-12 Thread Brian Bockelman
ccurs at 128KB, exactly. I'm a bit befuddled. I know we say that HDFS is optimized for large, sequential reads, not random reads - but it seems that it's one bug- fix away from being a good general-purpose system. Heck if I can find what's causing the issues though... Brian

Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman
On Apr 10, 2009, at 2:06 PM, Todd Lipcon wrote: On Fri, Apr 10, 2009 at 12:03 PM, Brian Bockelman >wrote: 0.19.1 with a few convenience patches (mostly, they improve logging so the local file system researchers can play around with our data patterns). Hey Brian, I'm curio

Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman
On Apr 10, 2009, at 1:53 PM, Stas Oskin wrote: 2009/4/10 Brian Bockelman Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going to be even better. We run about 300TB @ 2 replicas, and haven't had file loss that was Hadoop's fault since about January. Brian

Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman
January: 11 files to accidentally reformatting 2 nodes at once, 35 to a night with 2 dead nodes. Make no mistake - HDFS with 2 replicas is *not* an archive-quality file system. HDFS does not replace tape storage for long term storage. Brian 2009/4/10 Stas Oskin 2009/4/10 Brian

Re: HDFS read/write speeds, and read optimization

2009-04-10 Thread Brian Bockelman
On Apr 10, 2009, at 9:40 AM, Stas Oskin wrote: Hi. Depends. What hardware? How much hardware? Is the cluster under load? What does your I/O load look like? As a rule of thumb, you'll probably expect very close to hardware speed. Standard Xeon dual cpu, quad core servers, 4 GB RAM.

Re: Two degrees of replications reliability

2009-04-10 Thread Brian Bockelman
Most of the issues were resolved in 0.19.1 -- I think 0.20.0 is going to be even better. We run about 300TB @ 2 replicas, and haven't had file loss that was Hadoop's fault since about January. Brian On Apr 10, 2009, at 11:11 AM, Stas Oskin wrote: Hi. I know that there were some hard to

Re: HDFS read/write speeds, and read optimization

2009-04-09 Thread Brian Bockelman
On Apr 9, 2009, at 5:45 PM, Stas Oskin wrote: Hi. I have 2 questions about HDFS performance: 1) How fast are the read and write operations over network, in Mbps per second? Depends. What hardware? How much hardware? Is the cluster under load? What does your I/O load look like? As

Re: HDFS as a logfile ??

2009-04-09 Thread Brian Bockelman
Also, Chukwa (a project already in Hadoop contrib) is designed to do something similar with Hadoop directly: http://wiki.apache.org/hadoop/Chukwa I think some of the examples even mention Apache logs. Haven't used it personally, but it looks nice. Brian On Apr 9, 2009, at 11:14 PM, Alex

Re: Getting free and used space

2009-04-08 Thread Brian Bockelman
part - how do I specify the user when connecting? :) Is it a config file level, or run-time level setting? Regards. 2009/4/8 Brian Bockelman Hey Stas, Did you try this as a privileged user? There might be some permission errors... in most of the released versions, getUsed() is only av

Re: Getting free and used space

2009-04-08 Thread Brian Bockelman
Hey Stas, Did you try this as a privileged user? There might be some permission errors... in most of the released versions, getUsed() is only available to the Hadoop superuser. It may be that the exception isn't propagating correctly. Brian On Apr 8, 2009, at 3:13 AM, Stas Oskin wrote:

Re: problem running on a cluster of mixed hardware due to "Incompatible" buildVersion of JobTracker adn TaskTracker

2009-04-06 Thread Brian Bockelman
Ah yes, there you go ... so much for extrapolating on a Monday :). Sorry Bill! Brian On Apr 6, 2009, at 6:03 PM, Todd Lipcon wrote: On Mon, Apr 6, 2009 at 4:01 PM, Brian Bockelman wrote: Hey Bill, I might be giving you bad advice (I've only verified this for HDFS components o

Re: problem running on a cluster of mixed hardware due to "Incompatible" buildVersion of JobTracker adn TaskTracker

2009-04-06 Thread Brian Bockelman
Hey Bill, I might be giving you bad advice (I've only verified this for HDFS components on the 0.19.x branch, not the JT/TT or the 0.18.x branch), but... In my understanding, Hadoop only compares the base SVN revision number, not the build strings. Make sure that both have the SVN rev.

Re: Using HDFS to serve www requests

2009-04-06 Thread Brian Bockelman
Indeed, it would be a very nice interface to have (if anyone has some free time)! I know a few Caltech people who'd like to see how how their WAN transfer product (http://monalisa.cern.ch/FDT/) would work with HDFS; if there was a HDFS NIO interface, playing around with HDFS and FDT w

Re: Amazon Elastic MapReduce

2009-04-02 Thread Brian Bockelman
On Apr 2, 2009, at 3:13 AM, zhang jianfeng wrote: seems like I should pay for additional money, so why not configure a hadoop cluster in EC2 by myself. This already have been automatic using script. Not everyone has a support team or an operations team or enough time to learn how to d

Re: ANN: Hadoop UI beta

2009-03-31 Thread Brian Bockelman
Hey Stefan, I like it. I would like to hear a bit how the security policies work. If I open this up to "the world", how does "the world" authenticate/authorize with my cluster? I'd love nothing more to be able to give my users a dead-simple way to move files on and off the cluster. Thi

Re: hadoop-a small doubt

2009-03-30 Thread Brian Bockelman
On Mar 30, 2009, at 3:59 AM, W wrote: I already try the mountable HDFS, both webDav and FUSE approach, it seem both of it is not production ready .. Depends on what you define to be "production ready"; for a business serving HDFS to external customers directly, no. But then again, it's

Re: hadoop-a small doubt

2009-03-30 Thread Brian Bockelman
On Mar 30, 2009, at 3:53 AM, deepya wrote: Do you mean to say the node from which we want to access hdfs should also have hadoop installed on it??If that is the case then doesnt that node also become apart of the cluster?? Yes. You need the Hadoop client installed to access HDFS. You

Re: Using HDFS to serve www requests

2009-03-26 Thread Brian Bockelman
On Mar 26, 2009, at 8:55 PM, phil cryer wrote: When you say that you have huge images, how big is "huge?" Yes, we're looking at some images that are 100Megs in size, but nothing like what you're speaking of. This helps me understand Hadoop's usage better and unfortunately it won't be the fit

Re: Using HDFS to serve www requests

2009-03-26 Thread Brian Bockelman
On Mar 26, 2009, at 5:44 PM, Aaron Kimball wrote: In general, Hadoop is unsuitable for the application you're suggesting. Systems like Fuse HDFS do exist, though they're not widely used. We use FUSE on a 270TB cluster to serve up physics data because the client (2.5M lines of C++) doesn't

Re: Need Help hdfs -How to minimize access Time

2009-03-25 Thread Brian Bockelman
Hey Snehal (removing the core-dev list; please only post to one at a time), The access time should be fine, but it depends on what you define as an acceptable access time. If this is not acceptable, I'd suggest putting it behind a web cache like Squid. The best way to find out is to use

Re: Monitoring with Ganglia

2009-03-19 Thread Brian Bockelman
lled GangliaContext31 in /usr/local/hadoop-0.18.4/src/core/org/apache/hadoop/metrics/ganglia/ GangliaContext31.java. thanks, Tamir On Thu, Mar 19, 2009 at 3:25 PM, Brian Bockelman wrote: Hey Tamir, This is a very strange stack trace: java.la

Re: Monitoring with Ganglia

2009-03-19 Thread Brian Bockelman
here: http://www.sendspace.com/file/86v5jc Thanks, Tamir On Thu, Mar 19, 2009 at 2:51 PM, Brian Bockelman wrote: Hey Tamir, It appears the webserver stripped off your attachment. Do you have more of a stack trace available? Brian On Mar 19, 2009, at 7:25 AM, Tamir Kamara wrote: Hi, The full lsof | gr

Re: Monitoring with Ganglia

2009-03-19 Thread Brian Bockelman
which is the new one the "ant clean jar" command created. On Thu, Mar 19, 2009 at 2:00 PM, Brian Bockelman wrote: On Mar 19, 2009, at 6:56 AM, Tamir Kamara wrote: Hi Brian, I see GangliaContext31.class in the jar and GangliaContext31.java in the src folder. By the way, I

Re: Monitoring with Ganglia

2009-03-19 Thread Brian Bockelman
. Can you perform "lsof" on the running process and see if it's perhaps using the wrong JAR? Brian Thanks, Tamir On Thu, Mar 19, 2009 at 1:38 PM, Brian Bockelman wrote: Hey Tamir, Can you see the file GangliaContext31.java in your jar? In the source directory? Br

Re: Monitoring with Ganglia

2009-03-19 Thread Brian Bockelman
Mar 17, 2009 at 5:16 PM, Brian Bockelman wrote: On Mar 17, 2009, at 10:08 AM, Carlos Valiente wrote: On Tue, Mar 17, 2009 at 14:56, Tamir Kamara wrote: I don't know too much about multicast... and I'm using the default gmond conf file. The default multicast address seem

Re: Monitoring with Ganglia

2009-03-17 Thread Brian Bockelman
On Mar 17, 2009, at 10:08 AM, Carlos Valiente wrote: On Tue, Mar 17, 2009 at 14:56, Tamir Kamara wrote: I don't know too much about multicast... and I'm using the default gmond conf file. The default multicast address seems to be 239.2.11.71, so that's the one for your hadoop-metrics.pro

Re: Monitoring with Ganglia

2009-03-17 Thread Brian Bockelman
Yup, that's the next question: what's your recv channel in gmond.conf on that node? You can just send along the whole gmond.conf if you're not sure. If you set the metrics to be logged to a file, do they appear there? I.e., have you verified the metrics are working at all for the node?

Re: Monitoring with Ganglia

2009-03-17 Thread Brian Bockelman
Hey Tamir, I assume you want something like this: http://rcf.unl.edu/ganglia/?c=red-workers&h=node155&m=load_one&r=hour&s=descending&hc=4 (That link's old - where'd you find it? I'll update it...) Can you send out the relevant lines from the hadoop-metrics file? Also, can you do the followin

Re: Problem : data distribution is non uniform between two different disks on datanode.

2009-03-16 Thread Brian Bockelman
Hey Vaibhavj, Two notes beforehand: 1) When asking questions, you'll want to post the Hadoop version used. 2) You'll also want to only send to one mailing list at a time; it is a common courtesy. Can you provide the list with the outputs of "df -h"? Also, can you share what your namenode

Re: DataNode gets 'stuck', ends up with two DataNode processes

2009-03-09 Thread Brian Bockelman
ces I had saved away at one point, from the java forums, but perhaps this will get you started. http://forums.sun.com/thread.jspa?threadID=5297465&tstart=0 On Mon, Mar 9, 2009 at 11:23 AM, Brian Bockelman wrote: It's very strange. It appears that the second process is the result

Re: DataNode gets 'stuck', ends up with two DataNode processes

2009-03-09 Thread Brian Bockelman
It's very strange. It appears that the second process is the result of a fork call, yet has only one thread running whose gdb backtrace looks like this: (gdb) bt #0 0x003e10c0af8b in __lll_mutex_lock_wait () from /lib64/tls/ libpthread.so.0 #1 0x in ?? () Not very he

Re: Issues installing FUSE_DFS

2009-03-03 Thread Brian Bockelman
; We did have some trouble compiling FUSE-DFS but got through the compilation errors. Any advice on what to try next? Josh Patterson TVA -----Original Message- From: Brian Bockelman [mailto:bbock...@cse.unl.edu] Sent: Monday, March 02, 2009 5:30 PM To: core-user@hadoop.apache.org Subject: Re:

Re: Issues installing FUSE_DFS

2009-03-03 Thread Brian Bockelman
-----Original Message- From: Brian Bockelman [mailto:bbock...@cse.unl.edu] Sent: Monday, March 02, 2009 5:30 PM To: core-user@hadoop.apache.org Subject: Re: Issues installing FUSE_DFS Hey Matthew, We use the following command on 0.19.0: fuse_dfs -oserver=hadoop-name -oport=9000 /mnt/hadoo

Re: Issues installing FUSE_DFS

2009-03-02 Thread Brian Bockelman
Hey Matthew, We use the following command on 0.19.0: fuse_dfs -oserver=hadoop-name -oport=9000 /mnt/hadoop -oallow_other - ordbufffer=131072 Brian On Mar 2, 2009, at 4:12 PM, Hyatt, Matthew G wrote: When we try to mount the dfs from fuse we are getting the following errors. Has anyone seen

Re: Atomicity of file operations?

2009-02-26 Thread Brian Bockelman
On Feb 26, 2009, at 4:14 PM, Brian Long wrote: What kind of atomicity/visibility claims are made regarding the various operations on a FileSystem? I have multiple processes that write into local sequence files, then uploads them into a remote directory in HDFS. A map/reduce job runs which

Re: Unable to Decommission Node

2009-02-25 Thread Brian Bockelman
Hey Roger, This sounds vaguely familiar to me. Do you have multiple hostnames or multiple IPs in that node? In one of our dual-homed host, I think the sysadmin had to do something different to decommission it - something like list the IP in the exclude hosts? I can't remember. Brian

Re: hdfs disappears

2009-02-23 Thread Brian Bockelman
Hello, Where are you saving your data? If it's being written into /tmp, it will be deleted every time you restart your computer. I believe writing into /tmp is the default for Hadoop unless you changed it in hadoop-site.xml. Brian On Feb 23, 2009, at 10:00 PM, Anh Vũ Nguyễn wrote: Hi

Re: the question about the common pc?

2009-02-18 Thread Brian Bockelman
On Feb 18, 2009, at 11:43 PM, 柳松 wrote: Actually, there's a widely misunderstanding of this "Common PC" . Common PC doesn't means PCs which are daily used, It means the performance of each node, can be measured by common pc's computing power. In the matter of fact, we dont use Gb enthern

Re: JvmMetrics

2009-02-15 Thread Brian Bockelman
Hey David -- In case if no one has pointed you to this, you can submit this through JIRA. Brian On Feb 14, 2009, at 12:07 AM, David Alves wrote: Hi I ran into a use case where I need to keep two contexts for metrics. One being ganglia and the other being a file context (to do offline

Re: HDFS on non-identical nodes

2009-02-15 Thread Brian Bockelman
5 times consecutive attempts 4. Another balancer is working 5. I/O exception The default setting is 10% for each datanodes, for 1TB it is 100GB, for 3T is 300GB, and for 60GB is 6GB Hope helpful On Thu, Feb 12, 2009 at 10:06 AM, Brian Bockelman >wrote: On Feb 12, 2009, at 2:5

Re: HDFS on non-identical nodes

2009-02-12 Thread Brian Bockelman
On Feb 12, 2009, at 2:54 AM, Deepak wrote: Hi, We're running Hadoop cluster on 4 nodes, our primary purpose of running is to provide distributed storage solution for internal applications here in TellyTopia Inc. Our cluster consists of non-identical nodes (one with 1TB another two with 3 TB a

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman
ervers which each have 1Gbps connection. Does that help? Brian On Feb 10, 2009, at 4:46 PM, Brian Bockelman wrote: On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote: Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman
e. Brian Thank you, Mark On Tue, Feb 10, 2009 at 4:46 PM, Brian Bockelman wrote: On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote: Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to HDFS, S3 etc and among Stor

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman
ething is configured wrong. Have you just tried hadoop fs -put for some large file hanging around locally? If that doesn't go more than 5MB/s or so (when your hardware can obviously do such a rate), then there's probably a configuration issue. Brian Thank you, Mark On Tue,

Re: File Transfer Rates

2009-02-10 Thread Brian Bockelman
On Feb 10, 2009, at 4:10 PM, Wasim Bari wrote: Hi, Could someone help me to find some real Figures (transfer rate) about Hadoop File transfer from local filesystem to HDFS, S3 etc and among Storage Systems (HDFS to S3 etc) Thanks, Wasim What are you looking for? Maximum possible t

Re: java.io.IOException: Could not get block locations. Aborting...

2009-02-09 Thread Brian Bockelman
On Feb 9, 2009, at 7:50 PM, jason hadoop wrote: The other issue you may run into, with many files in your HDFS is that you may end up with more than a few 100k worth of blocks on each of your datanodes. At present this can lead to instability due to the way the periodic block reports to the n

Re: Backing up HDFS?

2009-02-09 Thread Brian Bockelman
On Feb 9, 2009, at 6:41 PM, Amandeep Khurana wrote: Why would you want to have another backup beyond HDFS? HDFS itself replicates your data so if the reliability of the system shouldnt be a concern (if at all it is)... It should be. HDFS is not an archival system. Multiple replicas does

Re: using HDFS for a distributed storage system

2009-02-09 Thread Brian Bockelman
ns. I really appreciate it. Once I test this setup, I will put the results back to the list. Thanks, Amit On Mon, Feb 9, 2009 at 12:39 PM, Brian Bockelman wrote: Hey Amit, Your current thoughts on keeping block size larger and removing the very small files are along the right line. Wh

Re: using HDFS for a distributed storage system

2009-02-09 Thread Brian Bockelman
Hey Amit, Your current thoughts on keeping block size larger and removing the very small files are along the right line. Why not chose the default size of 64MB or larger? You don't seem too concerned about the number of replicas. However, you're still fighting against the tide. You've

Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-06 Thread Brian Bockelman
t. -TCK --- On Wed, 2/4/09, Brian Bockelman wrote: From: Brian Bockelman Subject: Re: Batch processing with Hadoop -- does HDFS scale for parallel reads? To: core-user@hadoop.apache.org Date: Wednesday, February 4, 2009, 1:50 PM Sounds overly complicated. Complicated usually leads to mistake

Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread Brian Bockelman
, TCK --- On Wed, 2/4/09, Brian Bockelman wrote: From: Brian Bockelman Subject: Re: Batch processing with Hadoop -- does HDFS scale for parallel reads? To: core-user@hadoop.apache.org Date: Wednesday, February 4, 2009, 1:06 PM Hey TCK, We use HDFS+FUSE solely as a storage solution for a

Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread Brian Bockelman
Hey TCK, We use HDFS+FUSE solely as a storage solution for a application which doesn't understand MapReduce. We've scaled this solution to around 80Gbps. For 300 processes reading from the same file, we get about 20Gbps. Do consider your data retention policies -- I would say that Hadoo

Re: HDD benchmark/checking tool

2009-02-03 Thread Brian Bockelman
Also, you want to look at combining SMART hard drive monitoring (most drives support SMART at this point) and combine it with Nagios. It often lets us known when a hard drive is about to fail *and* when the drive is under-performing. Brian On Feb 3, 2009, at 6:18 PM, Aaron Kimball wrote:

Re: Is hadoop right for my problem

2009-02-03 Thread Brian Bockelman
Hey Chris, I think it would be appropriate. Look at it this way, it takes 1 mapper 1 minute to process 24k records, so it should take about 17 mappers to process all your tasks for the largest problem in one minute. Even if you still think your problem is too small, consider: 1) The possib

Re: HDFS Namenode Heap Size woes

2009-02-01 Thread Brian Bockelman
ean Knapp wrote: Brian, Thanks for jumping in as well. Is there a recommended way of manually triggering GC? Thanks, Sean On Sun, Feb 1, 2009 at 6:06 PM, Brian Bockelman wrote: Hey Sean, Dumb question: how much memory is used after a garbage collection cycle? Look at the graph "jvm.

Re: HDFS Namenode Heap Size woes

2009-02-01 Thread Brian Bockelman
Hey Sean, Dumb question: how much memory is used after a garbage collection cycle? Look at the graph "jvm.metrics.memHeapUsedM": http://rcf.unl.edu/ganglia/?m=network_report&r=hour&s=descending&c=red&h=hadoop-name&sh=1&hc=4&z=small If you tell the JVM it has 16GB of memory to play with, it wil

Re: Question about HDFS capacity and remaining

2009-01-30 Thread Brian Bockelman
For what it's worth, our organization did extensive tests on many filesystems benchmarking their performance when they are 90 - 95% full. Only XFS retained most of its performance when it was "mostly full" (ext4 was not tested)... so, if you are thinking of pushing things to the limits, tha

Re: Hadoop+s3 & fuse-dfs

2009-01-29 Thread Brian Bockelman
Hey all, This is a long-shot, but I've noticed before that libhdfs doesn't load hadoop-site.xml *unless* hadoop-site.xml is in your local directory. As a last try, maybe cd $HADOOP_HOME/conf and try running it from there? Brian On Jan 28, 2009, at 7:20 PM, Craig Macdonald wrote: Hi Roopa,

Re: files are inaccessible after HDFS upgrade from 0.18.1 to 1.19.0

2009-01-27 Thread Brian Bockelman
Hey YY, At a more basic level -- have you run fsck on that file? What were the results? Brian On Jan 27, 2009, at 10:54 AM, Bill Au wrote: Did you start your namenode with the -upgrade after upgrading from 0.18.1 to 0.19.0? Bill On Mon, Jan 26, 2009 at 8:18 PM, Yuanyuan Tian wrote:

Re: HDFS - millions of files in one directory?

2009-01-25 Thread Brian Bockelman
Hey Mark, You'll want to watch your name node requirements -- tossing a wild- guess out there, a billion files could mean that you need on the order of terabytes of RAM in your namenode. Have you considered using: a) Using SequenceFile (appropriate for binary data, I believe -- but limits

Re: Upgrading and patching

2009-01-19 Thread Brian Bockelman
: Thanks Brian, I have just one more question: When building my own release where do I enter in the version and compiled by information? Thanks, Phil On Fri, Jan 16, 2009 at 6:23 PM, Brian Bockelman wrote: Hey Philip, I've found it easier to download the release, apply the patches,

Re: Upgrading and patching

2009-01-16 Thread Brian Bockelman
Hey Philip, I've found it easier to download the release, apply the patches, and then re-build the release. It's really pleasant to build the release. I suppose it's equivalent to check it out from SVN. Brian On Jan 16, 2009, at 1:46 PM, Philip wrote: Hello All, I'm currently trying to

Re: hadoop 0.19.0 and data node failure

2009-01-16 Thread Brian Bockelman
Hey Kumar, Hadoop won't let you write new blocks if it can't write them at the right replica level. You've requested to write a block with two replicas on a system where there's only one datanode alive. I'd hope that it wouldn't let you create a new file! Brian On Jan 16, 2009, at 12:

Namenode response to out-of-disk space?

2009-01-08 Thread Brian Bockelman
Hey, What happens to the namenode when it runs out of disk space. From reading the lists, it appears that: a) Journal writes use pre-allocated space, so when a sync is actually done, HDFS should be guaranteed to write the full sync and not a partial one. b) When the namenode detects it c

  1   2   >