Re: HDFS out of space
On 6/22/09 10:12 AM, Kris Jirapinyo kjirapi...@biz360.com wrote: Hi all, How does one handle a mount running out of space for HDFS? We have two disks mounted on /mnt and /mnt2 respectively on one of the machines that are used for HDFS, and /mnt is at 99% while /mnt2 is at 30%. Is there a way to tell the machine to balance itself out? I know for the cluster, you can balance it using start-balancer.sh but I don't think that it will tell the individual machine to balance itself out. Our hack right now would be just to delete the data on /mnt, since we have replication of 3x, we should be OK. But I'd prefer not to do that. Any thoughts? Decommission the entire node, wait for data to be replicated, re-commission, then do HDFS rebalance. It blows, no doubt about it, but the admin tools in the space are... lacking.
Re: Making sure the tmp directory is cleaned?
On 6/22/09 12:15 PM, Qin Gao q...@cs.cmu.edu wrote: Do you know if the tmp directory on every map/reduce task will be deleted automatically after the map task finishes or will do I have to delete them? I mean the tmp directory that automatically created by on current directory. Past experience says that users will find writable space on nodes and fill it, regardless of what Hadoop may do to try and keep it clean. It is a good idea to just wipe those spaces clean during hadoop upgrades and other planned downtimes.
Re: Multicluster Communication
On 6/19/09 3:49 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: Why do you want to do this in the first place? It seems like you want cluster1 to be a plain HDFS cluster and cluster2 to be a mapred cluster. Doing something like that will be disastrous - Hadoop is all about sending computation closer to your data. If you don't want that, you need not even use hadoop. Given some of the limitations with HDFS (quota operability, security), I can easily why it would be desirable to have static data coming from one grid while doing computation/intermediate outputs/real output to another. Using performance as your sole metric of viability is a bigger disaster waiting to happen. Sure, we crashed the file system, but look how fast it went down in flames!
Re: Restricting quota for users in HDFS
On 6/15/09 11:16 PM, Palleti, Pallavi pallavi.pall...@corp.aol.com wrote: We have chown command in hadoop dfs to make a particular directory own by a person. Do we have something similar to create user with some space limit/restrict the disk usage by a particular user? Quotas are implemented on a per-directory basis, not per-user. There is no support for this user can have X space, regardless of where he/she writes only this directory has a limit of X space, regardless of who writes there.
Re: data in yahoo / facebook hdfs
On 6/13/09 9:00 AM, PORTO aLET portoa...@gmail.com wrote: I am just wondering what do facebook/yahoo do with the data in hdfs after they finish processing the log files or whatever that are in hdfs? Are they simply deleted? or get backed up in tape ? whats the typical process? The grid ops team here at Yahoo! has a strict retention policy that dictates the data is deleted after X time period. We perform no backups of the data on the grid. It is also worth mentioning that the data is loaded from the primary source, so in the case of data corruption (hai hadoop-0.18) or accidental deletion (where are my snapshots dev people?), we reload the data from that primary source. (dependent, of course, on whether they still have it or not) Also what is the process of adding a new node to the hadoop cluster? simply connect a new computer to the network (and setup the hadoop conf)? http://wiki.apache.org/hadoop/FAQ#17
Re: ssh issues
On 5/26/09 3:40 AM, Steve Loughran ste...@apache.org wrote: HDFS is as secure as NFS: you are trusted to be who you say you are. Which means that you have to run it on a secured subnet -access restricted to trusted hosts and/or one two front end servers or accept that your dataset is readable and writeable by anyone on the network. There is user identification going in; it is currently at the level where it will stop someone accidentally deleting the entire filesystem if they lack the rights. Which has been known to happen. Actually, I'd argue that HDFS is worse than even rudimentary NFS implementations. Off the top of my head: a) There is no equivalent of squash root/force anonymous. Any host can assume privilege. b) There is no 'read only from these hosts'. If you can read blocks over Hadoop RPC, you can write as well (minus safe mode).
Re: Optimal Filesystem (and Settings) for HDFS
On 5/18/09 11:33 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Do not forget 'tune2fs -m 2'. By default this value gets set at 5%. With 1 TB disks we got 33 GB more usable space. Talk about instant savings! Yup. Although, I think we're using -m 1. On Mon, May 18, 2009 at 1:31 PM, Alex Loddengaard a...@cloudera.com wrote: I believe Yahoo! uses ext3, Yup. They won't buy me enough battery backed RAM to use a memory file system. ;)
Re: Beware sun's jvm version 1.6.0_05-b13 on linux
On 5/15/09 11:38 AM, Owen O'Malley o...@yahoo-inc.com wrote: We have observed that the default jvm on RedHat 5 I'm sure some people are scratching their heads at this. The default JVM on at least RHEL5u0/1 is a GCJ-based 1.4, clearly incapable of running Hadoop. We [and, really, this is my doing... ^.^ ] replace it with the JVM from the JPackage folks. So while this isn't the default JVM that comes from RHEL, the warning should still be heeded.
Re: Advice on restarting HDFS in a cron
On 4/24/09 9:31 AM, Marc Limotte mlimo...@feeva.com wrote: I've heard that HDFS starts to slow down after it's been running for a long time. And I believe I've experienced this. We did an upgrade (== complete restart) of a 2000 node instance in ~20 minutes on Wednesday. I wouldn't really consider that 'slow', but YMMV. I suspect people aren't running the secondary name node and therefore have massively large edits file. The name node appears slow on restart because it has to apply the edits to the fsimage rather than having the secondary keep it up to date.
Re: tuning performance
On 3/13/09 11:25 AM, Vadim Zaliva kroko...@gmail.com wrote: When you stripe you automatically make every disk in the system have the same speed as the slowest disk. In our experiences, systems are more likely to have a 'slow' disk than a dead one and detecting that is really really hard. In a distributed system, that multiplier effect can have significant consequences on the whole grids performance. All disk are the same, so there is no speed difference. There will be when they start to fail. :)
Re: Backing up HDFS?
On 2/9/09 4:41 PM, Amandeep Khurana ama...@gmail.com wrote: Why would you want to have another backup beyond HDFS? HDFS itself replicates your data so if the reliability of the system shouldnt be a concern (if at all it is)... I'm reminded of a previous job where a site administrator refused to make tape backups (despite our continual harassment and pointing out that he was in violation of the contract) because he said RAID was good enough. Then the RAID controller failed. When we couldn't recover data from the other mirror he was fired. Not sure how they ever recovered, esp. considering what the data was they lost. Hopefully they had a paper trail. To answer Nathan's question: On Mon, Feb 9, 2009 at 4:17 PM, Nathan Marz nat...@rapleaf.com wrote: How do people back up their data that they keep on HDFS? We have many TB of data which we need to get backed up but are unclear on how to do this efficiently/reliably. The content of our HDFSes is loaded from elsewhere and is not considered 'the source of authority'. It is the responsibility of the original sources to maintain backups and we then follow their policies for data retention. For user generated content, we provide *limited* (read: quota'ed) NFS space that is backed up regularly. Another strategy we take is multiple grids in multiple locations that get the data loaded simultaneously. The key here is to prioritize your data. Impossible to replicate data gets backed up using whatever means necessary, hard-to-regenerate data, next priority. Easy to regenerate and ok to nuke data, doesn't get backed up.
Re: Cannot run program chmod: error=12, Not enough space
On 1/28/09 7:42 PM, Andy Liu andyliu1...@gmail.com wrote: I'm running Hadoop 0.19.0 on Solaris (SunOS 5.10 on x86) and many jobs are failing with this exception: Error initializing attempt_200901281655_0004_m_25_0: java.io.IOException: Cannot run program chmod: error=12, Not enough space at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) ... at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) ... 20 more However, all the disks have plenty of disk space left (over 800 gigs). Can somebody point me in the right direction? Not enough space is usually SysV kernel speak for not enough virtual memory to swap. See how much mem you have free.
Re: issues with hadoop in AIX
On 12/27/08 12:18 AM, Arun Venugopal arunvenugopa...@gmail.com wrote: Yes, I was able to run this on AIX as well with a minor change to the DF.java code. But this was more of a proof of concept than on a production system. There are lots of places where Hadoop (esp. in contrib) interprets the output of Unix command line utilities. Changes like this are likely going to be required for AIX and other Unix systems that aren't being used by a committer. :(
Re: 64 bit namenode and secondary namenode 32 bit datanode
On 11/25/08 3:58 PM, Sagar Naik [EMAIL PROTECTED] wrote: I am trying to migrate from 32 bit jvm and 64 bit for namenode only. *setup* NN - 64 bit Secondary namenode (instance 1) - 64 bit Secondary namenode (instance 2) - 32 bit datanode- 32 bit From the mailing list I deduced that NN-64 bit and Datanode -32 bit combo works Yup. That's how we run it. But, I am not sure if S-NN-(instance 1--- 64 bit ) and S-NN (instance 2 -- 32 bit) will work with this setup. Considering that the primary and secondary process essentially the same data, they should have the same memory requirements. In other words, if you need 64-bit for the name node, your secondary is going to require it too. I'm also not sure if you can have two secondaries. I'll let someone else chime in on that. :)
Re: ls command output format
On 11/21/08 6:03 AM, Alexander Aristov [EMAIL PROTECTED] wrote: Trying hadoop-0.18.2 I got next output [root]# hadoop fs -ls / Found 2 items drwxr-xr-x - root supergroup 0 2008-11-21 08:08 /mnt drwxr-xr-x - root supergroup 0 2008-11-21 08:19 /repos ... which reminds me. I really wish ls didn't default to -l.
Re: Best way to handle namespace host failures
On 11/10/08 10:42 PM, Dhruba Borthakur [EMAIL PROTECTED] wrote: 2. Create a virtual IP, say name.xx.com that points to the real machine name of the machine on which the namenode runs. Everyone doing this should be aware of the discussion happening in https://issues.apache.org/jira/browse/HADOOP-3988 though.
Re: NameNode memory usage and 32 vs. 64 bit JVMs
On 11/10/08 1:30 AM, Aaron Kimball [EMAIL PROTECTED] wrote: It sounds like you think the 64- and 32-bit environments are effectively interchangable. May I ask why are you using both? The 64bit environment gives you access to more memory; do you see faster performance for the TT's in 32-bit mode? Do you get bit by library compatibility bugs that others should watch out for in running a dual-mode Hadoop environment? Some random thoughts on our mixed environment: A) The vast majority of user provided (legacy) code is 32-bit. Since you can't mix 64 and 32 bit objects at link or runtime, it just makes sense for us to run TTs, etc, by default as 32-bit to give us the most bang for our buck. B) In the case of the data node, the memory usage is small enough that the 64-bit JVM isn't needed. C) Since we currently run HOD, it should be possible for users to switch their bit-ness and I think we have a handful of users that do. We'll probably lose this capability when we go back to a static job tracker. :( D) For streaming jobs, the bit-ness of the JVM is irrelevant. 32-bit is better due to the smaller footprint since streaming jobs eat memory like it was candy. :) E) We load the 64-bit and 32-bit versions of libraries on our nodes, thus allowing us to move our bit-ness whenever we like. This makes for a fat image (so no RAM disk for the OS for us!), but given the streaming VM issues, it works out mostly in our favor anyway. In general, 64-bit code runs slower than 32-bit code. So unless one needs to access more memory or has external dependencies (JNIs, whatever), 32-bit for your Java environment is the way to go. The name node and maybe a static job tracker are the potential problem children here and places where I suspect most people will be using the 64-bit JVM.
Re: hadoop start problem
On 11/10/08 6:18 AM, Brian MacKay [EMAIL PROTECTED] wrote: I had a similar problem when I upgraded... not sure of details why, but I had permissions problems trying to develop and run on windows out of cygwin. At Apachecon, we think we identified a case where someone forgot to copy the newer hadoop-defaults.xml into their old configuration directories that they were using post-upgrade. Hadoop acts really strangely under those conditions.
Re: Can you specify the user on the map-reduce cluster in Hadoop streaming
On 11/10/08 12:21 PM, Rick Hangartner [EMAIL PROTECTED] wrote: But is there a proper way to allow developers to specify a remote_username they legitimately have access to on the cluster if it is not the same as the local_username of the account on their own machine they are using to submit a streaming job without setting HDFS permissions to 777? There are ways that the Hadoop security as currently implemented can be bypassed. If you really want to know how, that's probably better not asked on a public list. ;) But I'm curious as to your actual use case. From what I can gather from your description, there are two possible solutions, depending upon what you're looking to accomplish: A) Turn off permissions B) Create a group and make the output directory group writable We use B a lot. We don't use A at all.
Re: NameNode memory usage and 32 vs. 64 bit JVMs
On 11/6/08 10:17 PM, C G [EMAIL PROTECTED] wrote: I've got a grid which has been up and running for some time. It's been using a 32 bit JVM. I am hitting the wall on memory within NameNode and need to specify max heap size 4G. Is it possible to switch seemlessly from 32bit JVM to 64bit? I've tried this on a small test grid and had no issues, but I want to make sure it's OK to proceed. It should be. We run name node with 64-bit JVM and everything else with 32-bit. Speaking of NameNode, what does it keep in memory? Our memory usage ramped up rather suddenly recently. Some one/thing created a ton of files, likely small. Check your file system contents, use the fs count command, etc, to look for anomalies. Also, does SecondaryNameNode require the same amount of memory as NameNode? We're actually starting to see that it requires a tad bit more.
Re: Hadoop hardware specs
On 11/4/08 2:16 AM, Arijit Mukherjee [EMAIL PROTECTED] wrote: * 1-5 TB external storage I'm curious to find out what sort of specs do people use normally. Is the external storage essential or will the individual disks on each node be sufficient? Why would you need an external storage in a hadoop cluster? The big reason for the external storage is two fold: A) Provide shared home directory (especially for the HDFS user so that it is easy to use the start scripts that call ssh) B) An off-machine copy of the fsimage and edits file as used by the name node. This way if the name node goes belly up, you'll have an always up-to-date backup to recover. How can I find out what other projects on hadoop are using? Slide 12 of the Apachecon presentation I did earlier this year talks about what Yahoo!'s typical node looks like. For a small 5 node cluster, your hardware specs seem fine to me. An 8GB namenode for 4 data nodes (or maybe even running nn on the same machine as a data node if memory size of jobs is kept in check) should be a-ok, even if you double the storage. You're likely going to run out of disk space before the name node starts swapping.
Re: Keep free space with du.reserved and du.pct
On 10/21/08 3:33 AM, Jean-Adrien [EMAIL PROTECTED] wrote: I expected to keep 3.75 Gb free. But free space goes under 1 Gb, as if I kept the default settings I noticed that you're running on /. In general, this is a bad idea, as space can disappear in various ways and you'll never know. For example, /var/log can grow tremendously without warning or there might be a deleted-but-still-open file handle on /tmp. What does a du on the dfs directories tell you? How space is *actually* being used by Hadoop? You might also look around for dead task leftovers. I read a bit in jira (HADOOP-2991) and I saw that the implementation of these directives was subject to discussions. But it is not marked as affecting 0.18.1. What is the situation now ? I'm fairly certain it is unchanged. None of the developers seem particularly interested in a static allocation method, deeming it too hard to maintain when you have large or heterogeneous clusters. HADOOP-2816 going into 0.19 is somewhat relevant, though, because the name node UI is completely wrong when it comes to the actual capacity.
Re: Pushing jar files on slave machines
On 10/13/08 11:06 AM, Tarandeep Singh [EMAIL PROTECTED] wrote: I want to push third party jar files that are required to execute my job, on slave machines. What is the best way to do this? Use a DistributedCache as part of your job submission.
Re: How to make LZO work?
On 10/9/08 6:46 PM, Songting Chen [EMAIL PROTECTED] wrote: Does that mean I have to rebuild the native library? Also, the LZO installation puts liblzo2.a and liblzo2.la under /usr/local/lib. There is no liblzo2.so there. Do I need to rename them to liblzo2.so somehow? You need to compile and install lzo2 as a shared library. IIRC, this is not the default. Also, the shared version (.so) will need to be part of your link path (LD_LIBRARY_PATH env var, /etc/ld.so.conf on Linux, runtime option (usually -R) to ld, ...) when you fire up the JVM so that Java can locate it when it needs it.
Re: Hadoop and security.
On 10/6/08 6:39 AM, Steve Loughran [EMAIL PROTECTED] wrote: Edward Capriolo wrote: You bring up some valid points. This would be a great topic for a white paper. -a wiki page would be a start too I was thinking about doing Deploying Hadoop Securely for a ApacheCon EU talk, as by that time, some of the basic Kerberos stuff should be in place... This whole conversation is starting to reinforce the idea
Re: is 12 minutes ok for dfs chown -R on 45000 files ?
On 10/2/08 11:33 PM, Frank Singleton [EMAIL PROTECTED] wrote: Just to clarify, this is for when the chown will modify all files owner attributes eg: toggle all from frank:frank to hadoop:hadoop (see below) When we converted from 0.15 to 0.16, we chown'ed all of our files. The local dev team wrote the code in https://issues.apache.org/jira/browse/HADOOP-3052 , but it wasn't committed as a standard feature as they viewed this as a one off. :( Needless to say, running a large chown as a MR job should be significantly faster.
Re: Hadoop Cluster Size Scalability Numbers?
On 9/21/08 9:40 AM, Guilherme Menezes [EMAIL PROTECTED] wrote: We currently have 4 nodes (16GB of ram, 6 * 750 GB disks, Quad-Core AMD Opteron processor). Our initial plans are to perform a Web crawl for academic purposes (something between 500 million and 1 billion pages), and we need to expand the number of nodes for that. Is it better to have a larger number of nodes simpler than the ones we currently have (less memory, less processing?) in terms of Hadoop performance? Your current boxes seem overpowered for crawling. If it were me, I'd probably: a) turn the current four machines into dedicated namenode, job tracker, secondary name node, oh-no-a-machine-just-died! backup node (setup an nfs server on it and run it as your secondary direct copy of the fsimage and edits file if you don't have one). With 16gb name nodes, you should be able to store a lot of data. b) when you buy new nodes, I'd cut down on memory and cpu and just turn them into your work horses That said, I know little-to-nothing about crawling. So, IMHO on the above.
Re: How to manage a large cluster?
On 9/11/08 2:39 AM, Alex Loddengaard [EMAIL PROTECTED] wrote: I've never dealt with a large cluster, though I'd imagine it is managed the same way as small clusters: Maybe. :) -Use hostnames or ips, whichever is more convenient for you Use hostnames. Seriously. Who are you people using raw IPs for things? :) Besides, you're going to need it for the eventual support of Kerberos. -All the slaves need to go into the slave file We only put this file on the namenode and 2ndary namenode to prevent accidents. -You can update software by using bin/hadoop-daemons.sh. Something like: #bin/hadoop-daemons.sh rsync (mastersrcpath) (localdestpath) We don't use that because it doesn't take inconsideration down nodes (and you *will* have down nodes!) or deal with nodes that are outside the grid (such as our gateways/bastion hosts, data loading machines, etc). Instead, use a real system configuration management package such as bcfg2, smartfrog, puppet, cfengine, etc. [Steve, you owe me for the plug. :) ] I created a wiki page that currently contains one tip for managing large clusters. Could others add to this wiki page? http://wiki.apache.org/hadoop/LargeClusterTips Quite a bit of what we do is covered in the latter half of http://tinyurl.com/5foamm . This is a presentation I did at ApacheCon EU this past April that included some of the behind-the-scenes of the large clusters at Y!. At some point I'll probably do an updated version that includes more adminy things (such as why we push four different types of Hadoop configurations per grid) while others talk about core Hadoop stuff.
Re: critical name node problem
On 9/5/08 5:53 AM, Andreas Kostyrka [EMAIL PROTECTED] wrote: Another idea would be a tool or namenode startup mode that would make it ignore EOFExceptions to recover as much of the edits as possible. We clearly need to change the how to configure docs to make sure people put at least two directories on two different storage systems for the dfs.name.dir . This problem seems to happen quite often, and having two+ dirs helps protect against it. We recently had one of the disks on one of our copies go bad. The system kept going just fine until we had a chance to reconfig the name node. That said, I've just HADOOP-4080 to help alert admins in these situations.
Re: Slaves Hot-Swaping
On 9/2/08 8:33 AM, Camilo Gonzalez [EMAIL PROTECTED] wrote: I was wondering if there is a way to Hot-Swap Slave machines, for example, in case an Slave machine fails while the Cluster is running and I want to mount a new Slave machine to replace the old one, is there a way to tell the Master that a new Slave machine is Online without having to stop and start again the Cluster? I would appreciate the name of this, I don't think it is named Hot-Swaping, I don't know even if this exists. Lol :) Using hadoop dfsadmin -refreshNodes, you can have the name node reload the include and exclude files.
Re: Load balancing in HDFS
On 8/27/08 7:51 AM, Mork0075 [EMAIL PROTECTED] wrote: This sound really interesting. And while increasing the replicas for certain files, the available troughput for these files increases too? Yes, as there are more places to pull the file from. This needs to get weighed against the amount of work the name node will use to re-replicate the file in case of failure and the total amount of disk space used... So the extra bandwidth isn't free. Allen Wittenauer schrieb: On 8/27/08 12:54 AM, Mork0075 [EMAIL PROTECTED] wrote: i'am planning to use HDFS as a DFS in a web application evenvironment. There are two requirements: fault tolerence, which is ensured by the replicas and load balancing. There is a SPOF in the form of the name node. So depending upon your needs, that may or may not be acceptable risk. On 8/27/08 1:23 AM, Mork0075 [EMAIL PROTECTED] wrote: Some documents stored in the HDFS could be very popular and therefor accessed more often then others. Then HDFS needs to balance the load - distribute the requests to different nodes. Is i possible? Not automatically. However, it is possible to manually/programmatically increase the replication on files. This is one of the possible uses for the new audit logging in 0.18... By watching the log, it should be possible to determine which files need a higher replication factor.
Re: Load balancing in HDFS
On 8/27/08 12:54 AM, Mork0075 [EMAIL PROTECTED] wrote: i'am planning to use HDFS as a DFS in a web application evenvironment. There are two requirements: fault tolerence, which is ensured by the replicas and load balancing. There is a SPOF in the form of the name node. So depending upon your needs, that may or may not be acceptable risk. On 8/27/08 1:23 AM, Mork0075 [EMAIL PROTECTED] wrote: Some documents stored in the HDFS could be very popular and therefor accessed more often then others. Then HDFS needs to balance the load - distribute the requests to different nodes. Is i possible? Not automatically. However, it is possible to manually/programmatically increase the replication on files. This is one of the possible uses for the new audit logging in 0.18... By watching the log, it should be possible to determine which files need a higher replication factor.
Re: Integrate HADOOP and Map/Reduce paradigm into HPC environment
On 8/17/08 10:56 AM, Filippo Spiga [EMAIL PROTECTED] wrote: I read the tutorial about HOD (Hadoop on demand) but HOD use torque only for initial node allocation. I would use TORQUE also for computation, allowing users to load data into HDFS, submit a TORQUE JOB that execute a Map/Reduce task and after retrive results. It's important for me that Map/Reduce tasks run only on the subset of nodes selected by TORQUE. Can someone help me? This is essentially how we use HOD. We have a HDFS that run on all nodes and then use torque to allocate mini-mapreduce clusters on top of that. To limit the amount of nodes get used for MapReduce, IIRC you need to create a queue in torque to be used for MR. Then limit the amount of nodes that queue will be allowed to allocate at one time in Maui using the classcfg stuff.
Re: NameNode hardware specs
On 8/12/08 12:07 PM, lohit [EMAIL PROTECTED] wrote: - why RAID5? - If running RAID 5, why is this necessary? Not absolute necessary. I'd be afraid of the write penalty of RAID5 vs, say, RAID10 or even just plain RAID1. For the record, I don't think we have any production systems except maybe one that uses any sort of RAID methods on the name node. I'm sure Steve will pop up at some point and explain his reasoning on this one. ;)
Re: performance not great, or did I miss something?
On 8/8/08 1:25 PM, James Graham (Greywolf) [EMAIL PROTECTED] wrote: 226GB of available disk space on each one; 4 processors (2 x dualcore) 8GB of RAM each. Some simple stuff: (Assuming SATA): Are you using AHCI? Do you have the write cache enabled? Is the topologyProgram providing proper results? Is DNS performing as expected? Is it fast? How many tasks per node? How much heap does your name node have? Is it going into garbage collection or swapping?
Re: Configuration: I need help.
On 8/6/08 11:52 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: You can put the same hadoop-site.xml on all machines. Yes, you do want a secondary NN - a single NN is a SPOF. Browser the archives a few days back to find an email from Paul about DRBD (disk replication) to avoid this SPOF. Keep in mind that even with a secondary name node, you still have a SPOF. If the NameNode process dies, so does your HDFS.
Re: having different HADOOP_HOME for master and slaves?
On 8/4/08 11:10 AM, Meng Mao [EMAIL PROTECTED] wrote: I suppose I could, for each datanode, symlink things to point to the actual Hadoop installation. But really, I would like the setup that is hinted as possible by statement 1). Is there a way I could do it, or should that bit of documentation read, All machines in the cluster _must_ have the same HADOOP_HOME? If you run the -all scripts, they assume the location is the same. AFAIK, there is nothing preventing you from building your own -all scripts that point to the different location to start/stop the data nodes.
Re: Hadoop 4 disks per server
On 7/29/08 6:37 PM, Rafael Turk [EMAIL PROTECTED] wrote: I´m setting up a cluster with 4 disks per server. Is there any way to make Hadoop aware of this setup and take benefits from that? This is how we run our nodes. You just need to list the four file systems in the configuration files and the datanode and map/red processes will know what to do.
Re: Restricting Job Submission Access
On 7/17/08 3:33 PM, Theocharis Ian Athanasakis [EMAIL PROTECTED] wrote: What's the recommended way to restrict access to job submissions and HDFS access, besides a firewall? We basically put bastion hosts (we call them gateways) next to hadoop that users use to submit jobs, access the HDFS, etc. By limiting who can get onto the gateways, we limit access. We also use HOD, so we have all of Torque's access and resource control capabilities as well. Not a replacement for real security, obviously. Oh, I think there might be some diagrams, pictures and other info about this in my preso on the hadoop wiki.
Re: client connect as different username?
On 6/11/08 5:17 PM, Chris Collins [EMAIL PROTECTED] wrote: The finer point to this is that in development you may be logged in as user x and have a shared hdfs instance that a number of people are using. In that mode its not practical to sudo as you have all your development tools setup for userx. hdfs is setup with a single user, what is the procedure to add users to that hdfs instance? It has to support it surely? Its really not obvious, looking in the hdfs docs that come with the distro nothing springs out. the hadoop command line tool doesnt have anything that vaguely looks like a way to create a user. User information is sent from the client. The code literally does a 'whoami' and 'groups' and sends that information to the server. Shared data should be handled just like you would in UNIX: - create a directory - set permissions to be insecure - go crazy
Re: compressed/encrypted file
On 6/5/08 11:38 AM, Ted Dunning [EMAIL PROTECTED] wrote: We use encryption on log files using standard AES. I wrote an input format to deal with it. Key distribution should be done better than we do it. My preference would be to insert an auth key into the job conf which is then used by the input to open a well known keyring via an API that prevents auths from surviving for long term. This sounds like it opens the door for key stealing in a multi-user/static job tracker system, since the job conf is readable by all jobs running on the same machine.
Re: compressed/encrypted file
On 6/5/08 11:57 AM, Ted Dunning [EMAIL PROTECTED] wrote: Can you suggest an alternative way to communicate a secret to hadoop tasks short of embedding it into source code? This is one of the reasons why we use hod--job isolation such that it helps prevent data leaks from one job to the next.
Re: hadoop on EC2
On 5/28/08 1:22 PM, Andreas Kostyrka [EMAIL PROTECTED] wrote: I just wondered what other people use to access the hadoop webservers, when running on EC2? While we don't run on EC2 :), we do protect the hadoop web processes by putting a proxy in front of it. A user connects to the proxy, authenticates, and then gets the output from the hadoop process. All of the redirection magic happens via a localhost connection, so no data is leaked unprotected.
Re: How do people keep their client configurations in sync with the remote cluster(s)
On 5/15/08 8:56 AM, Steve Loughran [EMAIL PROTECTED] wrote: Allen Wittenauer wrote: On 5/15/08 5:05 AM, Steve Loughran [EMAIL PROTECTED] wrote: I have a question for users: how do they ensure their client apps have configuration XML file that are kept up to date? We control the client as well as the servers, so it all gets pushed at once. :) yes, but you use NFS, so you have your own problems, like the log message NFS Server not responding still trying appearing across everyone's machines simultaneously, which is to be feared almost as much as when ClearCase announces that its filesystem is offline. We don't use NFS for this.
Re: using sge, or drmaa for HOD
On 5/2/08 7:22 AM, Andre Gauthier [EMAIL PROTECTED] wrote: Also I was thinking of modifying HOD to run on grid engine. I haven't really begun to pour over all the code for HOD but, my question is this, can I just write a python module similar to that of torque.py under hod/schedulers/ for sge or would this require significant modification in HOD and possibly hadoop? Given that both torque and SGE are based off of (IEEE standard) PBS, it might even run unmodified.
Re: Hadoop Cluster Administration Tools?
On 5/1/08 5:00 PM, Bradford Stephens [EMAIL PROTECTED] wrote: *Very* cool information. As someone who's leading the transition to open-source and cluster-orientation at a company of about 50 people, finding good tools for the IT staff to use is essential. Thanks so much for the continued feedback. Hmm. I should upload my slides.
Re: Help with configuration
On 4/22/08 7:12 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I am getting this annoying error message every time I start bin/start-all.sh with one single node command-line: line 0: Bad configuration option: ConnectTimeout Do you know what could be the issue?? I can not find it in the FAQs, Thank you for your help. You likely have an ancient version of ssh installed that doesn't support ConnectTimeout. Best bet is to hack the start-all.sh or to manually start each data node. What OS? I'm guessing Solaris 9, from the bits and pieces I know about JPMC's infrastructure. ;)
Re: Newbie asking: ordinary filesystem above Hadoop
On 4/22/08 12:23 PM, Mika Joukainen [EMAIL PROTECTED] wrote: All right, I have to refrase: like to have storage system for files which are inserted by the users. Users are going to use normal human operable sw entities ;) System is going to have: fault tolerance, parallelism etc. == HDFS, isn't it. No, it isn't. You're looking for Lustre and similar file systems.
Re: New bee quick questions :-)
On 4/21/08 3:36 AM, vikas [EMAIL PROTECTED] wrote: Most of your questions have been answered by Luca, from what I can see, so let me tackle the rest a bit... 4) Let us suppose I want to shutdown one datanode for maintenance purpose. is there any way to inform Hadoop saying that this particular datanode is going done -- please make sure the data in it is replicated else where ? You want to do datanode decommissioning. See http://wiki.apache.org/hadoop/FAQ#17 for details. 5) I was going through some videos on MAP-Reduce and few Yahoo tech talks. in that they were specifying a Hadoop cluster has multiple cores -- what does this mean ? I haven't watched the tech talks in ages, but we generally refer to cores in a variety of ways. There is the single physical box verson--an individual processor has more than one execution unit, thereby giving it a degree of parallelism. Then there is the complete grid count--an individual grid can have lots and lots of processors with lots and lots of individual cores on those processors which works out to be a pretty good rough estimation of how many individual Hadoop tasks can be run simultaneously. 5.1) can I have multiple instance of namenodes running in a cluster apart from secondary nodes ? No. The name node is a single point of failure in the system. 6) If I go on create huge files will they be balanced among all the datanodes ? or do I need to change the creation of file location in the application. In addition to what Luca said, be aware that if you load a file on a machine with a data node process, the data for that file will *always* get loaded to that machine. This can cause your data nodes to get extremely unbalanced. You are much better off doing data loads *off grid*/from another machine. Since you only need the hadoop configuration and binaries available (in other words, no hadoop processes need be running), this usually isn't too painful to do. In 0.16.x, there is a rebalancer to help fix this situation, but I have no practical experience with it yet to say whether or not it works.
Re: jar files on NFS instead of DistributedCache
On 4/21/08 2:18 PM, Ted Dunning [EMAIL PROTECTED] wrote: I agree with the fair and balanced part. I always try to keep my clusters fair and balanced! Joydeep should mention his background. In any case, I agree that high-end filers may provide good enough NFS service, but I would also contend that HDFS has been better for me than NFS from generic servers. We take a mixed approach to the NFS problem. For grids that have some sort of service level agreement associated with it, we do not allow NFS connections. The jobs must be reasonably self contained. For other grids (research, development, etc), we do allow NFS connections and hope that people don't do stupid things. It is probably worth pointing out that it is much easier for a user to do stupid things with, say, 500 nodes than 5. So we take a much a more conservative view for grids we care about. As Joydeep said, the implementation of the stack does make a huge difference. NetApp and Sun are leaps and bounds better than most. In the case of Linux, it has made great strides forward but I'd be leary using it for the sorts of workloads we have.
Re: Add your project or company to the powered by page?
On 2/21/08 11:34 AM, Jeff Hammerbacher [EMAIL PROTECTED] wrote: yeah, i've heard those facebook groups can be a great way to get the word out... anyways, just got approval yesterday for a 320 node cluster. each node has 8 cores and 4 TB of raw storage so this guy is gonna be pretty powerful. can we claim largest cluster outside of yahoo? I guess it depends upon how you define outside. *Technically*, M45 is outside of a Yahoo! building, given that it is in one of those shipping-container-data-center-thingies ...
Re: Starting up a larger cluster
On 2/7/08 11:01 PM, Tim Wintle [EMAIL PROTECTED] wrote: it's useful to be able to connect from nodes that aren't in the slaves file so that you can put in input data direct from another machine that's not part of the cluster, I'd actually recommend this as a best practice. We've been bit over... and over... and over... with users loading data into HDFS from a data node only to discover that the block distribution is pretty horrid which in turn means that MR performance is equally horrid. [Remember: all writes will go the local node if it is a data node!] We're now down to the point that we've got one (relatively smaller) grid that is used for data loading/creation/extraction which then distcp's its contents to another grid. Less than ideal, but definitely helps the performance of the entire 'real' grid.
Re: Starting up a larger cluster
On 2/8/08 9:32 AM, Jeff Eastman [EMAIL PROTECTED] wrote: I noticed that phenomena right off the bat. Is that a designed feature or just an unhappy consequence of how blocks are allocated? My understanding is that this is by design--when you are running a MR job, you want the output, temp files, etc, to be local. Ted compensates for this by aggressively rebalancing his cluster often by adjusting the replication up and down, but I wonder if an improvement in the allocation strategy would improve this. IIRC, we're getting a block re-balancer in 0.16 so this particular annoyance should mostly go away soon. I've also used Ted's trick, with less than marvelous results. I'd hate to pull my biggest machine (where I store all the backup files) out of the cluster just to get more even block distribution but I may have to. Been there, done that. (At one time, we were decomm'ing entire racks to force redistribution. I seem recall that we hit a bug so we then slowed down to doing 10 at a time.)