Re: showing hadoop status UI

2009-11-18 Thread Steve Loughran
Mark N wrote: I want to show the status of M/R jobs on user interface , should i read the default hadoop counters to display some kind of map/ reduce tasks? I could read the status of map/reduce task using Jobclient ( hadoop default counters ) . I can then have a java websevice exposing

Re: client.Client: failed to interact with node......ERROR

2009-11-17 Thread Steve Loughran
Johannens Zillmann wrote: Hi there, i directed Yair to this list because the exception make me think i could be a problem of using hadoop ipc versus using hadoop ipc in a servlet container like tomcat. Thought maybe a problem with static variables. To explain, katta uses plain hadoop ipc for

Re: client.Client: failed to interact with node......ERROR

2009-11-17 Thread Steve Loughran
Johannens Zillmann wrote: Hi Steve, in the meantime Yair posted logs with hadoop debug log level. -- 09/11/17 01:31:59 DEBUG ipc.Client: IPC Client (47) connection to qa-hadoop005.ascitest.net/10.12.2.205:2 from root: starting, having connections 2 09/11/17 01:31:59

mapred.local.dir options

2009-11-16 Thread Steve Loughran
I see that the mapred.local.dir is served up round robin, as with the dfs.data.dir values. But there's no awareness of the possibility that the same disk partition is used for mapred local data and for datanode blocks. What do people do here? * keep their fingers crossed that if the MR job

Re: About Hadoop pseudo distribution

2009-11-12 Thread Steve Loughran
kvorion wrote: Hi All, I have been trying to set up a hadoop cluster on a number of machines, a few of which are multicore machines. I have been wondering whether the hadoop pseudo distribution is something that can help me take advantage of the multiple cores on my machines. All the tutorials

Re: Linux Flavor

2009-11-03 Thread Steve Loughran
Tom Wheeler wrote: Based on what I've seen on the list, larger installations tend to use RedHat Enterprise Linux or one of its clones like CentOS. One other thing to add is that a large cluster is not the place to learn linux or solaris or whatever -it helps to have a working knowledge of

Re: Regd. Hadoop Implementation

2009-10-29 Thread Steve Loughran
shwitzu wrote: Thanks for Responding, I read about HDFS and understood how it works and I also installed hadoop in my windows using cygwin and tried a sample driver code and made sure it works. But my concern is, given the problem statement how should I proceed Could you please give me some

Re: editing etc hosts files of a cluster

2009-10-21 Thread Steve Loughran
Allen Wittenauer wrote: A bit more specific: At Yahoo!, we had either every server as a DNS slave or a DNS caching server. In the case of LinkedIn, we're running Solaris so nscd is significantly better than its Linux counterpart. However, we still seem to be blowing out the cache too much.

Re: detecting stalled daemons?

2009-10-15 Thread Steve Loughran
Edward Capriolo wrote: I know there is a Jira open to add life cycle methods to each hadoop component that can be polled for progress. I dont know the # off hand. HDFS-326 https://issues.apache.org/jira/browse/HDFS-326 the code has its own branch. This is still something I'm working on,

Re: Hardware Setup

2009-10-15 Thread Steve Loughran
Brian Bockelman wrote: Hey Alex, In order to lower cost, you'll probably want to order the worker nodes without hard drives then buy them separately. HDFS provides a software-level RAID, so most of the reasonings behind buying hard drives from Dell/HP are irrelevant - you are just paying an

Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?

2009-10-05 Thread Steve Loughran
Smith Stan wrote: Hey Cloudera genius guys . Sorry, not cloudera. I speak for myself. I read this Via Cloudera, Hadoop is currently used by most of the giants in the space including Google, Yahoo, Facebook (we wrote about Facebook’s use of Cloudera here), Amazon, AOL, Baidu and more. I

Re: NameNode high availability

2009-10-05 Thread Steve Loughran
Isabel Drost wrote: On Mon, 05 Oct 2009 10:28:58 +0100 Steve Loughran ste...@apache.org wrote: 2. Even LGPL and GPL say no need to contribute back if you dont distribute the code Sorry in advance about the nitpicking: IANAL - but AFAIK even LGPL and GPL do not force you to contribute back

Re: NameNode high availability

2009-10-02 Thread Steve Loughran
Stas Oskin wrote: Hi. Could you share the way in which it didn't quite work? Would be valuable information for the community. The idea is to have a Xen machine dedicated to NN, and maybe to SNN, which would be running over DRBD, as described here: http://www.drbd.org/users-guide/ch-xen.html

Re: NameNode high availability

2009-10-02 Thread Steve Loughran
Stas Oskin wrote: Hi. The HA service (heartbeat) is running on Dom0, and when the primary node is down, it basically just starts the VM on the other node. So there not supposed to be any time issues. Can you explain a bit more about your approach, how to automate it for example? * You need

Re: Advice on new Datacenter Hadoop Cluster?

2009-10-01 Thread Steve Loughran
Kevin Sweeney wrote: I really appreciate everyone's input. We've been going back and forth on the server size issue here. There are a few reasons we shot for the $1k price, one because we wanted to be able to compare our datacenter costs vs. the cloud costs. Another is that we have spec'd out a

Re: Advice on new Datacenter Hadoop Cluster?

2009-10-01 Thread Steve Loughran
Ryan Smith wrote: I have a question that i feel i should ask on this thread. Lets say you want to build a cluster where you will be doing very little map/reduce, storage and replication of data only on hdfs. What would the hardware requirements be? No quad core? less ram? Servers with more

Re: Running Hadoop on cluster with NFS booted systems

2009-09-30 Thread Steve Loughran
Todd Lipcon wrote: Yep, this is a common problem. The fix that Brian outlined helps a lot, but if you are *really* strapped for random bits, you'll still block. This is because even if you've set the random source, it still uses the real /dev/random to grab a seed for the prng, at least on my

Re: Running Hadoop on cluster with NFS booted systems

2009-09-30 Thread Steve Loughran
Brian Bockelman wrote: On Sep 30, 2009, at 4:24 AM, Steve Loughran wrote: Todd Lipcon wrote: Yep, this is a common problem. The fix that Brian outlined helps a lot, but if you are *really* strapped for random bits, you'll still block. This is because even if you've set the random source

Re: local node Quotas (for an RD cluster)

2009-09-25 Thread Steve Loughran
Paul Smith wrote: On 25/09/2009, at 3:57 PM, Allen Wittenauer wrote: On 9/24/09 7:38 PM, Paul Smith psm...@aconex.com wrote: I think this could be one of these If we build it, they will come issues. most of the Hadoop committers are working in large scale homogenous environments (lucky

Re: local node Quotas (for an RD cluster)

2009-09-25 Thread Steve Loughran
Paul Smith wrote: On 25/09/2009, at 8:55 PM, Steve Loughran wrote: I'd love to see more direct Log4J/Hadoop integration, such as a standardised log4j-in-hadoop format that was easily readable, included stack traces on exceptions, etc, and came with some sample mapreducer or pig scripts

Re: 3D Cluster Performance Visualization

2009-09-25 Thread Steve Loughran
Brian Bockelman wrote: ;) Unfortunately, I'm going to go out on a limb and guess that we don't want to add OpenGL to the dependency list for the namenode... The viz application actually doesn't depend on the namenode, it uses the datanodes. Here's the source:

Re: Limiting the total number of tasks per task tracker

2009-09-25 Thread Steve Loughran
Oliver Senn wrote: Hi, Thanks for your answer. I used these parameters. But they seem to limit only the number of parallel maps and parallel reduces separately. They do not prevent the scheduler from schedule one map and one reduce on the same task tracker in parallel. But that's the

Re: Can not stop hadoop cluster ?

2009-09-21 Thread Steve Loughran
Jeff Zhang wrote: My cluster has running for several months. Nice. Is this a bug of hadoop? I think hadoop is supposed to run for long time. I'm doing work in HDFS-326 on making it easier to start/stop the various hadoop services; once the lifecycle stuff is in I'll worry more about the

Re: Hadoop on Windows

2009-09-17 Thread Steve Loughran
brien colwell wrote: Our cygwin/windows nodes are picky about the machines they work on. On some they are unreliable. On some they work perfectly. We've had two main issues with cygwin nodes. Hadoop resolves paths in strange ways, so for example /dir is interpreted as c:/dir not

Re: Stretched HDFS cluster

2009-09-17 Thread Steve Loughran
Edward Capriolo wrote: On a somewhat related topic I was showing a co-worker a Hadoop setup and he asked stated, What if we got a bunch of laptops on the internet like the playstation 'Folding @ Home' of course these are widely different distributed models. I have been thinking about this.

Re: Best practices with large-memory jobs

2009-09-16 Thread Steve Loughran
Chris Dyer wrote: my task logs I see the message: attempt to override final parameter: mapred.child.ulimit; Ignoring. which doesn't exactly inspire confidence that I'm on the right path. Chances are the param has been marked final in the task tracker's running config which will prevent you

Re: Stretched HDFS cluster

2009-09-16 Thread Steve Loughran
Touretsky, Gregory wrote: Hi, Does anyone have an experience running HDFS cluster stretched over high-latency WAN connections? Any specific concerns/options/recommendations? I'm trying to setup the HDFS cluster with the nodes located in the US, Israel and India - considering it as a

Re: measuring memory usage

2009-09-10 Thread Steve Loughran
Arvind Sharma wrote: hmmm... I had seen some exceptions (don't remember which one) on MacOS. There was missing JSR-223 engine on my machine. Not sure why on Linux distribution you would see this error From: Ted Yu yuzhih...@gmail.com To:

Re: hadoop 0.20.0 jobtracker.info could only be replicated to 0 nodes

2009-09-10 Thread Steve Loughran
gcr44 wrote: Thanks for the response. I have already tried moving JobTracker to several different ports always with the same result. Chandraprakash Bhagtani wrote: You can try running JobTracker on some other port. This port might me in use. -- Thanks Regards, Chandra Prakash Bhagtani, On

Re: Pregel

2009-09-08 Thread Steve Loughran
Ted Dunning wrote: You would be entirely welcome in Mahout. Graph based algorithms are key for lots of kinds of interesting learning and would be a fabulous thing to have in a comprehensive substrate. I personally would also be very interested in learning more about about what sorts of things

Re: Cloudera Video - Hadoop build on eclipse

2009-09-01 Thread Steve Loughran
ashish pareek wrote: Hello Bharath, Earlier even I faced the same problem. I think your are accessing internet through proxy.So try using direct broadband connection. Hope this will solve your problem. or set Ant's proxy up http://ant.apache.org/manual/proxy.html Ashish

Re: NN memory consumption on 0.20/0.21 with compressed pointers/

2009-08-24 Thread Steve Loughran
Raghu Angadi wrote: Suresh had made an spreadsheet for memory consumption.. will check. A large portion of NN memory is taken by references. I would expect memory savings to be very substantial (same as going from 64bit to 32bit), could be on the order of 40%. The last I heard from Sun was

Re: Ubuntu/Hadoop incompatibilities?

2009-08-18 Thread Steve Loughran
brien colwell wrote: Actually Ubuntu comes out of the box with an entry in the hosts file (/etc/hosts) that maps the computer name to the loopback address. (btw I'm not sure if this is specific to Ubuntu) The effect is that all name lookups from the machine for itself resolve to 127.0.0.1.

Re: HADOOP-4539 question

2009-08-17 Thread Steve Loughran
Konstantin Shvachko wrote: Steve, There are other groups claimed they work on HA solution. We had discussions about it not so long ago in this list. Is it possible that your colleagues present their design? As you point out the issue gets fairly complex fast, particularly because of the

Re: What OS?

2009-08-17 Thread Steve Loughran
Edward Capriolo wrote: while I completely agree with you about freebsd, that is not the point I was driving at. Linux is the main target platform.you chose another platform you have more work for yourself.if you have a problem like the one I had, probably no one else has the same environment as

Re: HADOOP-4539 question

2009-08-13 Thread Steve Loughran
Konstantin Shvachko wrote: And the only remaining step is to implement fail-over mechanism. :) Colleagues of mine work on HA stuff; I try and steer clear of it as it gets complex fast. Test case: what happens when a network failure splits the datacentre in two, you now have two clusters

Re: changing logging

2009-08-11 Thread Steve Loughran
John Clarke wrote: Thanks for the reply. I considered that but I have a lot of threads in my application and it's v handy to have log4j output the thread name with the log message. It's like the log4j.properties file in the conf/ directory is not being used as any changes I make seem to have no

Re: HADOOP-4539 question

2009-08-11 Thread Steve Loughran
networkaddress.cache.ttl to to something low (like 60s), and then you should be able to bring up a node with the same name but a different IPAddress. This is useful if you can't control the IPAddr of a node, but you can at least change the DNS entry 2009/8/7 Steve Loughran ste...@apache.org Stas Oskin

Re: Some issues!

2009-08-03 Thread Steve Loughran
Sugandha Naolekar wrote: I want to encrypt the data that would be placed in HDFS. So I will have to use some kind of encryption algorithms, right? Also, This encryption is to be done on data before placing it in HDFS. How this can be done? Any special API's available in HADOOP for the above

Re: Map performance with custom binary format

2009-07-29 Thread Steve Loughran
Scott Carey wrote: Well, the first thing to do in any performance bottleneck investigation is to look at the machine hardware resource usage. During your test, what is the CPU use and disk usage? What about network utilization? Top, vmstat, iostat, and some network usage monitoring would be

Re: Hadoop in a Heterogeneous Environment - taking advantage of different processor specs

2009-07-28 Thread Steve Loughran
Saptarshi Guha wrote: Hello, Not sure if this has been asked or answered. Suppose I have tasktrackers A1,A2,A3 each with 4 cores and 16GB ram. mapred.tasktracker.map.tasks.maximum = 6 mapred.tasktracker.reduce.tasks.maximum = 4 Now suppose I have one more machine(X) with 8 cores and 32GB ram.

Re: A few questions about Hadoop and hard-drive failure handling.

2009-07-24 Thread Steve Loughran
Ryan Smith wrote: Todd, excellent info, thank you. I use Ganglia, I will set up nagios though, good idea. Just one clarification on Question 1. What if I actually lose all my master data dirs, and have no back up on the secondary name node, are the data blocks on all the slaves lost in that

Re: Questions on How the Namenode Assign Blocks to Datanodes

2009-07-24 Thread Steve Loughran
Boyu Zhang wrote: Dear All, I have a question in my mind about HDFS and I cannot find the answer from the documents on the apache website. I have a cluster of 4 machines, one is the namenode and the other 3 are datanodes. When I put 6 files, each 430 MB, to HDFS, the 6 files are split into 42

Re: A few questions about Hadoop and hard-drive failure handling.

2009-07-24 Thread Steve Loughran
Ryan Smith wrote: but you dont want to be the one trying to write something just after your production cluster lost its namenode data. Steve, I wasnt planning on trying to solve something like this in production. I would assume everyone here is a professional and wouldn't even think of

Re: Remote access to cluster using user as hadoop

2009-07-24 Thread Steve Loughran
Pallavi Palleti wrote: Hi all, I tried to trackdown to the place where I can add some conditions for not allowing any remote user with username as hadoop(root user) (other than some specific hostnames or ipaddresses). I could see the call path as FsShell - DistributedFileSystem -DFSClient -

Re: Benchmarks

2009-07-22 Thread Steve Loughran
JQ Hadoop wrote: I'm wondering where once can get the pagerank implementation for a try. Thanks, http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/citerank/ works over the citeceer citation dataset

Re: HDFS and long-running processes

2009-07-21 Thread Steve Loughran
Todd Lipcon wrote: On Sat, Jul 4, 2009 at 9:08 AM, David B. Ritch david.ri...@gmail.comwrote: Thanks, Todd. Perhaps I was misinformed, or misunderstood. I'll make sure I close files occasionally, but it's good to know that the only real issue is with data recovery after losing a node.

Re: Questions on Hadoop On Demand (HOD)

2009-07-21 Thread Steve Loughran
Boyu Zhang wrote: Dear all, Is there any other virtual machines that I can use to provide a Hadoop cluster over a physical cluster? 1. You can bring up Hadoop under VMWare, VirtualBox, Xen. There are problems with Centos5.x/RHEL5 under VirtualBox (some clock issue generates 100% load

<    1   2   3