Name node failover

2007-01-08 Thread Sarvesh Singh
Hi, It looks like name node failover is not yet implemented after seeing some emails over this alias. I tested name node failover scenario and it does not work When is it expected to get implemented? Thanks Servesh

Task tracker failover

2007-01-08 Thread Sarvesh Singh
Hi, Task tracker failover happens but alive slave instance firstly does copy the blocks locally and then recover that. Therefore due to copy operation it takes 7-10 minutes for my test case. To me it looks to be more. Can we have fastest failover and resume the operation quickly? Is there

Loadbalancing

2007-01-08 Thread Sarvesh Singh
Hi, I want loadbalancing to happen when map/reduce tasks get distributed over hadoop cluster. I have different configuration machines and want to utilize the cpu well. How can I do that? Thanks Servesh

Design Patterns for Hadoop

2007-01-08 Thread Sarvesh Singh
Hi, Can somebody send me some article/paper/doc suggesting how to write/design map/reduce program over hadoop so that we can minimize I/O, less block transfer and use Hadoop effectively. Thanks Servesh

Re: s3

2007-01-08 Thread Doug Cutting
Tom White wrote: Any what do people think of the following. We already have a bunch of stuff up in S3 that we'd like to use as input to a hadoop mapreduce job only it wasn't put there by hadoop so it doesn't have the hadoop format where file-is-actually-a-list-of-blocks. [ ... ] The best

Re: s3

2007-01-08 Thread Tom White
Here's a thought: implement a simple read-only HttpFileSystem that works for MapReduce input. It couldn't perform directory enumeration, so job inputs would have to be listed explicitly, not as directories or glob patterns. For raw S3, one could make a subclass that adds directory enumeration,

Re: s3

2007-01-08 Thread Doug Cutting
Tom White wrote: This sounds like a good plan. I wonder whether the existing block-based s3 scheme should be renamed (as s3block or similar) so s3 is the scheme that sores raw files as you describe? Perhaps s3fs would be best for the full FileSystem implementation, and simply s3 for direct

Re: s3

2007-01-08 Thread Michael Stack
Bryan A. P. Pendleton wrote: S3 has a lot of somewhat weird limits right now, which make some of this tricky for the common case. Files can only be stored as a single s3 object if they are less than 5gb, and not 2gb-4gb in size, for instance. Perhaps an implementation could throw an exception