Re: Upgrade from 16.2 to 17.0 Errors

2008-05-28 Thread Ion Badita
Hi, Most probably you need to recompile your classes with hadoop 0.17 API. Pay attention to hadoop classes with generic parameters. Ion Kayla Jay wrote: Hi I just upgraded from scratch from 16.2 to 17.0. A program that used to work on 16.2 is causing errors with version 17.0 Here's

Re: How to make a lucene Document hadoop Writable?

2008-05-28 Thread David Chung
unsubscribe

About Metrics update

2008-05-28 Thread Ion Badita
Hi, A looked over the class org.apache.hadoop.metrics.spi.AbstractMetricsContext and i have a question: why in the update(MetricsRecordImpl record) metricUpdates Map is not cleared after the updates are merged in metricMap. Because of this on every update() old increments are merged in

behavior of MapWritable as Key in Map Reduce

2008-05-28 Thread Tarandeep Singh
Hi, I want to understand the behavior of MapWritable if used as an intermediate Key in Mappers and Reducers. Suppose I create a MapWritable object with the following key-values in it- (K1, V1), (K2, V2) (K3, V3) So how will the Map Reduce Framework group and sort the keys (MapWritable objects)

Re: splitting of big files?

2008-05-28 Thread Erik Paulson
On Tue, May 27, 2008 at 10:49:38AM -0700, Ted Dunning wrote: There is a good tutorial on the wiki about this. Your problem here is that you have conflated two concepts. The first is the splitting of files into blocks for storage purposes. This has nothing to do with what data a program

hadoop on EC2

2008-05-28 Thread Andreas Kostyrka
Hi! I just wondered what other people use to access the hadoop webservers, when running on EC2? Ideas that I had: 1.) opening ports 50030 and so on = not good, data goes unprotected over the internet. Even if I could enable some form of authentication it would still plain http. 2.) Some kind of

Re: hadoop on EC2

2008-05-28 Thread Jake Thompson
What is wron with opening up the ports only to the hosts that you want to have access to them. This is what I cam currently doing, -s 0.0.0.0/0 is everyone everywhere so change it to -s my.ip.add.ress/32 On Wed, May 28, 2008 at 4:22 PM, Andreas Kostyrka [EMAIL PROTECTED] wrote: Hi! I just

Re: hadoop on EC2

2008-05-28 Thread Allen Wittenauer
On 5/28/08 1:22 PM, Andreas Kostyrka [EMAIL PROTECTED] wrote: I just wondered what other people use to access the hadoop webservers, when running on EC2? While we don't run on EC2 :), we do protect the hadoop web processes by putting a proxy in front of it. A user connects to the proxy,

Re: hadoop on EC2

2008-05-28 Thread Andreas Kostyrka
That presumes that you have a static source address. Plus for nontechnical reasons changing the firewall rules is nontrivial. (I'm responsible for the inside of the VMs, but somebody else holds the ec2 keys, don't ask) Andreas Am Mittwoch, den 28.05.2008, 16:27 -0400 schrieb Jake Thompson: What

Re: hadoop on EC2

2008-05-28 Thread Chris Anderson
Andreas, If you can ssh into the nodes, you can always set up port-forwarding with ssh -L to bring those ports to your local machine. On Wed, May 28, 2008 at 1:51 PM, Andreas Kostyrka [EMAIL PROTECTED] wrote: What I wonder is what ports do I need to access? 50060 on all nodes. 50030 on the

Re: hadoop on EC2

2008-05-28 Thread Ted Dunning
That doesn't work because the various web pages have links or redirects to other pages on other machines. Also, you would need to ssh to ALL of your cluster to get the file browser to work. Better to do the proxy thing. On 5/28/08 2:16 PM, Chris Anderson [EMAIL PROTECTED] wrote: Andreas,

Re: hadoop on EC2

2008-05-28 Thread Chris Anderson
On Wed, May 28, 2008 at 2:23 PM, Ted Dunning [EMAIL PROTECTED] wrote: That doesn't work because the various web pages have links or redirects to other pages on other machines. Also, you would need to ssh to ALL of your cluster to get the file browser to work. True. That makes it a little

Re: hadoop on EC2

2008-05-28 Thread Jim R. Wilson
Recently I spent some time hacking the contrib/ec2 scripts to install and configure OpenVPN on top of the other installed packages. Our use case required that all the slaves running mappers would need to connect back through to our primary mysql database (firewalled as you can imagine).

Re: 0.16.4 DataNode problem...

2008-05-28 Thread C G
I've repeated the experiment under more controlled circumstances: by creating a new file system formatted by 0.16.4 and then populating it. In this scenario we see the same problem: during the reduce phase the DataNode instances consume more and more memory until the system fails. Further,

Need example of MapWritable as Intermediate Key

2008-05-28 Thread Tarandeep Singh
Hi, Can someone point me to an example code where MapWritable/SortedMapWritable is used as in intermediate key. I am looking for how to set the comparator for MapWritable/SortedMapwritable so that the framework groups/sorts the intermediate keys in accordance to my requirement - sort the

Re: hadoop on EC2

2008-05-28 Thread Nate Carlson
On Wed, 28 May 2008, Andreas Kostyrka wrote: 1.) opening ports 50030 and so on = not good, data goes unprotected over the internet. Even if I could enable some form of authentication it would still plain http. Personally, I set up an Apache server (with https and auth), and then set up