RE: Profiling map reduce jobs?

2013-06-29 Thread David Poisson
when using multipleOutputs? We are taking our input and splitting it into 3 files. So it seems to be a natural choice for MultipleOutputs. Performance is a bit slow though. Cheers! David From: David Poisson [david.pois...@ca.fujitsu.com] Sent: Thursday

Profiling map reduce jobs?

2013-06-27 Thread David Poisson
Howdy, I want to take a look at a MR job which seems to be slower than I had hoped. Mind you, this MR job is only running on a pseudo-distributed VM (cloudera cdh4). I have modified my mapred-site.xml with the following (that last one is commented out because it crashes my MR job):

RE: Best practices for loading data into hbase

2013-06-05 Thread David Poisson
it be interfering? Other than that, my VM's networking is set to bridged, if that makes any difference. Mind you, I'm trying to connect from my vm to my vm. I'm at a lost here. Could really use some guidance. Thanks! David From: David Poisson [david.pois

Best practices for loading data into hbase

2013-05-31 Thread David Poisson
Hi, We are still very new at all of this hbase/hadoop/mapreduce stuff. We are looking for the best practices that will fit our requirements. We are currently using the latest cloudera vmware's (single node) for our development tests. The problem is as follows: We have multiple sources in