Re: mapr common library?

2011-10-20 Thread Alex Gauthier
Thanks guys and sorry for not being more specific but yes cloud9 and mahout are definitely what I'm look for; much appreciated. On Wed, Oct 19, 2011 at 9:23 PM, Harsh J ha...@cloudera.com wrote: Alex, I know of Cloud9 http://lintool.github.com/Cloud9/index.html as a library that caters to

Re: execute hadoop job from remote web application

2011-10-20 Thread Steve Loughran
On 18/10/11 17:56, Harsh J wrote: Oleg, It will pack up the jar that contains the class specified by setJarByClass into its submission jar and send it up. Thats the function of that particular API method. So, your deduction is almost right there :) On Tue, Oct 18, 2011 at 10:20 PM, Oleg

Re: Is there a good way to see how full hdfs is

2011-10-20 Thread Mapred Learn
Hi, I have same question regarding the documentation and : Is there something like this for memory and CPU utilization also ? Sent from my iPhone Thanks, JJ On Oct 19, 2011, at 5:00 PM, Rajiv Chittajallu raj...@yahoo-inc.com wrote: ivan.nov...@emc.com wrote on 10/18/11 at 09:23:50 -0700:

Fixing Mis-replicated blocks

2011-10-20 Thread John Meagher
After a hardware move with an unfortunate mis-setup rack awareness script our hadoop cluster has a large number of mis-replicated blocks. After about a week things haven't gotten better on their own. Is there a good way to trigger the name node to fix the mis-replicated blocks? Here's what I'm

Capacity Scheduler : how to use more than the queue capacity ?

2011-10-20 Thread Sami Dalouche
Hi, By choosing the capacity scheduler, I was under the impression that each queue could borrow other queues' resources if they are available. Let's say we have the configuration below, and a total capacity of 180 slots. What I expect is that whenever default and cpu-bound queues have no job,

Re: Capacity Scheduler : how to use more than the queue capacity ?

2011-10-20 Thread Sami Dalouche
Hi, I ended up finding another post about the exact same issue on this exact same mailing list, that was just a few days old... It looks like the setting to play with is mapred.capacity-scheduler.default-user-limit-factor Sami On Thu, Oct 20, 2011 at 1:25 PM, Sami Dalouche sa...@hopper.com

Re: Hadoop archive

2011-10-20 Thread John George
Could you try 0.20.205.0? The HAR issue in branch-20-security was updated by JIRA HADOOP-7539. -Original Message- From: Jonas Hartwig jonas.hart...@cision.com Reply-To: common-user@hadoop.apache.org common-user@hadoop.apache.org Date: Mon, 17 Oct 2011 02:11:24 -0700 To:

Connecting to vm through java

2011-10-20 Thread JAX
Hi guys : im getting the dreaded org.apache.hadoop.ipc.Client$Connection handleConnectionFailure When connecting to clouderas hadoop (running in a vm) to request running a simple m/r job (from a machine outside the hadoop vm).. I've seen a lot of posts about this online, and it's

running sqoop on hadoop cluster

2011-10-20 Thread firantika
Hi All, i'm newbie on hadoop, if i installed hadoop on 2 node, where is hdfs running ? on master or slave node ? and then if i running sqoop for export dbms to hive, is it give effect on speed up system between hadoop which running on single node and hadoop multi node ? please give me

Re: Fixing Mis-replicated blocks

2011-10-20 Thread Jeff Bean
Do setrep -w on the increase to force the new replica before decreasing again. Of course, the little script only works if the replication factor is 3 on all the files. If it's a variable amount you should use the java API to get the existing factor and then increase by one and then decrease.