date:20120911

Re: Non utf-8 chars in input

2012-09-11 Thread Joshi, Rekha

Hi Ajay, Try SequenceFileAsBinaryInputFormat ? Thanks Rekha On 11/09/12 11:24 AM, Ajay Srivastava ajay.srivast...@guavus.com wrote: Hi, I am using default inputFormat class for reading input from text files but the input file has some non utf-8 characters. I guess that TextInputFormat class

what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube

Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: 1. Some of the blocks it was managing are deleted/modified? 2. The size of the blocks are now modified say from 64MB to 128MB? 3. What if the block replication factor was one

Re: what happens when a datanode rejoins?

2012-09-11 Thread George Datskos

Hi Mehul Some of the blocks it was managing are deleted/modified? The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. The size of the blocks are now modified say

Re: what happens when a datanode rejoins?

2012-09-11 Thread George Datskos

Mehul, Let me make an addition. Some of the blocks it was managing are deleted/modified? Blocks that are deleted in the interim will deleted on the rejoining node as well, after it rejoins . Regarding the modified, I'd advise against modifying blocks after they have been fully written.

Re: what happens when a datanode rejoins?

2012-09-11 Thread Harsh J

George has answered most of these. I'll just add on: On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube mehul_cho...@symantec.com wrote: 1. Some of the blocks it was managing are deleted/modified? A DN runs a block report upon start, and sends the list of blocks to the NN. NN validates them

Re: how to make different mappers execute different processing on same data ?

2012-09-11 Thread Narasingu Ramesh

Hi Jason, Mehmet said is exactly correct ,without reducers we cannot increase performance please you can add mappers and reducers in any processing data you can get output and performance is good. Thanks Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 9:31 AM, Mehmet

Re: Non utf-8 chars in input

2012-09-11 Thread Ajay Srivastava

Rekha, I guess that problem is that Text class uses utf-8 encoding and one can not set other encoding for this class. I have not seen any other Text like class which supports other encoding otherwise I have written my custom input format class. Thanks for your inputs. Regards, Ajay

Re: configure hadoop-0.22 fairscheduler

2012-09-11 Thread Jameson Li

Hi Harsh, Thanks for your reply. And I am sorry for my unclear description. As I mentioned previous, I think I configured the fairsheduler correctly in hadoop-0.22.0. But when I commit lots of the jobs: many big jobs (map number and reduce number is bigger than the map/reduce slot) commit

what happens when a datanode rejoins?

2012-09-11 Thread mehul choube

Hi, What happens when an existing (not new) datanode rejoins a cluster for following scenarios: a) Some of the blocks it was managing are deleted/modified? b) The size of the blocks are now modified say from 64MB to 128MB? c) What if the block replication factor was one (yea not in most

RE: build failure - trying to build hadoop trunk checkout

2012-09-11 Thread Tony Burton

Changing the hostname to lowercase fixed this particular problem - thanks for your replies. The build is failing elsewhere now, I'll post a new thread for that. Tony From: Tony Burton [mailto:tbur...@sportingindex.com] Sent: 10 September 2012 10:44 To: user@hadoop.apache.org Subject: RE:

Re: Can't run PI example on hadoop 0.23.1

2012-09-11 Thread Narasingu Ramesh

Hi Vinod, Please check whether input file location and output file location doesnt match. please find your input file first put into HDFS and then run MR job it is working fine. Thanks Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 4:23 AM, Vinod Kumar Vavilapalli

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Narasingu Ramesh

Hi, Please find i think one command is there then only build the all applications. Thanks Regards, Ramesh.Narasingu On Tue, Sep 11, 2012 at 2:28 PM, Tony Burton tbur...@sportingindex.comwrote: Hi, ** ** I’ve checked out the hadoop trunk, and I’m running “mvn test” on the

RE: what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube

The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. What happens when the datanode rejoins after namenode has already re-replicated the blocs it was managing? Will

Re: what happens when a datanode rejoins?

2012-09-11 Thread Harsh J

Hi, Inline. On Tue, Sep 11, 2012 at 2:36 PM, Mehul Choube mehul_cho...@symantec.com wrote: The namenode will asynchronously replicate the blocks to other datanodes in order to maintain the replication factor after a datanode has not been in contact for 10 minutes. What happens when the

RE: what happens when a datanode rejoins?

2012-09-11 Thread Mehul Choube

DataNode rejoins take care of only NameNode. Sorry didn't get this From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com] Sent: Tuesday, September 11, 2012 2:38 PM To: user@hadoop.apache.org Subject: Re: what happens when a datanode rejoins? Hi Mehul, DataNode rejoins take care

Re: Undeliverable messages

2012-09-11 Thread Harsh J

Ha, good sleuthing. I just moved it to INFRA, as no one from our side has gotten to this yet. I guess we can only moderate, not administrate. So the ticket now awaits action from INFRA on ejecting it out. On Tue, Sep 11, 2012 at 2:34 PM, Tony Burton tbur...@sportingindex.com wrote: Thanks

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton

Hi Ramesh Thanks for the quick reply, but I'm having trouble following your English. Are you saying that there is one command to build everything? If so, can you tell me what it is? Tony From: Narasingu Ramesh [mailto:ramesh.narasi...@gmail.com] Sent: 11 September 2012 10:06 To:

Re: Undeliverable messages

2012-09-11 Thread Harsh J

And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony! On Tue, Sep 11, 2012 at 2:44 PM, Harsh J ha...@cloudera.com wrote: Ha, good sleuthing. I just moved it to INFRA, as no one from our side has gotten to this yet. I guess we can only moderate, not administrate. So

Re: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Steve Loughran

It's probably some maven thing -in particular Maven's habit of grabbing the online nightly snapshots off apache rather than local, try mvn clean install -DskipTests -offline to force in all the artifacts, then run the MR tests Tony -why not get on the mapreduce-dev mailing list, as this is the

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton

Thanks Steve, I’ll try the mvn command you suggest. All the snapshots I can see came from repository.apache.org though. How do I run the MR tests only? Thanks for the mapreduce-dev mailing list suggestion, I thought all lists had merged into one though – did I get the wrong end of the stick?

RE: Undeliverable messages

2012-09-11 Thread Tony Burton

No problem! I'll remove that Outlook filter now... :) -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: 11 September 2012 10:34 To: user@hadoop.apache.org Subject: Re: Undeliverable messages And done. We shouldn't get this anymore. Thanks for bumping on this issue Tony!

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton

Good suggestions Harsh and Hemanth. When I was asked to submit a patch for hadoop 1.0.3, I thought it a good exercise to work through the build process to become familiar even though the patch is documentation-only. Maybe the requests for patches could come with a list of suggested reading as

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks

Hi Lin The default value for number of reducers is 1 namemapred.reduce.tasks/name value1/value It is not determined by data volume. You need to specify the number of reducers for your mapreduce jobs as per your data volume. Regards Bejoy KS On Tue, Sep 11, 2012 at 4:53 PM, Jason Yang

Re: FW: Doubts Reg

2012-09-11 Thread sudha sadhasivam

Dear Madam, I'am keeping as attachment relevant screen shots of running hive Profile.png shows contents of /etc/profile hive_common_lib.png shows h-ve_common*.jar is already in $HIVE_HOME/lib , here $HIVE_HOME is /home/yahoo/hive/build/dist as evident from classpath_err.png Yours Truly G

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks

Hi Lin The default values for all the properties are in core-default.xml hdfs-default.xml and mapred-default.xml Regards Bejoy KS On Tue, Sep 11, 2012 at 5:06 PM, Jason Yang lin.yang.ja...@gmail.comwrote: Hi, Bejoy Thanks for you reply. where could I find the default value of

Re: Some general questions about DBInputFormat

2012-09-11 Thread Bejoy KS

Hi Yaron Sqoop uses a similar implementation. You can get some details there. Replies inline • (more general question) Are there many use-cases for using DBInputFormat? Do most Hadoop jobs take their input from files or DBs? From my small experience Most MR jobs have data in hdfs. It is

Question about the task assignment strategy

2012-09-11 Thread Hiroyuki Yamada

Hi, I want to make sure my understanding about task assignment in hadoop is correct or not. When scanning a file with multiple tasktrackers, I am wondering how a task is assigned to each tasktracker . Is it based on the block sequence or data locality ? Let me explain my question by example.

Re: Question about the task assignment strategy

2012-09-11 Thread Hemanth Yamijala

Hi, Task assignment takes data locality into account first and not block sequence. In hadoop, tasktrackers ask the jobtracker to be assigned tasks. When such a request comes to the jobtracker, it will try to look for an unassigned task which needs data that is close to the tasktracker and will

RE: hadoop trunk build failure - yarn, surefire related?

2012-09-11 Thread Tony Burton

Another mvn test caused the build to fail slightly further down the road. As my Jira issue is documentation-only, I've submitted the patch anyway. Is this multiple-failure scenario typical for trying to build hadoop from the trunk? It's sure putting me off submitting code in future. Is there

Re: Error in : hadoop fsck /

2012-09-11 Thread Hemanth Yamijala

Could you please review your configuration to see if you are pointing to the right namenode address ? (This will be in core-site.xml) Please paste it here so we can look for clues. Thanks hemanth On Tue, Sep 11, 2012 at 9:25 PM, yogesh dhari yogeshdh...@live.com wrote: Hi all, I am running

Re: Error in : hadoop fsck /

2012-09-11 Thread Arpit Gupta

Yogesh try this hadoop fsck -Ddfs.http.address=localhost:50070 / 50070 is the default http port that the namenode runs on. The property dfs.http.address should be set in your hdfs-site.xml -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Sep 11, 2012, at 9:03 AM, yogesh dhari

how to specify the root directory of hadoop on slave node?

2012-09-11 Thread Richard Tang

Hi, All I need to setup a hadoop/hdfs cluster with one namenode on a machine and two datanodes on two other machines. But after setting datanode machiines in conf/slaves file, running bin/start-dfs.sh can not start hdfs normally.. I am aware that I have not specify the root directory hadoop is

security-in-HADOOP

2012-09-11 Thread nisha

How security is maintained in hadoop, is it maintained by giving folder/file permissions in hadoop how can i make sure that somebody else dunt write in to my hdfs file system ...

removing datanodes from clustes.

2012-09-11 Thread yogesh dhari

Hello all, I am not getting the clear way out to remove datanode from the cluster. please explain me decommissioning steps with example. like how to creating exclude files and other steps involved in it. Thanks regards Yogesh Kumar

Re: security-in-HADOOP

2012-09-11 Thread Bertrand Dechoux

By reading the documentation, like the following http://hadoop.apache.org/docs/r1.0.3/hdfs_permissions_guide.html On Tue, Sep 11, 2012 at 8:14 PM, nisha nishakulkarn...@gmail.com wrote: How security is maintained in hadoop, is it maintained by giving folder/file permissions in hadoop how can

Re: How to remove datanode from cluster..

2012-09-11 Thread Bejoy Ks

Hi Yogesh The detailed steps are available in hadoop wiki on FAQ page http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F Regrads Bejoy KS On Wed, Sep 12, 2012 at 12:14 AM, yogesh dhari

Re: Question about the task assignment strategy

2012-09-11 Thread Hiroyuki Yamada

I figured out the cause. HDFS block size is 128MB, but I specify mapred.min.split.size as 512MB, and data local I/O processing goes wrong for some reason. When I remove the mapred.min.split.size configuration, tasktrackers pick data-local tasks. Why does it happen ? It seems like a bug. Split is

Re: Accessing image files from hadoop to jsp

2012-09-11 Thread Michael Segel

Here's one... Write a Java program which can be accessed on the server side to pull the picture from HDFS and display it on your JSP. On Sep 11, 2012, at 3:48 PM, Visioner Sadak visioner.sa...@gmail.com wrote: any hints experts atleast if i m on the right track or we cant use hftp at all

How to split a sequence file

2012-09-11 Thread Jason Yang

Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of Text, BytesWritable, like below: Text BytesWritable - id_A_01 7F2B3C687F2B3C687F2B3C68 id_A_02

Re: How to not output the key

2012-09-11 Thread Manoj Babu

Hi, You have to specify the reducer key out type as NullWritable. Cheers! Manoj. On Wed, Sep 12, 2012 at 7:43 AM, Nataraj Rashmi - rnatar rashmi.nata...@acxiom.com wrote: Hello, ** ** I have simple map/reduce program to merge input files into one big output files. My question is,

RE: Issue in access static object in MapReduce

2012-09-11 Thread Stuti Awasthi

Thanks Bejoy, I try to implement and if face any issues will let you know. Thanks Stuti From: Bejoy Ks [mailto:bejoy.had...@gmail.com] Sent: Tuesday, September 11, 2012 8:39 PM To: user@hadoop.apache.org Subject: Re: Issue in access static object in MapReduce Hi Stuti You can pass the json

RE: removing datanodes from clustes.

2012-09-11 Thread Brahma Reddy Battula

Hi Yogesh.. FYI. Please go through following.. http://tech.zhenhua.info/2011/04/how-to-decommission-nodesblacklist.html http://hadoop-karma.blogspot.in/2011/01/hadoop-cookbook-how-to-decommission.html From: yogesh dhari [yogeshdh...@live.com] Sent: Wednesday,

Re: Issue in access static object in MapReduce

2012-09-11 Thread Kunaal

Have you looked at Terracotta or any other distributed caching system? Kunal -- Sent while mobile -- On Sep 11, 2012, at 9:30 PM, Stuti Awasthi stutiawas...@hcl.com wrote: Thanks Bejoy, I try to implement and if face any issues will let you know. Thanks Stuti From: Bejoy Ks

Re: How to split a sequence file

2012-09-11 Thread Robert Dyer

If the file is pre-sorted, why not just make multiple sequence files - 1 for each split? Then you don't have to compute InputSplits because the physical files are already split. On Tue, Sep 11, 2012 at 11:00 PM, Harsh J ha...@cloudera.com wrote: Hey Jason, Is the file pre-sorted? You could

Re: Question about the task assignment strategy

2012-09-11 Thread Hemanth Yamijala

Hi, I tried a similar experiment as yours but couldn't replicate the issue. I generated 64 MB files and added them to my DFS - one file from every machine, with a replication factor of 1, like you did. My block size was 64MB. I verified the blocks were located on the same machine as where I

45 matches

Mail list logo