Re: log

2013-04-19 Thread Bejoy Ks
This basically happens while running a mapreduce job. When a map reduce job is triggered the job files are put in hdfs with high replication ( replication is controlled by - 'mapred.submit.replication' default value is 10). The job files are cleaned up after the job is completed and hence that

Re: How to change secondary namenode location in Hadoop 1.0.4?

2013-04-17 Thread Bejoy Ks
Hi Henry You can change the secondary name node storage location by overriding the property 'fs.checkpoint.dir' in your core-site.xml On Wed, Apr 17, 2013 at 2:35 PM, Henry Hung ythu...@winbond.com wrote: Hi All, ** ** What is the property name of Hadoop 1.0.4 to change secondary

Re: Adjusting tasktracker heap size?

2013-04-17 Thread Bejoy Ks
Hi Marcos, You need to consider the slots based on the available memory Available Memory = Total RAM - (Memory for OS + Memory for Hadoop Daemons like DN,TT + Memory for other servicess if any running in that node) Now you need to consider the generic MR jobs planned on your cluster. Say if

Re: Submitting mapreduce and nothing happens

2013-04-16 Thread Bejoy Ks
Hi Amit Are you seeing any errors or warnings on JT logs? Regards Bejoy KS

Re: VM reuse!

2013-04-16 Thread Bejoy Ks
Hi Rahul If you look at larger cluster and jobs that involve larger input data sets. The data would be spread across the whole cluster, and a single node might have various blocks of that entire data set. Imagine you have a cluster with 100 map slots and your job has 500 map tasks, now in that

Re: HW infrastructure for Hadoop

2013-04-16 Thread Bejoy Ks
+1 for Hadoop Operations On Tue, Apr 16, 2013 at 3:57 PM, MARCOS MEDRADO RUBINELLI marc...@buscapecompany.com wrote: Tadas, Hadoop Operations has pretty useful, up-to-date information. The chapter on hardware selection is available here:

Re: VM reuse!

2013-04-16 Thread Bejoy Ks
more number of mappers and less number of mappers slots. Regards, Rahul On Tue, Apr 16, 2013 at 2:40 PM, Bejoy Ks bejoy.had...@gmail.com wrote: Hi Rahul If you look at larger cluster and jobs that involve larger input data sets. The data would be spread across the whole cluster

Re: guessing number of reducers.

2012-11-21 Thread Bejoy KS
then you need lesser volume of data per reducer for better performance results. In general it is better to have the number of reduce tasks slightly less than the number of available reduce slots in the cluster. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message

Re: guessing number of reducers.

2012-11-21 Thread Bejoy KS
. You can round this value and use it to set the number of reducers in conf programatically. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Manoj Babu manoj...@gmail.com Date: Wed, 21 Nov 2012 23:28:00 To: user@hadoop.apache.org Cc: bejoy.had

Re: fundamental doubt

2012-11-21 Thread Bejoy KS
it. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: jamal sasha jamalsha...@gmail.com Date: Wed, 21 Nov 2012 14:50:51 To: user@hadoop.apache.orguser@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: fundamental doubt Hi.. I guess i am asking alot

Re: Supplying a jar for a map-reduce job

2012-11-20 Thread Bejoy KS
Hi Pankaj AFAIK You can do the same. Just provide the properties like mapper class, reducer class, input format, output format etc using -D option at run time. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Pankaj Gupta pan...@brightroll.com Date

Re: Strange error in Hive

2012-11-15 Thread Bejoy KS
Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Mark Kerzner mark.kerz...@shmsoft.com Date: Wed, 14 Nov 2012 17:05:20 To: Hadoop Useruser@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Strange error in Hive Hi, I am trying to insert a table in hive

Re: Setting up a edge node to submit jobs

2012-11-14 Thread Bejoy KS
. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Manoj Babu manoj...@gmail.com Date: Thu, 15 Nov 2012 10:03:24 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Setting up a edge node to submit jobs Hi, How to setup a edge node

Re: Map-Reduce V/S Hadoop Ecosystem

2012-11-07 Thread Bejoy KS
code can be efficient as yours would very specific to your app but the MR in hive and pig may be more generic. To just write your custom mapreduce functions, just basic knowledge on java is good. As you are better with java you can understand the internals better. Regards Bejoy KS Sent from

Re: Data locality of map-side join

2012-10-23 Thread Bejoy KS
. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Sigurd Spieckermann sigurd.spieckerm...@gmail.com Date: Mon, 22 Oct 2012 22:29:15 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Data locality of map-side join Hi guys, I've been trying

Re: Old vs New API

2012-10-22 Thread Bejoy KS
was not available in new mapreduce API at that point. Now mapreduce API is pretty good and you can go ahead with that for development. AFAIK mapreduce API is the future. Let's wait for a commiter to officially comment on this. Regards Bejoy KS Sent from handheld, please excuse typos

Re: extracting lzo compressed files

2012-10-21 Thread Bejoy KS
Hi Manoj You can get the file in a readable format using hadoop fs -text fileName Provided you have lzo codec within the property 'io.compression.codecs' in core-site.xml A 'hadoop fs -ls' command would itself display the file size. Regards Bejoy KS Sent from handheld, please excuse typos

Re: Hadoop counter

2012-10-19 Thread Bejoy KS
Hi Jay Counters are reported at the end of a task to JT. So if a task fails the counters from that task are not send to JT and hence won't be included in the final value of counters from that Job. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Jay

Re: Hadoop installation on mac

2012-10-16 Thread Bejoy KS
) Regards Bejoy KS -- View this message in context: http://hadoop-common.472056.n3.nabble.com/Hadoop-installation-on-mac-tp3999520p3999535.html Sent from the Users mailing list archive at Nabble.com.

Re: document on hdfs

2012-10-10 Thread Bejoy KS
Hi Murthy Hadoop - The definitive Guide by Tom White has the details on file write anatomy. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: murthy nvvs murthy_n1...@yahoo.com Date: Wed, 10 Oct 2012 04:27:58 To: user@hadoop.apache.orguser

Re: stable release of hadoop

2012-10-09 Thread Bejoy KS
Hi Nisha The current stable version is the 1.0.x releases. This is well suited for production environments. 0.23.x/2.x.x releases is of alpha quality and hence not that recommended on production. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From

Re: What is the difference between Rack-local map tasks and Data-local map tasks?

2012-10-07 Thread Bejoy KS
tasks when the number of input splits/map tasks are large which is quite common. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: centerqi hu cente...@gmail.com Date: Sun, 7 Oct 2012 23:28:55 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org

Re: Multiple Aggregate functions in map reduce program

2012-10-05 Thread Bejoy KS
aggregated sum and count for each key. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: iwannaplay games funnlearnfork...@gmail.com Date: Fri, 5 Oct 2012 12:32:28 To: useru...@hbase.apache.org; u...@hadoop.apache.org; hdfs-userhdfs-user@hadoop.apache.org

Re: Multiple Aggregate functions in map reduce program

2012-10-05 Thread Bejoy KS
aggregated sum and count for each key. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: iwannaplay games funnlearnfork...@gmail.com Date: Fri, 5 Oct 2012 12:32:28 To: useru...@hbase.apache.org; user@hadoop.apache.org; hdfs-userhdfs-u...@hadoop.apache.org

Re: hadoop memory settings

2012-10-05 Thread Bejoy KS
Hi Sadak AFAIK HADOOP_HEAPSIZE determines the jvm size of the daemons like NN,JT,TT,DN etc. mapred.child.java.opts and mapred.child.ulimit is used to set the jvm heap for child jvms launched for each map/reduce task launched. Regards Bejoy KS Sent from handheld, please excuse typos

Re: copyFromLocal

2012-10-04 Thread Bejoy KS
be accessible from this client. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Kartashov, Andy andy.kartas...@mpac.ca Date: Thu, 4 Oct 2012 16:51:35 To: user@hadoop.apache.orguser@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: RE: copyFromLocal I

Re: How to lower the total number of map tasks

2012-10-02 Thread Bejoy Ks
Hi You need to alter the value of mapred.max.split size to a value larger than your block size to have less number of map tasks than the default. On Tue, Oct 2, 2012 at 10:04 PM, Shing Hing Man mat...@yahoo.com wrote: I am running Hadoop 1.0.3 in Pseudo distributed mode. When I submit a

Re: How to lower the total number of map tasks

2012-10-02 Thread Bejoy Ks
Sorry for the typo, the property name is mapred.max.split.size Also just for changing the number of map tasks you don't need to modify the hdfs block size. On Tue, Oct 2, 2012 at 10:31 PM, Bejoy Ks bejoy.had...@gmail.com wrote: Hi You need to alter the value of mapred.max.split size

Re: How to lower the total number of map tasks

2012-10-02 Thread Bejoy KS
Hi Shing Is your input a single file or set of small files? If latter you need to use CombineFileInputFormat. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Shing Hing Man mat...@yahoo.com Date: Tue, 2 Oct 2012 10:38:59 To: user

Re: Add file to distributed cache

2012-10-01 Thread Bejoy KS
to distributed cache Sent: Oct 2, 2012 05:44 Hi all How do you add a small file to distributed cache in MR program Regards Abhi Sent from my iPhone Regards Bejoy KS Sent from handheld, please excuse typos.

Re: File block size use

2012-10-01 Thread Bejoy KS
Hi Anna If you want to increase the block size of existing files. You can use a Identity Mapper with no reducer. Set the min and max split sizes to your requirement (512Mb). Use SequenceFileInputFormat and SequenceFileOutputFormat for your job. Your job should be done. Regards Bejoy KS

Re: Programming Question / Joining Dataset

2012-09-26 Thread Bejoy Ks
Hi Oliver I have scribbled a small post on reduce side joins , the implementation matches with your requirement http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html Regards Bejoy KS

Re: Unit tests for Map and Reduce functions.

2012-09-26 Thread Bejoy Ks
Hi Ravi You can take a look at mockito http://books.google.co.in/books?id=Nff49D7vnJcCpg=PA138lpg=PA138dq=mockito+%2B+hadoopsource=blots=IifyVu7yXpsig=Q1LoxqAKO0nqRquus8jOW5CBiWYhl=ensa=Xei=b2pjULHSOIPJrAeGsIHwAgved=0CC0Q6AEwAg#v=onepageq=mockito%20%2B%20hadoopf=false On Thu, Sep 27, 2012 at

Re: Help on a Simple program

2012-09-25 Thread Bejoy Ks
Hi If you don't want either key or value in the output, just make the corresponding data types as NullWritable. Since you just need to filter out a few records/itemd from your logs, reduce phase is not mandatory just a mappper would suffice your needs. From your mapper just output the records

Re: Detect when file is not being written by another process

2012-09-25 Thread Bejoy Ks
Hi Peter AFAIK oozie has a mechanism to achieve this. You can trigger your jobs as soon as the files are written to a certain hdfs directory. On Tue, Sep 25, 2012 at 10:23 PM, Peter Sheridan psheri...@millennialmedia.com wrote: These are log files being deposited by other processes, which

Re: Job failed with large volume of small data: java.io.EOFException

2012-09-20 Thread Bejoy Ks
that run DNs. You can verify the current value using 'ulimit -n' and then try increasing the same to a much higher value. Regards Bejoy KS

Re: How to make the hive external table read from subdirectories

2012-09-13 Thread Bejoy KS
') LOCATION '/user/myuser/MapReduceOutput/2012/09/11'; Like this you need to register each of the paritions. After this your query should work as desired. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Nataraj Rashmi - rnatar rashmi.nata...@acxiom.com Date: Thu

Re: How to make the hive external table read from subdirectories

2012-09-12 Thread Bejoy KS
Hi Natraj Create a partitioned table and add the sub dirs as partitions. You need to have some logic in place for determining the partitions. Say if the sub dirs denote data based on a date then make date as the partition. Regards Bejoy KS Sent from handheld, please excuse typos

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks
Hi Lin The default value for number of reducers is 1 namemapred.reduce.tasks/name value1/value It is not determined by data volume. You need to specify the number of reducers for your mapreduce jobs as per your data volume. Regards Bejoy KS On Tue, Sep 11, 2012 at 4:53 PM, Jason Yang

Re: what's the default reducer number?

2012-09-11 Thread Bejoy Ks
Hi Lin The default values for all the properties are in core-default.xml hdfs-default.xml and mapred-default.xml Regards Bejoy KS On Tue, Sep 11, 2012 at 5:06 PM, Jason Yang lin.yang.ja...@gmail.comwrote: Hi, Bejoy Thanks for you reply. where could I find the default value

Re: Some general questions about DBInputFormat

2012-09-11 Thread Bejoy KS
for that db. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Yaron Gonen yaron.go...@gmail.com Date: Tue, 11 Sep 2012 15:41:26 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Some general questions about DBInputFormat Hi, After

Re: How to remove datanode from cluster..

2012-09-11 Thread Bejoy Ks
Hi Yogesh The detailed steps are available in hadoop wiki on FAQ page http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F Regrads Bejoy KS On Wed, Sep 12, 2012 at 12:14 AM, yogesh dhari yogeshdh

Re: Reg: parsing all files file append

2012-09-10 Thread Bejoy Ks
Hi Manoj From my limited knowledge on file appends in hdfs , i have seen more recommendations to use sync() in the latest releases than using append(). Let us wait for some commiter to authoritatively comment on 'the production readiness of append()' . :) Regards Bejoy KS On Mon, Sep 10, 2012

Re: Reg: parsing all files file append

2012-09-09 Thread Bejoy KS
Hi Manoj You can load daily logs into a individual directories in hdfs and process them daily. Keep those results in hdfs or hbase or dbs etc. Every day do the processing, get the results and aggregate the same with the previously aggregated results till date. Regards Bejoy KS Sent from

Re: Using hadoop for analytics

2012-09-05 Thread Bejoy Ks
Hi Prashant Welcome to Hadoop Community. :) Hadoop is meant for processing large data volumes. Saying that, for your custom requirements you should write your own mapper and reducer that contains your business logic for processing the input data. Also you can have a look at hive and pig, which

Re: Replication Factor Modification

2012-09-05 Thread Bejoy Ks
Hi You can change the replication factor of an existing directory using '-setrep' http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#setrep The below command will recursively set the replication factor to 1 for all files within the given directory '/user' hadoop fs -setrep -w 1 -R

Re: Replication Factor Modification

2012-09-05 Thread Bejoy Ks
Hi Uddipan As Harsh mentioned, replication factor is a client side property . So you need to update the value for 'dfs.replication' in hdfs-site.xml as per your requirement in your edge nodes or from the machines your are copying files to hdfs. If you are using some of the existing DN's for this

Re: Integrating hadoop with java UI application deployed on tomcat

2012-09-04 Thread Bejoy KS
) You need to have the exact configuration files and hadoop jars from the cluster machines on this tomcat environment as well. I mean on the classpath of your application. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Visioner Sadak visioner.sa

Re: Exception while running a Hadoop example on a standalone install on Windows 7

2012-09-04 Thread Bejoy Ks
Hi Udayani By default hadoop works well for linux and linux based OS. Since you are on Windows you need to install and configure ssh using cygwin before you start hadoop daemons. On Tue, Sep 4, 2012 at 6:16 PM, Udayini Pendyala udayini_pendy...@yahoo.com wrote: Hi, Following is a

Re: reading a binary file

2012-09-03 Thread Bejoy Ks
Hi Francesco TextInputFormat reads line by line based on '\n' by default, there the key values is the position offset and the line contents respectively. But in your case it is just a sequence of integers and also it is Binary. Also you require the offset for each integer value and not offset by

Re: knowing the nodes on which reduce tasks will run

2012-09-03 Thread Bejoy Ks
HI Abhay The TaskTrackers on which the reduce tasks are triggered is chosen in random based on the reduce slot availability. So if you don't need the reduce tasks to be scheduled on some particular nodes you need to set 'mapred.tasktracker.reduce.tasks.maximum' on those nodes to 0. The bottleneck

Re: knowing the nodes on which reduce tasks will run

2012-09-03 Thread Bejoy Ks
, Bejoy Ks bejoy.had...@gmail.com wrote: HI Abhay The TaskTrackers on which the reduce tasks are triggered is chosen in random based on the reduce slot availability. So if you don't need the reduce tasks to be scheduled on some particular nodes you need to set

Re: MRBench Maps strange behaviour

2012-08-29 Thread Bejoy KS
Hi Gaurav You can get the information on the num of map tasks in the job from the JT web UI itself. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Gaurav Dasgupta gdsay...@gmail.com Date: Wed, 29 Aug 2012 13:14:11 To: user@hadoop.apache.org Reply

Re: one reducer is hanged in reduce- copy phase

2012-08-28 Thread Bejoy KS
. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Abhay Ratnaparkhi abhay.ratnapar...@gmail.com Date: Tue, 28 Aug 2012 19:40:58 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: one reducer is hanged in reduce- copy phase Hello, I have a MR

Re: namenode not starting

2012-08-24 Thread Bejoy KS
Hi Abhay What is the value for hadoop.tmp.dir or dfs.name.dir . If it was set to /tmp the contents would be deleted on a OS restart. You need to change this location before you start your NN. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Abhay

Re: Streaming issue ( URGENT )

2012-08-20 Thread Bejoy Ks
/joins-with-plain-map-reduce.html Regards Bejoy KS

Re: Number of Maps running more than expected

2012-08-17 Thread Bejoy Ks
for number of splits won't hold. If small files are there then definitely the number of maps tasks should be more. Also did you change the split sizes as well along with block size? Regards Bejoy KS

Re: help in distribution of a task with hadoop

2012-08-13 Thread Bejoy Ks
/ -files to distribute jars or files Regards Bejoy KS

Re: how to enhance job start up speed?

2012-08-13 Thread Bejoy KS
to the map task node. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Matthias Kricke matthias.mk.kri...@gmail.com Sender: matthias.zeng...@gmail.com Date: Mon, 13 Aug 2012 16:33:06 To: user@hadoop.apache.org Reply-To: user@hadoop.apache.org Subject: Re

Re: Hbase JDBC API

2012-08-10 Thread Bejoy Ks
/HBaseIntegration Regards Bejoy KS

Re: fs.local.block.size vs file.blocksize

2012-08-09 Thread Bejoy Ks
To understand more details on the working, i have just scribbled something long back, may be it can help you start off http://kickstarthadoop.blogspot.in/2011/04/word-count-hadoop-map-reduce-example.html Regards Bejoy KS

Re: Problem with hadoop filesystem after restart cluster

2012-08-08 Thread Bejoy Ks
Hi Andy Is your hadoop.tmp.dir or dfs.name.dir configured to /tmp? If so it can happen as /tmp dir gets wiped out on OS restarts Regards Bejoy KS

Re: Reading fields from a Text line

2012-08-03 Thread Bejoy KS
That is a good pointer Harsh. Thanks a lot. But if IdentityMapper is being used shouldn't the job.xml reflect that? But Job.xml always shows mapper as our CustomMapper. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Harsh J ha...@cloudera.com Date

Re: Reading fields from a Text line

2012-08-03 Thread Bejoy KS
Ok Got it now. That is a good piece of information. Thank You :) Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Harsh J ha...@cloudera.com Date: Fri, 3 Aug 2012 16:28:27 To: mapreduce-user@hadoop.apache.org; bejoy.had...@gmail.com Cc: Mohammad

Re: Reading fields from a Text line

2012-08-02 Thread Bejoy KS
on that as well. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Mohammad Tariq donta...@gmail.com Date: Thu, 2 Aug 2012 15:48:42 To: mapreduce-user@hadoop.apache.org Reply-To: mapreduce-user@hadoop.apache.org Subject: Re: Reading fields from a Text line

Re: Reading fields from a Text line

2012-08-02 Thread Bejoy Ks
!.. Regards Bejoy KS

Re: All reducers are not being utilized

2012-08-02 Thread Bejoy Ks
reduce tasks there is no gaurentee that one task will be scheduled on each node. It can be like 2 in one node and 1 in another. Regards Bejoy KS

Re: DBOutputWriter timing out writing to database

2012-08-02 Thread Bejoy Ks
Hi Nathan Alternatively you can have a look at Sqoop , which offers efficient data transfers between rdbms and hdfs. Regards Bejoy KS

Re: Reading fields from a Text line

2012-08-02 Thread Bejoy Ks
the framework triggers Identity Mapper instead of the custom mapper provided with the configuration. This seems like a bug to me . Filed a jira to track this issue https://issues.apache.org/jira/browse/MAPREDUCE-4507 Regards Bejoy KS

Re: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text

2012-08-02 Thread Bejoy Ks
); job.setMapOutputValueClass(Text.class); //setting the final reduce output data type classes job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); Regards Bejoy KS

Re: Disable retries

2012-08-02 Thread Bejoy KS
. With these two steps you can ensure that a task is attempted only once. These properties to be set in mapred-site.xml or at job level. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Marco Gallotta ma...@gallotta.co.za Date: Thu, 2 Aug 2012 16:52:00 To: common

Re: Merge Reducers Output

2012-07-30 Thread Bejoy KS
Hi Why not use 'hadoop fs -getMerge outputFolderInHdfs targetFileNameInLfs' while copying files out of hdfs for the end users to consume. This will merge all the files in 'outputFolderInHdfs' into one file and put it in lfs. Regards Bejoy KS Sent from handheld, please excuse typos

Re: Error reading task output

2012-07-27 Thread Bejoy Ks
that runs mapreduce jobs, for a non security enabled cluster it is mapred. You need to increase this to a laarge value using mapred soft nproc 1 mapred hard nproc 1 If you are running on a security enabled cluster, this value should be raised for the user who submits the job. Regards Bejoy KS

Re: Hadoop 1.0.3 start-daemon.sh doesn't start all the expected daemons

2012-07-27 Thread Bejoy Ks
Hi Dinesh Try using $HADOOP_HOME/bin/start-all.sh . It starts all the hadoop daemons including TT and DN. Regards Bejoy KS

Re: Retrying connect to server: localhost/127.0.0.1:9000.

2012-07-27 Thread Bejoy KS
Hi Keith Your NameNode is not up still. What does the NN logs say? Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: anil gupta anilgupt...@gmail.com Date: Fri, 27 Jul 2012 11:30:57 To: common-user@hadoop.apache.org Reply-To: common-user

Re: KeyValueTextInputFormat absent in hadoop-0.20.205

2012-07-25 Thread Bejoy Ks
Hi Tariq KeyValueTextInputFormat is available from hadoop 1.0.1 version on wards for the new mapreduce API http://hadoop.apache.org/common/docs/r1.0.1/api/org/apache/hadoop/mapreduce/lib/input/KeyValueTextInputFormat.html Regards Bejoy KS On Wed, Jul 25, 2012 at 8:07 PM, Mohammad Tariq donta

Re: Unexpected end of input stream (GZ)

2012-07-24 Thread Bejoy Ks
Hi Oleg From the job tracker page, you can get to the failed tasks and see which was the file split processed by that task. The split information is available under the status column for each task. The file split information is not available on job history. Regrads Bejoy KS On Tue, Jul 24

Re: fail and kill all tasks without killing job.

2012-07-20 Thread Bejoy KS
Hi Jay Did you try hadoop job -kill-task task-id ? And is that not working as desired? Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: jay vyas jayunit...@gmail.com Date: Fri, 20 Jul 2012 17:17:58 To: common-user@hadoop.apache.orgcommon-user

Re: NameNode fails

2012-07-20 Thread Bejoy KS
Hi Yogesh Is your dfs.name.dir pointing to /tmp dir? If so try changing that to any other dir . The contents of /tmp may get wiped out on OS restarts. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: yogesh.kuma...@wipro.com Date: Fri, 20 Jul 2012 06

Re: Hadoop filesystem directories not visible

2012-07-19 Thread Bejoy KS
Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Yuvrajsinh Chauhan yuvraj.chau...@elitecore.com Date: Thu, 19 Jul 2012 15:16:24 To: hdfs-user@hadoop.apache.org Reply-To: hdfs-user@hadoop.apache.org Subject: RE: Hadoop filesystem directories not visible Dear

Re: Hadoop filesystem directories not visible

2012-07-19 Thread Bejoy Ks
not visible Thanks Bejoy!! On Thu, Jul 19, 2012 at 3:22 PM, Bejoy KS bejoy.had...@gmail.com wrote: Hi Saniya In hdfs the directory exists only as meta data in the name node. There is no real hierarchical existence like normal file system. It is the data in the files that is stored as hdfs

Re: Loading data in hdfs

2012-07-19 Thread Bejoy Ks
Hi Prabhjot Yes, Just use the filesystem commands hadoop fs -copyFromLocal src fs path destn hdfs path Regards Bejoy KS On Thu, Jul 19, 2012 at 3:49 PM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi, I am unable to use sqoop and want to load data in hdfs for testing, Is there any

Re: Jobs randomly not starting

2012-07-12 Thread Bejoy KS
. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Robert Dyer psyb...@gmail.com Date: Thu, 12 Jul 2012 23:03:02 To: mapreduce-user@hadoop.apache.org Reply-To: mapreduce-user@hadoop.apache.org Subject: Jobs randomly not starting I'm using Hadoop 1.0.3

Re: Error using MultipleInputs

2012-07-05 Thread Bejoy Ks
Hi Sanchita Try your code after commenting the following Line of code, //conf.setInputFormat(TextInputFormat.class); AFAIK This explicitly sets the input format as TextInputFormat instead of MultipleInput and hence the compiler throws an error stating 'no input path specified'. Regards Bejoy

Re: Hive/Hdfs Connector

2012-07-05 Thread Bejoy KS
Regards Bejoy KS Sent from handheld, please excuse typos.

Re: change hdfs block size for file existing on HDFS

2012-06-26 Thread Bejoy KS
Regards Bejoy KS Sent from handheld, please excuse typos.

Re: change hdfs block size for file existing on HDFS

2012-06-26 Thread Bejoy Ks
Hi Anurag, To add on, you can also change the replication of exiting files by hadoop fs -setrep http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#setrep On Tue, Jun 26, 2012 at 7:42 PM, Bejoy KS bejoy.had...@gmail.com wrote: Hi Anurag, The easiest option would be , in your map

Re: change hdfs block size for file existing on HDFS

2012-06-26 Thread Bejoy KS
Regards Bejoy KS Sent from handheld, please excuse typos.

Re: Streaming in mapreduce

2012-06-16 Thread Bejoy KS
Hi Pedro In simple terms Streaming API is used in hadoop if you have your mapper or reducer is in any language other than java . Say ruby or python. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Pedro Costa psdc1...@gmail.com Date: Sat, 16 Jun

Re: Setting number of mappers according to number of TextInput lines

2012-06-16 Thread Bejoy KS
very small input size (kB), but processing to produce some output takes several minutes. Is there a way how to say, file has 100 lines, i need 10 mappers, where each mapper node has to process 10 lines of input file? Thanks for advice. Ondrej Klimpera Regards Bejoy KS Sent from handheld

Re: [Newbie] How to make Multi Node Cluster from Single Node Cluster

2012-06-14 Thread Bejoy KS
You can follow the documents for 0.20.x . It is almost same for 1.0.x as well. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Alpha Bagus Sunggono bagusa...@gmail.com Date: Thu, 14 Jun 2012 17:15:16 To: common-user@hadoop.apache.org Reply

Re: Map/Reduce | Multiple node configuration

2012-06-12 Thread Bejoy KS
? Is it required for the Map Reduce to execute on the machines which has the data stored (DFS)? Bejoy: MR framework takes care of this. Map tasks consider data locality. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Girish Ravi giri...@srmtech.com Date: Tue

Re: Need logical help

2012-06-12 Thread Bejoy KS
Hi Girish You can achice this using reduce side joins. Use MultipleInputFormat for parsing two different sets of log files. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Girish Ravi giri...@srmtech.com Date: Tue, 12 Jun 2012 12:59:32

Re: Need logical help

2012-06-12 Thread Bejoy KS
To add on, have a look at hive and pig. Those are perfect fit for similar use cases. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: Bejoy KS bejoy.had...@gmail.com Date: Tue, 12 Jun 2012 13:04:33 To: mapreduce-user@hadoop.apache.org Reply

Re: set the mapred.map.tasks.speculative.execution=false, but it is not useful.

2012-06-12 Thread Bejoy KS
Hi If your intension is controlling the number of attempts every task make, then the property to be tweaked is mapred.map.max.attempts The default value is 4, for no map task re attempts make it 1. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From

Re: Getting filename in case of MultipleInputs

2012-05-03 Thread Bejoy Ks
Hi Subbu, The file/split processed by a mapper could be obtained from WebUI as soon as the job is executed. However this detail can't be obtained once the job is moved to JT history. Regards Bejoy On Thu, May 3, 2012 at 6:25 PM, Kasi Subrahmanyam kasisubbu...@gmail.com wrote: Hi,

Re: updating datanode config files on namenode recovery

2012-05-01 Thread Bejoy KS
Hi Sumadhur, The easier approach is to make the hostname of the new NN same as the old one, else you'll have to update the new one on config files across cluster. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: sumadhur sumadhur_i...@yahoo.com Date

Re: reducers and data locality

2012-04-27 Thread Bejoy KS
Hi Mete A custom Paritioner class can control the flow of keys to the desired reducer. It gives you more control on which key to which reducer. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: mete efk...@gmail.com Date: Fri, 27 Apr 2012 09:19:21

Re: Reducer not firing

2012-04-17 Thread Bejoy KS
IdentityReducer is being triggered. Regards Bejoy KS Sent from handheld, please excuse typos. -Original Message- From: kasi subrahmanyam kasisubbu...@gmail.com Date: Tue, 17 Apr 2012 19:10:33 To: mapreduce-user@hadoop.apache.org Reply-To: mapreduce-user@hadoop.apache.org Subject: Re

Re: map and reduce with different value classes

2012-04-16 Thread Bejoy Ks
(theClass) //set final/reduce output key value types         job.setOutputKeyClass(Text.class);         job.setOutputValueClass(IntWritable.class) If both map output and reduce output key value types are the same you just need to specify the final output types. Regards Bejoy KS On Tue, Apr 17

  1   2   >