Hello,
I am trying to write a program where I need to write multiple rounds of map
and reduce.
The output of the last round of map-reduce must be fed into the input of the
next round.
Can anyone please guide me to any link / material that can teach me as to
how I can achieve this.
Thanks a lot
3, 2011 at 5:46 PM, Arko Provo Mukherjee <
> arkoprovomukher...@gmail.com> wrote:
>
>> Hello,
>>
>> I am trying to write a program where I need to write multiple rounds of
>> map and reduce.
>>
>> The output of the last round of map-reduce must be f
Hello,
I am trying to learn Hadoop and doing a project on it.
I need to update some files in my project and hence wanted to use version
0.21.0
However, I am confused as to how I can compile my programs on version 0.21.0
as it doesn't have any hadoop-core-0.21.0.jar file. What option should I
hav
0.21.0.jar for accessing the mapreduce APIs. I
> cannot really comment further on compilation errors without seeing the
> code/error messages.
>
> --Bobby Evans
>
>
> On 8/31/11 4:34 PM, "Arko Provo Mukherjee"
> wrote:
>
> Hello,
>
> I am trying to lea
ass in an IDE such as Eclipse you’ll see
> that when you restrict the org.apache.hadoop.* import only to packages you
> need, that indeed you are using hdfs classes.
>
>
>
> Thanks,
>
>
>
> Joep
>
>
>
> From: Arko Provo Mukherjee [mailto:arkoprovomukher...@gmai
Hello Everyone,
I have a small issue with my Reducer that I am trying to figure out
and wanted some advice.
In the reducer, when writing to the output file as declared in
FileOutputFormat.setOutputPath() I want to write only the key and not
the value when I am calling output.collect().
Is there
gt; Hi Akro
> You can achieve the same within the existing mapreduce frame work
> itself. Give a NullWritable in place of reducer output value in reduce
> function. In your driver class as well mention the output value type as
> NullWritable.
>
> --Original Message-
> Writables in hadoop. When you nedd to use NullWritable instance you can give
> NullWritable.get(), which would do the job.
> Ie
> output.collect ( NullWritable.get(), new Text(output_string) );
>
> Regards
> Bejoy K S
>
> -Original Message-
> From: Arko Provo
Hi,
Is there a way to pass some data from the driver class to the Mapper
class without going through the HDFS?
Does the API provide us with some functionality to pass some variables?
Thanks a lot in advance!
Warm regards
Arko
);
> somevar = conf.get();
>
> On Thu, Sep 15, 2011 at 11:13 PM, Arko Provo Mukherjee
> wrote:
>>
>> Hi,
>>
>> Is there a way to pass some data from the driver class to the Mapper
>> class without going through the HDFS?
>>
>> Does the API provi
Hi,
I am writing some Map Reduce programs in pseudo-distributed mode.
I am getting some error in my program and would like to debug it.
For that I want to embed some print statements in my Map / Reduce.
But when I am running the mappers, the prints doesn't seem to show up in the
terminal.
Does
utput (stdout) and error (stderr) streams of the task are read
> by the TaskTracker and logged to ${HADOOP_LOG_DIR}/userlogs *
>
>
>
> Regards,
> Subroto Sanyal
>
> From: Arko Provo Mukherjee [mailto:arkoprovomukher...@gmail.com]
> Sent: Tuesday, September 27, 2011
Hi,
I am not sure how you can avoid the filesystem, however, I did it as follows:
// For Job 1
FileInputFormat.addInputPath(job1, new Path(args[0]));
FileOutputFormat.setOutputPath(job1, new Path(args[1]));
// For job 2
FileInputFormat.addInputPath(job2, new Path(args[1]));
FileOutputFormat.setO
Hello Everyone,
I have a particular situation, where I am trying to run Iterative
Map-Reduce, where the output files for one iteration are the input files for
the next.
It stops when there are no new files created in the output.
*Code Snippet:*
*int round = 0;*
*JobConf jobconf = new JobConf(ne
Hi,
I solved it by creating a new JobConf instance for each iteration in the loop.
Thanks & regards
Arko
On Oct 12, 2011, at 1:54 AM, Arko Provo Mukherjee
wrote:
> Hello Everyone,
>
> I have a particular situation, where I am trying to run Iterative Map-Reduce,
> where the
Hi,
I have a situation where I have to read a large file into every mapper.
Since its a large HDFS file that is needed to work on each input to the
mapper, it is taking a lot of time to read the data into the memory from
HDFS.
Thus the system is killing all my Mappers with the following message:
Thanks!
I will try and let know.
Warm regards
Arko
On Oct 27, 2011, at 8:19 AM, Brock Noland wrote:
> Hi,
>
> On Thu, Oct 27, 2011 at 3:22 AM, Arko Provo Mukherjee
> wrote:
>> Hi,
>>
>> I have a situation where I have to read a large file into every mapper.
. You could add
> some context.progress() or context.setStatus("status") in your map method
> from time to time (at least once every 600 seconds, to not get the timeout).
>
> Regards,
> Lucian
>
>
> On Thu, Oct 27, 2011 at 11:22 AM, Arko Provo Mukherjee <
> ark
Hello,
I have a situation where I am reading a big file from HDFS and then
comparing all the data in that file with each input to the mapper.
Now since my mapper is trying to read the entire HDFS file for each of its
input, the amount of data it is having to read and keep in memory is
becoming la
Hello,
I am having the following problem with Distributed Caching.
*In the driver class, I am doing the following: (/home/arko/MyProgram/data
is a directory created as an output of another map-reduce)*
*FileSystem fs = FileSystem.get(jobconf_seed);
String init_path = "/home/arko/MyProgram/data"
Hi,
Check the links below.
Read from HDFS:
https://sites.google.com/site/hadoopandhive/home/hadoop-how-to-read-a-file-from-hdfs
Write from HDFS:
https://sites.google.com/site/hadoopandhive/home/how-to-write-a-file-in-hdfs-using-hadoop
Hope they help!
Thanks & regards
Arko
On Tue, Apr 3, 2012 a
output
> into the Job output path.
>
> Thanks
> Devaraj
>
> ____
> From: Arko Provo Mukherjee [arkoprovomukher...@gmail.com]
> Sent: Tuesday, April 17, 2012 10:32 AM
> To: mapreduce-user@hadoop.apache.org
> Subject: Reducer not firin
the reduce phase. By default task attempt logs present in
> $HADOOP_LOG_DIR/userlogs//. There could be some bug exist in your
> reducer which is leading to this output.
>
>
> Thanks
> Devaraj
>
>
> From: Arko Provo Mukherjee [arkoprovomukher...@gmail.com]
>
> Sent: Tue
Hello George,
It worked. Thanks so much!! Bad typo while porting :(
Thanks again to everyone who helped!!
Warm regards
Arko
On Tue, Apr 17, 2012 at 6:59 PM, George Datskos
wrote:
> Arko,
>
> Change Iterator to Iterable
>
>
> George
>
>
>
> On 2012/04/18 8:
Hello All,
I am running my job on a Hadoop Cluster and it fails due to insufficient
Java Heap Memory.
I searched in google, and found that I need to add the following into the
conf files:
mapred.child.java.opts
-Xmx2000m
However, I don't want to request the administrator to change
of your job (jobConf.set(…) or
> job.getConfiguration().set(…)). Alternatively, if you implement Tool,
> and use its grabbed Configruation, you can also pass it via
> -Dname=value argument when running the job (the option has to precede
> any custom options).
>
> On Sat, Sep 7, 20
rently and check you're not offering each more memory than the
> machine has spare.
>
> Hope this helps,
> Tim
>
>
> On Sat, Sep 7, 2013 at 8:20 PM, Arko Provo Mukherjee <
> arkoprovomukher...@gmail.com> wrote:
>
>> Hi Harsh,
>>
>> Thanks for
27 matches
Mail list logo