Look at what I sent you.it generates a number of splits three is a static variable to list how many On May 5, 2012 8:38 AM, "sharat attupurath" <shara...@hotmail.com> wrote:
> Since the input files are very small, the default input formats in Hadoop > all generate just a single InputSplit, so only a single map task is > executed, and we wont have much parallelism. > > I was thinking of writing an InputFormat that would read the whole file as > an InputSplit and replicate this input split n times (where n would be the > number of ants in a single stage) so that we'll have n mappers. > Also I want the input format to return the value as the adjacency matrix > of the graph (calculating it from the coordinates in the input file). > > But I can't find a way to do this. Is it possible? Or is it better to just > have the input as Text and create the adjacency matrix in the mappers? > > ------------------------------ > Date: Sat, 5 May 2012 08:20:34 -0700 > Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in > Hadoop > From: lordjoe2...@gmail.com > To: mapreduce-user@hadoop.apache.org > > yes - if you know how you can put it in distributed cache or if it is > small put in as a String in the config or have all InputFormats read it > from somewhere > > On Sat, May 5, 2012 at 8:08 AM, sharat attupurath <shara...@hotmail.com>wrote: > > I looked at both the files. in AbstractNShotInputFormat it is mentioned > that this input format does not read from files. My input is in a text > file. I want the whole file as a single record. So is it enough if i just > copy the contents of the file and return it as a string from > getValueFromIndex() ? > > ------------------------------ > Date: Fri, 4 May 2012 13:15:46 -0700 > > Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in > Hadoop > From: lordjoe2...@gmail.com > To: mapreduce-user@hadoop.apache.org > > Look at NShotInputFormat > call setNumberKeys to set the number of ants then > change the method getValueFromIndex to return the text representing the > original problem to each mapper - > what happens in the second and third jobs is an exercise but the saved > test needs to have pheremones and the original problem > > > On Fri, May 4, 2012 at 9:54 AM, sharat attupurath <shara...@hotmail.com>wrote: > > Hi, > > Thanks a lot for the quick reply sir! We are new to Apache Hadoop and > haven't yet understood it properly yet. Can you please elaborate on how we > can have multiple stages of mapreduce jobs for combining the trails as you > have mentioned? > > We have been trying to find out how to write a custom splitter and almost > all online resources tell to subclass the FileInputFormat and write only a > custom RecordReader? Will it be possible to generate our splits in that way? > > Regards > > Sharat > > ------------------------------ > Date: Fri, 4 May 2012 09:22:51 -0700 > Subject: Re: Ant Colony Optimization for Travelling Salesman Problem in > Hadoop > From: lordjoe2...@gmail.com > To: mapreduce-user@hadoop.apache.org > > > > > On Fri, May 4, 2012 at 9:06 AM, sharat attupurath <shara...@hotmail.com>wrote: > > Hi, > > We are trying to parallelize the ant colony optimization algorithm for TSP > over hadoop and are facing some issues. We are using TSPLIB as input files. > The input is a text file containing eucledian coordinates of the cities - > first column is city number and the next two columns contain the x and y > coordinates respectively. > > What we intend to do is to take input from this single file, send copies > of it to multiple mappers (each mapper acts like the ant in the algorithm), > each mapper works on its input to find its own TSP solution that it outputs > and finally the reducer outputs the smallest tour found by the mappers. > Hope we are in the right track. Here are the issues: > > 1) Since the input file is small, we need to force hadoop to fire up > multiple map tasks by replicating the input. How can we make an InputSplit > of the whole file and replicate it so that the input can be sent to > multiple mappers? > > Write a custom splitter sending the same data to all mappers - the only > critical criteria is the number of "Ants" > > > 2) the algorithm uses a shared pheromone array and each mapper needs to > read and write data from this. How can we share the pheromone data across > the mappers. > > You cannot share data across mappers and should not attempt to do so - > Better to use the reducer(s) to combine trails combine first pass trails > and > then pass the combined trails to another mapreduce job - this with the > original problem plus the current pheremone trails > > > Hope the questions are clear enough. Any help would be greatly > appreciated. > > Thank you > > Regards > > Sharat > > > > > -- > Steven M. Lewis PhD > 4221 105th Ave NE > Kirkland, WA 98033 > 206-384-1340 (cell) > Skype lordjoe_com > > > > > > -- > Steven M. Lewis PhD > 4221 105th Ave NE > Kirkland, WA 98033 > 206-384-1340 (cell) > Skype lordjoe_com > > > > > > -- > Steven M. Lewis PhD > 4221 105th Ave NE > Kirkland, WA 98033 > 206-384-1340 (cell) > Skype lordjoe_com > > >