Re: Code does not enter mapper, reducer class

2016-09-18 Thread Denis Mone

Hello.

  I have finally found the solution to my problem.
  What i have done wrong is not to place any files in the input path i 
have designated in my code.
  So the framework could not read any input file and thus no need to 
call mapper an reducer.


Thanks for spending time in answering my question.

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



Re: Code does not enter mapper, reducer class

2016-09-18 Thread Denis Mone
The imports are from the mapreduce package. Also i use maven for 
dependencies and the version of hadoop is 2.7.1



On 09/18/2016 09:11 PM, Dieter De Witte wrote:
Maybe also add the list of imports you are doing, they are different 
between different versions of hadoop and mixing them might cause 
counterintuitive behaviour...


Kind Regards,

Dieter De Witte
Big Data Scientist
iMinds - Data Science Lab 
Ghent University

2016-09-18 19:54 GMT+02:00 Denis Mone >:


Hello and thanks for your time.
What i mean is that i have setup breakpoints in my code in the
map and reduce functions
and the the breakpoint is not activated when the program
starts running (hence the title code does not enter class, which
is not that informative really).
 As for the jobs, there is only one, that of kmeans algorithm and
is being executed correctly no exception thrown.
Here

is the output of the job.
 The driver class is this


09/18/2016 02:27 AM, daemeon reiydelle wrote:

What do you mean by "does not enter ... class(es)"?

Does the log show that the scheduler ever accepts the job (You
may have to turn logging up)? Are "other" jobs that are submitted
to the same class under your user scheduled & executed? Wonder
about which scheduler? What is the definition for the scheduler
class? Is it getting to a container? Let's get a complete history
of the steps you are getting please?

***
...**

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198 
London (+44) (0) 20 8144 9872
*/
/

On Sat, Sep 17, 2016 at 3:56 AM, Denis Mone > wrote:

Hello hadoop users.

I am trying to implement a mapreduce KMeans algorithm
using hadoop. The problem i have is that the code does not
enter the map and reduce class. I'm running the application
from Intellij Idea not using hadoop binary.

The rest of the email is a sample of my code. If someone can
see something that could help that would be greatly appreciated.

Thanks in advance.

Here is my driver code:

Job job = Job.getInstance(conf); job.setJobName("kmeans"); 
job.setJarByClass(KMeans.class); FileInputFormat.addInputPath(job, input); 
FileOutputFormat.setOutputPath(job, output); job.setMapperClass(KMeansMapper.class); 
job.setReducerClass(KMeansReducer.class);job.setMapOutputKeyClass(PointVector.class); 
job.setMapOutputValueClass(PointVector.class); job.setOutputKeyClass(Text.class); 
job.setOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); 
job.setOutputFormatClass(TextOutputFormat.class);job.waitForCompletion(true);

And below are my map and reduce classes:

public class KMeansMapperextends Mapper {

 private int clusters; private List>centers; @Override protected void setup(Context context)throws 
IOException, InterruptedException {
 System.out.println("Inside setup"); this.clusters = 
Integer.valueOf(context.getConfiguration().get("clusters")); this.centers =new ArrayList<>(); 
BufferedReader br =new BufferedReader(new FileReader("/home/denis/centers")); for(int i =0; i  c :centers) {
 dist = ed.compute(line.points(), c.right.points()); if (dist 
< minDist) {
 minDist = dist; index = c.right; }
 }

 context.write(index, line); }
}

public class KMeansReducerextends Reducer {
 private double min_dist = Double.MAX_VALUE; @Override public void 
reduce(PointVector center, Iterable points, Context context)throws 
IOException, InterruptedException {
 EuclideanDistance measure =new EuclideanDistance(); double 
distance =0.0; int numOfPoints =0;double diff =0.0; PointVector newCenter 
=null; double [] sums =new double[center.size()]; for (PointVector p : points) {

Re: Code does not enter mapper, reducer class

2016-09-18 Thread Denis Mone

Hello and thanks for your time.
What i mean is that i have setup breakpoints in my code in the map 
and reduce functions
and the the breakpoint is not activated when the program starts 
running (hence the title code does not enter class, which is not that 
informative really).
 As for the jobs, there is only one, that of kmeans algorithm and is 
being executed correctly no exception thrown.
Here 
 is 
the output of the job.
 The driver class is this 
 



09/18/2016 02:27 AM, daemeon reiydelle wrote:

What do you mean by "does not enter ... class(es)"?

Does the log show that the scheduler ever accepts the job (You may 
have to turn logging up)? Are "other" jobs that are submitted to the 
same class under your user scheduled & executed? Wonder about which 
scheduler? What is the definition for the scheduler class? Is it 
getting to a container? Let's get a complete history of the steps you 
are getting please?


***
...**

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872*/
/

On Sat, Sep 17, 2016 at 3:56 AM, Denis Mone > wrote:


Hello hadoop users.

I am trying to implement a mapreduce KMeans algorithm using
hadoop. The problem i have is that the code does not enter the map
and reduce class. I'm running the application from Intellij Idea
not using hadoop binary.

The rest of the email is a sample of my code. If someone can see
something that could help that would be greatly appreciated.

Thanks in advance.

Here is my driver code:

Job job = Job.getInstance(conf); job.setJobName("kmeans"); 
job.setJarByClass(KMeans.class); FileInputFormat.addInputPath(job, input); 
FileOutputFormat.setOutputPath(job, output); job.setMapperClass(KMeansMapper.class); 
job.setReducerClass(KMeansReducer.class);job.setMapOutputKeyClass(PointVector.class); 
job.setMapOutputValueClass(PointVector.class); job.setOutputKeyClass(Text.class); 
job.setOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); 
job.setOutputFormatClass(TextOutputFormat.class);job.waitForCompletion(true);

And below are my map and reduce classes:

public class KMeansMapperextends Mapper {

 private int clusters; private List>centers; @Override protected void setup(Context context)throws 
IOException, InterruptedException {
 System.out.println("Inside setup"); this.clusters = 
Integer.valueOf(context.getConfiguration().get("clusters")); this.centers =new ArrayList<>(); 
BufferedReader br =new BufferedReader(new FileReader("/home/denis/centers")); for(int i =0; i  c :centers) {
 dist = ed.compute(line.points(), c.right.points()); if (dist < 
minDist) {
 minDist = dist; index = c.right; }
 }

 context.write(index, line); }
}

public class KMeansReducerextends Reducer {
 private double min_dist = Double.MAX_VALUE; @Override public void 
reduce(PointVector center, Iterable points, Context context)throws 
IOException, InterruptedException {
 EuclideanDistance measure =new EuclideanDistance(); double 
distance =0.0; int numOfPoints =0;double diff =0.0; PointVector newCenter 
=null; double [] sums =new double[center.size()]; for (PointVector p : points) {
 distance += measure.compute(center.points(), p.points()); if 
(distance (); public PointVector(double [] values) {
   

Re: Code does not enter mapper, reducer class

2016-09-17 Thread daemeon reiydelle
What do you mean by "does not enter ... class(es)"?

Does the log show that the scheduler ever accepts the job (You may have to
turn logging up)? Are "other" jobs that are submitted to the same class
under your user scheduled & executed? Wonder about which scheduler? What is
the definition for the scheduler class? Is it getting to a container? Let's
get a complete history of the steps you are getting please?


*...*



*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Sat, Sep 17, 2016 at 3:56 AM, Denis Mone  wrote:

> Hello hadoop users.
>
> I am trying to implement a mapreduce KMeans algorithm using hadoop.
> The problem i have is that the code does not enter the map and reduce
> class. I'm running the application from Intellij Idea not using hadoop
> binary.
>
> The rest of the email is a sample of my code. If someone can see something
> that could help that would be greatly appreciated.
>
> Thanks in advance.
>
> Here is my driver code:
>
> Job job = 
> Job.getInstance(conf);job.setJobName("kmeans");job.setJarByClass(KMeans.class);FileInputFormat.addInputPath(job,
>  input);FileOutputFormat.setOutputPath(job, 
> output);job.setMapperClass(KMeansMapper.class);job.setReducerClass(KMeansReducer.class);job.setMapOutputKeyClass(PointVector.class);job.setMapOutputValueClass(PointVector.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);job.setInputFormatClass(TextInputFormat.class);job.setOutputFormatClass(TextOutputFormat.class);job.waitForCompletion(true);
>
> And below are my map and reduce classes:
>
> public class KMeansMapper extends Mapper PointVector> {
>
> private int clusters;private List PointVector>> centers;@Overrideprotected void setup(Context context) 
> throws IOException, InterruptedException {
> System.out.println("Inside setup");this.clusters = 
> Integer.valueOf(context.getConfiguration().get("clusters"));
> this.centers = new ArrayList<>();BufferedReader br = new 
> BufferedReader(new FileReader("/home/denis/centers"));for(int i = 0; 
> i < clusters; i++) {
> centers.add(DocumentRecordParser.parse(br.readLine()));}
> br.close();}
>
> @Overridepublic void map(LongWritable key, Text value, Context 
> context) throws IOException, InterruptedException {
> PointVector line = 
> DocumentRecordParser.returnPointVector(value.toString());
> System.out.println("Inside map!");double minDist = Double.MAX_VALUE;  
>   double dist;PointVector index = null;EuclideanDistance 
> ed = new EuclideanDistance();for (ImmutableTriple PointVector> c : centers) {
> dist = ed.compute(line.points(), c.right.points());if 
> (dist < minDist) {
> minDist = dist;index = c.right;}
> }
>
> context.write(index, line);}
> }
>
> public class KMeansReducer extends Reducer Text> {
> private double min_dist = Double.MAX_VALUE;@Overridepublic void 
> reduce(PointVector center, Iterable points, Context context) 
> throws IOException, InterruptedException {
> EuclideanDistance measure = new EuclideanDistance();double 
> distance = 0.0;int numOfPoints = 0;double diff = 0.0;
> PointVector newCenter = null;double [] sums = new 
> double[center.size()];for (PointVector p : points) {
> distance += measure.compute(center.points(), p.points()); 
>if (distance < min_dist) {
> min_dist = distance;newCenter = p;
> }
> numOfPoints++;sums = MathArrays.ebeAdd(p.points(), 
> sums);}
> for (int i = 0; i < sums.length; i++) {
> sums[i] = sums[i] / numOfPoints;}
> System.out.println("Old center " + center + " new center: " + newCenter); 
>context.write(new Text(newCenter.toString()) , new Text(new 
> PointVector(sums).toString()));}
> }
>
> Last but not least my custom data structure class PointVector
>
> public class PointVector implements WritableComparable {
> /** * Keep the tfIdf values of the terms of a document */
> private Vector data = new Vector<>();public PointVector(double [] 
> values) {
> this.data = new Vector<>(values.length);
> this.data.addAll(Doubles.asList(values));}
>
> public PointVector(List values) {
> this.data = new Vector<>(values.size());
> this.data.addAll(values);}
>
> public PointVector(String [] values) {
> this.data = new Vector<>(values.length);for (String s: 
> values) {
> this.data.add(Double.valueOf(s));}
> }
>
> public PointVector() {
>