Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change 
notification.

The "BSPModel" page has been changed by thomasjungblut:
http://wiki.apache.org/hama/BSPModel?action=diff&rev1=14&rev2=15

  
  There are also {{{setup()}}} and {{{cleanup()}}} which will be called at the 
beginning of your computation, respectively at the end of the computation.
  
- {{{cleanup()}}} is '''guranteed''' to run after the computation or in case of 
failure.
+ {{{cleanup()}}} is '''guranteed''' to run after the computation or in case of 
failure. (In 0.4.0 it is actually not, we expect this to be fixed in 0.5.0).
  
  You can simply override the functions you need from BSP class.
  
@@ -35, +35 @@

  
  Since Hama 0.4.0 we provide a input and output system for BSP Jobs.
  
+ We choose the key/value model from Hadoop, since we want to provide a 
conherent API to widely used products like Hadoop MapReduce (SequenceFiles) and 
HBase (Column-storage). 
- TODO: Some blahblah about key value and stuff
- What's in case when no input is configured? and stuff like that should be 
documented here..
- 
  
  == Input ==
  
@@ -111, +109 @@

  
  === Custom Inputformat ===
  
- You can implement your own inputformat blabla
+ You can implement your own inputformat. It is similar to Hadoop MapReduce's 
input formats, so you can use existing literature to get into it.
  
  == Output ==
  
  === Configuring Output ===
  
+ Like the input, you can configure the output while setting up your BSPJob.
+ 
+ 
+ {{{
+     job.setOutputKeyClass(Text.class);
+     job.setOutputValueClass(DoubleWritable.class);
+ 
+     job.setOutputFormat(TextOutputFormat.class);
+ 
+     FileOutputFormat.setOutputPath(job, TMP_OUTPUT);
+ }}}
+ 
+ As you can see there are 3 major sections.
+ 
+ The first section is about setting the classes for output key and output 
value.
+ 
+ The second section is about setting the format of your output. In this case 
this is TextOutputFormat, it outputs key separated by tabstops ('\t') from the 
value. Each record (key+value) is separated by a newline ('\n').
+ 
+ The third and last section is about setting the path where your output should 
go. 
+ You can use the static method in your choosen Outputformat as well as the 
convenience method in BSPJob:
+ 
+ {{{
+  job.setOutputPath(new Path("/tmp/out"));
+ }}}
+ 
+ If you don't provide output, no output folder or collector will be allocated.
+ 
- === Using Input ===
+ === Using Output ===
+ 
+ From your BSP, you can output like this:
+ 
+ {{{
+  @Override
+  public void bsp(
+         BSPPeer<NullWritable, NullWritable, Text, DoubleWritable, 
DoubleWritable> peer)
+         throws IOException, SyncException, InterruptedException {
+ 
+      peer.write(new Text("Estimated value of PI is"), new 
DoubleWritable(3.14));
+ 
+  }
+ }}}
+ 
+ Note that you can always output, even from Setup or Cleanup methods!
  
  === Custom Outputformat ===
+ 
+ You can implement your own outputformat. It is similar to Hadoop MapReduce's 
output formats, so you can use existing literature to get into it.
  
  == Implementation notes ==
  
@@ -198, +240 @@

  
  = Counters =
  
+ Just like in Hadoop MapReduce you can use Counters.
+ 
+ Counters are basically enums that you can only increment. You can use them to 
track meaningful metrics in your code, e.G. how often a loop has been executed.
+ 
+ From your BSP code you can use counters like this:
+ 
+ {{{
+     // enum definition
+     enum LoopCounter{
+       LOOPS
+     }
+ 
+     @Override
+     public void bsp(
+         BSPPeer<NullWritable, NullWritable, Text, DoubleWritable, 
DoubleWritable> peer)
+         throws IOException, SyncException, InterruptedException {
+       for (int i = 0; i < iterations; i++) {
+         // details ommitted
+         peer.getCounter(LoopCounter.LOOPS).increment(1L);
+       }
+       // rest ommitted
+     }
+ }}}
+ 
+ Counters are in 0.4.0 not usable for flow controls, since they are not synced 
during sync phase. Watch 
[[https://issues.apache.org/jira/browse/HAMA-515|HAMA-515]] for details.
+ 
  == Setup and Cleanup ==
  
  == Combiners ==
@@ -244, +312 @@

  
  }}}
  
- == Job Configuration and Submission ==
- 
- TODO:
- 

Reply via email to