[Lucene-hadoop Wiki] Trivial Update of "DataProcessingBenchmarks" by udanax

Apache Wiki Tue, 08 Jan 2008 17:03:34 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/DataProcessingBenchmarks

------------------------------------------------------------------------------
     * EM algorithm performance analysis
     * Lanczos algorithm performance analysis
  
- === Group/Sort Processing Benchmarks ===
+ === Group/Sort Processing ===
+ 
+  * Finds the most connected networks.
+    * After [https://issues.apache.org/jira/browse/HADOOP-2480 HADOOP-2480] 
done, Hbase will be join to benchmarks.
  
  SQL > select ipaddress, count(*) from access_log group by ipaddress order by 
count(*) desc limit 0,100;
  [[BR]]''Ï ,,count. ipaddress,, (Ï ,,count,, (Î³ ,,count(ipaddress). 
ipaddress,, (access_log)))''
- 
-  * After [https://issues.apache.org/jira/browse/HADOOP-2480 HADOOP-2480] 
done, Hbase will be join to benchmarks.
  
  ||<bgcolor="#E5E5E5">||<bgcolor="#E5E5E5">!MySql 5.0.27 
||<bgcolor="#E5E5E5">Hadoop-0.15.0 ||
  ||<bgcolor="#E5E5E5">Data ||B-tree disk table (MyISAM)||Text files 
(access_log)||
@@ -23, +24 @@

  ||<bgcolor="#E5E5E5">Results ||100 ||100||
  ||<bgcolor="#E5E5E5">Time  ||3.715 sec ||112.03 sec||
  
- ==== Processing Flow ====
+ ==== MapReduce Flow ====
  
   * Map was used for extract the IP address of the client requesting the web 
page.
   * Reduce was used for summation.
   * 1 more Map/Reduce was used for sort by count.
  
- ==== Processing Results ====
+ ==== MapReduce Results ====
  {{{
  ------------------------------------
  * Top 100 connector list :
@@ -48, +49 @@

  Processing time : 112.03 sec
  }}}
  
+ === EM Algorithm ===
+  * Finds maximum likelihood estimates of parameters in probabilistic models.
+  * Alternates between expectation (E) step and maximization (M) step.
+ 
+ ==== MapReduce Flow ====
+

[Lucene-hadoop Wiki] Trivial Update of "DataProcessingBenchmarks" by udanax

Reply via email to