intro-cooccurrence-spark.mdtext

pat Fri, 05 Sep 2014 08:10:55 -0700

Author: pat
Date: Fri Sep  5 15:10:17 2014
New Revision: 1622718

URL: http://svn.apache.org/r1622718
Log:
replace left angle brace in non-code blocks


Modified:
    
mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext

Modified: 
mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext?rev=1622718&r1=1622717&r2=1622718&view=diff
==============================================================================
--- 
mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext
 (original)
+++ 
mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext
 Fri Sep  5 15:10:17 2014
@@ -214,9 +214,14 @@ Can be parsed with the following CLI and
 
 ##2. spark-rowsimilarity
 
-*spark-rowsimilarity* is the companion to *spark-itemsimilarity* the primary 
difference is that it takes a text file version of a DRM with optional 
application specific IDs. The input is in text-delimited form where there are 
three delimiters used. By default it reads 
(rowID<tab>columnID1:strength1<space>columnID2:strength2...) Since this job 
only supports LLR similarity, which does not use the input strengths, they may 
be omitted in the input. It writes 
(columnID<tab>columnID1:strength1<space>columnID2:strength2...) The output is 
sorted by strength descending. The output can be interpreted as a column id 
from the primary input followed by a list of the most similar columns. For a 
discussion of the output layout and formatting see *spark-itemsimilarity*. 
-
-One significant output option is --omitStrength. This allows output of the 
form (columnID<tab>columnID2<space>columnID2<space>...) This is a tab-delimited 
file containing a columnID token followed by a space delimited string of 
tokens. It can be directly indexed by search engines to create an item-based 
recommender.
+*spark-rowsimilarity* is the companion to *spark-itemsimilarity* the primary 
difference is that it takes a text file version of 
+a matrix of sparse vectors with optional application specific IDs and it finds 
similar rows rather than items (columns). Its use is
+not limited to collaborative filtering. The input is in text-delimited form 
where there are three delimiters used. By 
+default it reads 
(rowID&lt;tab>columnID1:strength1&lt;space>columnID2:strength2...) Since this 
job only supports LLR similarity,
+ which does not use the input strengths, they may be omitted in the input. It 
writes 
+(rowID&lt;tab>rowID1:strength1&lt;space>rowID2:strength2...) 
+The output is sorted by strength descending. The output can be interpreted as 
a row ID from the primary input followed 
+by a list of the most similar rows.
 
 The command line interface is:
 
@@ -311,4 +316,3 @@ Another use case for these jobs is in fi
 
 
 
-

svn commit: r1622718 - /mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext

Reply via email to