[Hama Wiki] Update of "OnlineCF" by IkhtiyorAhmedov

Apache Wiki Thu, 05 Sep 2013 21:44:29 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change 
notification.


The "OnlineCF" page has been changed by IkhtiyorAhmedov:
https://wiki.apache.org/hama/OnlineCF

New page:
= Online Collaborative Filtering =
=== Contents ===
 * Overview

 * Usage

 * Complementary functions

=== Overview. ===
 . The problem of collaborative filtering is typically defined as the task of 
inferring consumer preferences: Given an observed set of product preferences 
for a set of users, can we accurately predict the unobserved preferences?

 . Notation. We define a collaborative filtering problem as a distribution D 
over triples (a, b, r) ⊂ A × B × R where A and B are finite sets of size n and 
m respectively. We are given a set M of triples {(a, b, r)} and want to find a 
function f (a, b) which minimizes the expected squared error

 . Typically, we think of A as our set of users, B as our set of products, and 
r as user a’s “rating” of product b. In most movie recommendation datasets, r 
is a number in {1, 2, 3, 4, 5} as in the number of “stars”, although in other 
settings we may only be given r ∈ {0, 1}, as in liked/disliked.

=== Usage. ===
 . Basic overview of usage steps are:

 * Convert. convert input data into OnlineCF compatible format.
 * Configuration and Train. set   parameters for training
 * Load
 * Predict

'''Convert.'''

 . Since, currently, we support one input path, we need to convert input data 
and combine set of triples, item and user features into one file. In order to 
implement custom parsing of input data, use InputConverter class Below is 
example for Movie Lens dataset converter.

{{{
MovieLensConverter converter = new MovieLensConverter();
converter.convert(pathToPreferences, pathToMovieGenres, convertedOutputPath);
}}}

'''Configuration and Train.'''

 . In order to achieve good performance in prediction we need to configure 
iteration count, matrix rank and matrix factorization update functions.

{{{
OnlineCF recommender = new OnlineCF(); 
recommender.setInputPreferences(convertedOutputPath); 
recommender.setIteration(150); 
recommender.setMatrixRank(3); 
recommender.setSkipCount(1); // after how many steps we should synchronize 
values in each task recommender.setUpdateFunction(MeanAbsError.class); 
recommender.setOutputPath(outputFileName); 
recommender.train();
}}}

'''Load.'''

 . After training, model will be saved into output file by default In order to 
use prediction functions we need to load it.

{{{
recommender.load(pathToTrainedModel, false);
}}}

'''Predict.'''

{{{
// estimate score double estimatedScore = 
recommender.estimatePreference(userId, itemId);

// estimate user similarities 
double userSimilarity = recommender.calculateUserSimilarity(user1, user2); 
// Pair<K, V> - where K predicted similar user, V predicted similarity score 
List<Pair<Long, Double>> similarUsers = recommender.getMostSimilarUsers(userId, 
count);

// estimate item similarities 
double itemSimilarity = recommender.calculateItemSimilarity(item1, item2); 
// Pair<K, V> - where K predicted similar item, V predicted similarity score 
List<Pair<Long, Double>> similarItems = recommender.getMostSimilarItems(itemId, 
count);
}}}

=== Complementary functions ===
 . There some classes which can be useful to know while using Online 
Collaborative Filtering.

 * InputConverter. For parsing input data and converting into OnlineCF 
compatible format. (see MovieLensConverter)
 * OnlineUpdate.Function. For matrix factorization functions It will be used 
while training and estimating user preference (see MeanAbsError)

=== References ===
 * Online Collaborative Filtering. Jacob Abernethy, Kevin Canini, John 
Langford, Alex Simma 
http://canini.me/research_files/OnlineCollaborativeFiltering.pdf‎

[Hama Wiki] Update of "OnlineCF" by IkhtiyorAhmedov

Reply via email to