[jira] [Commented] (FLINK-2452) Add a playcount threshold to the MusicProfiles example

ASF GitHub Bot (JIRA) Thu, 27 Apr 2017 03:10:49 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986304#comment-15986304
 ]


ASF GitHub Bot commented on FLINK-2452:
---------------------------------------

Github user coveralls commented on the issue:

    https://github.com/apache/flink/pull/968
  
    
    [![Coverage 
Status](https://coveralls.io/builds/11266065/badge)](https://coveralls.io/builds/11266065)
    
    Changes Unknown when pulling **c0c8463521912d021c392c2c5edc254fee267eb8 on 
vasia:music-profiles** into ** on apache:master**.



> Add a playcount threshold to the MusicProfiles example
> ------------------------------------------------------
>
>                 Key: FLINK-2452
>                 URL: https://issues.apache.org/jira/browse/FLINK-2452
>             Project: Flink
>          Issue Type: Improvement
>          Components: Gelly
>    Affects Versions: 0.10.0
>            Reporter: Vasia Kalavri
>            Assignee: Vasia Kalavri
>            Priority: Minor
>             Fix For: 0.10.0
>
>
> In the MusicProfiles example, when creating the user-user similarity graph, 
> an edge is created between any 2 users that have listened to the same song 
> (even if once). Depending on the input data, this might produce a projection 
> graph with many more edges than the original user-song graph.
> To make this computation more efficient, this issue proposes adding a 
> user-defined parameter that filters out songs that a user has listened to 
> only a few times. Essentially, it is a threshold for playcount, above which a 
> user is considered to like a song.
> For reference, with a threshold value of 30, the whole Last.fm dataset is 
> analyzed on my laptop in a few minutes, while no threshold results in a 
> runtime of several hours.
> There are many solutions to this problem, but since this is just an example 
> (not a library method), I think that keeping it simple is important.
> Thanks to [~andralungu] for spotting the inefficiency!



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-2452) Add a playcount threshold to the MusicProfiles example

Reply via email to