[ 
https://issues.apache.org/jira/browse/MAHOUT-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128130#comment-13128130
 ] 

Sean Owen commented on MAHOUT-710:
----------------------------------

Is the second patch OK to commit Sebastian? I would assume so.
I am unable to apply it as it's gone out of sync with HEAD. Tillmann can you 
try to reintegrate it?
                
> Implementing K-Trusses
> ----------------------
>
>                 Key: MAHOUT-710
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-710
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Graph
>    Affects Versions: 0.5
>            Reporter: Tillmann Fiehn
>            Assignee: Sebastian Schelter
>             Fix For: 0.6
>
>         Attachments: 
> 0001-SimplifyGraph_AugmentEdgesWithDegrees_EnumerateTriangles_workfine.patch, 
> 0002-FindComponents_FindKTrusses_workfine.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We are Tillmann Fiehn and Sebastian Arnold, IT students from TU Berlin. As 
> Sebastian Schelter already announced, we are atteding Isabel's and 
> Sebastian's class "Large scale data analysis and data mining" and picked an 
> interesting project that we want to implement in Mahout. We are open for any 
> hints and suggestions and would appreciate if you could share your thoughts 
> on our proposal.
> Our goal is to implement a map/reduce algorithm for finding k-trusses in a 
> given graph. A k-truss is a nontrivial, single-component maximal subgraph, 
> such that every edge is contained in at least k-2 triangles in the subgraph. 
> The algorithm was proposed in the IEEE paper J. Cohen 2009: "Graph Twiddling 
> in a MapReduce World" 
> (http://www.csee.usf.edu/~anda/CIS6930-S11/papers/graph-processing-w-mapreduce.pdf)
>  and involves a number of graph algorithms that are to our knowledge 
> currently not present in Mahout:
> *Goal: finding K-Trusses*
> * relaxation of k-member clique
> * non-trivial, single-component maximal subgraph, s.t.
> * every edge is contained in at least k-2 triangles in the subgraph
> *Algorithms to be implemented on top of Mahout / Hadoop:*
> *simplifyGraph*: Edges -> RepresentativeEdges
> * removes Loops (not cycles)
> * aggregate duplicate edges
> *augmentGraphWithDegrees*: RepresentativeEdges -> AugmentedEdges = (Edge (v, 
> u) , d(v), d(u))
> * augements the edges with degree information for both nodes d(v) = |{E | E = 
> (x,y) a. (x = v o. y = v) }|
> *enumerateTriangles*: AugmentedEdges  = (Edge, d(v), d(u)) -> Triangles (v, 
> u, s) 
> * finds all triangles in a Graph
> *findComponents*: RepresentativeEdges -> ZoneAssignments (v, z)
> * finds all components of a graph, each identified as the order number of the 
> lowest-order vertex contained
> * consists of:
> ** step 1: find adjacent zones: Edges x Zones -> InterzoneEdges (z, z)
> ** step 2: merge adjacent zones into one (the lowest-order neighbouring 
> zone): InterzoneEdges, ZoneAssignments (v, z) -> Pairs (v, z)
> {noformat}
> while true do:
>   step 1
>   if empty set interzone edges break;
>   step 2
> done
> {noformat}
> *findKTrusses*: Edges, k -> ZoneAssignments (v, z)
> * finds all k-trusses of the graph
> * each returned vertex v is part of a truss z
> {noformat}
> simplifyGraph
> while true do:
>   augmentGraphWithDegrees
>   enumerateTriangles
>   keep only edges contained in k-2 triangles
>   if all edges kept break;
> done
> findComponents
> {noformat}
> We suppose to create the package {{org.apache.mahout.graph.trusses}} and  
> {{org.apache.mahout.graph.components}} in the {{core}} module.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to