Hi Dzeno, I'm not familiar with the algorithm myself, but if you have an important use case for it, you could open a JIRA to discuss it. However, if it is a less common algorithm, I'd recommend first submitting it as a Spark package (but publicizing the package on the user list). If it gains traction, then it could become a higher priority item for MLlib.
Thanks, Joseph On Mon, Dec 14, 2015 at 7:56 AM, Dženan Softić <dzen...@gmail.com> wrote: > Hi, > > As a part of the project, we are trying to create parallel implementation > of BIRCH clustering algorithm [1]. We are mostly getting idea how to do it > from this paper, which used CUDA to make BIRCH parallel [2]. ([2] is short > paper, just section 4. is relevant). > > We would like to implement BIRCH on Spark. Would this be an interesting > contribution for MLlib? Is there anyone already who tried to implement > BIRCH on Spark? > > Any suggestions for implementation itself would be very much appreciated! > > > [1] http://www.cs.sfu.ca/CourseCentral/459/han/papers/zhang96.pdf > [2] http://boyuan.global-optimization.com/Mypaper/IDEAL2013-88.pdf > > > Best, > Dzeno > >