Hi Malith,
On Wed, Jun 7, 2017 at 3:04 PM, Malith Jayasinghe <[email protected]> wrote: > Hello All, > > > > We are developing k-means clustering extension. k-means is an > unsupervised learning algorithm which provides a simple way to classify a > given data set through a certain number of clusters . The standard k-means > clustering algorithm is a nondeterministic algorithm. This means that we > can get different results for the same input data when we run the algorithm > multiple times. The reason is that the algorithm randomly chooses k > observations from the data set and uses these as the initial means. Here > we implement a variant of k means in which the initial cluster centers > are determined by the first k distinct values. This will ensure the same > output for a given input. > > > > Function Parameters: Data point to be clustered > > Number of cluster centers - k > > Number of iterations - m > > Number of events for which the model is trained - x > > > > The cluster centers are initialized based on the first distinct number of > k (number of cluster centers) events in the stream. > > The model is trained for every x events received. > Does this mean at any point in time, the maximum number of input points used by the training process is x? Also how is the training process carried out? I assume the training doesn't happen in real time. After receiving the first x events, an output is given for each event > generated. The output consists of the cluster centre value to which the > data point belongs, the id of the particular cluster center and the > distance from the cluster center. > > > > The clustering can be performed for a given window implementation i.e. > time, time batch, length > > -- > Malith Jayasinghe > > WSO2, Inc. (http://wso2.com) > Email :[email protected] > Mobile :0770704040 > Blog :https://medium.com/@malith.jayasinghe > <https://medium.com/@malith.jayasinghe> > Lean . Enterprise . Middleware > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Thanks & Regards, Fazlan Nazeem *Senior Software Engineer* *WSO2 Inc* Mobile : +94772338839 <%2B94%20%280%29%20773%20451194> [email protected]
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
