On Wed, Mar 29, 2017 at 9:10 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> Sorry, i think more commonly if aggregating transpose is to be used, then
> cenroid assignments are better be the key of the matrix D (so D:= A) and
> aggregating transpose is performed on a matrix (1 | D)'  (i.e., 1 cbind
> D).t  so that the first row of result contains counts of cluster points and
> we can finish up cluster assignment via
>
> M = (1 | D)'
> C = M(:,2:) with each row hadamard-divided by first row of counts M(:,1)
> (implying Golub-Van Loan notations for subblocking)
>

Argh. another way around. this should of course read

C = M (2:,:) each row using hadamard division by M(1,:)
in Golub/Van Loan notation

1 | D means 1 cbind D in Samsara's speak. Slicing is explained in the
manual; note that in samsara rows start with 0 not 1, as in common
notations or R. Implied is that  M(1,:) should be collected as a simple
vector and then broadcasted to M(2:,:) with actual row-wise division being
done in M(2:,:).mapBlock(){...}.


> On Wed, Mar 29, 2017 at 9:02 AM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
>
>> the simplest scheme is to initialize distributed matrix of the shape D :=
>> (0 | A) where A is your dataset and 0 is a single column indicating current
>> centroid assignment and distribute current centroid matrix C via matrix
>> broadcast (assuming there are few enough centers).
>>
>> Then alternatively run cluster assignment within mapBlock() operator on D
>> with recomputation of new centroids C afterwards. Recomputation of
>> centroids can be done via aggregating transpose.
>>
>> of course a better scheme includes pre-sketching (k-means ||) and use of
>> a triangle inequality during recomputations.
>>
>> On Wed, Mar 29, 2017 at 8:30 AM, KHATWANI PARTH BHARAT <
>> h2016...@pilani.bits-pilani.ac.in> wrote:
>>
>>> Sir,
>>> I am trying to write the kmeans clustering algorithm using Mahout Samsara
>>> but i am bit confused
>>> about how to leverage Distributed Row Matrix for the same. Can anybody
>>> help
>>> me with same.
>>>
>>>
>>>
>>>
>>>
>>> Thanks
>>> Parth Khatwani
>>>
>>
>>
>

Reply via email to