Bumping this-

Parth, is there anything we can do to assist you?



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Mon, Apr 24, 2017 at 9:34 PM, KHATWANI PARTH BHARAT <
h2016...@pilani.bits-pilani.ac.in> wrote:

> @Trevor and @Dmitriy
>
> Tough Bug in Aggregating Transpose is fixed. One issue is still left which
> is causing hindrance in completing the KMeans Code
> That issue is of Assigning the the Row Keys of The DRM with the "Closest
> Cluster Index" found
> Consider the Matrix of Data points given as follows
>
> {
>    0 => {0:1.0,    1: 1.0,    2: 1.0,   3: 3.0}
>    1 => {0:1.0,    1: 2.0,    2: 3.0,   3: 4.0}
>    2 => {0:1.0,    1: 3.0,    2: 4.0,   3: 5.0}
>    3 => {0:1.0,    1: 4.0,    2: 5.0,   3: 6.0}
>   }
> Now these are
> 0 =>
> 1 =>
> 2 =>
> 3 =>
> the Row keys. Here Zeroth column(0) contains the values which will be used
> the store the count of Points assigned to each cluster and Column 1 to 3
> contains co-ordinates of the data points.
>
> So now after cluster assignment step of Kmeans algorithm which @Dmitriy has
> Outlined in the beginning of this mail chain,
>
> the above Matrix should look like this(Assuming that the 0th and 1st data
> points are assigned to the cluster with index 0 and 2nd and 3rd data points
> are assigned to cluster with index 1)
>
>  {
>    0 => {0:1.0,    1: 1.0,    2: 1.0,   3: 3.0}
>    0 => {0:1.0,    1: 2.0,    2: 3.0,   3: 4.0}
>    1 => {0:1.0,    1: 3.0,    2: 4.0,   3: 5.0}
>    1 => {0:1.0,    1: 4.0,    2: 5.0,   3: 6.0}
>  }
>
> to achieve above mentioned result i using following code lines of code
>
> //11. Iterating over the Data Matrix(in DrmLike[Int] format)
> dataDrmX.mapBlock() {
>   case (keys, block) =>
>     for (row <- 0 until block.nrow) {
>          var dataPoint = block(row, ::)
>
>          //12. findTheClosestCentriod find the closest centriod to the Data
> point specified by "dataPoint"
>          val closesetIndex = findTheClosestCentriod(dataPoint, centriods)
>
>          //13. assigning closest index to key
>          keys(row) = closesetIndex
>      }
>      keys -> block
> }
>
> But it turns out to be
>
>  {
>    0 => {0:1.0,    1: 2.0,    2: 3.0,   3: 4.0}
>    1 => {0:1.0,    1: 4.0,    2: 5.0,   3: 6.0}
>  }
>
>
> So is there any thing wrong with the syntax of the above code.I am unable
> to find any reference to the way in which i should assign a value to the
> row keys.
>
> @Trevor as per what you have mentioned in the above mail chain
> "Got it- in short no.
>
> Think of the keys like a dictionary or HashMap.
>
> That's why everything is ending up on row 1."
>
> But according to Algorithm outlined by@Dmitriy at start of the mail chain
> we assign same key To Multiple Rows is possible.
> Same is also mentioned in the Book Written by Dmitriy and Andrew.
> It is mentioned that the rows having the same row keys summed up when we
> take aggregating transpose.
>
> I now confused that weather it possible to achieve what i have mentioned
> above or it is not possible to achieve or it is the Bug in the API.
>
>
>
> Thanks & Regards
> Parth
> <#m_33347126371020841_m_5688102708516554904_>
>

Reply via email to