Yeah, that's libsvm format, which is 1-indexed.

On Wed, Aug 3, 2016 at 12:45 PM, Tony Lane <tonylane....@gmail.com> wrote:
> I guess the setup of the model and usage of the vector got to me.
> Setup takes position 1 , 2 , 3  - like this in the build example - "1:0.0
> 2:0.0 3:0.0"
> I thought I need to follow the same numbering while creating vector too.
>
> thanks a bunch
>
>
> On Thu, Aug 4, 2016 at 12:39 AM, Sean Owen <so...@cloudera.com> wrote:
>>
>> You mean "new int[] {0,1,2}" because vectors are 0-indexed.
>>
>> On Wed, Aug 3, 2016 at 11:52 AM, Tony Lane <tonylane....@gmail.com> wrote:
>> > Hi Sean,
>> >
>> > I did not understand,
>> > I created a KMeansModel with 3 dimensions and then I am calling predict
>> > method on this model with a 3 dimension vector ?
>> > I am not sre what is wrong in this approach. i am missing a point ?
>> >
>> > Tony
>> >
>> > On Wed, Aug 3, 2016 at 11:22 PM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> You declare that the vector has 3 dimensions, but then refer to its
>> >> 4th dimension (at index 3). That is the error.
>> >>
>> >> On Wed, Aug 3, 2016 at 10:43 AM, Tony Lane <tonylane....@gmail.com>
>> >> wrote:
>> >> > I am using the following vector definition in java
>> >> >
>> >> > Vectors.sparse(3, new int[] { 1, 2, 3 }, new double[] { 1.1, 1.1, 1.1
>> >> > }))
>> >> >
>> >> > However when I run the predict method on this vector it leads to
>> >> >
>> >> > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:
>> >> > 3
>> >> > at org.apache.spark.mllib.linalg.BLAS$.dot(BLAS.scala:143)
>> >> > at org.apache.spark.mllib.linalg.BLAS$.dot(BLAS.scala:115)
>> >> > at
>> >> >
>> >> >
>> >> > org.apache.spark.mllib.util.MLUtils$.fastSquaredDistance(MLUtils.scala:298)
>> >> > at
>> >> >
>> >> >
>> >> > org.apache.spark.mllib.clustering.KMeans$.fastSquaredDistance(KMeans.scala:606)
>> >> > at
>> >> >
>> >> >
>> >> > org.apache.spark.mllib.clustering.KMeans$$anonfun$findClosest$1.apply(KMeans.scala:580)
>> >> > at
>> >> >
>> >> >
>> >> > org.apache.spark.mllib.clustering.KMeans$$anonfun$findClosest$1.apply(KMeans.scala:574)
>> >> > at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
>> >> > at
>> >> >
>> >> > org.apache.spark.mllib.clustering.KMeans$.findClosest(KMeans.scala:574)
>> >> > at
>> >> >
>> >> >
>> >> > org.apache.spark.mllib.clustering.KMeansModel.predict(KMeansModel.scala:59)
>> >> > at
>> >> > org.apache.spark.ml.clustering.KMeansModel.predict(KMeans.scala:130)
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to