Thank you, I think I've found this bug (or related to this) here https://issues.apache.org/jira/browse/IGNITE-9239 It will be delivered in 2.7 (Currently it's in master branch).
To be sure 100% that the bug is closed, could @DocDVZ provide an approach of cache populating? I mean this cache IgniteCache<String, double[]> dataCache = ignite.cache(storageName); Thank you. вт, 21 авг. 2018 г. в 10:11, Denis Magda <[email protected]>: > Hey, ML experts, > > Here is an ML issue reported. Please have a look. > > -- > Denis > > ---------- Forwarded message --------- > From: DocDVZ <[email protected]> > Date: Mon, Aug 20, 2018 at 10:53 AM > Subject: NPE exception in KMeansTrainer > To: <[email protected]> > > > Hello, > > Since I'm new to data science, I'm not really sure if it's a bug or wrong > incoming data, so I decided to ask here for advice before submitting a > ticket. I tried to apply Kmeans algorithm on my bag-of-words data with ~8k > features. So I copy-pasted some lines from example: > > IgniteCache<String, double[]> dataCache = > ignite.cache(storageName); > KMeansTrainer trainer = new KMeansTrainer().withSeed(1234L); > KMeansModel mdl = trainer.fit( > ignite, > dataCache, > (k, v) -> Arrays.copyOfRange(v, 1, v.length), > (k, v) -> v[0] > ); > > But this leads to a NullPointerException in KMeansTrainer.class: > > Caused by: java.lang.NullPointerException > at > org.apache.ignite.ml > > .clustering.kmeans.KMeansTrainer.lambda$initClusterCentersRandomly$4dba08e1$1(KMeansTrainer.java:190) > at > > org.apache.ignite.ml.dataset.impl.cache.CacheBasedDataset.computeForAllPartitions(CacheBasedDataset.java:158) > at > > org.apache.ignite.ml.dataset.impl.cache.CacheBasedDataset.compute(CacheBasedDataset.java:122) > at org.apache.ignite.ml.dataset.Dataset.compute(Dataset.java:102) > at org.apache.ignite.ml.dataset.Dataset.compute(Dataset.java:156) > at > org.apache.ignite.ml > > .clustering.kmeans.KMeansTrainer.initClusterCentersRandomly(KMeansTrainer.java:186) > at > org.apache.ignite.ml > .clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:86) > > > at line: > > List<LabeledVector> rndPnts = dataset.compute(data -> { > List<LabeledVector> rndPnt = new ArrayList<>(); > rndPnt.add(data.getRow(new > Random(seed).nextInt(data.rowSize()))); > return rndPnt; > }, (a, b) -> a == null ? b : Stream.concat(a.stream(), > b.stream()).collect(Collectors.toList())); > > The reducer receives null value for b and since there's no check for null, > b.stream() leads to NPE. Ignite version is 2.6. This seems like a bug for > me, is there any ways to workaround this issue? > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
