yeah- same thing:

mahout> val rowBindings= drmTFIDF.getRowLabelBindings
rowBindings: java.util.Map[String,Integer] = {/talk.religion.misc/84570=7597}

mahout> rowBindings.size
res0: Int = 1

I have to double check that row 7597 is actually /talk.religion.misc/84570

thx.

> From: [email protected]
> To: [email protected]
> Subject: drmFromHDFS rowLabelBindings question
> Date: Fri, 12 Sep 2014 13:36:43 -0400
> 
> I'm having some trouble getting the rowLabelBindings from a Sting-keyed 
> (Chekpointed...Spark)Drm from read in from HDFS.  I'm reading in a sequence 
> file of form <Text,VectorWritable> which is output from  seq2sparse.  The Drm 
> has 7598 rows and the vectors seem to be read in properly.  When I try to get 
> a Map using getRowLabelBindings(), I get back a Map of size 1.
> 
> The single key/value pair in that map is consistent with what I would expect: 
>    
>     
>       k = /talk.religion.misc/84570, v = 7597 
>  
> (the last row..)  I don't know why I'm not getting entries for the rest of 
> the rows.  I've been looking through drmWrap() and drmFromHdfs() and don't 
> see where/if rowLabelBindings is being set except in collect() (I'm very 
> likely missing something because of my bad scala understanding). 
> 
> Is there a different way to get a map of key/rowIndex?
> 
> Ultimately I'm working from the math-scala package, so I can't do anything 
> specific with RDDs.
> 
> Below is where I'm hitting trouble shown in the Spark-Shell.
> 
> Any Input is appreciated.  Thanks, 
> 
> Andy
> 
> 
> mahout> val drmTFIDF= drmFromHDFS( path = 
> "/tmp/mahout-work-andy/20news-test-vectors/part-r-00000")
> drmTFIDF: org.apache.mahout.math.drm.CheckpointedDrm[_] = 
> org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark@6e200e2d
>  
> mahout> drmTFIDF.nrow
> res0: Long = 7598
>                                                                               
>         ^
> mahout> val drmRowLabelBindings:java.util.HashMap[String,Integer] = new 
> java.util.HashMap(drmTFIDF.getRowLabelBindings) 
> drmRowLabelBindings: java.util.HashMap[String,Integer] = 
> {/talk.religion.misc/84570=7597}
> 
> mahout> val incoreTFIDF=drmTFIDF.collect
> incoreTFIDF: org.apache.mahout.math.Matrix = 
> {
>   2770  =>    
> {40894:3.706777572631836,25040:5.326602935791016,63527:7.625180244445801,30072:9.138689994812012,75991:2.7672300338745117,91042:1.964722752571106,85483:5.487469673156738,45764:4.326903343200684,83215:2.904540777206421,25284:4.734808444976807,90958:3.0112483501434326,29565:7.410068988800049,60779:6.3667192459106445,91156:1.6616008281707764,92814:2.255286693572998,23763:4.0394415855407715,7067:12.395035743713379,61058:6.993908405303955,55483:9.745443344116211,43286:3.622220039367676,65462:4.295836925506592,43535:1.5242335796356201,34898:6.624548435211182,66572:8.541470527648926,64323:2.1623659133911133,58008:3.128486394882202,33351:3.3363659381866455,36587:4.08017110824585,74747:2.935668706893921,38val
>  
> 
> mahout> val 
> incoreRowLabelBindings:incoreRowLabelBindings:java.util.HashMap[String,Integer]
>  = new java.util.HashMap(incoreTFIDF.getRowLabelBindings)
> incoreRowLabelBindings: java.util.HashMap[String,Integer] = 
> {/talk.religion.misc/84570=7597}
> 
> mahout> incoreTFIDF.nrow
> res1: Int = 7598
> 
> mahout> incoreRowLabelBindings.size
> res3: Int = 1
> 
> mahout> drmTFIDF.nrow
> res5: Long = 7598
> 
> mahout> drmRowLabelBindings.size
> res4: Int = 1
>                                         
                                          

Reply via email to