i've tried desperately to create an RDD from a matrix i have. Every combination failed.
I have a sparse matrix returned from a call to dv = DictVectorizer() sv_tf = dv.fit_transform(tf) which is supposed to be a matrix of document terms and their frequencies. I need to convert this to an RDD so I can feed it to pyspark functions such as IDF().fit() I tried applying a Vectors.sparse(??, sv_tf) but i didn't know what the dimension should be I tried doing a sc.parallelize(sv_tf) which didn't work either I tried both above methods with sv_tf.toarray(). Again no luck thanks Jeff