Sounds like a good idea in general.
Here a tiny bit of code to get you rolling. Adding this to the existing
VectorDumper is better than using a standalone class as I have here. Your
thought about long strings is also very pertinent.
public class DumpTriples {
public static void dump(PrintWriter out, Matrix m) {
for (MatrixSlice row : m) {
Iterator<Vector.Element> i = row.vector().iterateNonZero();
while (i.hasNext()) {
Vector.Element element = i.next();
out.printf("%d,%d,%f\n", row.index(), element.index(),
element.get());
}
}
}
}
On Wed, Aug 17, 2011 at 3:27 PM, Jeff Hansen <[email protected]> wrote:
> Does anybody happen to know if there's already a utility out there for
> dumping a sequence file of vectors to a csv file with vector,element,value?
>
> I was hoping to shift some of my results over to R found a comment by Ted a
> while back suggesting that the easiest method is to spit out sparse csv
> triples and load them with
>
> sparseMatrix(x=c(1,1,1,1), i=c(1,2,3,3), j=c(1,1,2,1))
>
> from the Matrix library.
>
> This wouldn't be that complicated to write, but I imagine I'm not the first
> person to look for it. If a utility like this doesn't already exist, does
> anybody think it would be a worthwhile enhancement to add an option onto
> the
> VectorDump utility to output to this format? If so I'd be happy to offer
> up
> a patch (although I might want to refactor the VectorHelper class to emit
> straight out to the writer -- I'm not too fond of generating huge strings)
>