Sounds like a good idea in general.

Here a tiny bit of code to get you rolling.  Adding this to the existing
VectorDumper is better than using a standalone class as I have here.  Your
thought about long strings is also very pertinent.

public class DumpTriples {
  public static void dump(PrintWriter out, Matrix m) {
    for (MatrixSlice row : m) {
      Iterator<Vector.Element> i = row.vector().iterateNonZero();
      while (i.hasNext()) {
        Vector.Element element = i.next();
        out.printf("%d,%d,%f\n", row.index(), element.index(),
element.get());
      }
    }
  }
}


On Wed, Aug 17, 2011 at 3:27 PM, Jeff Hansen <[email protected]> wrote:

> Does anybody happen to know if there's already a utility out there for
> dumping a sequence file of vectors to a csv file with vector,element,value?
>
> I was hoping to shift some of my results over to R found a comment by Ted a
> while back suggesting that the easiest method is to spit out sparse csv
> triples and load them with
>
> sparseMatrix(x=c(1,1,1,1), i=c(1,2,3,3), j=c(1,1,2,1))
>
> from the Matrix library.
>
> This wouldn't be that complicated to write, but I imagine I'm not the first
> person to look for it.  If a utility like this doesn't already exist, does
> anybody think it would be a worthwhile enhancement to add an option onto
> the
> VectorDump utility to output to this format?  If so I'd be happy to offer
> up
> a patch (although I might want to refactor the VectorHelper class to emit
> straight out to the writer -- I'm not too fond of generating huge strings)
>

Reply via email to