Make sure that the files can be ordered, of course.  Losing the ordering
can be really bad.

On Sun, Nov 13, 2011 at 10:34 PM, Jake Mannix <[email protected]> wrote:

> Yeah, in particular, DistributedRowMatrix "is" simply a
> SequenceFile<IntWritable,VectorWritable>, when in its serialized form.  As
> such,
> this "file" can be (and typically is) a series of part-* files in a
> directory (typically
> on HDFS).
>
>  -jake
>
> On Sun, Nov 13, 2011 at 10:23 PM, Dmitriy Lyubimov <[email protected]
> >wrote:
>
> > It's my understanding drm can be multifile. In fact, stuff like
> seq2sparse
> > will produce multifile output, being a MR job itself.
> > On Nov 12, 2011 3:23 PM, "Lance Norskog" <[email protected]> wrote:
> >
> > > Is there a convention for multi-file matrices? For example, the
> > > DistributedRowMatrix?
> > >
> > > --
> > > Lance Norskog
> > > [email protected]
> > >
> >
>

Reply via email to