I'm not sure that Dmitriy's use-case has an easy solution.  As you
say, Writable loads into memory the whole thing, independently of
whether you try / not try to do buffering on iteration.

My situation (monstrous vectors) is easier, in some respects: if
the matrices are essentially
SequenceFile<IntWritable,Pair<IntWritable,DoubleWritable>>, then
there are a lot bigger vectors which can be handled in MR jobs, but
they no longer really look like "vectors" in the interface sense.

  -jake

On Mon, Dec 13, 2010 at 12:52 PM, Ted Dunning <[email protected]> wrote:

> OK.
>
> Let's assume that this is needed.
>
> I think that an iterable interface on VectorWritable that throws
> UnsupportedOperationException or similar if
> you try to get the iterator twice is much more transparent than a watcher
> structure and much easier for a user
> to discover/re-invent.
>
> Another (evil) thought is a parallel class to VectorWritable which is
> essentially SequentialAccessVectorWritable that supports reading and
> writing.  It seems to me that the Writable isn't real compatible with this
> interface in any case.  How will that be resolved?
>
>
> On Mon, Dec 13, 2010 at 11:36 AM, Dmitriy Lyubimov <[email protected]
> >wrote:
>
> > Absent of this solution, i realistically don't see how i can go without a
> > push technique in accessing the vectors.
> >
>

Reply via email to