Initial work on vector preprocessing is done in my git on "ssvd-vw-hack"
branch of my ssvd work (doc is here : https://github.com/dlyubimov/ssvd-doc) .
The heavy lifting is done thru VectorPreprocessor interface and seems to
work like a charm. I did not test it extensively, but when complete, it
should be able to cope with ocasional spikes in data density without
cirppling SVD mapper's memory.

thanks. -d

On Mon, Dec 13, 2010 at 9:49 AM, Dmitriy Lyubimov <[email protected]> wrote:

> Hi all,
>
> I would like to submit a patch to VectorWritable that allows for streaming
> access to vector elements without having to prebuffer all of them first.
> (current code allows for the latter only).
>
> That patch would allow to strike down one of the memory usage issues in
> current Stochastic SVD implementation and effectively open memory bound for
> n of the SVD work. (The value i see is not to open up the the bound though
> but just be more efficient in memory use, thus essentially speeding u p the
> computation. )
>
> If it's ok, i would like to create a JIRA issue and provide a patch for it.
>
>
> Another issue is to provide an SSVD patch that depends on that patch for
> VectorWritable.
>
> Thank you.
> -Dmitriy
>

Reply via email to