Re: VectorWritable bug et al

Jake Mannix Thu, 11 Feb 2010 10:37:54 -0800

This code was copied right out of AbstractVector, I never really understood
why we had to do that caching.

On Thu, Feb 11, 2010 at 9:10 AM, Sean Owen <sro...@gmail.com> wrote:

> I'm writing up an appendix on Vector and Matrix. In the course of
> this, I noticed a big problem with VectorWrtiable. It is pretty
> glaringly un-thread-safe. It caches, in a static member, the class of
> the vector to be read. The read method is not synchronized. Oops.
>
> Synchronization fixes this, but also introduces an unnecessary
> bottleneck in the read method. I'm forced to wonder -- why does this
> serialized representation of a vector vary at all? I can understand
> why the best approach for representing a dense or sparse vector in
> memory varies, but those concerns do not apply to a serialized form.
> The sparse representation is the only reasonable one. Why would it
> vary?
>

Why would the sparse representation be the only way to represent it
on disk?  It's nearly twice as big as the dense form for dense vectors
(ok, 50% bigger).

> I'd like to really fix this by unifying representation of vectors on
> disk accordingly. Am I missing something?
>

We should be fixing this with Avro "Real Soon Now".

Where do we actually use the VectorWritable.readVector() static
method?

If you stick to using VectorWritable as people use other writables
(just instantiate, then read()), this doesn't come up, the static
class instance isn't used...

  -jake

Re: VectorWritable bug et al

Reply via email to