I don't know of any situations where Vectors are used as keys. It hardly makes sense to use them as they are so unwieldy. Suggest we could change to just Writable and be ahead. In terms of the potential density improvement, it will be interesting to see what can typically be achieved.

r786323 just removed all calls to asWritableComparable, replacing them with asFormatString which was correct anyway.

Shall I change the method to asWritable()?

Jeff

David Hall wrote:
How often does Mahout need the "Comparable" part for Vectors? Are
vectors commonly used as map output keys?

In terms of space efficiency, I'd bet it's probably a bit better than
a factor of two in the average case, especially for densevectors. The
gson format is storing both the int index and the double as raw
strings, plus whatever boundary characters.  The writable
implementation stores just the bytes of the double, plus a length.

-- David

On Thu, Jun 18, 2009 at 2:13 PM, Jeff Eastman<j...@windwardsolutions.com> wrote:
+1 asWritableComparable is a simple implementation that uses asFormatString.
It would be good to rewrite it for internal communication. A factor of two
is still a factor of two.

Jeff


Grant Ingersoll wrote:
On Jun 18, 2009, at 4:45 PM, Ted Dunning wrote:

Writable should be plenty!

+1.  Still nice to have JSON for user facing though.

On Thu, Jun 18, 2009 at 1:15 PM, David Hall <d...@cs.stanford.edu> wrote:

See my followup on another thread (sorry for the schizophrenic
posting); Vector already implements Writable, so that's all I really
can ask of it. Is there something more you'd like? I'd be happy to do
it.







Attachment: PGP.sig
Description: PGP signature

Reply via email to