Sounds much better to me.
On 12/26/07 7:53 AM, Eric Baldeschwieler [EMAIL PROTECTED] wrote:
With a secondary sort on the values during the shuffle, nothing would
need to be kept in memory, since it could all be counted in a single
scan. Right? Wouldn't that be a much more efficient
Have you looked at hadoop.io.MapWritable?
---
Jim Kellerman, Senior Engineer; Powerset
-Original Message-
From: C G [mailto:[EMAIL PROTECTED]
Sent: Wednesday, December 19, 2007 11:59 AM
To: hadoop-user@lucene.apache.org
Subject: HashMap which can spill to disk for Hadoop?
Hi All:
You should also be able get quite a bit of mileage out of special purpose
HashMaps. In general, java generic collections incur large to huge
penalties for certain special cases. If you have one of these special cases
or can put up with one, then you may be able to get 1+ order of magnitude