Hi Martin
As per your hint, I've taken some time to think about shrinking
hashmaps. As you said, it is hard to find a general solution. I do, in
fact, believe that we should not even try. I'm asking myself: What are
the scenarios where people remove entries on a big scale from large
hashmaps? I for one always end up throwing the maps away, not removing
things from them. So since I suspect these scenarios to be rare, I
also suspect we won't find a large enough subset with compatible
requirements to warrant aiming for a general solution.
Instead, I propose to allow people to implement shrinking hashmaps
themselves on top of HashMap. This could be done by directly extending
HashMap, or by adding a new descendant, java.util.ShrinkableHashMap. I
have attempted the latter (in order to better show the changes) to see
what would be required.
The attached classes are just a sketch to invite further discussion.
There is ShrinkableHashMap, which would go into java.util so it can
properly access HashMap, and there are two demo user classes that
implement specific shrinking strategies. (Note: The whole thing is
untested and the demos are contrived. I'd like some feedback before I
take this further.)
Key points of ShrinkableHashMap:
* Expose methods to adjust the capacity (several variants).
* Expose methods to query the current capacity and load factor.
* Expose override point to be notified when the capacity changes.
* Expose override point to be notified when the size is reduced
(requires change to HashMap).
* Use the fallback on tightResize() when super.resize() fails.
I chose to use tightResize() here because shrinking hashmaps is
something people might want to do especially when memory is low, so if
we can do it without needing additional memory, we should.
Since we're talking strategies here, another approach might be to use
explicit strategy interfaces so people could supply implementations
thereof to (Shrinkable)HashMap. I haven't explored this yet.
As you can see, this change would open up a fair bit of HashMap's
heretofore non-public interface. You would know better whether this is
warranted, i.e. whether there is sufficient demand for being able to
shrink hashmaps.
Is this a direction to follow, do you think? If not, what would you suggest?
-Peter
ps. Here's the necessary change to HashMap (required to properly
support removals through the key and entry sets):
@@ -591,7 +591,7 @@ public class HashMap<K,V>
table[i] = next;
else
prev.next = next;
- e.recordRemoval(this);
+ removed(e);
return e;
}
prev = e;
@@ -624,7 +624,7 @@ public class HashMap<K,V>
table[i] = next;
else
prev.next = next;
- e.recordRemoval(this);
+ removed(e);
return e;
}
prev = e;
@@ -632,6 +632,13 @@ public class HashMap<K,V>
}
return e;
+ }
+
+ /**
+ * Gives both map and entry descendants a chance to react to
entry removals.
+ */
+ void removed(Entry<K,V> e) {
+ e.recordRemoval(this);
}
On Nov 21, 2007 10:59 PM, Peter Arrenbrecht <[EMAIL PROTECTED]> wrote:
> On Nov 21, 2007 9:58 PM, Peter Arrenbrecht <[EMAIL PROTECTED]> wrote:
> > Hi Martin
> >
> > > I would prefer to see engineering work go into something
> > > like auto-reduction of the table array when many elements
> > > have been removed, but that's a hard problem.
> >
> > Would you care to elaborate? At first glance, this seems not
> > especially hard to me, in particular since tightResize() would be able
> > to switch to a smaller array without causing a short spike of needing
> > more memory first. But I'm sure I have overlooked something here.
>
> Ah, you don't want it to oscillate. Is that it?
> -peter
>
package ch.arrenbrecht.java.util;
import java.util.Map;
/**
* Proposal for extensions to HashMap that would support deriving auto-shrinking
* maps from it. This class does not by itself implement any such auto-shrinking
* scheme, but provides the necessary methods and override points.
*
* @see #adjustCapacity(int)
* @see #capacityChanged(int, int)
* @see #sizeReduced()
*/
public class ShrinkableHashMap<K,V> extends HashMap<K,V> {
private static final long serialVersionUID = 8959839469846943965L;
public ShrinkableHashMap(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor);
capacityChanged(0, capacity());
}
public ShrinkableHashMap(int initialCapacity) {
super(initialCapacity);
capacityChanged(0, capacity());
}
public ShrinkableHashMap() {
super();
capacityChanged(0, capacity());
}
public ShrinkableHashMap(Map m) {
super(m);
capacityChanged(0, capacity());
}
/**
* Called whenever this map has been resized.
*/
protected void capacityChanged(int oldCapacity, int newCapacity) {
}
/**
* Called whenever this map's size has been reduced due to entry removals.
*/
protected void sizeReduced() {
}
/**
* Adjusts the capacity to the lowest supported value that obeys the
* constraints imposed by the current size and the load factor.
*/
public void adjustCapacityToSize() {
adjustCapacityToSize(size());
}
/**
* Adjusts the capacity to the lowest supported value that obeys the
* constraints imposed by the given size and the load factor.
*/
public void adjustCapacityToSize(int desiredSize) {
adjustCapacity(minCapacityForSize(desiredSize));
}
/**
* Adjusts the capacity to the lowest supported value that is greater or
* equal to <tt>desiredCapacity</tt>.
*/
public void adjustCapacity(int desiredCapacity) {
resize(allowableCapacity(desiredCapacity));
}
/**
* Returns the current capacity.
*/
@Override
public int capacity() {
return super.capacity();
}
/**
* Returns the load factor used to determine when to grow and how much to
* shrink this map.
*
* @see #put(Object, Object)
* @see #adjustCapacityToSize()
*/
@Override
public float loadFactor() {
return super.loadFactor();
}
// ---- supporting overrides
@Override
public void clear() {
super.clear();
sizeReduced();
}
// ---- private methods
/**
* Not sure if this should be accessible as well.
*/
private int minCapacityForSize(int desiredSize) {
return (int) Math.ceil(desiredSize / loadFactor);
}
private int allowableCapacity(int desiredCapacity) {
int newCapacity = capacity();
if (desiredCapacity == 0)
desiredCapacity = 1;
while (newCapacity > desiredCapacity) {
int nextLower = newCapacity / 2;
if (nextLower < desiredCapacity)
break;
newCapacity = nextLower;
}
return newCapacity;
}
/**
* First attempts a faster version, but which needs additional memory to run
* even when shrinking. If there is no memory, runs a version that works
* without allocating more memory.
*/
@Override
void resize(int newCapacity) {
int oldCapacity = capacity();
try {
fastResize(newCapacity);
} catch (OutOfMemoryError e) {
tightResize(newCapacity);
}
capacityChanged(oldCapacity, newCapacity);
}
private void fastResize(int newCapacity) {
super.resize(newCapacity);
}
private void tightResize(int newCapacity) {
Entry head = joinLists();
table = null; // free it first
table = new Entry[newCapacity];
threshold = (int) (newCapacity * loadFactor);
transfer(head, table);
}
private Entry joinLists() {
final Entry[] src = table;
final int n = src.length;
Entry head = null;
int i = 0;
while (i < n && null == head) {
head = src[i++];
}
Entry tail = head;
assert i >= n || null != tail;
while (i < n) {
Entry e = src[i++];
if (null != e) {
tail.next = e;
do {
tail = e;
e = e.next;
} while (null != e);
}
}
return head;
}
private void transfer(Entry head, Entry[] tgt) {
int n = capacity();
while (head != null) {
Entry e = head;
head = head.next;
int i = indexFor(e.hash, n);
e.next = tgt[i];
tgt[i] = e;
}
}
}
package ch.arrenbrecht.util;
import java.util.Map;
import ch.arrenbrecht.java.util.ShrinkableHashMap;
public class AggressivelyShrinkingHashMapDescendant<K,V> extends ShrinkableHashMap<K,V> {
private static final long serialVersionUID = 1486556544322229223L;
public AggressivelyShrinkingHashMapDescendant() {
super();
}
public AggressivelyShrinkingHashMapDescendant(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor);
}
public AggressivelyShrinkingHashMapDescendant(int initialCapacity) {
super(initialCapacity);
}
public AggressivelyShrinkingHashMapDescendant(Map m) {
super(m);
}
@Override
protected void sizeReduced() {
super.sizeReduced();
adjustCapacityToSize();
}
}
package ch.arrenbrecht.util;
import java.util.Map;
import ch.arrenbrecht.java.util.ShrinkableHashMap;
public class SlowlyShrinkingHashMapDescendant<K,V> extends ShrinkableHashMap<K,V> {
private static final long serialVersionUID = -5669759110544379547L;
int shrinkThreshold = 0;
public SlowlyShrinkingHashMapDescendant() {
super();
}
public SlowlyShrinkingHashMapDescendant(int initialCapacity,
float loadFactor) {
super(initialCapacity, loadFactor);
}
public SlowlyShrinkingHashMapDescendant(int initialCapacity) {
super(initialCapacity);
}
public SlowlyShrinkingHashMapDescendant(Map m) {
super(m);
}
@Override
protected void capacityChanged(int oldCapacity, int newCapacity) {
super.capacityChanged(oldCapacity, newCapacity);
if (newCapacity > 8)
shrinkThreshold = (int) ((newCapacity / 4) * loadFactor());
else
shrinkThreshold = 0;
}
@Override
protected void sizeReduced() {
super.sizeReduced();
if (size() < shrinkThreshold) {
adjustCapacity(capacity() / 2);
}
}
}