Shi,
Be careful when you start calling it a buggy API, especially as you
present the quality of code that you did in your initial test case. Your
bugs-per-LOC was pretty high.
However, it seems that you did in fact stumble into a bug in the Spy client,
but only because you did no error checking at all.
Dustin,
while trying to re-create this problem and point out the various errors in
his code, I found that, in his test case, if I did not call Future.get() to
verify the result of the set, the spyMemcached client leaked memory. Given
that the Spymemcached wiki says that fire-and-forget is a valid mode of
usage, this appears to be a bug.
Here's my testcase against spymemcached-2.5.jar:
'java -cp .:./memcached-2.5.jar FutureResultLeak true' leaks memory and will
eventually die OOM.
' java -cp .:./memcached-2.5.jar FutureResultLeak false' does not leak and
runs to completion.
Here's the code. It's based on Shi's testcase so he and I now share the
blame for code quality :)
----------------------
import net.spy.memcached.*;
import java.lang.*;
import java.net.*;
import java.util.concurrent.*;
public class FutureResultLeak {
public static void main(String[] args) throws Exception {
boolean leakMemory = false;
if (args.length >= 1) {
leakMemory = Boolean.valueOf(args[0]);
}
System.out.println("Testcase will " + (leakMemory ? "leak memory" : "not
leak memory"));
MemcachedClient mc=new MemcachedClient(new
InetSocketAddress("localhost", 11211));
mc.flush();
System.out.println("Memcached flushed ...");
int count = 0;
int logInterval = 100000;
int itemExpiryTime = 600;
long intervalStartTime = System.currentTimeMillis();
for(int i=0;i<6000000;i++){
String a = "String"+i;
String b = "Value"+i;
Future<Boolean> f =mc.add(a,itemExpiryTime, b);
if (!leakMemory) {
f.get();
}
count++;
if (count % logInterval == 0) {
long elapsed = System.currentTimeMillis() - intervalStartTime;
double itemsPerSec = logInterval*1.0/elapsed;
System.out.println(count+ " elements added in " + elapsed + " (" +
itemsPerSec + " per sec).");
intervalStartTime = System.currentTimeMillis();
}
}
System.out.println("done "+ count +" records inserted");
mc.shutdown(60, TimeUnit.SECONDS);
}
}
----------------------
Regards,
Kelvin
On 17/10/10 12:28 AM, "Shi Yu" <[email protected]> wrote:
> And I run with the following java command on a 64-bit Unix machine
> which has 8G memory. I separate the Map into three parts, still
> failed. TBH I think there is some bug in the spymemcached input
> method. With Whalin's API there is no any problem with only 2G heap
> size, just a little bit slower but thats definitely better than being
> stuck for 6 hours on a bugged API.
>
> java -Xms4G -Xmx4G -classpath ./lib/spymemcached-2.5.jar Memcaceload
>
> Here is the error output:
>
> 2010-10-16 22:40:50.959 INFO net.spy.memcached.MemcachedConnection:
> Added {QA sa=ocuic32.research/192.168.136.36:11211, #Rops=0, #Wops=0,
> #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect
> queue
> Memchaced flushed ...
> Cache loader created ...
> 2010-10-16 22:40:50.989 INFO net.spy.memcached.MemcachedConnection:
> Connection state changed for sun.nio.ch.selectionkeyi...@25fa1bb6
> map1 loaded
> map2 loaded
> java.lang.OutOfMemoryError: Java heap space
> at sun.nio.cs.UTF_8.newEncoder(UTF_8.java:51)
> at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:215)
> at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:207)
> at java.lang.StringCoding.encode(StringCoding.java:266)
> at java.lang.String.getBytes(String.java:947)
> at net.spy.memcached.KeyUtil.getKeyBytes(KeyUtil.java:20)
> at
> net.spy.memcached.protocol.ascii.OperationImpl.setArguments(OperationImpl.java
> :86)
> at
> net.spy.memcached.protocol.ascii.BaseStoreOperationImpl.initialize(BaseStoreOp
> erationImpl.java:48)
> at
> net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java:60
> 1)
> at
> net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java:58
> 2)
> at net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:277)
> at
> net.spy.memcached.MemcachedClient.asyncStore(MemcachedClient.java:314)
> at net.spy.memcached.MemcachedClient.set(MemcachedClient.java:691)
> at net.spy.memcached.util.CacheLoader.push(CacheLoader.java:92)
> at net.spy.memcached.util.CacheLoader.loadData(CacheLoader.java:61)
> at net.spy.memcached.util.CacheLoader.loadData(CacheLoader.java:75)
> at MemchacedLoad.mapload(MemchacedLoad.java:90)
> at MemchacedLoad.main(MemchacedLoad.java:159)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
> ava:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>
> Shi
>
> On Sat, Oct 16, 2010 at 10:23 PM, Dustin <[email protected]> wrote:
>>
>> On Oct 16, 6:45 pm, Shi Yu <[email protected]> wrote:
>>> I have also tried the CacheLoader API, it pops a java GC error. The
>>> thing I haven't tried is to separate 6 million records into several
>>> objects and try CacheLoader. But I don't think it should be that
>>> fragile and complicated. I have spent a whole day on this issue, now I
>>> just rely the hybrid approach to finish the work. But I would be very
>>> interested to hear any solution to solve this issue.
>>
>> I cannot make any suggestions as to why you got an error without
>> knowing what you did and what error you got.
>>
>> I would not expect the same that you posted to work without a lot of
>> memory, tweaking, and a very fast network since you're just filling an
>> output queue as fast as java will allow you.
>
>> You didn't share any code using CacheLoader, so I can only guess as
>> to how you may have used it to get an error. There are three
>> different methods you can use -- did you try to create a map with six
>> million values and then pass it to the CacheLoader API (that would
>> very likely give you an out of memory error).
>
>
>>
>> You could also be taxing the GC considerably by converting integers
>> to strings to compute modulus if your jvm doesn't do proper escape
>> analysis.
>>
>> I can assure you there's no magic that will make it fail to load six
>> million records through the API as long as you account for the
>> realities of your network (which CacheLoader does for you) and your
>> available memory.