Okay. I have to empty the useful data from my memcache server to do
the experiment again. The code of method is as follows.
public static void mapload() throws Exception{
MemcachedClient mc=new MemcachedClient(new
InetSocketAddress("ocuic32.research", 11211));
mc.flush();
System.out.println("Memchaced flushed ...");
CacheLoader cl = new CacheLoader(mc);
System.out.println("Cache loader created ...");
Map<String,String> map1 = new HashMap<String,String>();
Map<String,String> map2 = new HashMap<String,String>();
Map<String,String> map3 = new HashMap<String,String>();
for (int i=0;i<1999999;i++){
map1.put("key"+i,"value"+i);
}
try{
cl.loadData(map1);
System.out.println("map1 loaded");
}catch(Exception e1){
e1.printStackTrace();
}
map1=null;
for (int i=2000000;i<3999999;i++){
map2.put("key"+i,"value"+i);
}
try{
cl.loadData(map2);
System.out.println("map2 loaded");
}catch(Exception e2){
e2.printStackTrace();
}
map2=null;
for (int i=4000000;i<5999999;i++){
map3.put("key"+i,"value"+i);
}
try{
cl.loadData(map3);
System.out.println("map3 loaded");
}catch(Exception e3){
e3.printStackTrace();
}
map3=null;
System.out.println("All done");
}
And I run with the following java command on a 64-bit Unix machine
which has 8G memory. I separate the Map into three parts, still
failed. TBH I think there is some bug in the spymemcached input
method. With Whalin's API there is no any problem with only 2G heap
size, just a little bit slower but thats definitely better than being
stuck for 6 hours on a bugged API.
java -Xms4G -Xmx4G -classpath ./lib/spymemcached-2.5.jar Memcaceload
Here is the error output:
2010-10-16 22:40:50.959 INFO net.spy.memcached.MemcachedConnection:
Added {QA sa=ocuic32.research/192.168.136.36:11211, #Rops=0, #Wops=0,
#iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect
queue
Memchaced flushed ...
Cache loader created ...
2010-10-16 22:40:50.989 INFO net.spy.memcached.MemcachedConnection:
Connection state changed for sun.nio.ch.selectionkeyi...@25fa1bb6
map1 loaded
map2 loaded
java.lang.OutOfMemoryError: Java heap space
at sun.nio.cs.UTF_8.newEncoder(UTF_8.java:51)
at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:215)
at java.lang.StringCoding$StringEncoder.<init>(StringCoding.java:207)
at java.lang.StringCoding.encode(StringCoding.java:266)
at java.lang.String.getBytes(String.java:947)
at net.spy.memcached.KeyUtil.getKeyBytes(KeyUtil.java:20)
at
net.spy.memcached.protocol.ascii.OperationImpl.setArguments(OperationImpl.java:86)
at
net.spy.memcached.protocol.ascii.BaseStoreOperationImpl.initialize(BaseStoreOperationImpl.java:48)
at
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java:601)
at
net.spy.memcached.MemcachedConnection.addOperation(MemcachedConnection.java:582)
at net.spy.memcached.MemcachedClient.addOp(MemcachedClient.java:277)
at
net.spy.memcached.MemcachedClient.asyncStore(MemcachedClient.java:314)
at net.spy.memcached.MemcachedClient.set(MemcachedClient.java:691)
at net.spy.memcached.util.CacheLoader.push(CacheLoader.java:92)
at net.spy.memcached.util.CacheLoader.loadData(CacheLoader.java:61)
at net.spy.memcached.util.CacheLoader.loadData(CacheLoader.java:75)
at MemchacedLoad.mapload(MemchacedLoad.java:90)
at MemchacedLoad.main(MemchacedLoad.java:159)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
Shi
On Sat, Oct 16, 2010 at 10:23 PM, Dustin <[email protected]> wrote:
>
> On Oct 16, 6:45 pm, Shi Yu <[email protected]> wrote:
>> I have also tried the CacheLoader API, it pops a java GC error. The
>> thing I haven't tried is to separate 6 million records into several
>> objects and try CacheLoader. But I don't think it should be that
>> fragile and complicated. I have spent a whole day on this issue, now I
>> just rely the hybrid approach to finish the work. But I would be very
>> interested to hear any solution to solve this issue.
>
> I cannot make any suggestions as to why you got an error without
> knowing what you did and what error you got.
>
> I would not expect the same that you posted to work without a lot of
> memory, tweaking, and a very fast network since you're just filling an
> output queue as fast as java will allow you.
> You didn't share any code using CacheLoader, so I can only guess as
> to how you may have used it to get an error. There are three
> different methods you can use -- did you try to create a map with six
> million values and then pass it to the CacheLoader API (that would
> very likely give you an out of memory error).
>
> You could also be taxing the GC considerably by converting integers
> to strings to compute modulus if your jvm doesn't do proper escape
> analysis.
>
> I can assure you there's no magic that will make it fail to load six
> million records through the API as long as you account for the
> realities of your network (which CacheLoader does for you) and your
> available memory.