Hello,
I have a question about using the cached data in memory via centralized cache
management. 

I cached
the data what I want to use through the CLI (hdfs cacheadmin
-addDirectives ...).

Then, when
I write my mapreduce application, how can I read the cached data in
memory?

 

Here
is the source code from my mapreduce application.

 

 

  System.out.println("Ready
for loading data from Centralized Cache in DataNode");

  
System.out.println("Connecting
HDFS... at " + hdfsURI.toString());

  
DFSClient
dfs = new DFSClient(hdfsURI, new Configuration());

  
CacheDirectiveInfo
info = 

    
new
CacheDirectiveInfo.Builder().setPath(new Path("path in HDFS for cached
data")).setPool("cache").build();

  
CacheDirectiveEntry
cachedFile = dfs.listCacheDirectives(info).next();

  
System.out.println("We
got cachedFile! ID: " + 

  
cachedFile.getInfo().getId()
+ ", Path: " + cachedFile.getInfo().getPath() + ", CachedPool:
" + cachedFile.getInfo().getPool());

  


  
System.out.println("Open
DFSInputStream to read cachedFile to ByteBuffer");

  
DFSInputStream
in = dfs.open(cachedFile.getInfo().getPath().toString());

  
ElasticByteBufferPool
bufPool = new ElasticByteBufferPool();

  
ByteBuffer
buf = ByteBuffer.allocate(10000);

  
System.out.println("Generating
Off-Heap ByteBuffer! size: " + buf.capacity()); 

  
in.read(buf);

  
buf.flip();
// Flip: ready for reading data after writing data into buffer

  
System.out.println("Zero-Copying
cached file into buffer!");  

 

Is
it right source code for using the centralized cache management feature?

 

Thanks 


// Yoonmin Nam


Reply via email to