I am building my code using Lucene 4.7.1 and Hadoop 2.4.0 . Here is what I am trying to do Create Index 1. Build index in RAMDirectory based on data stored on HDFS . 2. Once built , copy the index onto HDFS. Search Index 1. Bring in the index stored on HDFS into RAMDirectory. 2. Perform a search on in memory index . The error I am facing is Exception in thread "main" java.io.EOFException: read past EOF: RAMInputStream(name=segments_2) at org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:94) at org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:67) at org.apache.lucene.store.ChecksumIndexInput.readByte(ChecksumIndexInput.java:41) at org.apache.lucene.store.DataInput.readInt(DataInput.java:84) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:326) at org.apache.lucene.index.StandardDirectoryReader$1 o.doBody(StandardDirectoryReader.java:56) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66) at hdfs.SearchFiles.main(SearchFiles.java:85) I did some research and found out this , may be due to index corruption . Below is my code . Save index into HDFS . // Getting files present in memory into an array.StringfileList[]=rdir.listAll();// Reading index files from memory and storing them to HDFS.for(inti =0;i <fileList.length;i++){IndexInputindxfile =rdir.openInput(fileList[i].trim(),null);longlen =indxfile.length();intlen1 =(int)len;// Reading data from file into a byte array.byte[]bytarr =newbyte[len1];indxfile.readBytes(bytarr,0,len1);// Creating file in HDFS directory with name same as that of// index filePathsrc =newPath(indexPath +fileList[i].trim());dfs.createNewFile(src);// Writing data from byte array to the file in HDFSFSDataOutputStreamfs =dfs.create(newPath(dfs.getWorkingDirectory()+indexPath +fileList[i].trim()),true);fs.write(bytarr);fs.flush();fs.close();}FileSystem.closeAll(); ________________________________
Bringing index from HDFS into RAMDirectory and using it . // Creating a RAMDirectory (memory) object, to be able to create index// in memory.RAMDirectoryrdir =newRAMDirectory();// Getting the list of index files present in the directory into an// array.FSDataInputStreamfilereader =null;for(inti =0;i <status.length;i++){// Reading data from index files on HDFS directory into filereader// object.filereader =dfs.open(status[i].getPath());intsize =filereader.available();// Reading data from file into a byte array.byte[]bytarr =newbyte[size];filereader.read(bytarr,0,size);// Creating file in RAM directory with names same as that of// index files present in HDFS directory.filenm =newString(status[i].getPath().toString());StringsSplitValue =filenm.substring(57,filenm.length());System.out.println(sSplitValue);IndexOutputindxout =rdir.createOutput((sSplitValue),IOContext.DEFAULT);// Writing data from byte array to the file in RAM directoryindxout.writeBytes(bytarr,bytarr.length);indxout.flush();indxout.close();}