Hi Raghu, Many thanx for your reply:
The write takes approximately: 11367 millisecs. The read takes approximately: 1610565 millisecs. File size is 68573254 bytes and hdfs block size is 64 megs. Here is the Writer code: FileInputStream fis = null; OutputStream os = null; try { fis = new FileInputStream(new File(inputFile)); os = dsmStore.insert(outputFile); dsmStore.insert does the following: { DistributedFileSystem fileSystem = new DistributedFileSystem(); fileSystem.initialize(uri, conf); Path path = new Path(sKey); //writing: FSDataOutputStream dataOutputStream = fileSystem.create(path); return dataOutputStream; } byte[] data = new byte[4096]; while (fis.read(data) != -1) { os.write(data); os.flush(); } } catch (Exception e) { e.printStackTrace(); } finally { if (os != null) { try { os.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } if (fis != null) { try { fis.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } } Here is the Reader code: byte[] data = new byte[4096]; int totalBytesRead = 0; boolean fileNotReadFully = true; InputStream is = dsmStore.select(fileName); dsmStore.select does the following: { DistributedFileSystem fileSystem = new DistributedFileSystem(); fileSystem.initialize(uri, conf); Path path = new Path(sKey); FSDataInputStream dataInputStream = fileSystem.open(path); return dataInputStream; } while (fileNotReadFully) { int bytesReadThisRead = 0 ; try { bytesReadThisRead = is.read(data); totalBytesRead += bytesReadThisRead; fileNotReadFully = (bytesReadThisRead != -1); } catch (Exception e) { e.printStackTrace(); } } if (is != null) { try { is.close(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } Could probably try different buffer sizes etc. Thanx, Taj Raghu Angadi wrote: > > > How slow is it? May the code that reads is relevant too. > > Raghu. > > j2eeiscool wrote: >> Hi, >> >> I am new to hadoop. We are evaluating HDFS for a reliable, disitrbuted >> file >> system use. >> >> From the tests (1 name + 1 data, both on different RHEL 4 m/cs, client >> running on the name node m/c) I have run so far: >> >> 1.The writes are very fast. >> >> 2.The read is very slow (reading a 68 megs file). Here is the sample >> code. >> Any ideas what could be going wrong: >> >> >> public InputStream select(String sKey) throws RecordNotFoundException, >> IOException { >> DistributedFileSystem fileSystem = new DistributedFileSystem(); >> fileSystem.initialize(uri, conf); >> Path path = new Path(sKey); >> FSDataInputStream dataInputStream = fileSystem.open(path); >> return dataInputStream; >> >> } >> >> Thanx, >> Taj >> >> > > > -- View this message in context: http://www.nabble.com/HDFS-File-Read-tf4773580.html#a13657913 Sent from the Hadoop Users mailing list archive at Nabble.com.