Hi Raghu,
Many thanx for your reply:
The write takes approximately: 11367 millisecs.
The read takes approximately: 1610565 millisecs.
File size is 68573254 bytes and hdfs block size is 64 megs.
Here is the Writer code:
FileInputStream fis = null;
OutputStream os = null;
try {
fis = new FileInputStream(new File(inputFile));
os = dsmStore.insert(outputFile);
dsmStore.insert does the following:
{
DistributedFileSystem fileSystem = new DistributedFileSystem();
fileSystem.initialize(uri, conf);
Path path = new Path(sKey);
//writing:
FSDataOutputStream dataOutputStream = fileSystem.create(path);
return dataOutputStream;
}
byte[] data = new byte[4096];
while (fis.read(data) != -1) {
os.write(data);
os.flush();
}
} catch (Exception e) {
e.printStackTrace();
}
finally {
if (os != null) {
try {
os.close();
} catch (IOException e) {
// TODO Auto-generated catch
block
e.printStackTrace();
}
}
if (fis != null) {
try {
fis.close();
} catch (IOException e) {
// TODO Auto-generated catch
block
e.printStackTrace();
}
}
}
}
Here is the Reader code:
byte[] data = new byte[4096];
int totalBytesRead = 0;
boolean fileNotReadFully = true;
InputStream is = dsmStore.select(fileName);
dsmStore.select does the following:
{
DistributedFileSystem fileSystem = new DistributedFileSystem();
fileSystem.initialize(uri, conf);
Path path = new Path(sKey);
FSDataInputStream dataInputStream = fileSystem.open(path);
return dataInputStream;
}
while (fileNotReadFully) {
int bytesReadThisRead = 0 ;
try {
bytesReadThisRead = is.read(data);
totalBytesRead += bytesReadThisRead;
fileNotReadFully = (bytesReadThisRead
!= -1);
} catch (Exception e) {
e.printStackTrace();
}
}
if (is != null) {
try {
is.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Could probably try different buffer sizes etc.
Thanx,
Taj
Raghu Angadi wrote:
>
>
> How slow is it? May the code that reads is relevant too.
>
> Raghu.
>
> j2eeiscool wrote:
>> Hi,
>>
>> I am new to hadoop. We are evaluating HDFS for a reliable, disitrbuted
>> file
>> system use.
>>
>> From the tests (1 name + 1 data, both on different RHEL 4 m/cs, client
>> running on the name node m/c) I have run so far:
>>
>> 1.The writes are very fast.
>>
>> 2.The read is very slow (reading a 68 megs file). Here is the sample
>> code.
>> Any ideas what could be going wrong:
>>
>>
>> public InputStream select(String sKey) throws RecordNotFoundException,
>> IOException {
>> DistributedFileSystem fileSystem = new DistributedFileSystem();
>> fileSystem.initialize(uri, conf);
>> Path path = new Path(sKey);
>> FSDataInputStream dataInputStream = fileSystem.open(path);
>> return dataInputStream;
>>
>> }
>>
>> Thanx,
>> Taj
>>
>>
>
>
>
--
View this message in context:
http://www.nabble.com/HDFS-File-Read-tf4773580.html#a13657913
Sent from the Hadoop Users mailing list archive at Nabble.com.