And I also run the random read benchmark provided in https://issues.apache.org/jira/browse/HDFS-236. Here is the result: 09/07/21 14:55:45 INFO mapred.FileInputFormat: Date & time: Tue Jul 21 14:55:45 CST 2009 09/07/21 14:55:45 INFO mapred.FileInputFormat: Number of files: 10 09/07/21 14:55:45 INFO mapred.FileInputFormat: Total MBytes processed: 39 09/07/21 14:55:45 INFO mapred.FileInputFormat: Throughput mb/sec: 3.113045903729678 09/07/21 14:55:45 INFO mapred.FileInputFormat: Average IO rate mb/sec: 3.133864402770996 09/07/21 14:55:45 INFO mapred.FileInputFormat: IO rate std deviation: 0.25014845326868934 09/07/21 14:55:45 INFO mapred.FileInputFormat: Test exec time sec: 21.318 09/07/21 14:55:45 INFO mapred.FileInputFormat: Read Ops Time: 12548 09/07/21 14:55:45 INFO mapred.FileInputFormat: Num of files per map: 1 09/07/21 14:55:45 INFO mapred.FileInputFormat: Total File Size: 10737418240 09/07/21 14:55:45 INFO mapred.FileInputFormat: Offset Distribution: random 09/07/21 14:55:45 INFO mapred.FileInputFormat: Random Read Method: pread 09/07/21 14:55:45 INFO mapred.FileInputFormat: Total Read Operations: 10000 09/07/21 14:55:45 INFO mapred.FileInputFormat: Time for each read : 1.25 09/07/21 14:55:45 INFO mapred.FileInputFormat: Num Reads Per Sec: 796.94 09/07/21 14:55:45 INFO mapred.FileInputFormat: Avg Read Size: 4096
>From this data, my result seems reasonable now... But should the two read API have so greate difference? Thanks. Martin Mituzas wrote: > > I dived into the code and did a time breakdown for two read API. > The poor performance read invokes a method > > fetchBlockByteRange(LocatedBlock block, long start, long end, byte[] buf, > int offset) > > For the total 100 seconds read time, I measured the time running this > method, about 99686 ms. > In this method, I measured the real read time on the following line: > > int nread = reader.readAll(buf, offset, len); > > The total real read time is about 35053 ms and about 64633 ms is spent on > choosing datenode and connecting datanode as well as create a new > BlockReader. > > And for the first read API, the total time spent on the following line is > about 95143 ms. > int result = readBuffer(buf, off, realLen); > > So based on these data, sequential read througput should be ~3x of random > read. But my test result is about 17x. Any other reason? > > > > > Martin Mituzas wrote: >> >> Waiting for response... >> Thanks in advance. >> >> >> >> Martin Mituzas wrote: >>> >>> hi, all >>> I see there are two read in DFSInputStream: >>> >>> int read(byte buf[], int off, int len) >>> int read(long position, byte[] buffer, int offset, int length) >>> >>> And I use the following code test the read performance. >>> Before test I generate some files in the directory DATA_DIR, then I run >>> this function for some time and calculate the read throughput. >>> The initFiles() function is borrowed from the patch >>> https://issues.apache.org/jira/browse/HDFS-236. >>> My question is I tried above two read methods (see the commented lines) >>> and found the throughput have huge difference. The results are attached >>> below. Is there something wrong with my code ? I cann't believe there >>> can be such big difference... >>> And in https://issues.apache.org/jira/browse/HDFS-236, I saw the >>> following performance data posted by Raghu Angadi : >>> >>> Description of read Time for each read in ms >>> 1000 native reads over block files 09.5 >>> Random Read 10x500 10.8 >>> Random Read without CRC 10.5 >>> Random Read with 'seek() and read()' 12.5 >>> Read with sequential offsets 01.7 >>> 1000 native reads without closing files 07.5 >>> >>> So based on this data, sequential read is about 6x faster than random >>> read which is reasonable, and my data seems unreasonable. Anybody >>> provides some comments? >>> >>> Here is my test result. >>> >>> with first read: >>> test type,read size,read ops,start time,end time,test time,real read >>> time,throughput >>> sequence read,10274603008,2508448,[2009-07-21 13:47:11 229],[2009-07-21 >>> 13:48:51 229],100,100,97.99 >>> >>> with second read: >>> test type,read size,read ops,start time,end time,test time,real read >>> time,throughput >>> sequence read,592449536,144641,[2009-07-21 13:23:52 464],[2009-07-21 >>> 13:25:32 465],100,100,5.65 >>> >>> My cluster: 1 name node + 3 data nodes, replication = 3. >>> And my code: >>> >>> private void sequenceRead(long time) throws IOException { >>> >>> byte[] data = new byte[bufferSize]; >>> Random rand = new Random(); >>> initFiles(DATA_DIR); >>> long period = time * 1000; >>> FSDataInputStream in = null; >>> long totalSize = 0; >>> long readCount = 0; >>> long offset = 0; >>> int index = (rand.nextInt() & Integer.MAX_VALUE ) % >>> fileList.size(); >>> if(barrier()){ >>> start = System.currentTimeMillis(); >>> while(System.currentTimeMillis() - start < period){ >>> if(in == null){ >>> FileInfo file = >>> (FileInfo)fileList.get(index); >>> in = file.fileStream; >>> if(in == null){ >>> in = fs.open(file.filePath); >>> file.fileStream = in; >>> } >>> index = (index ++) % fileList.size(); >>> } >>> long actualSize = in.read(offset, data, 0, >>> bufferSize); >>> //long actualSize = >>> in.read(data,0,bufferSize); >>> readCount ++; >>> >>> if(actualSize > 0){ >>> totalSize += actualSize; >>> offset += actualSize; >>> } >>> if(actualSize < bufferSize) { >>> //in.seek(0); >>> in = null; >>> offset = 0; >>> } >>> } >>> out.close(); >>> end = System.currentTimeMillis(); >>> >>> for(FileInfo finfo : fileList){ >>> if(finfo.fileStream != null) >>> IOUtils.closeStream(finfo.fileStream); >>> } >>> System.out.println("test type,read size,read ops,start >>> time,end time,test time,real read time,throughput"); >>> String s = String.format("sequence >>> read,%d,%d,[%s],[%s],%d,%d,%.2f", >>> totalSize, >>> readCount, >>> sdf.format(new Date(start)), >>> sdf.format(new Date(end)), >>> time, >>> (end-start)/1000, >>> (double)(totalSize*1000)/(double)((end >>> - start)*1024*1024)); >>> System.out.println(s); >>> } >>> } >>> >>> >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/HDFS-random-read-performance-vs-sequential-read-performance---tp24565264p24582739.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
