[jira] [Commented] (HDFS-4190) Read complete block into memory once in BlockScanning and reduce concurrent disk access

Uma Maheswara Rao G (JIRA) Thu, 15 Nov 2012 11:00:14 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498246#comment-13498246
 ]


Uma Maheswara Rao G commented on HDFS-4190:
-------------------------------------------

Could you point me the JIRA for #b ? Thanks Todd. So, that I will try to make 
use that in this case.
                
> Read complete block into memory once in BlockScanning and reduce concurrent 
> disk access
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-4190
>                 URL: https://issues.apache.org/jira/browse/HDFS-4190
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 3.0.0
>            Reporter: Uma Maheswara Rao G
>
> When we perform bulk write operations to DFS we observed that block scan is 
> one bottleneck for concurrent disk access.
> To see real load on disks, keep single data node and local client flushing 
> data to DFS.
> When we switch off block scanning we have seen >10% improvement. I will 
> update real figures in comment.
> Even though I am doing only write operation, implicitly there will be a read 
> operation for each block due to block scanning. Next scan will happen only 
> after 21 days, but once scan will happen after adding the block. This will be 
> the concurrent access to disks.
> Other point to note is that, we will read the block, packet by packet in 
> block scanning as well. We know that, we have to read&scan complete block, 
> so, it may be correct to load complete block once and do checksums 
> verification for that data?
> I tried with MemoryMappedBuffers:
> mapped the complete block once in blockScanning and does the checksum 
> verification with that. Seen good improvement in that bulk write scenario.
> But we don't have any API to clean the mapped buffer immediately. With my 
> experiment I just used, Cleaner class from sun package. That will not be 
> correct to use in production. So, we have to write JNI call to clean that 
> mmapped buffer.
> I am not sure I missed something here. please correct me If i missed some 
> points.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4190) Read complete block into memory once in BlockScanning and reduce concurrent disk access

Reply via email to