[ 
https://issues.apache.org/jira/browse/HDFS-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13057618#comment-13057618
 ] 

Uma Maheswara Rao G commented on HDFS-1640:
-------------------------------------------

Hi Harsh,
 Thanks for the comments,

{quote}
@Option 2 – This can be done at the writer level itself. Much too expensive 
maintaining a per-file property for compression and implementing the same to be 
guaranteed as well.
{quote}
  I agree with you.

bq. Besides, compressing on the fly for tx would still invoke the same amount 
of IO. No savings, right?
  Here the amount of data we are transferring will become very less. In our 
test, we noticed that we are  constrained by network IO. With this feature 
alone we could achieve good performance results. 
  



> Transmission of large files in compressed format will save the network 
> traffic and can improve the data transfer time.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1640
>                 URL: https://issues.apache.org/jira/browse/HDFS-1640
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>            Reporter: Uma Maheswara Rao G
>
> *In Write scenario:*
> DFSClient can Compress the data when transmitting it over the network, Data 
> Nodes can forward the same compressed data to other Data Nodes in pipeline 
> and as well as it can decompress that data and write on to their local disks. 
> *In Read Scenario:*
>  Data Node can compress the data when transmitting it over the network. 
> DFSClient can decompress it and write on to the local stream.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to