[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2015-08-27 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716577#comment-14716577
 ] 

Yong Zhang commented on HDFS-4412:
--

Why not try IO throttling in a fire mode, like HADOOP-9640?

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2015-08-27 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716580#comment-14716580
 ] 

Yong Zhang commented on HDFS-4412:
--

sorry, fair mode

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2015-08-27 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716581#comment-14716581
 ] 

Yong Zhang commented on HDFS-4412:
--

sorry, fair mode

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2015-08-27 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14716579#comment-14716579
 ] 

Yong Zhang commented on HDFS-4412:
--

sorry, fair mode

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-03-01 Thread Denis Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590743#comment-13590743
 ] 

Denis Petrov commented on HDFS-4412:


It would be nice to throttle on per-datanode basis, liming not the bandwidth of 
the current stream, but the IO bandwidth on the bottleneck datanode.

If few throttled writes go to the same datanode, the throttling threshold 
should be adjusted.


 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-03-01 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590824#comment-13590824
 ] 

Colin Patrick McCabe commented on HDFS-4412:


I think you need to decide what problem you're trying to solve.  Are you trying 
to implement cluster-wide QoS (quality of service)?  Are you trying to avoid a 
problem with hot spots in the network?  (And if so, have you quantified how 
big a problem that really is?)

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-03-01 Thread Denis Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590895#comment-13590895
 ] 

Denis Petrov commented on HDFS-4412:


I am trying to avoid a problem with hot spots in the disk IO.
In particular, during major compaction of Accumulo tablets 
(https://issues.apache.org/jira/browse/ACCUMULO-1128).

This problem is difficult to solve at the application level, because actual 
disk IO can be (and often is) performed on another server, not on the server 
which runs the Accumulo tablet server doing the compaction.
Two of three compactions running on different servers can result in heavy disk 
IO on the same HDFS datanode resulting in degradation of query performance and 
latency.

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-01-17 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556461#comment-13556461
 ] 

Zhenxiao Luo commented on HDFS-4412:


Thanks Alejandro and Daryn.

I am thinking of providing an IOThrottler interface, which 
DataTransferThrottler implements.

An app side change is extending IOUtils.copyBytes with an additional 
IOThrottler parameter, which does the throttling when users doing DFSShell -put 
or -get.

Cluster side throttling might need to add additional api in FileSystem open() 
and create(), also pass an additional IOThrottler parameter, and put 
IOThrottler in FSDataInputStream/FSDataOutputStream.

As Alejandro said, if enforce throttling for cluster, we will go to Cluster 
Side, and if only enforce application throttling, we could go app side. Or, 
maybe in general, we could support both?

Comments and suggestions are welcome.

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-01-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556481#comment-13556481
 ] 

Colin Patrick McCabe commented on HDFS-4412:


I think it makes sense to create some kind of throttler class that wraps a 
generic {{OutputStream}}.  That doesn't need to be in HDFS-- in fact, that code 
should probably go in common.

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-01-16 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555390#comment-13555390
 ] 

Zhenxiao Luo commented on HDFS-4412:


Any comments are welcome. Which approach is better?

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-01-16 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555399#comment-13555399
 ] 

Alejandro Abdelnur commented on HDFS-4412:
--

Is the objective to be able to enforce IO throttling for the cluster or for 
certain applications to be 'nice'? If the former this must be enforced on the 
cluster side, not on the client side. If the later apps wanting to be 'nice' 
could wrap the IO streams with throttling aware ones.

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4412) Support HDFS IO throttling

2013-01-16 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555433#comment-13555433
 ] 

Daryn Sharp commented on HDFS-4412:
---

A simple app side change might be to extend IOUtils.copyBytes.

 Support HDFS IO throttling
 --

 Key: HDFS-4412
 URL: https://issues.apache.org/jira/browse/HDFS-4412
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Zhenxiao Luo

 When an applications upload/download files from/to HDFS clusters, it would be 
 nice if the IO could be throttled so that they won't go beyond the specified 
 maximum bandwidth.
 Two options to implement this IO throttling:
 #1. IO Throttling happens at the FSDataInputStream and FSDataOutputStream 
 level.
 Add an IO Throttler to FSDataInputStream/FSDataOutputStram, and whenever an 
 read/write happens, throttle it first(if throttler is set), then do the 
 actual read/write.
 We may need to add new FileSystem apis to take an IO throttler as input 
 parameter.
 #2. IO Throttling happens at the application level.
 Instead of changing the FSDataInputStream/FSDataOutputStream, all IO 
 throttling is done at the application level.
 In this approach, FileSystem api remains unchanged.
 Either case, an IO throttler interface is needed, which has a:
 public void throttle(long numOfBytes);
 The current DataTransferThrottler could be an implementation of this IO 
 throttler interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira