steveloughran commented on pull request #2179:
URL: https://github.com/apache/hadoop/pull/2179#issuecomment-686411091


   LGTM, +1 *pending you make clear in the option names and docs that this is 
experimental*
   
   > These configs are tested on prod environments. 
   
   > The same can remain as a means to control the resource usage. With the 
internal discussions we had we would like to keep the same this way.
   
   We all have this problem, we all want to get a fix in.
   
   My point of view is that a shared thread pool with queue managed to stop one 
single output stream using up all the capacity is that the correct solution. I 
base this on S3ABlockOutputStream, whose pool class is in hadoop-common.
   
   I also understand, why a simple "let's do this right now" fix can address a 
situation will rapidly before the ABFS streams switch to 
`BlockingThreadPoolExecutorService`. 
   
   However, it is precisely because we know it is an interim fix that I want 
all the options to have 'experimental' in their name. That way, when they get 
removed, people won't get upset that the options they were using have gone away.
   
   I recognise that you are shipping with this fix, and that you have cluster 
configurations which use them. However, it is long-standing policy in the 
project which is "the ASF project must not have its decisions determined by the 
fact that someone has already shipped a feature in their own branch". That's 
important: we have all shipped fixes early, and then had to deal with catching 
up with production releases. I believe the HDFS IPC wire format change between 
Hadoop 2.0.205 (used in CDH) and Hadoop 2.2.0 was the most controversial here 
as it was actual protocol incompatibility. The situation here is minor in 
comparision.
   
   _Anyone is free to ship a hadoop build with their own changes, but that 
cannot be used as a veto on changes in the open source codebase itself_
   
   This makes sense, when you think of it.
   
   
   The good news, `Configuration.addDeprecations()` lets you provide an 
automated way to map from deprecated values (your current set of options) to 
ones with "experimental" in the name, such as here (S3ATestUtils)
   
   ```
     private static void addDeprecatedKeys() {
       Configuration.DeprecationDelta[] deltas = {
           // STS endpoint configuration option
           new Configuration.DeprecationDelta(
               S3ATestConstants.TEST_STS_ENDPOINT,
               ASSUMED_ROLE_STS_ENDPOINT)
       };
   
       Configuration.addDeprecations(deltas);
       Configuration.reloadExistingConfigurations();
     }
   
     static {
       addDeprecatedKeys();
     }
   ```
   
   If the old option is used, .the user will see a message logged @ info 
(unless they have turned off the deprecation log), and the value is remapped to 
the new one. 
   
   ```
   log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=WARN
   ```
   
   So, please use experimental in the name, it gives us freedom to come up with 
a good design for how to do block upload buffering/pooling without having any 
future design decisions constrained by this intermediate patch.
   
   thanks.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to