[jira] [Created] (HADOOP-13777) Trim configuration values in `rumen`

2016-10-31 Thread Tianyin Xu (JIRA)
Tianyin Xu created HADOOP-13777:
---

 Summary: Trim configuration values in `rumen`
 Key: HADOOP-13777
 URL: https://issues.apache.org/jira/browse/HADOOP-13777
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0-alpha1
Reporter: Tianyin Xu
Priority: Minor


The current implementation of {{ClassName.java}} in {{rumen}} does not follow 
the practice of trimming configuration values. This leads to silent and 
hard-to-diagnosis errors if users set values containing space or 
newline---basically classes supposed to need anonymization will not do.

See the previous commits as reference (just list a few):
HADOOP-6578. Configuration should trim whitespace around a lot of value types
HADOOP-6534. Trim whitespace from directory lists initializing
Patch is available against trunk
HDFS-9708. FSNamesystem.initAuditLoggers() doesn't trim classnames
HDFS-2799. Trim fs.checkpoint.dir values.
YARN-3395. FairScheduler: Trim whitespaces when using username for queuename.
YARN-2869. CapacityScheduler should trim sub queue names when parse 
configuration.

Patch is available against trunk (tested):
{code:title=ClassName.java|borderStyle=solid}
@@ -43,15 +43,13 @@ protected String getPrefix() {

   @Override
   protected boolean needsAnonymization(Configuration conf) {
-String[] preserves = conf.getStrings(CLASSNAME_PRESERVE_CONFIG);
-if (preserves != null) {
-  // do a simple starts with check
-  for (String p : preserves) {
-if (className.startsWith(p)) {
-  return false;
-}
+String[] preserves = conf.getTrimmedStrings(CLASSNAME_PRESERVE_CONFIG);
+// do a simple starts with check
+for (String p : preserves) {
+  if (className.startsWith(p)) {
+return false;
   }
 }
 return true;
   }
{code}
(the NULL check is no longer needed because {{getTrimmedStrings}} returns an 
empty array if nothing is set)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-12676) Inconsistent assumptions of the default keytab file of Kerberos

2015-12-24 Thread Tianyin Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyin Xu resolved HADOOP-12676.
-
Resolution: Invalid

> Inconsistent assumptions of the default keytab file of Kerberos
> ---
>
> Key: HADOOP-12676
> URL: https://issues.apache.org/jira/browse/HADOOP-12676
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Tianyin Xu
>Assignee: Tianyin Xu
>Priority: Minor
>
> In the current implementation of {{SecurityUtil}}, we do not consider the 
> default keytab file of Kerberos (which is {{/etc/krb5.keytab}} in [MIT 
> Kerberos 
> defaults|http://web.mit.edu/kerberos/krb5-1.13/doc/mitK5defaults.html#paths]).
> If the user does not set the keytab file, an {{IOException}} will be thrown. 
> {code:title=SecurityUtil.java|borderStyle=solid}
> 230   public static void login(final Configuration conf,
> 231   final String keytabFileKey, final String userNameKey, String 
> hostname)
> 232   throws IOException { 
> ...
> 237 String keytabFilename = conf.get(keytabFileKey);
> 238 if (keytabFilename == null || keytabFilename.length() == 0) {
> 239   throw new IOException("Running in secure mode, but config doesn't 
> have a keytab");
> 240 }
> {code} 
> However, the default keytab location is assumed by some of the callers. For 
> example, in 
> [{{yarn-default.xml}}|https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml],
> ||property|| default||
> |yarn.resourcemanager.keytab  | /etc/krb5.keytab
> |yarn.nodemanager.keytab| /etc/krb5.keytab
> |yarn.timeline-service.keytab | /etc/krb5.keytab
> On the other hand, these callers directly call the {{SecurityUtil.login}} 
> method; therefore, the docs are incorrect that the defaults are actually 
> {{null}} (as we do not have a default)...
> {code:title=NodeManager.java|borderStyle=solid}
>   protected void doSecureLogin() throws IOException {
> SecurityUtil.login(getConfig(), YarnConfiguration.NM_KEYTAB,
> YarnConfiguration.NM_PRINCIPAL);
>   }
> {code}
> I don't know if we should make {{/etc/krb5.keytab}} as the default in 
> {{SecurityUtil}}, or ask the callers to correct their assumptions. I post 
> here as a minor issue.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12676) Consider the default keytab file of Kerberos

2015-12-23 Thread Tianyin Xu (JIRA)
Tianyin Xu created HADOOP-12676:
---

 Summary: Consider the default keytab file of Kerberos
 Key: HADOOP-12676
 URL: https://issues.apache.org/jira/browse/HADOOP-12676
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.6.2, 2.7.1
Reporter: Tianyin Xu
Priority: Minor


In the current implementation of {{SecurityUtil}}, we do not consider the 
default keytab file of Kerberos (which is {{/etc/krb5.keytab}} in [MIT Kerberos 
defaults|http://web.mit.edu/kerberos/krb5-1.13/doc/mitK5defaults.html#paths]).

If the user does not set the keytab file, an {{IOException}} will be thrown. 
{code:title=SecurityUtil.java|borderStyle=solid}
230   public static void login(final Configuration conf,
231   final String keytabFileKey, final String userNameKey, String hostname)
232   throws IOException { 
...
237 String keytabFilename = conf.get(keytabFileKey);
238 if (keytabFilename == null || keytabFilename.length() == 0) {
239   throw new IOException("Running in secure mode, but config doesn't 
have a keytab");
240 }
{code} 

However, the default keytab location is assumed by some of the callers. For 
example, in 
[{{yarn-default.xml}}|https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml],
 the defaults of {{yarn.resourcemanager.keytab}}, {{yarn.nodemanager.keytab}}, 
and {{yarn.timeline-service.keytab}} all point to {{/etc/krb5.keytab}}. 

On the other hand, these callers directly call the {{SecurityUtil.login}} 
method; therefore, the docs are incorrect that the defaults are actually 
{{null}} (as we do not have a default)...
{code:title=NodeManager.java|borderStyle=solid}
  protected void doSecureLogin() throws IOException {
SecurityUtil.login(getConfig(), YarnConfiguration.NM_KEYTAB,
YarnConfiguration.NM_PRINCIPAL);
  }
{code}

I don't know if we should make {{/etc/krb5.keytab}} as the default in 
{{SecurityUtil}}, or ask the callers to correct their assumptions. I post here 
as a potential improvement.

Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12671) Inconsistent configuration values and incorrect comments

2015-12-22 Thread Tianyin Xu (JIRA)
Tianyin Xu created HADOOP-12671:
---

 Summary: Inconsistent configuration values and incorrect comments
 Key: HADOOP-12671
 URL: https://issues.apache.org/jira/browse/HADOOP-12671
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, documentation, fs/s3
Affects Versions: 2.6.2, 2.7.1
Reporter: Tianyin Xu


The two values in [core-default.xml | 
https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml]
 are wrong. 
{{fs.s3a.multipart.purge.age}}
{{fs.s3a.connection.timeout}}
{{fs.s3a.connection.establish.timeout}}
\\
\\

*1. {{fs.s3a.multipart.purge.age}}*
(in both {{2.6.2}} and {{2.7.1}})
In [core-default.xml | 
https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml],
 the value is {{86400}} ({{24}} hours), while in the code it is {{14400}} 
({{4}} hours).
\\
\\

*2. {{fs.s3a.connection.timeout}}*
(only appear in {{2.6.2}})
In [core-default.xml (2.6.2) | 
https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/core-default.xml],
 the value is {{5000}}, while in the code it is {{5}}.
{code}
  // seconds until we give up on a connection to s3
  public static final String SOCKET_TIMEOUT = "fs.s3a.connection.timeout";
  public static final int DEFAULT_SOCKET_TIMEOUT = 5;
{code}
\\

*3. {{fs.s3a.connection.establish.timeout}}*
(only appear in {{2.7.1}})
In [core-default.xml (2.7.1)| 
https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml],
 the value is {{5000}}, while in the code it is {{5}}.
{code}
  // seconds until we give up trying to establish a connection to s3
  public static final String ESTABLISH_TIMEOUT = 
"fs.s3a.connection.establish.timeout";
  public static final int DEFAULT_ESTABLISH_TIMEOUT = 5;
{code}
\\

btw, the code comments are wrong! The two parameters are in the unit of 
*milliseconds* instead of *seconds*...
{code}
-  // seconds until we give up on a connection to s3
+  // milliseconds until we give up on a connection to s3
...
-  // seconds until we give up trying to establish a connection to s3
+  // milliseconds until we give up trying to establish a connection to s3
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12659) Incorrect usage of config parameters in token manager of KMS

2015-12-18 Thread Tianyin Xu (JIRA)
Tianyin Xu created HADOOP-12659:
---

 Summary: Incorrect usage of config parameters in token manager of 
KMS
 Key: HADOOP-12659
 URL: https://issues.apache.org/jira/browse/HADOOP-12659
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.6.2, 2.7.1
Reporter: Tianyin Xu


Hi, the usage of the following configs of Key Management Server (KMS) are 
problematic: 
{{hadoop.kms.authentication.delegation-token.renew-interval.sec}}
{{hadoop.kms.authentication.delegation-token.removal-scan-interval.sec}}

The name indicates that the units are {{sec}}, and the online doc shows that 
the default values are {{86400}} and {{3600}}, respectively.
https://hadoop.apache.org/docs/stable/hadoop-kms/index.html
which is also defined in
{code:title=DelegationTokenManager.java|borderStyle=solid}
 55   public static final String RENEW_INTERVAL = PREFIX + "renew-interval.sec";
 56   public static final long RENEW_INTERVAL_DEFAULT = 24 * 60 * 60;
 ...
 58   public static final String REMOVAL_SCAN_INTERVAL = PREFIX +
 59   "removal-scan-interval.sec";
 60   public static final long REMOVAL_SCAN_INTERVAL_DEFAULT = 60 * 60;
{code}

However, in {{DelegationTokenManager.java}} and 
{{ZKDelegationTokenSecretManager.java}}, these two parameters are used 
incorrectly.

1. *{{DelegationTokenManager.java}}*
{code}
 70   conf.getLong(RENEW_INTERVAL, RENEW_INTERVAL_DEFAULT) * 1000,
 71   conf.getLong(REMOVAL_SCAN_INTERVAL, 
 72   REMOVAL_SCAN_INTERVAL_DEFAULT * 1000));
{code}

Apparently, at Line 72, {{REMOVAL_SCAN_INTERVAL}} should be used in the same 
way as {{RENEW_INTERVAL}}, like
{code}
72c72
<   REMOVAL_SCAN_INTERVAL_DEFAULT * 1000));
---
>   REMOVAL_SCAN_INTERVAL_DEFAULT) * 1000);
{code}
Currently, the unit of 
{{hadoop.kms.authentication.delegation-token.removal-scan-interval.sec}} is not 
{{sec}} but {{millisec}}.

2. *{{ZKDelegationTokenSecretManager.java}}*
{code}
142 conf.getLong(DelegationTokenManager.RENEW_INTERVAL,
143 DelegationTokenManager.RENEW_INTERVAL_DEFAULT * 1000),
144 conf.getLong(DelegationTokenManager.REMOVAL_SCAN_INTERVAL,
145 DelegationTokenManager.REMOVAL_SCAN_INTERVAL_DEFAULT) * 1000);
{code}
 The situation is the opposite in this class that 
{{hadoop.kms.authentication.delegation-token.renew-interval.sec}} is wrong but 
the other is correct...
A patch should be like
{code}
143c143
< DelegationTokenManager.RENEW_INTERVAL_DEFAULT * 1000),
---
> DelegationTokenManager.RENEW_INTERVAL_DEFAULT) * 1000,
{code}

Thanks!




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11328) ZKFailoverController.java does not log Exception and causes latent problems during failover

2014-11-23 Thread Tianyin Xu (JIRA)
Tianyin Xu created HADOOP-11328:
---

 Summary: ZKFailoverController.java does not log Exception and 
causes latent problems during failover
 Key: HADOOP-11328
 URL: https://issues.apache.org/jira/browse/HADOOP-11328
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha
Affects Versions: 2.5.1
Reporter: Tianyin Xu


In _ZKFailoverController.java_, the _Exception_ caught by the _run()_ method 
does not have a single error log. This causes latent problems that are only 
manifested during failover.

h5. The problem we encountered

An _Exception_ is thrown from the _doRun()_ method during _initHM()_ (caused by 
a configuration error). If you want to repeat, you can set 
_ha.health-monitor.connect-retry-interval.ms_ to be any nonsensical value.
{code:title=ZKFailoverController.java|borderStyle=solid}
  private int doRun(String[] args)
...
initRPC();
initHM();
startRPC();

  }
{code}

The Exception is caught in the _run()_ method, as follows,
{code:title=ZKFailoverController.java|borderStyle=solid}
  public int run(final String[] args) throws Exception {
...
try {
  ...
@Override
public Integer run() {
  try {
return doRun(args);
  } catch (Exception t) {
throw new RuntimeException(t);
  } finally {
if (elector != null) {
  elector.terminateConnection();
}
  }
}
  });
} catch (RuntimeException rte) {
  throw (Exception)rte.getCause();
}
  }
{code}

Unfortunately, the Exception (causing the shutdown of the process) is *not 
logged at all*. This causes latent errors which is only manifested during 
failover (because ZKFC is dead). The tricky thing here is that everything looks 
perfectly fine: the _jps_ command shows a running DFSZKFailoverController 
process and the two NameNode (active and standby) work fine. 

h5. Patch

We strongly suggest to add a error log to notify the error caught, such as,

--- 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ZKFailoverController.java
(revision 1641307)
+++ 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ZKFailoverController.java
(working copy)
{code:title=@@ -178,6 +178,7 @@|borderStyle=solid}
 }
   });
 } catch (RuntimeException rte) {
+  LOG.fatal(The failover controller encounters runtime error:  + rte);
   throw (Exception)rte.getCause();
 }
   }
{code}

Thanks!




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)