[ 
https://issues.apache.org/jira/browse/HBASE-21062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581607#comment-16581607
 ] 

Mingliang Liu edited comment on HBASE-21062 at 8/15/18 10:42 PM:
-----------------------------------------------------------------

The existing code implicitly employs the fact that {{defaultProvider}} is 
having {{AsyncFSWALProvider}} class value. It makes sense as the enum 
{{defaultProvider}} can not have a different value at runtime unless we change 
code. This implementation pattern is not ideal to me either - because when we 
change the {{defaultProvider}} we may forget to change {{getProviderClass}} - 
but it's tolerable. Meanwhile, the newly added test in the patch does not 
actually override the {{defaultProvider}} value and {{getDefaultProvider}} is 
not used anywhere. So the test will pass anyway: w/ or w/o the fix in 
{{getProviderClass}}, w/ or w/o the {{getDefaultProvider}}. Correct me if I'm 
wrong.

Another (orthogonal?) problem in {{getProviderClass}} is that, the try-catch 
scope is not implemented well. The reason is that, in the 
{{catch(IllegalArgumentException exception)\{...\}}} clause, we end up with 
returning the {{AsyncFSWALProvider}}. However, this time we would skip the 
process that load {{AsyncFSWALProvider}} and then falls back to 
{{FSHLogProvider}}.

Last, for any error case, it merits to have logging message.

Overall, a fix in my mind is like - NOT TESTED:
{code:java}
  @VisibleForTesting
  public Class<? extends WALProvider> getProviderClass(String key, String 
defaultValue) {
    Providers provider;
    try {
       provider = Providers.valueOf(conf.get(key, defaultValue));
    } catch (IllegalArgumentException exception) {
      // Fall back to them specifying a class name
      // Note that the passed default class shouldn't actually be used, since 
the above only fails
      // when there is a config value present.
      LOG.warn("Unrecognized WALProvider class. Falling back to default 
AsyncFSWALProvider.");
      provider = conf.getClass(key, Providers.defaultProvider.clazz, 
WALProvider.class)
    }
    if (provider.clazz == AsyncFSWALProvider.class && 
!AsyncFSWALProvider.load()) {
      // AsyncFSWAL has better performance in most cases, and also uses less 
resources, we will try
      // to use it if possible. But it deeply hacks into the internal of 
DFSClient so will be easily
      // broken when upgrading hadoop. If it is broken, then we fall back to 
use FSHLog.
      LOG.warn("Failed to load AsyncFSWALProvider. Falling back to 
FSHLogProvider.");
      return FSHLogProvider.class;
    }
    return provider.clazz;
  }
{code}
Thanks!


was (Author: liuml07):
The existing code implicitly employs the fact that {{defaultProvider}} is 
having {{AsyncFSWALProvider}} class value. It makes sense as the enum 
{{defaultProvider}} can not have a different value at runtime unless we change 
code. This implementation pattern is not ideal to me either - because when we 
change the {{defaultProvider}} we may forget to change {{getProviderClass}} - 
but it's tolerable. Meanwhile, the newly added test in the patch does not 
actually override the {{defaultProvider}} value and {{getDefaultProvider}} is 
not used anywhere. So the test will pass anyway: w/ or w/o the fix in 
{{getProviderClass}}, w/ or w/o the {{getDefaultProvider}}. Correct me if I'm 
wrong.

Another (orthogonal?) problem in {{getProviderClass}} is that, the try-catch 
scope is not implemented well. The reason is that, in the 
{{catch(IllegalArgumentException exception)\{...\}}} clause, we end up with 
returning the {{AsyncFSWALProvider}}. However, this time we would skip the 
process that load {{AsyncFSWALProvider}} and then falls back to 
{{FSHLogProvider}}.

Last, for any error case, it merits to have logging message.

Overall, a fix in my mind is like - NOT TESTED:
{code:java}
  @VisibleForTesting
  public Class<? extends WALProvider> getProviderClass(String key, String 
defaultValue) {
    Providers provider;
    try {
       provider = Providers.valueOf(conf.get(key, defaultValue));
    } catch (IllegalArgumentException exception) {
      // Fall back to them specifying a class name
      // Note that the passed default class shouldn't actually be used, since 
the above only fails
      // when there is a config value present.
      LOG.warn("Unrecognized WALProvider class. Falling back to default 
AsyncFSWALProvider.");
      provider = Providers.defaultProviderdr;
    }
    if (provider.clazz == AsyncFSWALProvider.class && 
!AsyncFSWALProvider.load()) {
      // AsyncFSWAL has better performance in most cases, and also uses less 
resources, we will try
      // to use it if possible. But it deeply hacks into the internal of 
DFSClient so will be easily
      // broken when upgrading hadoop. If it is broken, then we fall back to 
use FSHLog.
      LOG.warn("Failed to load AsyncFSWALProvider. Falling back to 
FSHLogProvider.");
      return FSHLogProvider.class;
    }
    return provider.clazz;
  }
{code}
Thanks!

> WALFactory has misleading notion of "default"
> ---------------------------------------------
>
>                 Key: HBASE-21062
>                 URL: https://issues.apache.org/jira/browse/HBASE-21062
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>            Priority: Major
>             Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
>         Attachments: HBASE-21062.001.branch-2.0.patch, 
> HBASE-21062.002.branch-2.0.patch
>
>
> In WALFactory, there is an enum {{Providers}} which has a list of supported 
> WALProvider implementations. In addition to list this, there is also a 
> {{defaultProvider}} (which the Configuration defaults to), that is meant to 
> be our "advertised" default WALProvider.
> However, the implementation of {{getProviderClass}} in WALFactory doesn't 
> actually adhere to the value of this enum, instead *always* returning 
> AsyncFSWal if it can be loaded.
> Having the default value in the enum but then overriding it in the 
> implementation of {{getProviderClass}} is silly and misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to