Re: HdfsBolt and hdfs in HA mode

Harsha Thu, 19 Feb 2015 09:21:04 -0800

Clay,
     When you are using storm-hdfs connector you need to package
     core-site.xml and hdfs-site.xml form you cluster into your topology
     jar . You can configure the storm-hdfs bolt to pass nameserviceID


HdfsBolt bolt = new HdfsBolt()
           .withFsURL("hdfs://myNameserviceID")
           .withFileNameFormat(fileNameformat)
           .withRecordFormat(format)
           .withRotationPolicy(rotationPolicy)
           .withSynPolicy(syncPolicy);

The above is all that needed to use namenode HA with your storm-hdfs. 

-Harsha

On Thu, Feb 19, 2015, at 08:58 AM, Bobby Evans wrote:
> Hadoop has lots of different configurations in core-site.xml,
> hdfs-site.xml, ... all of which eventually get loaded into the
> Configuration object used to create a FileSystem instance.  There are so
> many different configurations related to security, HA, etc. that it is
> almost impossible for me to guess exactly which ones you need to have set
> correctly to make this work.  Typically what we do for storm to be able
> to talk to HDFS is to package the complete set of configs that appear on
> a Hadoop Gateway with the topology jar when it is shipped.  This
> guarantees that the config is the same as on the gateway and should
> behave the same way.  You can also grab them from the name node or any of
> the hadoop compute nodes. 
>  This will work for the HdfsBolt that loads default configurations from the 
> classpath before overriding them with any custom configurations you set for 
> that bolt.
> 
> - Bobby
>  
> 
>      On Thursday, February 19, 2015 10:42 AM, clay teahouse
>      <[email protected]> wrote:
>    
> 
>  Bobby,What do you mean by client here? In this context, do you consider
>  hdfsbolt a client? If yes, then which configuration you are referring
>  to? I've seen the following, but I am not sure if I follow.
>    
>    - dfs.client.failover.proxy.provider.[nameservice ID] - the Java class
>    that HDFS clients use to contact the Active NameNodeConfigure the name
>    of the Java class which will be used by the DFS Client to determine
>    which NameNode is the current Active, and therefore which NameNode is
>    currently serving client requests. The only implementation which
>    currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so
>    use this unless you are using a custom one. For example:   <property>
>   <name>dfs.client.failover.proxy.provider.mycluster</name>
>   
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> </property>
> thanks,Clay
> 
> On Thu, Feb 19, 2015 at 8:38 AM, Bobby Evans
> <[email protected]> wrote:
> 
> HDFS HA provides fail-over for the name node and the client determines
> which name node is the active one but should be completely transparent to
> you if the client is configured correctly.
>  - Bobby
> 
> 
>      On Thursday, February 19, 2015 6:47 AM, clay teahouse 
> <[email protected]> wrote:
> 
> 
>  Hi All,
> Has anyone used HdfsBolt with hdfs in HA mode? How would you determine
> which hdfs node is the active node?
> 
> thanks
> Clay
> 
> 
>    
> 
> 
> 
>

Re: HdfsBolt and hdfs in HA mode

Reply via email to