Clay,
When you are using storm-hdfs connector you need to package
core-site.xml and hdfs-site.xml form you cluster into your topology
jar . You can configure the storm-hdfs bolt to pass nameserviceID
HdfsBolt bolt = new HdfsBolt()
.withFsURL("hdfs://myNameserviceID")
.withFileNameFormat(fileNameformat)
.withRecordFormat(format)
.withRotationPolicy(rotationPolicy)
.withSynPolicy(syncPolicy);
The above is all that needed to use namenode HA with your storm-hdfs.
-Harsha
On Thu, Feb 19, 2015, at 08:58 AM, Bobby Evans wrote:
> Hadoop has lots of different configurations in core-site.xml,
> hdfs-site.xml, ... all of which eventually get loaded into the
> Configuration object used to create a FileSystem instance. There are so
> many different configurations related to security, HA, etc. that it is
> almost impossible for me to guess exactly which ones you need to have set
> correctly to make this work. Typically what we do for storm to be able
> to talk to HDFS is to package the complete set of configs that appear on
> a Hadoop Gateway with the topology jar when it is shipped. This
> guarantees that the config is the same as on the gateway and should
> behave the same way. You can also grab them from the name node or any of
> the hadoop compute nodes.
> This will work for the HdfsBolt that loads default configurations from the
> classpath before overriding them with any custom configurations you set for
> that bolt.
>
> - Bobby
>
>
> On Thursday, February 19, 2015 10:42 AM, clay teahouse
> <[email protected]> wrote:
>
>
> Bobby,What do you mean by client here? In this context, do you consider
> hdfsbolt a client? If yes, then which configuration you are referring
> to? I've seen the following, but I am not sure if I follow.
>
> - dfs.client.failover.proxy.provider.[nameservice ID] - the Java class
> that HDFS clients use to contact the Active NameNodeConfigure the name
> of the Java class which will be used by the DFS Client to determine
> which NameNode is the current Active, and therefore which NameNode is
> currently serving client requests. The only implementation which
> currently ships with Hadoop is the ConfiguredFailoverProxyProvider, so
> use this unless you are using a custom one. For example: <property>
> <name>dfs.client.failover.proxy.provider.mycluster</name>
>
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> </property>
> thanks,Clay
>
> On Thu, Feb 19, 2015 at 8:38 AM, Bobby Evans
> <[email protected]> wrote:
>
> HDFS HA provides fail-over for the name node and the client determines
> which name node is the active one but should be completely transparent to
> you if the client is configured correctly.
> - Bobby
>
>
> On Thursday, February 19, 2015 6:47 AM, clay teahouse
> <[email protected]> wrote:
>
>
> Hi All,
> Has anyone used HdfsBolt with hdfs in HA mode? How would you determine
> which hdfs node is the active node?
>
> thanks
> Clay
>
>
>
>
>
>
>