Hi Omkar R,

Thanks for reaching out to us. My understanding is that you are looking for
a way to deploy/start an Impala cluster in a multi node environment.

At this point Impala does not have a solution to deploy/start a cluster in
to/on multiple hosts. The start-impala-cluster.py script starts and
configures it on a single hosts, or it can start the cluster in a
dockerised environment. It can help to figure out what configurations are
needed to deploy a cluster.

The Impala application depends on the gflags package, so Impala can be
configured with a flagfile or with CLI arguments, some examples can be
found here:
https://impala.apache.org/docs/build/html/topics/impala_config_options.html
This shows examples on how to point the Impala services to check for each
other on the right hosts.

Additionally, Impala depends on multiple configuration files to work with
other Hadoop components, for example the HDFSClient inside Impala will need
core-site.xml. In a working dev environment you can find these under
'${IMPALA_HOME}/fe/src/test/resources'. For example, if you check the
core-site.xml you will see that the defaultFS config points to a local HDFS:
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:20500</value>
      </property>
You will need to create these config files with the right Hadoop cluster
configurations on every node.

Hope this helps, let me know if you get stuck during the process.

Best regards,
Tamas


On Mon, 24 Apr 2023 at 09:15, Omkar Rohadkar <omkar.rohad...@ellicium.com>
wrote:

> Hi Team,
>
> I've successfully done installation of open source impala and integrated
> Apache services like hadoop ,hive, hbase on a single node . This setup is
> working fine.
>
> But now I want to install  Impala in a multi-node environment and integrate
> it with Apache services. I have a few questions about it.
>
> When we build and install Impala and start impala service using start
> impala cluster.py file it starts all the impala services like 1 catalogd 1
> statestore and 3 daemons.
>
> But now in multi node setup we only want catalog and statestore to run on
> master node and one impala daemon on all the worker nodes.
>
> May I know how we can achieve this? Do we need to build it differently? and
> in which config files or script files we need to change so that we can
> achieve the above setup.
>
> For installation of multi node setup if we build it on every node how can
> we combine it as a one cluster?
>
> Hope to get your help on the above doubts.
>
> Thanks and regards,
> Omkar R
>
> --
> Privileged/Confidential information may be contained in this message and
> may be subject to legal privilege. Access to this e-mail by anyone other
> than the intended is unauthorized. If you are not the intended recipient
> (or responsible for delivery of the message to such person), you may not
> use, copy, distribute or deliver to anyone this message (or any part of
> its
> contents ) or take any action in reliance on it. In such case, you should
> destroy this message, and notify us immediately. If you have received this
> email in error, please notify us immediately by e-mail or telephone and
> delete the e-mail from any computer. If you or your employer does not
> consent to internet e-mail messages of this kind, please notify us
> immediately by e-mail. All reasonable precautions have been taken to
> ensure
> no viruses are present in this e-mail. As our company cannot accept
> responsibility for any loss or damage arising from the use of this e-mail
> or attachments we recommend that you subject these to your virus checking
> procedures prior to use. The views, opinions, conclusions and other
> information expressed in this electronic mail are not given or endorsed by
> the company unless otherwise indicated by an authorized representative
> independent of this message.
>

Reply via email to