Hi Arvind

I can't fully answer your questions on how to install Hadoop in pseudo-distributed mode, but I can kind of invalidate your cons: By using sudo su <user> in your shell, you can easily users during a session. Giving the hadoop-user access to your directories should then be an issue of two minutes at max...

Cheers,
alex


On 05.08.2015 19:32, Arvind Sundararajan wrote:

Hi All,

I have a laptop running Ubuntu 14.04 LTS and am trying to install hadoop 2.7.1 (current stable version) in pseudo-distributed mode.

I have a regular user account on my laptop, but am confused if i should install hadoop using a dedicated hadoop user on my laptop. NOTE: By 'regular user', i mean the linux user account that i use for day-to-day personal work

The current hadoop documentation at [1] does not mention setting up a dedicated user for hadoop installation.

However, the hadoop installation tutorial at [2] mentions setting up a dedicated user for hadoop installation in pseudo-distributed mode on a single machine. This tutorial references an outdated hadoop installation tutorial [3] which too mentions setting up a dedicated user for hadoop installation in pseudo-distributed mode on a single machine.

I found several tutorials online which all seem to mention setting up dedicated user for hadoop installation in pseudo-distributed mode on a single machine, without mentioning why we should set up a dedicated user.

My questions are as follows:

a) Is it possible for me to execute hadoop programs as a regular user even if hadoop is installed in pseudo-distributed mode via a dedicated 'hadoop' user? If yes, what linux filesystem folder permissions and HDFS permissions do i need to give to the regular user for executing hadoop programs?

b) Quoting from the outdated hadoop installation tutorial [3]:

|     "We will use a dedicated Hadoop user account for running Hadoop.
      While that's not required it is recommended because it helps to separate
      the Hadoop installation from other software applications and
      user accounts running on the same machine
      (think: security, permissions, backups, etc)."
|

Can someone elaborate on this? what are the issues regarding security, permissions, backups when running hadoop in pseudo-distributed mode on a single laptop which will most likely have only one user account (my current user account) ?

c) Can someone please elaborate on the pros and cons of running hadoop in pseudo-distributed mode on a single machine as the regular user versus creating a dedicated user?

My thoughts on the cons, thus far has been:

|     i) if hadoop is unable to execute from a 'regular user' and
     only works from the dedicated hadoop user account, then i
     will have to edit my hadoop java programs from my
     'regular user' account where i have my development environment
     and IDE/text editor setup, copy the .jar files to the
     dedicated hadoop user account and execute. if any error occurs,
     i have to go back to the 'regular user' account, edit and
     then copy the new .jar files and execute again. this moving
     back and forth between accounts is a definite pain while
     working in pseudo-distributed mode and i have experienced
     this while working in Hadoop 1.x version

     ii) if hadoop is unable to execute from a 'regular user' and
     only works from the dedicated hadoop user account, then
     the hadoop operations copyFromLocal and copyToLocal will
     require a shared folder for both user accounts.
|

P.S. I also referred [4] and [5] before asking this question.

References:

[1] http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/SingleCluster.html
[2] http://dogdogfish.com/big-data/installing-hadoop-2-4-on-ubuntu-14-04/
[3] http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ [4] http://stackoverflow.com/questions/20192140/hadoop-pseudo-distributed-mode-for-multiple-users [5] http://stackoverflow.com/questions/23807486/hadoop-development-dedicated-user-in-ubuntu-how-to-access-hadoop-node-running


Reply via email to