Several questions to ask about hadoop: - when I install hadoop under /usr/local/hadoop/hadoop for master. Can other slaves be installed under different directories? Or every machine should have the same installation directory?
- After I successfully installed hadoop, how do I run real jobs on it? Should every job be written in java, as the wordcount example? - when you tell hadoop where to store files (and temp space), can you also limit how much is used? That is, we can create one or more directories for hadoop to use, but sometimes those directories will be on filesystems that are shared with other uses, and we'll need to limit how much hadoop can consume. - for machines with multiple cpu cores, can you control how many jobs are run simultaneously? For example, on desktops it would be nice to make sure that only one hadoop job is running at a time so that the machine stays responsive to the user. - similar question regarding RAM. Some machines have more or less RAm than others. Can we restrict how much RAM is available to the hadoop jobs? Thanks a lot for your help
