How does one set up pseudo distributed with a local filesystem? are you
saying fs.default.name can be left as file:/// instead of being set to
hdfs://? Do you then set mapred.job.tracker to file:/// as well?

Thanks,
Steve Cohen

On Tue, Sep 28, 2010 at 10:26 AM, Andrzej Bialecki <[email protected]> wrote:

> On 2010-09-28 14:27, Markus Jelsma wrote:
>
>> Thanks Andrzej,
>>
>> I will make an effort in getting it to run on Hadoop but i'd rather go for
>> a
>> fully distributed set up (although with only a single node for now) so i
>> can
>> add more machines later.
>>
>
> That's what I meant, sorry for using jargon - pseudo-distributed is a
> "fully distributed Hadoop that runs on a single node". Please note that you
> don't have to use HDFS then - all nodes :) have direct access to the same
> local file system.
>
>
>  Will the HadoopNutch tutorial on the wiki allow me to
>> set up for a cluster on a single node? Also, will it then still make use
>> of
>> multiple cores?
>>
>
> Yes, because there will be multiple tasks running in parallel, in multiple
> processes, which will be likely run on different cores.
>
> As I said, the main big difference between using LocalJobTracker and a real
> JobTracker is that with LocalJobTracker:
>
> * all map tasks are run sequentially, there is no parallelism.
> * there is always one reduce task - if your dataset is large then this
> single task will have to handle the sorting of the whole dataset, which may
> take disproportionately longer than if the data were split among multiple
> reduce tasks.
>
> Whereas with the JobTracker/TaskTracker, even when running on a single
> node:
>
> * tasks are run in separate processes and execute in parallel
> * there are many reduce tasks (as many as you configured), which handle
> portions of the output dataset, and which execute also in parallel.
>
> So even on a single node a pseudo-distributed setup should be faster than
> running in local mode.
>
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>

Reply via email to