Re: Accessing the local filesystem from AbstractJob

Dan Filimon Thu, 14 Feb 2013 09:19:06 -0800

Yes, that's right. I tried it, and it worked but I forgot to e-mail saying so.
Thanks!


On Thu, Feb 14, 2013 at 7:16 PM, Ted Dunning <[email protected]> wrote:
> I think that file: is the right way to access the local file system.
>
> On Wed, Feb 13, 2013 at 4:14 AM, Sean Owen <[email protected]> wrote:
>
>> Hmm I think it will work if you use "file:///..." URIs? I haven't tried in
>> a long time though.
>>
>>
>> On Wed, Feb 13, 2013 at 12:12 PM, Dan Filimon
>> <[email protected]>wrote:
>>
>> > I see. Well, my use case was wanting to run the job on one machine,
>> > being lazy and not wanting to put the files on HDFS. :)
>> >
>> > On Tue, Feb 12, 2013 at 8:27 PM, Sean Owen <[email protected]> wrote:
>> > > Yes because the input path is something processed by the jobtracker and
>> > > later the tasktrackers themselves, which won't be on your machine
>> > > (necessarily).
>> > >
>> > > Mappers can read the local file system but it's not clear what may or
>> may
>> > > not be there. Consider the distributed cache for smallish data.
>> > >
>> > >
>> > > On Tue, Feb 12, 2013 at 7:05 PM, Dan Filimon <
>> > [email protected]>wrote:
>> > >
>> > >> When creating my own job driver, I'm unable to give it any inputs from
>> > >> the local file system. An exception gets thrown when starting the job
>> > >> (and trying to get the splits).
>> > >> Apparently the files have to be on HDFS.
>> > >>
>> > >> Is there any way around this (ideally, I'd like it to first look for
>> > >> the file on the local file system and if no file is found, look at
>> > >> HDFS)?
>> > >>
>> >
>>

Re: Accessing the local filesystem from AbstractJob

Reply via email to