Do not use the InputSplit's getLocations() API to supply your file
path, it is not intended for such things, if thats what you've done in
your current InputFormat implementation.

If you're looking to store a single file path, use the FileSplit
class, or if not as simple as that, do use it as a base reference to
build you Path based InputSplit derivative. Its sources are at
https://github.com/apache/hadoop-common/blob/release-2.4.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileSplit.java.
Look for the Writable method overrides in particular to understand how
to use custom fields.

On Thu, Apr 10, 2014 at 9:54 PM, Patcharee Thongtra
<patcharee.thong...@uni.no> wrote:
> Hi,
>
> I wrote a custom InputFormat and InputSplit to handle netcdf file. I use
> with a custom pig Load function. When I submitted a job by running a pig
> script. I got an error below. From the error log, the network location name
> is "hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02" -
> my input file, containing "/", and hadoop does not allow.
>
> It could be something missing in my custom InputFormat and InputSplit. Any
> ideas? Any help is appreciated,
>
> Patcharee
>
>
> 2014-04-10 17:09:01,854 INFO [CommitterEvent Processor #0]
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing
> the event EventType: JOB_SETUP
>
> 2014-04-10 17:09:01,918 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:
> job_1387474594811_0071Job Transitioned from SETUP to RUNNING
>
> 2014-04-10 17:09:01,982 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.yarn.util.RackResolver: Resolved
> hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 to
> /default-rack
>
> 2014-04-10 17:09:01,984 FATAL [AsyncDispatcher event handler]
> org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
> java.lang.IllegalArgumentException: Network location name contains /:
> hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02
>     at org.apache.hadoop.net.NodeBase.set(NodeBase.java:87)
>     at org.apache.hadoop.net.NodeBase.<init>(NodeBase.java:65)
>     at
> org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:111)
>     at
> org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.<init>(TaskAttemptImpl.java:548)
>     at
> org.apache.hadoop.mapred.MapTaskAttemptImpl.<init>(MapTaskAttemptImpl.java:47)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.MapTaskImpl.createAttempt(MapTaskImpl.java:62)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAttempt(TaskImpl.java:594)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAndScheduleAttempt(TaskImpl.java:581)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.access$1300(TaskImpl.java:100)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:871)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:866)
>     at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>     at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>     at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>     at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:632)
>     at
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:99)
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1237)
>     at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231)
>     at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>     at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>     at java.lang.Thread.run(Thread.java:662)
> 2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.



-- 
Harsh J

Reply via email to