[
https://issues.apache.org/jira/browse/HAMA-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661208#comment-14661208
]
Edward J. Yoon commented on HAMA-963:
-------------------------------------
{code}
I wrote a very small Hama program to test it on a Yarn cluster running on my
laptop to isolate the problem:
final public class BSPTest extends BSP<LongWritable, Text, LongWritable, Text,
Text> {
@Override
public final void bsp( BSPPeer<LongWritable, Text, LongWritable, Text,
Text> peer)
throws IOException, InterruptedException, SyncException {
LongWritable key = new LongWritable();
Text value = new Text();
peer.readNext(key,value);
peer.write(key,value);
}
public static void main ( String[] args ) throws Exception {
HamaConfiguration conf = new HamaConfiguration();
conf.set("yarn.resourcemanager.address","localhost:8032");
YARNBSPJob job = new YARNBSPJob(conf);
job.setMemoryUsedPerTaskInMb(500);
job.setNumBspTask(4);
job.setJobName("test");
job.setBspClass(BSPTest.class);
job.setJarByClass(BSPTest.class);
job.setInputKeyClass(LongWritable.class);
job.setInputValueClass(Text.class);
job.setInputPath(new Path("in"));
job.setInputFormat(TextInputFormat.class);
job.setPartitioner(org.apache.hama.bsp.HashPartitioner.class);
job.set("bsp.min.split.size",Long.toString(1000));
job.setOutputPath(new Path("out"));
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(Text.class);
job.setOutputFormat(TextOutputFormat.class);
job.waitForCompletion(true);
}
}
where "in" is a small text file stored on HDFS. It does the file partitioning
into 4 files but then it gives me the same error:
15/07/26 06:46:25 INFO ipc.Server: IPC Server handler 0 on 10000, call
getTask(attempt_appattempt_1437858941768_0042_000001_0000_000004_4) from
127.0.0.1:54752: error: java.io.IOException:
java.lang.ArrayIndexOutOfBoundsException: 4
java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 4
at org.apache.hama.bsp.ApplicationMaster.getTask(ApplicationMaster.java:950)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hama.ipc.RPC$Server.call(RPC.java:615)
at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1211)
at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1207)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
I get the same error even when I remove the partitioning and I use 1 task.
{code}
> ArrayIndexOutOfBoundsException occurs when tasks are greater than splits
> ------------------------------------------------------------------------
>
> Key: HAMA-963
> URL: https://issues.apache.org/jira/browse/HAMA-963
> Project: Hama
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Edward J. Yoon
> Priority: Blocker
> Fix For: 0.7.1
>
>
> ArrayIndexOutOfBoundsException occurs when the number of tasks are greater
> than the number of splits at ApplicationMaster 950 line.
> {code}
> assignedSplit = splits[taskid.id];
> {code}
> There are two options: Option1. launch additional tasks without input split.
> Option2. adjust the number of tasks as number of input splits.
> I prefer the option 1.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)