[ 
https://issues.apache.org/jira/browse/HAMA-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661208#comment-14661208
 ] 

Edward J. Yoon commented on HAMA-963:
-------------------------------------

{code}
I wrote a very small Hama program to test it on a Yarn cluster running on my 
laptop to isolate the problem:

final public class BSPTest extends BSP<LongWritable, Text, LongWritable, Text, 
Text> {

    @Override
    public final void bsp( BSPPeer<LongWritable, Text, LongWritable, Text, 
Text> peer)
                  throws IOException, InterruptedException, SyncException {
        LongWritable key = new LongWritable();
        Text value = new Text();
        peer.readNext(key,value);
        peer.write(key,value);
    }

    public static void main ( String[] args ) throws Exception {
        HamaConfiguration conf = new HamaConfiguration();
conf.set("yarn.resourcemanager.address","localhost:8032");
        YARNBSPJob job = new YARNBSPJob(conf);
        job.setMemoryUsedPerTaskInMb(500);
        job.setNumBspTask(4);
        job.setJobName("test");
        job.setBspClass(BSPTest.class);
        job.setJarByClass(BSPTest.class);
        job.setInputKeyClass(LongWritable.class);
        job.setInputValueClass(Text.class);
        job.setInputPath(new Path("in"));
        job.setInputFormat(TextInputFormat.class);
job.setPartitioner(org.apache.hama.bsp.HashPartitioner.class);
job.set("bsp.min.split.size",Long.toString(1000));
        job.setOutputPath(new Path("out"));
        job.setOutputKeyClass(LongWritable.class);
        job.setOutputValueClass(Text.class);
        job.setOutputFormat(TextOutputFormat.class);
        job.waitForCompletion(true);
    }
}

where "in" is a small text file stored on HDFS. It does the file partitioning 
into 4 files but then it gives me the same error:

15/07/26 06:46:25 INFO ipc.Server: IPC Server handler 0 on 10000, call 
getTask(attempt_appattempt_1437858941768_0042_000001_0000_000004_4) from 
127.0.0.1:54752: error: java.io.IOException: 
java.lang.ArrayIndexOutOfBoundsException: 4
java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 4
    at org.apache.hama.bsp.ApplicationMaster.getTask(ApplicationMaster.java:950)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hama.ipc.RPC$Server.call(RPC.java:615)
    at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1211)
    at org.apache.hama.ipc.Server$Handler$1.run(Server.java:1207)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

I get the same error even when I remove the partitioning and I use 1 task.
{code}

> ArrayIndexOutOfBoundsException occurs when tasks are greater than splits
> ------------------------------------------------------------------------
>
>                 Key: HAMA-963
>                 URL: https://issues.apache.org/jira/browse/HAMA-963
>             Project: Hama
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Edward J. Yoon
>            Priority: Blocker
>             Fix For: 0.7.1
>
>
> ArrayIndexOutOfBoundsException occurs when the number of tasks are greater 
> than the number of splits at ApplicationMaster 950 line.
> {code}
>       assignedSplit = splits[taskid.id];
> {code}
> There are two options: Option1. launch additional tasks without input split.
> Option2. adjust the number of tasks as number of input splits.
> I prefer the option 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to