[ https://issues.apache.org/jira/browse/MAPREDUCE-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shilun Fan updated MAPREDUCE-6996: ---------------------------------- Target Version/s: 3.5.0 (was: 3.4.0) > FileInputFormat#getBlockIndex should include file name in the exception. > ------------------------------------------------------------------------ > > Key: MAPREDUCE-6996 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6996 > Project: Hadoop Map/Reduce > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Rushabh Shah > Priority: Minor > Labels: newbie++ > > {code:title=FileInputFormat..java|borderStyle=solid} > // Some comments here > protected int getBlockIndex(BlockLocation[] blkLocations, > long offset) { > { > ... > ... > BlockLocation last = blkLocations[blkLocations.length -1]; > long fileLength = last.getOffset() + last.getLength() -1; > throw new IllegalArgumentException("Offset " + offset + > " is outside of file (0.." + > fileLength + ")"); > } > {code} > When the file is open for writing, the {{last.getLength()}} and > {{last.getOffset()}} will be zero and we see the following exception stack > trace. > {noformat} > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:288) > Caused by: java.lang.IllegalArgumentException: Offset 0 is outside of file > (0..-1) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:453) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:413) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265) > ... 18 more > {noformat} > Its difficult to debug which file was open. > So creating this ticket to include the filename in the exception. > Since {{FileInputFormat#getBlockIndex}} is protected, we can't change the > signature of that method and add file name to arguments. > The only way I can think to fix this is: > {code:title=FileInputFormat..java|borderStyle=solid} > public InputSplit[] getSplits(JobConf job, int numSplits) > throws IOException { > { > ... > ... > for (FileStatus file: files) { > Path path = file.getPath(); > long length = file.getLen(); > if (length != 0) { > FileSystem fs = path.getFileSystem(job); > BlockLocation[] blkLocations; > if (file instanceof LocatedFileStatus) { > blkLocations = ((LocatedFileStatus) file).getBlockLocations(); > } else { > blkLocations = fs.getFileBlockLocations(file, 0, length); > } > if (isSplitable(fs, path)) { > long blockSize = file.getBlockSize(); > long splitSize = computeSplitSize(goalSize, minSize, blockSize); > long bytesRemaining = length; > while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) { > String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, > length-bytesRemaining, splitSize, clusterMap); > splits.add(makeSplit(path, length-bytesRemaining, splitSize, > splitHosts[0], splitHosts[1])); > bytesRemaining -= splitSize; > } > if (bytesRemaining != 0) { > String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, > length > - bytesRemaining, bytesRemaining, clusterMap); > splits.add(makeSplit(path, length - bytesRemaining, > bytesRemaining, > splitHosts[0], splitHosts[1])); > } > } else { > String[][] splitHosts = > getSplitHostsAndCachedHosts(blkLocations,0,length,clusterMap); > splits.add(makeSplit(path, 0, length, splitHosts[0], > splitHosts[1])); > } > } else { > //Create empty hosts array for zero length files > splits.add(makeSplit(path, 0, length, new String[0])); > } > } > {code} > Have a try-catch block around the above code chunk and catch > {{IllegalArgumentException}} and check for message {{Offset 0 is outside of > file (0..-1)}}. > If yes, add the file name and rethrow {{IllegalArgumentException}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org