[ https://issues.apache.org/jira/browse/CRUNCH-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomáš Čechal updated CRUNCH-577: -------------------------------- Attachment: 0001-CRUNCH-577-Use-getLongBytes-to-correctly-parse-dfs-b.patch One-liner that fixes the problem. > NumberFormatException when parsing dfs.block.size > ------------------------------------------------- > > Key: CRUNCH-577 > URL: https://issues.apache.org/jira/browse/CRUNCH-577 > Project: Crunch > Issue Type: Bug > Components: IO > Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.8.2, 0.10.0, 0.8.3, 0.8.4, > 0.11.0, 0.12.0 > Reporter: Tomáš Čechal > Priority: Minor > Labels: patch > Attachments: > 0001-CRUNCH-577-Use-getLongBytes-to-correctly-parse-dfs-b.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > When using file size abbreviations (like "128m") for the HDFS configuration > property "dfs.block.size" the Crunch job crashes with a > NumberFormatException. According to the Hadoop documentation > (https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml), > this style of abbreviations should be supported. > The problem occurs at line 38 in CrunchCombineFileInputFormat.java when the > configuration property is parsed using the getLong() method instead of > getLongBytes() method. Furthermore, obsolete configuration key > "dfs.block.size" is used instead of "dfs.blocksize" (see > https://issues.apache.org/jira/browse/HDFS-631) which leads to a warning > message being emitted when starting a MR pipeline. > The proposed solution discussed on the crunch-users mailing list > (http://mail-archives.apache.org/mod_mbox/crunch-user/201511.mbox/browser) is > to use the getLongBytes() method and the new config key "dfs.blocksize". -- This message was sent by Atlassian JIRA (v6.3.4#6332)