[
https://issues.apache.org/jira/browse/PIG-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Satish Subhashrao Saley updated PIG-5355:
-----------------------------------------
Attachment: PIG-5355-1.patch
> Negative progress report by HBaseTableRecordReader
> --------------------------------------------------
>
> Key: PIG-5355
> URL: https://issues.apache.org/jira/browse/PIG-5355
> Project: Pig
> Issue Type: Bug
> Reporter: Satish Subhashrao Saley
> Assignee: Satish Subhashrao Saley
> Priority: Major
> Attachments: PIG-5355-1.patch
>
>
> The logic for padding the current row does not consider the updated padded
> row during the comparison. It ends up with different length then expected.
> This results in negative value for {{processed}}.
> {code}
> byte[] lastPadded = currRow_;
> if (currRow_.length < endRow_.length) {
> lastPadded = Bytes.padTail(currRow_, endRow_.length -
> currRow_.length);
> }
> if (currRow_.length < startRow_.length) {
> lastPadded = Bytes.padTail(currRow_, startRow_.length -
> currRow_.length);
> }
> byte [] prependHeader = {1, 0};
> BigInteger bigLastRow = new BigInteger(Bytes.add(prependHeader,
> lastPadded));
> if (bigLastRow.compareTo(bigEnd_) > 0) {
> return progressSoFar_;
> }
> BigDecimal processed = new
> BigDecimal(bigLastRow.subtract(bigStart_));
> {code}
> The fix is to use {{lastPadded}} in the second {{if}} comparison and
> {{Bytes.padTail}} call inside that {{if}}
> PIG-4700 added progress reporting. This enabled ProgressHelper in Tez. It
> calls {{getProgress}} [here
> |https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/common/ProgressHelper.java#L50]
> on {{PigRecrodReader}}
> https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigRecordReader.java#L159
> . Since Pig is reporting negative progress, job is getting killed by AM.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)