[jira] [Work logged] (HDDS-2359) Seeking randomly in a key with more than 2 blocks of data leads to inconsistent reads

ASF GitHub Bot (Jira) Tue, 05 Nov 2019 21:27:45 -0800


     [ 
https://issues.apache.org/jira/browse/HDDS-2359?focusedWorklogId=339153&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-339153
 ]


ASF GitHub Bot logged work on HDDS-2359:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Nov/19 05:26
            Start Date: 06/Nov/19 05:26
    Worklog Time Spent: 10m 
      Work Description: bharatviswa504 commented on pull request #82: 
HDDS-2359. Seeking randomly in a key with more than 2 blocks of data leads to 
inconsistent reads
URL: https://github.com/apache/hadoop-ozone/pull/82
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 339153)
    Time Spent: 20m  (was: 10m)

> Seeking randomly in a key with more than 2 blocks of data leads to 
> inconsistent reads
> -------------------------------------------------------------------------------------
>
>                 Key: HDDS-2359
>                 URL: https://issues.apache.org/jira/browse/HDDS-2359
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Istvan Fajth
>            Assignee: Shashikant Banerjee
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.5.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> During Hive testing we found the following exception:
> {code}
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1569246922012_0214_1_03_000000_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: error iterating
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>     at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>     at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>     at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>     at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>     at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>     at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>     at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>     at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>     at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: error iterating
>     at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>     ... 16 more
> Caused by: java.io.IOException: java.io.IOException: error iterating
>     at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>     at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>     at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:366)
>     at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>     at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>     at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>     at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>     at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>     at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>     ... 18 more
> Caused by: java.io.IOException: error iterating
>     at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:835)
>     at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:74)
>     at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
>     ... 24 more
> Caused by: java.io.IOException: Error reading file: 
> o3fs://hive.warehouse.vc0136.halxg.cloudera.com:9862/data/inventory/delta_0000001_0000001_0000/bucket_00000
>     at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1283)
>     at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:156)
>     at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:150)
>     at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader$1.next(VectorizedOrcAcidRowBatchReader.java:146)
>     at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:831)
>     ... 26 more
> Caused by: java.io.IOException: Inconsistent read for blockID=conID: 2 locID: 
> 102851451236759576 bcsId: 14608 length=26398272 numBytesRead=6084153
>     at 
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:176)
>     at 
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:52)
>     at org.apache.hadoop.fs.FSInputStream.read(FSInputStream.java:75)
>     at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
>     at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
>     at 
> org.apache.orc.impl.RecordReaderUtils.readDiskRanges(RecordReaderUtils.java:557)
>     at 
> org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readFileData(RecordReaderUtils.java:276)
>     at 
> org.apache.orc.impl.RecordReaderImpl.readPartialDataStreams(RecordReaderImpl.java:1189)
>     at 
> org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1057)
>     at 
> org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1208)
>     at 
> org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1243)
>     at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1279)
>     ... 30 more
> {code}
> Evaluating the code path, the following is the issue:
> given a file with more data than 2 blocks
> when there are random seeks in the file to the end then to the beginning
> then the read fails with the final cause of the exception above.
> [~shashikant] has a solution already for this issue, which we have 
> successfully tested internally with Hive, I am assigning this JIRA to him to 
> post the PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Work logged] (HDDS-2359) Seeking randomly in a key with more than 2 blocks of data leads to inconsistent reads

Reply via email to