[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11
[ https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edwin Chiu updated HIVE-5235: - Affects Version/s: 0.12.0 > Infinite loop with ORC file and Hive 0.11 > - > > Key: HIVE-5235 > URL: https://issues.apache.org/jira/browse/HIVE-5235 > Project: Hive > Issue Type: Bug >Affects Versions: 0.11.0, 0.12.0 > Environment: Gentoo linux with Hortonworks Hadoop > hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d >Reporter: Iván de Prado >Priority: Blocker > Attachments: gendata.py > > > We are using Hive 0.11 with ORC file format and we get some tasks blocked in > some kind of infinite loop. They keep working indefinitely when we set a huge > task expiry timeout. If we the expiry time to 600 second, the taks fail > because of not reporting progress, and finally, the Job fails. > That is not consistent, and some times between jobs executions the behavior > changes. It happen for different queries. > We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks > that is blocked keeps consuming 100% of CPU usage, and the stack trace is > always the same consistently. Everything points to some kind of infinite > loop. My guessing is that it has some relation to the ORC file. Maybe some > pointer is not right when writing generating some kind of infinite loop when > reading. Or maybe there is a bug in the reading stage. > More information below. The stack trace: > {noformat} > "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000] >java.lang.Thread.State: RUNNABLE > at java.util.zip.Inflater.inflateBytes(Native Method) > at java.util.zip.Inflater.inflate(Inflater.java:256) > - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef) > at > org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) > - eliminated <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) > - locked <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {noformat} > We have seen the same stack trace repeatedly for several executions of jstack. > The log file for this kind of
[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11
[ https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5235: - Attachment: gendata.py > Infinite loop with ORC file and Hive 0.11 > - > > Key: HIVE-5235 > URL: https://issues.apache.org/jira/browse/HIVE-5235 > Project: Hive > Issue Type: Bug >Affects Versions: 0.11.0 > Environment: Gentoo linux with Hortonworks Hadoop > hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d >Reporter: Iván de Prado >Priority: Blocker > Attachments: gendata.py > > > We are using Hive 0.11 with ORC file format and we get some tasks blocked in > some kind of infinite loop. They keep working indefinitely when we set a huge > task expiry timeout. If we the expiry time to 600 second, the taks fail > because of not reporting progress, and finally, the Job fails. > That is not consistent, and some times between jobs executions the behavior > changes. It happen for different queries. > We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks > that is blocked keeps consuming 100% of CPU usage, and the stack trace is > always the same consistently. Everything points to some kind of infinite > loop. My guessing is that it has some relation to the ORC file. Maybe some > pointer is not right when writing generating some kind of infinite loop when > reading. Or maybe there is a bug in the reading stage. > More information below. The stack trace: > {noformat} > "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000] >java.lang.Thread.State: RUNNABLE > at java.util.zip.Inflater.inflateBytes(Native Method) > at java.util.zip.Inflater.inflate(Inflater.java:256) > - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef) > at > org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) > - eliminated <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) > - locked <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {noformat} > We have seen the same stack trace repeatedly for several executions of jstack. > The log file for this kind of task is th
[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11
[ https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-5235: - Summary: Infinite loop with ORC file and Hive 0.11 (was: Infinite loop with ORC file and Hive 0.1) > Infinite loop with ORC file and Hive 0.11 > - > > Key: HIVE-5235 > URL: https://issues.apache.org/jira/browse/HIVE-5235 > Project: Hive > Issue Type: Bug >Affects Versions: 0.11.0 > Environment: Gentoo linux with Hortonworks Hadoop > hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d >Reporter: Iván de Prado >Priority: Blocker > > We are using Hive 0.11 with ORC file format and we get some tasks blocked in > some kind of infinite loop. They keep working indefinitely when we set a huge > task expiry timeout. If we the expiry time to 600 second, the taks fail > because of not reporting progress, and finally, the Job fails. > That is not consistent, and some times between jobs executions the behavior > changes. It happen for different queries. > We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks > that is blocked keeps consuming 100% of CPU usage, and the stack trace is > always the same consistently. Everything points to some kind of infinite > loop. My guessing is that it has some relation to the ORC file. Maybe some > pointer is not right when writing generating some kind of infinite loop when > reading. Or maybe there is a bug in the reading stage. > More information below. The stack trace: > {noformat} > "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000] >java.lang.Thread.State: RUNNABLE > at java.util.zip.Inflater.inflateBytes(Native Method) > at java.util.zip.Inflater.inflate(Inflater.java:256) > - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef) > at > org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236) > - eliminated <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216) > - locked <0xe1459700> (a > org.apache.hadoop.mapred.MapTask$TrackedRecordReader) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {noformat} > We have seen the same stack trace repeatedly for several executions of jsta