[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2014-04-14 Thread Edwin Chiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edwin Chiu updated HIVE-5235:
-

Affects Version/s: 0.12.0

> Infinite loop with ORC file and Hive 0.11
> -
>
> Key: HIVE-5235
> URL: https://issues.apache.org/jira/browse/HIVE-5235
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0, 0.12.0
> Environment: Gentoo linux with Hortonworks Hadoop 
> hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
>Reporter: Iván de Prado
>Priority: Blocker
> Attachments: gendata.py
>
>
> We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
> some kind of infinite loop. They keep working indefinitely when we set a huge 
> task expiry timeout. If we the expiry time to 600 second, the taks fail 
> because of not reporting progress, and finally, the Job fails. 
> That is not consistent, and some times between jobs executions the behavior 
> changes. It happen for different queries.
> We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
> that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
> always the same consistently. Everything points to some kind of infinite 
> loop. My guessing is that it has some relation to the ORC file. Maybe some 
> pointer is not right when writing generating some kind of infinite loop when 
> reading.  Or maybe there is a bug in the reading stage.
> More information below. The stack trace:
> {noformat} 
> "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
>java.lang.Thread.State: RUNNABLE
>   at java.util.zip.Inflater.inflateBytes(Native Method)
>   at java.util.zip.Inflater.inflate(Inflater.java:256)
>   - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>   - eliminated <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>   - locked <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {noformat} 
> We have seen the same stack trace repeatedly for several executions of jstack.
> The log file for this kind of

[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2013-10-08 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5235:
-

Attachment: gendata.py

> Infinite loop with ORC file and Hive 0.11
> -
>
> Key: HIVE-5235
> URL: https://issues.apache.org/jira/browse/HIVE-5235
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
> Environment: Gentoo linux with Hortonworks Hadoop 
> hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
>Reporter: Iván de Prado
>Priority: Blocker
> Attachments: gendata.py
>
>
> We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
> some kind of infinite loop. They keep working indefinitely when we set a huge 
> task expiry timeout. If we the expiry time to 600 second, the taks fail 
> because of not reporting progress, and finally, the Job fails. 
> That is not consistent, and some times between jobs executions the behavior 
> changes. It happen for different queries.
> We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
> that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
> always the same consistently. Everything points to some kind of infinite 
> loop. My guessing is that it has some relation to the ORC file. Maybe some 
> pointer is not right when writing generating some kind of infinite loop when 
> reading.  Or maybe there is a bug in the reading stage.
> More information below. The stack trace:
> {noformat} 
> "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
>java.lang.Thread.State: RUNNABLE
>   at java.util.zip.Inflater.inflateBytes(Native Method)
>   at java.util.zip.Inflater.inflate(Inflater.java:256)
>   - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>   - eliminated <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>   - locked <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {noformat} 
> We have seen the same stack trace repeatedly for several executions of jstack.
> The log file for this kind of task is th

[jira] [Updated] (HIVE-5235) Infinite loop with ORC file and Hive 0.11

2013-09-13 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-5235:
-

Summary: Infinite loop with ORC file and Hive 0.11  (was: Infinite loop 
with ORC file and Hive 0.1)

> Infinite loop with ORC file and Hive 0.11
> -
>
> Key: HIVE-5235
> URL: https://issues.apache.org/jira/browse/HIVE-5235
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
> Environment: Gentoo linux with Hortonworks Hadoop 
> hadoop-1.1.2.23.tar.gz and Apache Hive 0.11d
>Reporter: Iván de Prado
>Priority: Blocker
>
> We are using Hive 0.11 with ORC file format and we get some tasks blocked in 
> some kind of infinite loop. They keep working indefinitely when we set a huge 
> task expiry timeout. If we the expiry time to 600 second, the taks fail 
> because of not reporting progress, and finally, the Job fails. 
> That is not consistent, and some times between jobs executions the behavior 
> changes. It happen for different queries.
> We are using Hive 0.11 with Hadoop hadoop-1.1.2.23 from Hortonworks. The taks 
> that is blocked keeps consuming 100% of CPU usage, and the stack trace is 
> always the same consistently. Everything points to some kind of infinite 
> loop. My guessing is that it has some relation to the ORC file. Maybe some 
> pointer is not right when writing generating some kind of infinite loop when 
> reading.  Or maybe there is a bug in the reading stage.
> More information below. The stack trace:
> {noformat} 
> "main" prio=10 tid=0x7f20a000a800 nid=0x1ed2 runnable [0x7f20a8136000]
>java.lang.Thread.State: RUNNABLE
>   at java.util.zip.Inflater.inflateBytes(Native Method)
>   at java.util.zip.Inflater.inflate(Inflater.java:256)
>   - locked <0xf42a6ca0> (a java.util.zip.ZStreamRef)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ZlibCodec.decompress(ZlibCodec.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:128)
>   at 
> org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:143)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVulong(SerializationUtils.java:54)
>   at 
> org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readVslong(SerializationUtils.java:65)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.readValues(RunLengthIntegerReader.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReader.next(RunLengthIntegerReader.java:81)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:332)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:802)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1214)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:71)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:46)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:300)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:236)
>   - eliminated <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:216)
>   - locked <0xe1459700> (a 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1178)
>   at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {noformat} 
> We have seen the same stack trace repeatedly for several executions of jsta