[jira] [Commented] (HIVE-20593) Load Data for partitioned ACID tables fails with bucketId out of range: -1
[ https://issues.apache.org/jira/browse/HIVE-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176849#comment-17176849 ] Bernard commented on HIVE-20593: Hi, Is the a workaround for this one without updating Hive? We've tried recreating the table but we're still getting this error. Thanks, Bernard > Load Data for partitioned ACID tables fails with bucketId out of range: -1 > -- > > Key: HIVE-20593 > URL: https://issues.apache.org/jira/browse/HIVE-20593 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 3.1.0 >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal >Priority: Major > Fix For: 4.0.0, 3.2.0, 3.1.2 > > Attachments: HIVE-20593.1.patch, HIVE-20593.2.patch, > HIVE-20593.3.patch > > > Load data for ACID tables is failing to load ORC files when it is converted > to IAS job. > > The tempTblObj is inherited from target table. However, the only table > property which needs to be inherited is bucketing version. Properties like > transactional etc should be ignored. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22318) Java.io.exception:Two readers for
[ https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17021685#comment-17021685 ] Bernard commented on HIVE-22318: I've observed that this bug occurs only when the destination table isn't partitioned or bucketed. Can someone confirm? > Java.io.exception:Two readers for > - > > Key: HIVE-22318 > URL: https://issues.apache.org/jira/browse/HIVE-22318 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 >Affects Versions: 3.1.0 >Reporter: max_c >Priority: Major > Attachments: hiveserver2 for exception.log > > > I create a ACID table with ORC format: > > {noformat} > CREATE TABLE `some.TableA`( > >) > ROW FORMAT SERDE >'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT >'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT >'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > TBLPROPERTIES ( >'bucketing_version'='2', >'orc.compress'='SNAPPY', >'transactional'='true', >'transactional_properties'='default'){noformat} > After executing merge into operation: > {noformat} > MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL > SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED > THEN DELETE > {noformat} > the problem happend(when selecting the TableA, the exception happens too): > {noformat} > java.io.IOException: java.io.IOException: Two readers for {originalWriteId: > 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new > [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId > 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC > Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1, > 9223372036854775807)], old [key={originalWriteId: 4, bucket: > 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, > 2434, 25, null}, reader=Hive ORC > Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat} > Through orc_tools I scan all the > files(bucket_0,bucket_1,bucket_2) under delete_delta and find all > rows of files are the same.I think this will cause the same > key(RecordIdentifer) when scan the bucket_1 after bucket_0 but I > don't know why all the rows are the same in these bucket files. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-12718) skip.footer.line.count misbehaves on larger text files
[ https://issues.apache.org/jira/browse/HIVE-12718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859141#comment-15859141 ] Charles Bernard commented on HIVE-12718: We are experiencing the same issue running CDH 5.8.0. Our problem is that the wrong line (not the last one) is being skipped. Forcing one mapper only does not help. > skip.footer.line.count misbehaves on larger text files > -- > > Key: HIVE-12718 > URL: https://issues.apache.org/jira/browse/HIVE-12718 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 > Environment: The bug was discovered and reproduced on a Cloudera > Hadoop 5.4 distribution running on CentOS 6.4. >Reporter: Gergely Nagy >Priority: Minor > > We noticed that when working on a table backed by a larger (large enough to > require splitting) text file, the {{skip.footer.line.count}} property of the > table misbehaves: the footer is not being ignored. > To reproduce, follow these steps: > 1) Create a large file: {{for i in $(seq 1 100); do cat > /usr/share/dict/words; done >large.txt}} > 2) Upload it to HDFS (eg, as {{/tmp/words}}) > 3) Create an external table with {{skip.footer.line.count}} set: > {quote} > CREATE EXTERNAL TABLE ext_words (word STRING) > ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' > LINES TERMINATED BY '\n' > STORED AS TEXTFILE LOCATION '/tmp/words' > tblproperties("skip.header.line.count"="1", "skip.footer.line.count"="1"); > {quote} > 4) Count the number of times the last line (in this example, I assume that to > be {{ZZZ}}) appears: {{SELECT COUNT( * ) FROM ext_words WHERE word = 'ZZZ';}} > 5) Observe that it returns 100 instead of 99. > Investigation showed that this happens when there are more than one mappers > used for the job. If we increase the split size, to force using one mapper > only, the problem did not occur. > There may be other related issues as well, like the wrong line being skipped > -- but we did not reproduce those yet. -- This message was sent by Atlassian JIRA (v6.3.15#6346)