I made some research on that issue.
The problem is in ValidCompactorTxnList::isTxnRangeValid
<https://github.com/apache/hive/blob/release-1.2.1/metastore/src/java/org/apache/hadoop/hive/metastore/txn/ValidCompactorTxnList.java>
method.

Here's code:

@Override
public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
  if (highWatermark < minTxnId) {
    return RangeResponse.NONE;
  } else if (minOpenTxn < 0) {
    return highWatermark >= maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
  } else {
    return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
  }
}


In my case this method returned RangeResponce.NONE for most of delta files.
With this value delta file doesn't include in compaction.

Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger
return *RangeResponce.NONE, *thats a problem for me, because of using Storm
Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat
until there's data to commit.

So if i get transaction and maintain it open all compactions will stop. Is
it incorrect Hive behavior, or Storm should close transaction?




On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:

> Thanks for reply, Alan. My guess with Storm was wrong. Today I get same
> behavior with running Storm topology.
> Anyway, I'd like to know, how can I check that transaction batch was
> closed correctly?
>
> On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote:
>
>> I don’t know the details of how the storm application that streams into
>> Hive works, but this sounds like the transaction batches weren’t getting
>> closed.  Compaction can’t happen until those batches are closed.  Do you
>> know how you had storm configured?  Also, you might ask separately on the
>> storm list to see if people have seen this issue before.
>>
>> Alan.
>>
>> > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
>> >
>> > One more thing. I'm using Apache Storm to stream data in Hive. And when
>> I turned off Storm topology compactions started to work properly.
>> >
>> > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com>
>> wrote:
>> > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive
>> Streaming API. After some time i expect compaction to start but it didn't
>> happen:
>> >
>> > Here's part of log, which shows that compactor initiator thread doesn't
>> see any delta files:
>> > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
>> (Initiator.java:run(89)) - Checking to see if we should compact
>> default.data_aaa.dt=20160726
>> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
>> (AcidUtils.java:getAcidState(432)) - in directory hdfs://
>> sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
>> base = null deltas = 0
>> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
>> (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
>> threshold: 0.1 will major compact: false
>> >
>> > But in that directory there's actually 23 files:
>> >
>> > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
>> > Found 23 items
>> > -rw-r--r--   3 storm hdfs          4 2016-07-26 17:20
>> /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:22
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:23
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:25
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:26
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:27
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:29
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:30
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:32
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:33
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:34
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:36
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:37
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:39
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:40
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:41
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:43
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:44
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:46
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:47
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:48
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
>> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
>> >
>> > Full log here.
>> >
>> > What could go wrong?
>> >
>>
>>
>

Reply via email to