Hive is doing the right thing there, as it cannot compact the deltas into a 
base file while there are still open transactions in the delta.  Storm should 
be committing on some frequency even if it doesn’t have enough data to commit.

Alan.

> On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote:
> 
> I made some research on that issue.
> The problem is in ValidCompactorTxnList::isTxnRangeValid method.
> 
> Here's code:
> @Override
> public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
>   if (highWatermark < minTxnId) {
>     return RangeResponse.NONE;
>   } else if (minOpenTxn < 0) {
>     return highWatermark >= maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
>   } else {
>     return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
>   }
> }
> 
> In my case this method returned RangeResponce.NONE for most of delta files. 
> With this value delta file doesn't include in compaction.
> 
> Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger return 
> RangeResponce.NONE, thats a problem for me, because of using Storm Hive Bolt. 
> Hive Bolt gets transaction and maintain it open with heartbeat until there's 
> data to commit.
> 
> So if i get transaction and maintain it open all compactions will stop. Is it 
> incorrect Hive behavior, or Storm should close transaction?
> 
> 
> 
> 
> On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:
> Thanks for reply, Alan. My guess with Storm was wrong. Today I get same 
> behavior with running Storm topology. 
> Anyway, I'd like to know, how can I check that transaction batch was closed 
> correctly?
> 
> On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote:
> I don’t know the details of how the storm application that streams into Hive 
> works, but this sounds like the transaction batches weren’t getting closed.  
> Compaction can’t happen until those batches are closed.  Do you know how you 
> had storm configured?  Also, you might ask separately on the storm list to 
> see if people have seen this issue before.
> 
> Alan.
> 
> > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> > One more thing. I'm using Apache Storm to stream data in Hive. And when I 
> > turned off Storm topology compactions started to work properly.
> >
> > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:
> > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive 
> > Streaming API. After some time i expect compaction to start but it didn't 
> > happen:
> >
> > Here's part of log, which shows that compactor initiator thread doesn't see 
> > any delta files:
> > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator 
> > (Initiator.java:run(89)) - Checking to see if we should compact 
> > default.data_aaa.dt=20160726
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils 
> > (AcidUtils.java:getAcidState(432)) - in directory 
> > hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
> >  base = null deltas = 0
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator 
> > (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0 
> > threshold: 0.1 will major compact: false
> >
> > But in that directory there's actually 23 files:
> >
> > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> > Found 23 items
> > -rw-r--r--   3 storm hdfs          4 2016-07-26 17:20 
> > /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:22 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:23 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:25 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:26 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:27 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:29 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:30 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:32 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:33 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:34 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:36 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:37 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:39 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:40 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:41 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:43 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:44 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:46 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:47 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:48 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
> > drwxrwxrwx   - storm hdfs          0 2016-07-26 17:50 
> > /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
> >
> > Full log here.
> >
> > What could go wrong?
> >
> 
> 
> 

Reply via email to