Hive LIMIT clause slows query

2017-11-02 Thread Igor Kuzmenko
I'm using HDP 2.5.0 with 1.2.1 Hive.
Performing some tests I noticed that my query works better if I don't use
limit clause.

My query is:

insert into table *results_table *partition (task_id=xxx)
select * from *data_table *
where dt=20171102
and .
limit 100


This query runs in about 30 seconds, but without limit clause I can get
about 20 seconds.

Query execution plan with limit , and without
.


I can't remove limit clause because in some cases there's to much results
and I don't whant to store them all in result table.
Why limit affects performance so much?  Intuitively, it seems that with
limit clause it should work faster. What can I do to improve prefomance?


Re: Hive locking mechanism on read partition.

2017-10-13 Thread Igor Kuzmenko
That's exactly what I'm looking for. Thank you, Eugene.

On Fri, Oct 13, 2017 at 8:21 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> I don’t think there is any way for you to get rid of the table level
> shared lock (though it may be a reasonable improvement to make).
>
> You could use https://cwiki.apache.org/confluence/display/Hive/
> Configuration+Properties#ConfigurationProperties-hive.
> txn.strict.locking.mode
>
> To change X lock on write to S lock to get around this but this may not be
> appropriate for the rest of your logic.
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Friday, October 13, 2017 at 2:16 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive locking mechanism on read partition.
>
>
>
> Hi, Eugene.
>
>
>
> Tables are not transactional and locks are backed by DbTxnManager.
>
>
>
> On Fri, Oct 13, 2017 at 2:30 AM, Eugene Koifman <ekoif...@hortonworks.com>
> wrote:
>
> Which lock manager are you using?
>
> Do you have acid enabled and if so are these tables transactional?
>
>
>
> Eugene
>
>
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Thursday, October 12, 2017 at 3:58 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Hive locking mechanism on read partition.
>
>
>
> Hello, I'm using HDP 2.5.0.0  with included hive 1.2.1. And I have problem
> with locking mechanism.
>
>
>
> Most of my queries to hive looks like this.
>
>
>
> *(1)insert into table results_table partition(task_id=${task_id})*
>
> *select * from data_table  where ;*
>
>
>
> results_table partitioned by task_id field and i expect to get exclusive
> lock on corresponding partition.Which is true:
>
>
>
> Lock ID Database
>
> Table Partition
>
>   StateBlocked By
>
> Type   Transaction ID
>
> 136639682.4 default
>
> results_table task_id=5556
>
>   ACQUIRED
>
>  EXCLUSIVE   NULL
>
>
>
>
>
>
>
> Another type of query is fetching data from results_table:
>
>
>
> *(2)  select * from results_table where task_id = ${task_id}*
>
>
>
> This select doesn't require any map reduce and executes fast. This is
> exactly what I want.
>
> But if I execute this two queries at the same time I can't perform read
> from result_table partition while inserting data into another.
>
>
>
> Locks looks like this:
>
>
>
> Lock ID Database
>
> Table Partition
>
>   State Blocked By
>
> Type Transaction ID
>
> 136639682.4 default
>
> results_table task_id=5556
>
>   ACQUIRED
>
> EXCLUSIVE NULL
>
> 136639700.1 default
>
> results_table NULL
>
>   WAITING 136639682.4
>
> SHARED_READ NULL
>
> 136639700.2 default
>
> results_table task_id=5535
>
>WAITING
>
> SHARED_READ NULL
>
>
>
>
>
> Reading data from specified partition requires shared lock on whole table.
> This prevents me to get data untill first query completes.
>
>
>
> As I can see on this page
> <https://cwiki.apache.org/confluence/display/Hive/Locking#Locking-UseCases> 
> this
> is expected behaivor. But I don't understand why we need lock on table.
>
> Can I get rid of shared lock on whole table, while still having shared
> lock on specific partition?
>
>
>
>
>
>
>
>
>


Re: Hive locking mechanism on read partition.

2017-10-13 Thread Igor Kuzmenko
Hi, Eugene.

Tables are not transactional and locks are backed by DbTxnManager.

On Fri, Oct 13, 2017 at 2:30 AM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> Which lock manager are you using?
>
> Do you have acid enabled and if so are these tables transactional?
>
>
>
> Eugene
>
>
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Thursday, October 12, 2017 at 3:58 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Hive locking mechanism on read partition.
>
>
>
> Hello, I'm using HDP 2.5.0.0  with included hive 1.2.1. And I have problem
> with locking mechanism.
>
>
>
> Most of my queries to hive looks like this.
>
>
>
> *(1)insert into table results_table partition(task_id=${task_id})*
>
> *select * from data_table  where ;*
>
>
>
> results_table partitioned by task_id field and i expect to get exclusive
> lock on corresponding partition.Which is true:
>
>
>
> Lock ID Database
>
> Table Partition
>
>   StateBlocked By
>
> Type   Transaction ID
>
> 136639682.4 default
>
> results_table task_id=5556
>
>   ACQUIRED
>
>  EXCLUSIVE   NULL
>
>
>
>
>
>
>
> Another type of query is fetching data from results_table:
>
>
>
> *(2)  select * from results_table where task_id = ${task_id}*
>
>
>
> This select doesn't require any map reduce and executes fast. This is
> exactly what I want.
>
> But if I execute this two queries at the same time I can't perform read
> from result_table partition while inserting data into another.
>
>
>
> Locks looks like this:
>
>
>
> Lock ID Database
>
> Table Partition
>
>   State Blocked By
>
> Type Transaction ID
>
> 136639682.4 default
>
> results_table task_id=5556
>
>   ACQUIRED
>
> EXCLUSIVE NULL
>
> 136639700.1 default
>
> results_table NULL
>
>   WAITING 136639682.4
>
> SHARED_READ NULL
>
> 136639700.2 default
>
> results_table task_id=5535
>
>WAITING
>
> SHARED_READ NULL
>
>
>
>
>
> Reading data from specified partition requires shared lock on whole table.
> This prevents me to get data untill first query completes.
>
>
>
> As I can see on this page
> <https://cwiki.apache.org/confluence/display/Hive/Locking#Locking-UseCases> 
> this
> is expected behaivor. But I don't understand why we need lock on table.
>
> Can I get rid of shared lock on whole table, while still having shared
> lock on specific partition?
>
>
>
>
>
>
>


Hive locking mechanism on read partition.

2017-10-12 Thread Igor Kuzmenko
Hello, I'm using HDP 2.5.0.0  with included hive 1.2.1. And I have problem
with locking mechanism.

Most of my queries to hive looks like this.

*(1)insert into table results_table partition(task_id=${task_id})*
*select * from data_table  where ;*

results_table partitioned by task_id field and i expect to get exclusive
lock on corresponding partition.Which is true:

Lock ID Database Table Partition   StateBlocked By Type
  Transaction ID
136639682.4 default results_table task_id=5556   ACQUIRED
   EXCLUSIVE   NULL



Another type of query is fetching data from results_table:

*(2)  select * from results_table where task_id = ${task_id}*

This select doesn't require any map reduce and executes fast. This is
exactly what I want.
But if I execute this two queries at the same time I can't perform read
from result_table partition while inserting data into another.

Locks looks like this:

Lock ID Database Table Partition   State Blocked By Type
  Transaction ID
136639682.4 default results_table task_id=5556   ACQUIRED
  EXCLUSIVE NULL
136639700.1 default results_table NULL   WAITING 136639682.4
SHARED_READ NULL
136639700.2 default results_table task_id=5535WAITING
  SHARED_READ NULL


Reading data from specified partition requires shared lock on whole table.
This prevents me to get data untill first query completes.

As I can see on this page

this
is expected behaivor. But I don't understand why we need lock on table.
Can I get rid of shared lock on whole table, while still having shared lock
on specific partition?


Unexpected query result

2017-08-21 Thread Igor Kuzmenko
Runnuning simple '*select count(*) from test_table*'  query returned me
500_000 result.
But when i run '*select count(distinct field) from test_table*' query
result is 500_001.

How it coud happen, that in table with 500_000 records have 500_001 unique
field values?

I'm using Hive from HDP 2.5.0 platform.
Table stored as ORC.


Re: Hive TxnHandler::lock method run into dead lock.

2017-03-28 Thread Igor Kuzmenko
Explicit configuration is workaround, but it doesn't solve deadlock problem.

On Mon, Mar 27, 2017 at 8:28 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> There is an open ticket
>
> https://issues.apache.org/jira/browse/HIVE-13842
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 27, 2017 at 8:39 AM
>
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> I increased maxConnections up to 50 and recompiled metastore jar. Its
> gonna be enough for a while.
>
> *Eugene, *do you know is there any Jira item on this problem?
>
>
>
> On Sun, Mar 26, 2017 at 7:21 PM, Eugene Koifman <ekoif...@hortonworks.com>
> wrote:
>
> I see (in HikariCP-2.5.1.jar) so perhaps upgrading the library is an
> option.
>
>
>
> *public *HikariConfig() {
>   *this*.dataSourceProperties = *new *Properties();
>   *this*.healthCheckProperties = *new *Properties();
>   *this*.minIdle = -1;
>   *this*.maxPoolSize = -1;
>   *this*.maxLifetime = MAX_LIFETIME;
>   *this*.connectionTimeout = CONNECTION_TIMEOUT;
>   *this*.validationTimeout = VALIDATION_TIMEOUT;
>   *this*.idleTimeout = IDLE_TIMEOUT;
>   *this*.isAutoCommit = *true*;
>   *this*.isInitializationFailFast = *true*;
>   String systemProp = System.getProperty(*"hikaricp.configurationFile"*);
>   *if*(systemProp != *null*) {
> *this*.loadProperties(systemProp);
>   }
>
> }
>
>
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Saturday, March 25, 2017 at 5:05 PM
>
>
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Hi, Eugene.
>
> I've tried hicaricp, it didn't work either. Hikari pool has same 10
> maxConnections limit. When creating pool there's no explicit set of max
> pool size and in HikariConfig constructor it's hardcoded to value of 10:
>
> public HikariConfig()
> {
>dataSourceProperties = new Properties();
>
>connectionTimeout = *CONNECTION_TIMEOUT*;
>idleTimeout = *IDLE_TIMEOUT*;
>isAutoCommit = true;
>isJdbc4connectionTest = true;
>minIdle = -1;
>maxPoolSize = 10;
>maxLifetime = *MAX_LIFETIME*;
>isRecordMetrics = false;
>transactionIsolation = -1;
>metricsTrackerClassName = 
> "com.zaxxer.hikari.metrics.CodaHaleMetricsTracker";
>customizer = new IConnectionCustomizer() {
>   @Override
>   public void customize(Connection connection) throws SQLException
>   {
>   }
>};
> }
>
>
>
> On Thu, Mar 23, 2017 at 8:29 PM, Eugene Koifman <ekoif...@hortonworks.com>
> wrote:
>
> Can you try use “hikaricp” connection pool manager?  It seems to be using
> default which is no limit.
>
>
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 20, 2017 at 2:17 PM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Sorry miss clicked.
>
>
>
> 2) TxnHandler::lock method request new connection when executing this line
> of code:
>
> ConnectionLockIdPair
>
> connAndLockId =
>
> enqueueLockWithRetry(rqst);
>
>
>
> 3) After that
>
> folowing this
>
> stacktrace:
>
> - TxnHandler::lock
>
> -
>
> TxnHandler::checkLockWithRetry
>
> -
>
> TxnHandler::checkLock
>
>
>
> In
>
> checkLock method we reach this line:
>
> handle
>
> =
>
> getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
>
>
>
> 4)
>
> acquireLock method requests another connection to DB:
>
> dbConn
>
> =
>
> getDbConn(Connection.TRANSACTION_READ_COMMITTED);
>
>
>
> So all in all if i call
>
> TxnHandler::lock
>
> method in 10 threads same time at first I get all connections to db, that
> stored in pool, and at
>
> acquireLock I will stuck because there's no free connection.
>
>
>
> Does anyone run into this problem? How can I avoid this problem?
>
>
>
> Code was taken from here:
>
> https://github.com/hortonworks/hive-release/blob/
> HDP-2.5.0.0-tag/metastore/src/java/org/apache/hadoop/hive/
> metastore/txn/TxnHandler.java
>
>
>
> I gue

Re: Hive TxnHandler::lock method run into dead lock.

2017-03-27 Thread Igor Kuzmenko
I increased maxConnections up to 50 and recompiled metastore jar. Its gonna
be enough for a while.
Eugene, do you know is there any Jira item on this problem?

On Sun, Mar 26, 2017 at 7:21 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> I see (in HikariCP-2.5.1.jar) so perhaps upgrading the library is an
> option.
>
>
>
> *public *HikariConfig() {
>   *this*.dataSourceProperties = *new *Properties();
>   *this*.healthCheckProperties = *new *Properties();
>   *this*.minIdle = -1;
>   *this*.maxPoolSize = -1;
>   *this*.maxLifetime = MAX_LIFETIME;
>   *this*.connectionTimeout = CONNECTION_TIMEOUT;
>   *this*.validationTimeout = VALIDATION_TIMEOUT;
>   *this*.idleTimeout = IDLE_TIMEOUT;
>   *this*.isAutoCommit = *true*;
>   *this*.isInitializationFailFast = *true*;
>   String systemProp = System.getProperty(*"hikaricp.configurationFile"*);
>   *if*(systemProp != *null*) {
>     *this*.loadProperties(systemProp);
>   }
>
> }
>
>
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Saturday, March 25, 2017 at 5:05 PM
>
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Hi, Eugene.
>
> I've tried hicaricp, it didn't work either. Hikari pool has same 10
> maxConnections limit. When creating pool there's no explicit set of max
> pool size and in HikariConfig constructor it's hardcoded to value of 10:
>
> public HikariConfig()
> {
>dataSourceProperties = new Properties();
>
>connectionTimeout = *CONNECTION_TIMEOUT*;
>idleTimeout = *IDLE_TIMEOUT*;
>isAutoCommit = true;
>isJdbc4connectionTest = true;
>minIdle = -1;
>maxPoolSize = 10;
>maxLifetime = *MAX_LIFETIME*;
>isRecordMetrics = false;
>transactionIsolation = -1;
>metricsTrackerClassName = 
> "com.zaxxer.hikari.metrics.CodaHaleMetricsTracker";
>customizer = new IConnectionCustomizer() {
>   @Override
>   public void customize(Connection connection) throws SQLException
>   {
>   }
>};
> }
>
>
>
> On Thu, Mar 23, 2017 at 8:29 PM, Eugene Koifman <ekoif...@hortonworks.com>
> wrote:
>
> Can you try use “hikaricp” connection pool manager?  It seems to be using
> default which is no limit.
>
>
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 20, 2017 at 2:17 PM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Sorry miss clicked.
>
>
>
> 2) TxnHandler::lock method request new connection when executing this line
> of code:
>
> ConnectionLockIdPair
>
> connAndLockId =
>
> enqueueLockWithRetry(rqst);
>
>
>
> 3) After that
>
> folowing this
>
> stacktrace:
>
> - TxnHandler::lock
>
> -
>
> TxnHandler::checkLockWithRetry
>
> -
>
> TxnHandler::checkLock
>
>
>
> In
>
> checkLock method we reach this line:
>
> handle
>
> =
>
> getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
>
>
>
> 4)
>
> acquireLock method requests another connection to DB:
>
> dbConn
>
> =
>
> getDbConn(Connection.TRANSACTION_READ_COMMITTED);
>
>
>
> So all in all if i call
>
> TxnHandler::lock
>
> method in 10 threads same time at first I get all connections to db, that
> stored in pool, and at
>
> acquireLock I will stuck because there's no free connection.
>
>
>
> Does anyone run into this problem? How can I avoid this problem?
>
>
>
> Code was taken from here:
>
> https://github.com/hortonworks/hive-release/blob/
> HDP-2.5.0.0-tag/metastore/src/java/org/apache/hadoop/hive/
> metastore/txn/TxnHandler.java
>
>
>
> I guess the closest branch in
>
> apach
>
> repo is:
>
> https://github.com/apache/hive/blob/branch-2.1/
> metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>
>
>
> On Tue, Mar 21, 2017 at 12:07 AM, Igor Kuzmenko <f1she...@gmail.com>
> wrote:
>
> Hello I'm running Hortonworks data platform 2.5.0.0 with included hive.
>
> I'm using storm hive bolt to load data into my hive. But launching many
> hive bolt always leads me to TimeoutException on calling hive metastore.
> Metastore logs full of Exception like this:
>
>
>
> 2017-03-15 18:46:12,436 ERROR [pool-5-thread-11]: txn.TxnHandler
> (TxnHandle

Re: Hive TxnHandler::lock method run into dead lock.

2017-03-25 Thread Igor Kuzmenko
Hi, Eugene.
I've tried hicaricp, it didn't work either. Hikari pool has same 10
maxConnections limit. When creating pool there's no explicit set of max
pool size and in HikariConfig constructor it's hardcoded to value of 10:

public HikariConfig()
{
   dataSourceProperties = new Properties();

   connectionTimeout = CONNECTION_TIMEOUT;
   idleTimeout = IDLE_TIMEOUT;
   isAutoCommit = true;
   isJdbc4connectionTest = true;
   minIdle = -1;
   maxPoolSize = 10;
   maxLifetime = MAX_LIFETIME;
   isRecordMetrics = false;
   transactionIsolation = -1;
   metricsTrackerClassName = "com.zaxxer.hikari.metrics.CodaHaleMetricsTracker";
   customizer = new IConnectionCustomizer() {
  @Override
  public void customize(Connection connection) throws SQLException
  {
  }
   };
}


On Thu, Mar 23, 2017 at 8:29 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> Can you try use “hikaricp” connection pool manager?  It seems to be using
> default which is no limit.
>
>
>
>
>
> Eugene
>
>
>
> *From: *Igor Kuzmenko <f1she...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Monday, March 20, 2017 at 2:17 PM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: Hive TxnHandler::lock method run into dead lock.
>
>
>
> Sorry miss clicked.
>
>
>
> 2) TxnHandler::lock method request new connection when executing this line
> of code:
>
> ConnectionLockIdPair
>
> connAndLockId =
>
> enqueueLockWithRetry(rqst);
>
>
>
> 3) After that
>
> folowing this
>
> stacktrace:
>
> - TxnHandler::lock
>
> -
>
> TxnHandler::checkLockWithRetry
>
> -
>
> TxnHandler::checkLock
>
>
>
> In
>
> checkLock method we reach this line:
>
> handle
>
> =
>
> getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());
>
>
>
> 4)
>
> acquireLock method requests another connection to DB:
>
> dbConn
>
> =
>
> getDbConn(Connection.TRANSACTION_READ_COMMITTED);
>
>
>
> So all in all if i call
>
> TxnHandler::lock
>
> method in 10 threads same time at first I get all connections to db, that
> stored in pool, and at
>
> acquireLock I will stuck because there's no free connection.
>
>
>
> Does anyone run into this problem? How can I avoid this problem?
>
>
>
> Code was taken from here:
>
> https://github.com/hortonworks/hive-release/blob/
> HDP-2.5.0.0-tag/metastore/src/java/org/apache/hadoop/hive/
> metastore/txn/TxnHandler.java
>
>
>
> I guess the closest branch in
>
> apach
>
> repo is:
>
> https://github.com/apache/hive/blob/branch-2.1/
> metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>
>
>
> On Tue, Mar 21, 2017 at 12:07 AM, Igor Kuzmenko <f1she...@gmail.com>
> wrote:
>
> Hello I'm running Hortonworks data platform 2.5.0.0 with included hive.
>
> I'm using storm hive bolt to load data into my hive. But launching many
> hive bolt always leads me to TimeoutException on calling hive metastore.
> Metastore logs full of Exception like this:
>
>
>
> 2017-03-15 18:46:12,436 ERROR [pool-5-thread-11]: txn.TxnHandler
> (TxnHandler.java:getDbConn(1834)) - There is a problem with a connection
> from the pool, retrying(rc=7): Timed out waiting for a free available
> connection. (SQLState=08001, ErrorCode=0)
> java.sql.SQLException: Timed out waiting for a free available connection.
> at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(
> DefaultConnectionStrategy.java:88)
> at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(
> AbstractConnectionStrategy.java:90)
> at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
> at com.jolbox.bonecp.BoneCPDataSource.getConnection(
> BoneCPDataSource.java:131)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.
> getDbConn(TxnHandler.java:1827)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(
> TxnHandler.java:873)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(
> TxnHandler.java:814)
> at org.apache.hadoop.hive.metastore.HiveMetaStore$
> HMSHandler.lock(HiveMetaStore.java:5751)
>
>
>
> After looking through code I found out
>
>
>
> 1) TxnHandler  class uses connection pool to get db connections and it's
> size is 10.
>
> 2) TxnHandler::lock method requset new connection whe executing this line
> of code:
>
>
>
>
>


Hive TxnHandler::lock method run into dead lock.

2017-03-20 Thread Igor Kuzmenko
Hello I'm running Hortonworks data platform 2.5.0.0 with included hive.
I'm using storm hive bolt to load data into my hive. But launching many
hive bolt always leads me to TimeoutException on calling hive metastore.
Metastore logs full of Exception like this:

2017-03-15 18:46:12,436 ERROR [pool-5-thread-11]: txn.TxnHandler
(TxnHandler.java:getDbConn(1834)) - There is a problem with a connection
from the pool, retrying(rc=7): Timed out waiting for a free available
connection. (SQLState=08001, ErrorCode=0)
java.sql.SQLException: Timed out waiting for a free available connection.
at
com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(DefaultConnectionStrategy.java:88)
at
com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(AbstractConnectionStrategy.java:90)
at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
at
com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:131)
at
org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:1827)
at
org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:873)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:814)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:5751)

After looking through code I found out

1) TxnHandler  class uses connection pool to get db connections and it's
size is 10.
2) TxnHandler::lock method requset new connection whe executing this line
of code:


Re: Hive TxnHandler::lock method run into dead lock.

2017-03-20 Thread Igor Kuzmenko
Sorry miss clicked.

2) TxnHandler::lock method request new connection when executing this line
of code:
ConnectionLockIdPair connAndLockId = enqueueLockWithRetry(rqst);

3) After that folowing this stacktrace:
- TxnHandler::lock
- TxnHandler::checkLockWithRetry
- TxnHandler::checkLock

In checkLock method we reach this line:
handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name());

4) acquireLock method requests another connection to DB:
dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);

So all in all if i call TxnHandler::lock method in 10 threads same time at
first I get all connections to db, that stored in pool, and at acquireLock
I will stuck because there's no free connection.

Does anyone run into this problem? How can I avoid this problem?

Code was taken from here:
https://github.com/hortonworks/hive-release/blob/HDP-2.5.0.0-tag/metastore/
src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java

I guess the closest branch in apach repo is:
https://github.com/apache/hive/blob/branch-2.1/metastore/src/java/org/apache
/hadoop/hive/metastore/txn/TxnHandler.java

On Tue, Mar 21, 2017 at 12:07 AM, Igor Kuzmenko <f1she...@gmail.com> wrote:

> Hello I'm running Hortonworks data platform 2.5.0.0 with included hive.
> I'm using storm hive bolt to load data into my hive. But launching many
> hive bolt always leads me to TimeoutException on calling hive metastore.
> Metastore logs full of Exception like this:
>
> 2017-03-15 18:46:12,436 ERROR [pool-5-thread-11]: txn.TxnHandler
> (TxnHandler.java:getDbConn(1834)) - There is a problem with a connection
> from the pool, retrying(rc=7): Timed out waiting for a free available
> connection. (SQLState=08001, ErrorCode=0)
> java.sql.SQLException: Timed out waiting for a free available connection.
> at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(
> DefaultConnectionStrategy.java:88)
> at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(
> AbstractConnectionStrategy.java:90)
> at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
> at com.jolbox.bonecp.BoneCPDataSource.getConnection(
> BoneCPDataSource.java:131)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.
> getDbConn(TxnHandler.java:1827)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(
> TxnHandler.java:873)
> at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(
> TxnHandler.java:814)
> at org.apache.hadoop.hive.metastore.HiveMetaStore$
> HMSHandler.lock(HiveMetaStore.java:5751)
>
> After looking through code I found out
>
> 1) TxnHandler  class uses connection pool to get db connections and it's
> size is 10.
> 2) TxnHandler::lock method requset new connection whe executing this line
> of code:
>
>


Re: How to remove Hive table property?

2016-08-24 Thread Igor Kuzmenko
Creating new table with subsequent rename seems good for me.
Thanks for advice, Mich.

On Tue, Aug 23, 2016 at 3:12 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Has table got data in it?
>
> Can you create a new table WITHOUT serialization.null.format and
> INSERT/SELECT from old to new, drop old and rename new to old.
>
> If the data is already there then the setting will apply to new rows only.
> That may be acceptable.
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 23 August 2016 at 12:42, Igor Kuzmenko <f1she...@gmail.com> wrote:
>
>> I've created a Hive table with property "serialization.null.format"="null"
>> to interpret string "null' as null. Now it's unnecessary for me. How can I
>> remove it?
>>
>> Alter table properties page
>> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTableProperties>
>>  explain
>> how to set new value, but I need completely remove it.
>>
>> Or setting "serialization.null.format"=null  is a correct way to do that?
>>
>
>


Re: Hive transaction doesn't release lock.

2016-08-24 Thread Igor Kuzmenko
Oh, my copy paste skills failed me.
Here's output from HIVE_LOCKS:

# HL_LOCK_EXT_ID, HL_LOCK_INT_ID, HL_TXNID, HL_DB, HL_TABLE, HL_PARTITION,
HL_LOCK_STATE, HL_LOCK_TYPE, HL_LAST_HEARTBEAT, HL_ACQUIRED_AT, HL_USER,
HL_HOST
'8496355', '1', '78461824', 'default', 'data_http', 'dt=20160821', 'a',
'r', '0', '1471768489000', 'storm', 'sorm-data03.msk.mts.ru'

TXNS, TXN_COMPONENTS:
# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
78461824, default, data_http, dt=201608213


Good news, I've found a log when it all started:

Here comes heartbeat:
2016-08-21 11:34:48,934 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:heartbeatTxn(1882)) - Going to execute update 

Concurrently we are trying to abort transaction
2016-08-21 11:34:48,907 DEBUG [DeadTxnReaper-0]: txn.TxnHandler
(TxnHandler.java:abortTxns(1429)) - Going to execute update 
2016-08-21 11:34:48,943 DEBUG [DeadTxnReaper-0]: txn.TxnHandler
(TxnHandler.java:abortTxns(1446)) - Going to execute update 
2016-08-21 11:34:48,957 DEBUG [pool-3-thread-11]: txn.TxnHandler
(TxnHandler.java:abortTxns(1429)) - Going to execute update 
2016-08-21 11:34:48,979 DEBUG [pool-3-thread-11]: txn.TxnHandler
(TxnHandler.java:abortTxns(1446)) - Going to execute update 

At this point transaction aborted and thers no lock.
But heartbeat thread still working and:
2016-08-21 11:34:49,025 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:lock(1546)) - Going to execute update 
2016-08-21 11:34:49,027 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:lock(1576)) - Going to execute update 
2016-08-21 11:34:49,029 DEBUG [pool-3-thread-155]: txn.TxnHandler
(TxnHandler.java:checkLock(1696)) -   lockid:8496355 intLockId:1
txnid:78461824 db:default table:data_http partition:dt=20160821
state:WAITING type:SHARED_READ


So I guess, that it's race condition between heartbeat thread and TxnReaper
thread. Last heartbeat information in HIVE_LOCKS table differs from TXNS
table.
Full log here
<https://drive.google.com/open?id=0ByB92PAoAkrKSFFaQjBNSnNXZFk>.

On Tue, Aug 23, 2016 at 8:20 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> your query is “select * from HIVE_LOCKS” but the output is not from
> HIVE_LOCKS.
> What entries do you have in HIVE_LOCKS for this txn_id?
>
> If all you see is an entry in TXN table in ‘a’ state – that is OK.  that
> just mean that this transaction was aborted.
>
> Eugene
>
> From: Igor Kuzmenko <f1she...@gmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Monday, August 22, 2016 at 8:27 AM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Hive transaction doesn't release lock.
>
> Hello, I'm using Apache Hive 1.2.1 and Apache Storm to stream data in hive
> table.
> After making some tests I tried to truncate my table, but sql execution
> doesn't complete because of the lock on table:
>
>
> select * from HIVE_LOCKS;
>
>
> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
> '78461824', 'a', '1471762974000', '1471768488000', 'storm', 'sorm-data03.
> msk.mts.ru', '78461824', 'default', 'data_http', 'dt=20160821'
>
>
>
> At the same time:
>
> select * from TXNS, TXN_COMPONENTS
> where 1=1
> and TXNS.TXN_ID = TXN_COMPONENTS.TC_TXNID
> and TXNS.TXN_ID = 78461824
> ;
>
>
> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
> 78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
> 78461824, default, data_http, dt=20160821
>
>
> Txn 78461824 is in aborted state, but still keep lock on table
> 'data_http'. How can I avoid this? What should I do to get rid of that
> lock?
>
>


Re: Hive transaction doesn't release lock.

2016-08-23 Thread Igor Kuzmenko
Thanks for reply, Mich.
Is there any open item in Jira?

On Mon, Aug 22, 2016 at 9:28 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> there are issues with locks not being released even when the transaction
> is aborted. There are still entries in hive_locks.
>
> I ended up deleting the row from hive_locks table manually. Not ideal but
> you know that the lock should not be there as the table is dropped.
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 22 August 2016 at 16:27, Igor Kuzmenko <f1she...@gmail.com> wrote:
>
>> Hello, I'm using Apache Hive 1.2.1 and Apache Storm to stream data in
>> hive table.
>> After making some tests I tried to truncate my table, but sql execution
>> doesn't complete because of the lock on table:
>>
>>
>> select * from HIVE_LOCKS;
>>
>>
>> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
>> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
>> '78461824', 'a', '1471762974000', '1471768488000', 'storm', 'sorm-data03.
>> msk.mts.ru', '78461824', 'default', 'data_http', 'dt=20160821'
>>
>>
>>
>> At the same time:
>>
>> select * from TXNS, TXN_COMPONENTS
>> where 1=1
>> and TXNS.TXN_ID = TXN_COMPONENTS.TC_TXNID
>> and TXNS.TXN_ID = 78461824
>> ;
>>
>>
>> # TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
>> TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
>> 78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
>> 78461824, default, data_http, dt=20160821
>>
>>
>> Txn 78461824 is in aborted state, but still keep lock on table
>> 'data_http'. How can I avoid this? What should I do to get rid of that
>> lock?
>>
>>
>


Hive transaction doesn't release lock.

2016-08-22 Thread Igor Kuzmenko
Hello, I'm using Apache Hive 1.2.1 and Apache Storm to stream data in hive
table.
After making some tests I tried to truncate my table, but sql execution
doesn't complete because of the lock on table:


select * from HIVE_LOCKS;


# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
'78461824', 'a', '1471762974000', '1471768488000', 'storm', 'sorm-data03.msk
.mts.ru', '78461824', 'default', 'data_http', 'dt=20160821'



At the same time:

select * from TXNS, TXN_COMPONENTS
where 1=1
and TXNS.TXN_ID = TXN_COMPONENTS.TC_TXNID
and TXNS.TXN_ID = 78461824
;


# TXN_ID, TXN_STATE, TXN_STARTED, TXN_LAST_HEARTBEAT, TXN_USER, TXN_HOST,
TC_TXNID, TC_DATABASE, TC_TABLE, TC_PARTITION
78461824, a, 1471762974000, 1471768488000, storm, sorm-data03.msk.mts.ru,
78461824, default, data_http, dt=20160821


Txn 78461824 is in aborted state, but still keep lock on table 'data_http'.
How can I avoid this? What should I do to get rid of that lock?


Re: Malformed orc file

2016-08-05 Thread Igor Kuzmenko
Unfortunately, I сan't provide more information, this file I got from our
tester and he already droped table.

On Thu, Aug 4, 2016 at 9:16 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Hi
>
> In case of streaming, when a transaction is open orc file is not closed
> and hence may not be flushed completely. Did the transaction commit
> successfully? Or was there any exception thrown during writes/commit?
>
> Thanks
> Prasanth
>
> On Aug 3, 2016, at 6:09 AM, Igor Kuzmenko <f1she...@gmail.com> wrote:
>
> Hello, I've got a malformed ORC file in my Hive table. File was created by
> Hive Streaming API and I have no idea under what circumstances it
> became corrupted.
>
> File on google drive: link
> <https://drive.google.com/file/d/0ByB92PAoAkrKeFFZRUN4WWVQY1U/view?usp=sharing>
>
> Exception message when trying to perform select from table:
>
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1468498236400_1106_6_00,
> diagnostics=[Task failed, taskId=task_1468498236400_1106_6_00_00,
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
> task:java.lang.RuntimeException: java.lang.RuntimeException:
> java.io.IOException: org.apache.hadoop.hive.ql.io.FileFormatException:
> Malformed ORC file hdfs://sorm-master01.msk.mts.
> ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/
> directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
> Invalid postscript length 0
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.
> initializeAndRunProcessor(TezProcessor.java:173)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(
> TezProcessor.java:139)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(
> LogicalIOProcessorRuntimeTask.java:344)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(
> TezTaskRunner.java:181)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(
> TezTaskRunner.java:172)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.
> callInternal(TezTaskRunner.java:172)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.
> callInternal(TezTaskRunner.java:168)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.IOException:
> org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file
> hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/
> pstn_connections/dt=20160711/directory_number_last_digit=5/
> delta_71700156_71700255/bucket_0. Invalid postscript length 0
> at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$
> TezGroupedSplitsRecordReader.initNextRecordReader(
> TezGroupedSplitsInputFormat.java:196)
> at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$
> TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
> at org.apache.tez.mapreduce.lib.MRReaderMapred.next(
> MRReaderMapred.java:113)
> at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.
> pushRecord(MapRecordSource.java:61)
> at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.
> run(MapRecordProcessor.java:326)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.
> initializeAndRunProcessor(TezProcessor.java:150)
> ... 14 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.ql.io.FileFormatException:
> Malformed ORC file hdfs://sorm-master01.msk.mts.
> ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/
> directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
> Invalid postscript length 0
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.
> handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.
> handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(
> HiveInputFormat.java:251)
> at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$
> TezGroupedSplitsRecordReader.initNextRecordReader(
> TezGroupedSplitsInputFormat.java:193)
> ... 19 more
> Caused by: org.apache.hadoop.hive.ql.io.FileFormatException: Malformed
> ORC file hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/

Re: Hive LIKE predicate. '_' wildcard decrease perfomance

2016-08-05 Thread Igor Kuzmenko
Thanks for reply, Gopal. Very helpful.

On Thu, Aug 4, 2016 at 10:15 PM, Gopal Vijayaraghavan 
wrote:

> > where res_url like '%mts.ru%'
> ...
> > where res_url like '%mts_ru%'
> ...
> > Why '_' wildcard decrease perfomance?
>
> Because it misses the fast path by just one "_".
>
> ORC vectorized reader has a zero-copy check for 3 patterns - prefix,
> suffix and middle.
>
> That means "https://%;, "%.html", "%mts.ru%" will hit the fast path -
> which uses StringExpr::equal() which JITs into the following.
>
> https://issues.apache.org/jira/secure/attachment/
> 12748720/string-intrinsic-
> sse.png
>
>
> In Hive-2.0, you can mix these up too to get "https:%mts%.html" in a
> ChainedChecker.
>
>
> Anything other than these 3 cases becomes a Regex and takes the slow path.
>
> The pattern you mentioned gets rewritten into ".*mts.ru.*" and the inner
> loop has a new String() as the input to the matcher + matcher.matches() in
> it.
>
> I've put in some patches recently which rewrite it Lazy regexes like
> ".?*mts.ru.?*", so the regex DFA will be smaller (HIVE-13196).
>
> That improves the case where the pattern is found, but does nothing to
> improve the performance of the new String() GC garbage.
>
> Cheers,
> Gopal
>
>
>


Hive LIKE predicate. '_' wildcard decrease perfomance

2016-08-04 Thread Igor Kuzmenko
I've got Hive Transactional table 'data_http' in ORC format, containing
around 100.000.000 rows.

When I execute query:

select * from data_http
where res_url like '%mts.ru%'

it completes in 10 seconds.

But executing query

select * from data_http
where res_url like '%mts_ru%'


takes more than 30 minutes.

Why '_' wildcard decrease perfomance?


Malformed orc file

2016-08-03 Thread Igor Kuzmenko
Hello, I've got a malformed ORC file in my Hive table. File was created by
Hive Streaming API and I have no idea under what circumstances it
became corrupted.

File on google drive: link


Exception message when trying to perform select from table:

ERROR : Vertex failed, vertexName=Map 1,
vertexId=vertex_1468498236400_1106_6_00, diagnostics=[Task failed,
taskId=task_1468498236400_1106_6_00_00, diagnostics=[TaskAttempt 0
failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: java.io.IOException:
org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://
sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
Invalid postscript length 0
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException:
org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://
sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
Invalid postscript length 0
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:326)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: java.io.IOException:
org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://
sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
Invalid postscript length 0
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC
file hdfs://
sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/pstn_connections/dt=20160711/directory_number_last_digit=5/delta_71700156_71700255/bucket_0.
Invalid postscript length 0
at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.ensureOrcFooter(ReaderImpl.java:236)
at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:376)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:317)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:238)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1259)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more

Does anyone encountered such a situation?


Re: Hive compaction didn't launch

2016-07-29 Thread Igor Kuzmenko
Here's how storm works right now:

After receiving new message, Storm determine in which partition it should
be written. Than, check is there any open connection to that HiveEndPoint
<https://github.com/apache/hive/blob/release-1.2.1/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java>,
if not - creates one and fetches new transaction batch. Here I assume, that
this transactions can only be used to write data at one HiveEndPoint only,
because when we fetched transaction batch we pass RecordWriter to fetch
method (StreamingConnection::fetchTransactionBatch
<https://github.com/apache/hive/blob/release-1.2.1/hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/StreamingConnection.java>).
So I don't see a case "*After they write to A they may choose to write to B
and then commit*". It seems, that Streaming API doesn't support this
feature.

Storm keep receiving messages and when his message buffer is full or after
fixed period of time it flushes all the messages (performing commit in
terms of hive streaming).
And here's interesting part if there's nothing to flush Storm will do
nothing. (HiveWriter
<https://github.com/apache/storm/blob/v1.0.0/external/storm-hive/src/main/java/org/apache/storm/hive/common/HiveWriter.java>
)

public void flush(boolean rollToNext)
throws CommitFailure, TxnBatchFailure, TxnFailure,
InterruptedException {
// if there are no records do not call flush
if (totalRecords <= 0) return;
try {
synchronized(txnBatchLock) {
commitTxn();
nextTxn(rollToNext);
totalRecords = 0;
lastUsed = System.currentTimeMillis();
}
} catch(StreamingException e) {
throw new TxnFailure(txnBatch, e);
}
}


At the same time Storm maintain all fetched transactions in separate thread
by sending heartbeats (HiveBolt
<https://github.com/apache/storm/blob/v1.0.0/external/storm-hive/src/main/java/org/apache/storm/hive/bolt/HiveBolt.java>
):

private void setupHeartBeatTimer() {
if(options.getHeartBeatInterval()>0) {
heartBeatTimer.schedule(new TimerTask() {
@Override
public void run() {
try {
if (sendHeartBeat.get()) {
LOG.debug("Start sending heartbeat on all writers");
sendHeartBeatOnAllWriters();
setupHeartBeatTimer();
}
} catch (Exception e) {
LOG.warn("Failed to heartbeat on HiveWriter ", e);
}
}
}, options.getHeartBeatInterval() * 1000);
}
}

The only way idle connection will be closed is excess connections limit
which is configurable parameter, but I can't control this event explicitly.
Making transaction batch smaller doesn't help either. Even if batch size is
1, after flushing the data Storm will get another transaction batch and
will wait new messages, which may not come for a long time.

I don't see any way to fix this problem with proper configuration, I need
to make changes in Hive or Storm code. Question is where it more
appropriate?




On Fri, Jul 29, 2016 at 8:15 AM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> I think Storm has some timeout parameter that will close the transaction
> if there are no events for a certain amount of time.
> How many transactions do you per transaction batch?  Perhaps making the
> batches smaller will make them close sooner.
>
> Eugene
>
>
> On 7/28/16, 3:59 PM, "Alan Gates" <alanfga...@gmail.com> wrote:
>
> >But until those transactions are closed you don¹t know that they won¹t
> >write to partition B.  After they write to A they may choose to write to
> >B and then commit.  The compactor can not make any assumptions about what
> >sessions with open transactions will do in the future.
> >
> >Alan.
> >
> >> On Jul 28, 2016, at 09:19, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >>
> >> But this minOpenTxn value isn't from from delta I want to compact.
> >>minOpenTxn can point on transaction in partition A while in partition B
> >>there's deltas ready for compaction. If minOpenTxn is less than txnIds
> >>in partition B deltas, compaction won't happen. So open transaction in
> >>partition A blocks compaction in partition B. That's seems wrong to me.
> >>
> >> On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com>
> >>wrote:
> >> Hive is doing the right thing there, as it cannot compact the deltas
> >>into a base file while there are still open transactions in the delta.
> >>Storm should be committing on some frequency even if it doesn¹t have
> >>enough data to commit.
> >>
> >> 

Re: Hive compaction didn't launch

2016-07-28 Thread Igor Kuzmenko
But this *minOpenTxn* value isn't from from delta I want to compact.
*minOpenTxn* can point on transaction in partition *A *while in partition *B
*there's deltas ready for compaction. If *minOpenTxn* is less than txnIds
in partition *B *deltas, compaction won't happen. So open transaction in
partition *A *blocks compaction in partition *B*. That's seems wrong to me.

On Thu, Jul 28, 2016 at 7:06 PM, Alan Gates <alanfga...@gmail.com> wrote:

> Hive is doing the right thing there, as it cannot compact the deltas into
> a base file while there are still open transactions in the delta.  Storm
> should be committing on some frequency even if it doesn’t have enough data
> to commit.
>
> Alan.
>
> > On Jul 28, 2016, at 05:36, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> > I made some research on that issue.
> > The problem is in ValidCompactorTxnList::isTxnRangeValid method.
> >
> > Here's code:
> > @Override
> > public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
> >   if (highWatermark < minTxnId) {
> > return RangeResponse.NONE;
> >   } else if (minOpenTxn < 0) {
> > return highWatermark >= maxTxnId ? RangeResponse.ALL :
> RangeResponse.NONE;
> >   } else {
> > return minOpenTxn > maxTxnId ? RangeResponse.ALL :
> RangeResponse.NONE;
> >   }
> > }
> >
> > In my case this method returned RangeResponce.NONE for most of delta
> files. With this value delta file doesn't include in compaction.
> >
> > Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger
> return RangeResponce.NONE, thats a problem for me, because of using Storm
> Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat
> until there's data to commit.
> >
> > So if i get transaction and maintain it open all compactions will stop.
> Is it incorrect Hive behavior, or Storm should close transaction?
> >
> >
> >
> >
> > On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com>
> wrote:
> > Thanks for reply, Alan. My guess with Storm was wrong. Today I get same
> behavior with running Storm topology.
> > Anyway, I'd like to know, how can I check that transaction batch was
> closed correctly?
> >
> > On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com>
> wrote:
> > I don’t know the details of how the storm application that streams into
> Hive works, but this sounds like the transaction batches weren’t getting
> closed.  Compaction can’t happen until those batches are closed.  Do you
> know how you had storm configured?  Also, you might ask separately on the
> storm list to see if people have seen this issue before.
> >
> > Alan.
> >
> > > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
> > >
> > > One more thing. I'm using Apache Storm to stream data in Hive. And
> when I turned off Storm topology compactions started to work properly.
> > >
> > > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com>
> wrote:
> > > I'm using Hive 1.2.1 transactional table. Inserting data in it via
> Hive Streaming API. After some time i expect compaction to start but it
> didn't happen:
> > >
> > > Here's part of log, which shows that compactor initiator thread
> doesn't see any delta files:
> > > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
> (Initiator.java:run(89)) - Checking to see if we should compact
> default.data_aaa.dt=20160726
> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
> (AcidUtils.java:getAcidState(432)) - in directory hdfs://
> sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
> base = null deltas = 0
> > > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
> (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
> threshold: 0.1 will major compact: false
> > >
> > > But in that directory there's actually 23 files:
> > >
> > > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> > > Found 23 items
> > > -rw-r--r--   3 storm hdfs  4 2016-07-26 17:20
> /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> > > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:22
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> > > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:23
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> > > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:25
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> > 

Re: Hive compaction didn't launch

2016-07-28 Thread Igor Kuzmenko
I made some research on that issue.
The problem is in ValidCompactorTxnList::isTxnRangeValid
<https://github.com/apache/hive/blob/release-1.2.1/metastore/src/java/org/apache/hadoop/hive/metastore/txn/ValidCompactorTxnList.java>
method.

Here's code:

@Override
public RangeResponse isTxnRangeValid(long minTxnId, long maxTxnId) {
  if (highWatermark < minTxnId) {
return RangeResponse.NONE;
  } else if (minOpenTxn < 0) {
return highWatermark >= maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
  } else {
return minOpenTxn > maxTxnId ? RangeResponse.ALL : RangeResponse.NONE;
  }
}


In my case this method returned RangeResponce.NONE for most of delta files.
With this value delta file doesn't include in compaction.

Last 'else' bock compare minOpenTxn to maxTxnId and if maxTxnId bigger
return *RangeResponce.NONE, *thats a problem for me, because of using Storm
Hive Bolt. Hive Bolt gets transaction and maintain it open with heartbeat
until there's data to commit.

So if i get transaction and maintain it open all compactions will stop. Is
it incorrect Hive behavior, or Storm should close transaction?




On Wed, Jul 27, 2016 at 8:46 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:

> Thanks for reply, Alan. My guess with Storm was wrong. Today I get same
> behavior with running Storm topology.
> Anyway, I'd like to know, how can I check that transaction batch was
> closed correctly?
>
> On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote:
>
>> I don’t know the details of how the storm application that streams into
>> Hive works, but this sounds like the transaction batches weren’t getting
>> closed.  Compaction can’t happen until those batches are closed.  Do you
>> know how you had storm configured?  Also, you might ask separately on the
>> storm list to see if people have seen this issue before.
>>
>> Alan.
>>
>> > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
>> >
>> > One more thing. I'm using Apache Storm to stream data in Hive. And when
>> I turned off Storm topology compactions started to work properly.
>> >
>> > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com>
>> wrote:
>> > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive
>> Streaming API. After some time i expect compaction to start but it didn't
>> happen:
>> >
>> > Here's part of log, which shows that compactor initiator thread doesn't
>> see any delta files:
>> > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
>> (Initiator.java:run(89)) - Checking to see if we should compact
>> default.data_aaa.dt=20160726
>> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
>> (AcidUtils.java:getAcidState(432)) - in directory hdfs://
>> sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
>> base = null deltas = 0
>> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
>> (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
>> threshold: 0.1 will major compact: false
>> >
>> > But in that directory there's actually 23 files:
>> >
>> > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
>> > Found 23 items
>> > -rw-r--r--   3 storm hdfs  4 2016-07-26 17:20
>> /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:22
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:23
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:25
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:26
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:27
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:29
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:30
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:32
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:33
>> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
>> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:

Re: Hive compaction didn't launch

2016-07-27 Thread Igor Kuzmenko
Thanks for reply, Alan. My guess with Storm was wrong. Today I get same
behavior with running Storm topology.
Anyway, I'd like to know, how can I check that transaction batch was closed
correctly?

On Wed, Jul 27, 2016 at 8:09 PM, Alan Gates <alanfga...@gmail.com> wrote:

> I don’t know the details of how the storm application that streams into
> Hive works, but this sounds like the transaction batches weren’t getting
> closed.  Compaction can’t happen until those batches are closed.  Do you
> know how you had storm configured?  Also, you might ask separately on the
> storm list to see if people have seen this issue before.
>
> Alan.
>
> > On Jul 27, 2016, at 03:31, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> > One more thing. I'm using Apache Storm to stream data in Hive. And when
> I turned off Storm topology compactions started to work properly.
> >
> > On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com>
> wrote:
> > I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive
> Streaming API. After some time i expect compaction to start but it didn't
> happen:
> >
> > Here's part of log, which shows that compactor initiator thread doesn't
> see any delta files:
> > 2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
> (Initiator.java:run(89)) - Checking to see if we should compact
> default.data_aaa.dt=20160726
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
> (AcidUtils.java:getAcidState(432)) - in directory hdfs://
> sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
> base = null deltas = 0
> > 2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
> (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
> threshold: 0.1 will major compact: false
> >
> > But in that directory there's actually 23 files:
> >
> > hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> > Found 23 items
> > -rw-r--r--   3 storm hdfs  4 2016-07-26 17:20
> /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:22
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:23
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:25
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:26
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:27
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:29
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:30
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:32
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:33
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:34
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:36
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:37
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:39
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:40
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:41
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:43
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:44
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:46
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:47
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:48
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
> > drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
> >
> > Full log here.
> >
> > What could go wrong?
> >
>
>


Re: Hive compaction didn't launch

2016-07-27 Thread Igor Kuzmenko
One more thing. I'm using Apache Storm to stream data in Hive. And when I
turned off Storm topology compactions started to work properly.

On Tue, Jul 26, 2016 at 6:28 PM, Igor Kuzmenko <f1she...@gmail.com> wrote:

> I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive
> Streaming API. After some time i expect compaction to start but it didn't
> happen:
>
> Here's part of log, which shows that compactor initiator thread doesn't
> see any delta files:
> *2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
> (Initiator.java:run(89)) - Checking to see if we should compact
> default.data_aaa.dt=20160726*
> *2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
> (AcidUtils.java:getAcidState(432)) - in directory
> hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
> base = null deltas = 0*
>
> *2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
> (Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
> threshold: 0.1 will major compact: false*
>
> But in that directory there's actually 23 files:
>
> hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
> Found 23 items
> -rw-r--r--   3 storm hdfs  4 2016-07-26 17:20
> /apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:22
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:23
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:25
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:26
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:27
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:29
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:30
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:32
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:33
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:34
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:36
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:37
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:39
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:40
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:41
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:43
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:44
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:46
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:47
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:48
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
> drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
> /apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955
>
>
> Full log here <http://pastebin.com/gHwvgRUV>.
>
> What could go wrong?
>


Hive compaction didn't launch

2016-07-26 Thread Igor Kuzmenko
I'm using Hive 1.2.1 transactional table. Inserting data in it via Hive
Streaming API. After some time i expect compaction to start but it didn't
happen:

Here's part of log, which shows that compactor initiator thread doesn't see
any delta files:
*2016-07-26 18:06:52,459 INFO  [Thread-8]: compactor.Initiator
(Initiator.java:run(89)) - Checking to see if we should compact
default.data_aaa.dt=20160726*
*2016-07-26 18:06:52,496 DEBUG [Thread-8]: io.AcidUtils
(AcidUtils.java:getAcidState(432)) - in directory
hdfs://sorm-master01.msk.mts.ru:8020/apps/hive/warehouse/data_aaa/dt=20160726
base = null deltas = 0*

*2016-07-26 18:06:52,496 DEBUG [Thread-8]: compactor.Initiator
(Initiator.java:determineCompactionType(271)) - delta size: 0 base size: 0
threshold: 0.1 will major compact: false*

But in that directory there's actually 23 files:

hadoop fs -ls /apps/hive/warehouse/data_aaa/dt=20160726
Found 23 items
-rw-r--r--   3 storm hdfs  4 2016-07-26 17:20
/apps/hive/warehouse/data_aaa/dt=20160726/_orc_acid_version
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:22
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71741256_71741355
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:23
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71762456_71762555
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:25
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71787756_71787855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:26
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71795756_71795855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:27
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71804656_71804755
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:29
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71828856_71828955
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:30
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71846656_71846755
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:32
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71850756_71850855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:33
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71867356_71867455
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:34
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71891556_71891655
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:36
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71904856_71904955
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:37
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71907256_71907355
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:39
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71918756_71918855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:40
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71947556_71947655
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:41
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71960656_71960755
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:43
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71963156_71963255
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:44
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71964556_71964655
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:46
/apps/hive/warehouse/data_aaa/dt=20160726/delta_71987156_71987255
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:47
/apps/hive/warehouse/data_aaa/dt=20160726/delta_72015756_72015855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:48
/apps/hive/warehouse/data_aaa/dt=20160726/delta_72021356_72021455
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
/apps/hive/warehouse/data_aaa/dt=20160726/delta_72048756_72048855
drwxrwxrwx   - storm hdfs  0 2016-07-26 17:50
/apps/hive/warehouse/data_aaa/dt=20160726/delta_72070856_72070955


Full log here .

What could go wrong?


Does HIVE JDBC return same sequence of records?

2016-07-04 Thread Igor Kuzmenko
If I perform query "*SELECT * FROM table t WHERE t.partition = value" *with
Hive JDBC several times is there garantee, that when I will iterate throw
result set I get records in the same order every time?
Intuitively, it feels yes, because in that query ther's no MapReduce and
hive just read data from HDFS directory.


What is the best way to store IPv6 address in Hive?

2016-06-28 Thread Igor Kuzmenko
Currently I'm using ORC transactional tables, and i need to store a lot of
data containing IP addresses.
With IPv4 it can be a Integer (4 bytes exacty), but what about IPv6?
Obiously it should be space efficient and easy to search for exact match.
As extra feature it would be good to do fast search with mask (10.120.*.*)


Re: Delete hive partition while executing query.

2016-06-09 Thread Igor Kuzmenko
I've opened jira issue <https://issues.apache.org/jira/browse/HIVE-13978>

On Wed, Jun 8, 2016 at 9:01 PM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> This looks like proof of a bug.  The reads locks 179730 and 179731 should
> have been blocked by 179729.
> As Alan said this won’t prevent the exception you are getting but it needs
> to be fixed to prevent a partition from disappearing while query 3 and 4
> are in progress.
>
> Could you file a Jira please?
>
> thanks,
> Eugene
>
> From: Igor Kuzmenko <f1she...@gmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Wednesday, June 8, 2016 at 7:30 AM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Re: Delete hive partition while executing query.
>
> Hi, thanks for reply, Alan.
> Here's one more test.
>
> Wed Jun 08 16:36:02 MSK 2016 Start thread 1
> Wed Jun 08 16:36:05 MSK 2016 Start thread 2
> Wed Jun 08 16:36:08 MSK 2016 Start thread 3
> Wed Jun 08 16:36:11 MSK 2016 Start thread 4
> Wed Jun 08 16:36:17 MSK 2016 Finish thread 1
> Wed Jun 08 16:36:17 MSK 2016 Thread 1 result: '344186'
> Wed Jun 08 16:36:17 MSK 2016 Thread 1 completed in 14443 ms
>
> Wed Jun 08 16:36:19 MSK 2016 Finished 2
> Wed Jun 08 16:36:19 MSK 2016 Thread 2 completed in 13967 ms
>
> Wed Jun 08 16:36:20 MSK 2016 Finish thread 3
> Wed Jun 08 16:36:20 MSK 2016 Thread 3 result: '344186'
> Wed Jun 08 16:36:20 MSK 2016 Thread 3 completed in 11737 ms
>
> java.sql.SQLException: Error while processing statement: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask.
> Vertex failed, vertexName=Map 1, vertexId=vertex_1461923723503_0931_1_00,
> diagnostics=[Vertex vertex_1461923723503_0931_1_00 [Map 1] killed/failed
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: mobile_connections
> initializer failed, vertex=vertex_1461923723503_0931_1_00 [Map 1],
> java.lang.RuntimeException: serious problem
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)
>
>
>
> HiveMetaStore.log
> <https://drive.google.com/open?id=0ByB92PAoAkrKVU1rT25OcjJuOEk>
> HiveServer2.log
> <https://drive.google.com/open?id=0ByB92PAoAkrKakhqYXhmdzlidkk>
>
> I didn't find anything intresing in metastore log, but HiveServer2 log
> contains this:
>
> Line 1023: 2016-06-08 16:36:04,456 INFO  [HiveServer2-Background-Pool:
> Thread-42]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) -
> Requesting:
> queryId=hive_20160608163602_542056d9-c524-4df4-af18-6aa5e906284f
> LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE,
> dbname:default, tablename:mobile_connections),
> LockComponent(type:SHARED_READ, level:PARTITION, dbname:default,
> tablename:mobile_connections,
> partitionname:dt=20151123/msisdn_last_digit=3)], txnid:0, user:hdfs,
> hostname:mercury)
> Line 1043: 2016-06-08 16:36:04,546 INFO  [HiveServer2-Background-Pool:
> Thread-42]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response
> to queryId=hive_20160608163602_542056d9-c524-4df4-af18-6aa5e906284f
> LockResponse(lockid:179728, state:ACQUIRED)
> Line 1349: 2016-06-08 16:36:05,214 INFO  [HiveServer2-Background-Pool:
> Thread-50]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) -
> Requesting:
> queryId=hive_20160608163604_832abbff-6199-497e-b969-fd8ac1465abc
> LockRequest(component:[LockComponent(type:EXCLUSIVE, level:PARTITION,
> dbname:default, tablename:mobile_connections,
> partitionname:dt=20151123/msisdn_last_digit=3)], txnid:0, user:hdfs,
> hostname:mercury)
> Line 1390: 2016-06-08 16:36:05,270 INFO  [HiveServer2-Background-Pool:
> Thread-50]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response
> to queryId=hive_20160608163604_832abbff-6199-497e-b969-fd8ac1465abc
> LockResponse(lockid:179729, state:WAITING)
> Line 2346: 2016-06-08 16:36:08,028 INFO  [HiveServer2-Background-Pool:
> Thread-68]: lockmgr.DbLockManager (DbLockManager.java:lock(98)) -
> Requesting:
> queryId=hive_20160608163607_7b18da12-6f86-41c9-b4b1-be45252c18c2
> LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE,
> dbname:default, tablename:mobile_connections),
> LockComponent(type:SHARED_READ, level:PARTITION, dbname:default,
> tablename:mobile_connections,
> partitionname:dt=20151123/msisdn_last_digit=3)], txnid:0, user:hdfs,
> hostname:mercury)
> Line 2370: 2016-06-08 16:36:08,069 INFO  [HiveServer2-Background-Pool:
> Thread-68]: lockmgr.DbLockManager (DbLockManager.java:lock(101)) - Response
> to queryId=hive_20160608163607_7b18da12-6f86-41c9-b4b1-be45252c18c2
> LockResponse(lockid:179730, state:ACQUIRED)
> Line 3561: 2016-06-08 16:36:11,000 INFO  [HiveServer2-

Re: Delete hive partition while executing query.

2016-06-08 Thread Igor Kuzmenko
[HiveServer2-Background-Pool:
Thread-50]: lockmgr.DbLockManager (DbLockManager.java:unlock(185)) -
Removed a lock true
Line 4976: 2016-06-08 16:36:19,439 DEBUG [HiveServer2-Background-Pool:
Thread-68]: lockmgr.DbLockManager (DbLockManager.java:unlock(182)) -
Unlocking lockid:179730
Line 4979: 2016-06-08 16:36:19,486 DEBUG [HiveServer2-Background-Pool:
Thread-68]: lockmgr.DbLockManager (DbLockManager.java:unlock(185)) -
Removed a lock true
Line 5467: 2016-06-08 16:36:28,090 DEBUG [HiveServer2-Background-Pool:
Thread-91]: lockmgr.DbLockManager (DbLockManager.java:unlock(182)) -
Unlocking lockid:179731
Line 5470: 2016-06-08 16:36:28,131 DEBUG [HiveServer2-Background-Pool:
Thread-91]: lockmgr.DbLockManager (DbLockManager.java:unlock(185)) -
Removed a lock true


On Tue, Jun 7, 2016 at 9:01 PM, Alan Gates <alanfga...@gmail.com> wrote:

> Eugene Koifman pointed out to me that there is one other possibility.  If
> we ignore query 3 for a minute, I can explain this with the following
> timeline:
>
> 1) Query one requests and gets a read lock, starts its select.
> 2) Query two (alter table…) requests an exclusive lock, and then blocks
> until query one has completed.
> 3) Query four requests a read lock, and then blocks until query two has
> completed.  Once it can run it finds that part of its inputs have gone away
> and thus fails with an error.
>
> Hive’s metadata is not part of the transactional system.  So we don’t keep
> the old dropped partition around after query two drops it.  So the error
> from query four is actually the correct thing to do.  If four ignored the
> fact that the partition had vanished and returned results minus that
> partition it would not be returning correct results as of the time its
> transaction started.  Because the underlying data has been dropped the
> query is now unanswerable an error is the best possible answer.
>
> This situation cannot be fixed until Hive DDL operations take part in
> transactions.  I don’t know when that will happen.
>
> However, this explanation is incomplete because it doesn’t account for
> query three.  That query should also have blocked.  The code is not
> supposed to allow “jumping” exclusive locks.  That is, query three
> shouldn’t be allowed to acquire the read lock along with query one because
> query two’s exclusive lock is in between.  It’s possible there’s a bug in
> that code.  It’s also possible that the above explanation is not what’s
> happening.
>
> Would it be possible for you to turn on debug logging on your thrift
> metastore process and rerun this test and post the logs somewhere?  Apache
> lists strip attachments so you won’t be able to attach them here, you’ll
> have to put them up on paste bin or something.
>
> Alan.
>
> > On Jun 7, 2016, at 04:00, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> >   • hive.support.concurrency – true
> >   • hive.enforce.bucketing – true (Not required as of Hive 2.0)
> >   • hive.exec.dynamic.partition.mode – nonstrict
> >   • hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
> >   • hive.compactor.initiator.on – true (for exactly one instance of
> the Thrift metastore service)
> >   • hive.compactor.worker.threads – a positive number on at least
> one instance of the Thrift metastore service
> > I've double check all settings and restart hive. The problem remained.
> > I've complete the same test with little modification in console output.
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=1", 1);
> > Thread.sleep(3000);
> > asyncExecute("alter table mobile_connections drop if exists partition
> (dt=20151124, msisdn_last_digit=1) purge", 2);
> > Thread.sleep(3000);
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=1", 3);
> > Thread.sleep(3000);
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=1", 4);
> >
> > Like in previous test four queries executing in parallel. First, third
> and fourth are simple selects. Second is drop partition command.
> > Here's out put:
> >
> > Tue Jun 07 13:45:18 MSK 2016 Start thread 1
> > Tue Jun 07 13:45:21 MSK 2016 Start thread 2
> > Tue Jun 07 13:45:24 MSK 2016 Start thread 3
> > Tue Jun 07 13:45:27 MSK 2016 Start thread 4
> >
> > Tue Jun 07 13:45:32 MSK 2016 Finish thread 1
> > Tue Jun 07 13:45:32 MSK 2016 Thread 1 result: '210802'
> > Tue Jun 07 13:45:32 MSK 2016 Thread 1 completed in 13810 ms
> >
> > Tue Jun 07 13:

Re: Delete hive partition while executing query.

2016-06-07 Thread Igor Kuzmenko
rt(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1036)
... 15 more

So second thread definitely waits until first thread completes and than
make a partition drop. Than, somehow, after partition was droped, third
query completes and shows result. Fourth query doesn't complete at all,
throwing exception.






On Mon, Jun 6, 2016 at 8:30 PM, Alan Gates <alanfga...@gmail.com> wrote:

> Do you have the system configured to use the DbTxnManager?  See
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
> for details on how to set this up.  The transaction manager is what manages
> locking and makes sure that your queries don’t stomp each other.
>
> Alan.
>
> > On Jun 6, 2016, at 06:01, Igor Kuzmenko <f1she...@gmail.com> wrote:
> >
> > Hello, I'm trying to find a safe way to delete partition with all data
> it includes.
> >
> > I'm using Hive 1.2.1, Hive JDBC driver 1.2.1 and perform simple test on
> transactional table:
> >
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=2", 1);
> > Thread.sleep(3000);
> > asyncExecute("alter table mobile_connections drop if exists partition
> (dt=20151124, msisdn_last_digit=2) purge", 2);
> > Thread.sleep(3000);
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=2", 3);
> > Thread.sleep(3000);
> > asyncExecute("Select count(distinct in_info_msisdn) from
> mobile_connections where dt=20151124 and msisdn_last_digit=2", 4);
> > (full code here)
> >
> > I cretate several threads, each execute query async. First is querying
> partition. Second drop partition. Others are the same as first. First query
> takes about 10-15 seconds to complete, so "alter table" query starts before
> first query completes.
> > As a result i get:
> >   • First query - successfully completes
> >   • Second query - successfully completes
> >   • Third query - successfully completes
> >   • Fourth query - throw exception:
> > java.sql.SQLException: Error while processing statement: FAILED:
> Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map
> 1, vertexId=vertex_1461923723503_0189_1_00, diagnostics=[Vertex
> vertex_1461923723503_0189_1_00 [Map 1] killed/failed due
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: mobile_connections initializer
> failed, vertex=vertex_1461923723503_0189_1_00 [Map 1],
> java.lang.RuntimeException: serious problem
> >   at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)
> >   at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1086)
> >   at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
> >   at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
> >   at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
> >   at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:255)
> >   at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:248)
> >   at java.security.AccessController.doPrivileged(Native Method)
> >   at javax.security.auth.Subject.doAs(Subject.java:422)
> >   at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> >   at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:248)
> >   at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:235)
> >   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >   at java.lang.Thread.run(Thread.java:745)
> > Caused by: java.util.concurrent.ExecutionException:
> java.io.FileNotFoundException: File
> hdfs://jupiter.bss:8020/apps/hive/warehouse/mobile_connections/dt=20151124/msisdn_last_digit=2
> does not exist.
> >   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> >   at java.util.concurrent

Delete hive partition while executing query.

2016-06-06 Thread Igor Kuzmenko
Hello, I'm trying to find a safe way to delete partition with all data it
includes.

I'm using Hive 1.2.1, Hive JDBC driver 1.2.1 and perform simple test on
transactional table:

asyncExecute("Select count(distinct in_info_msisdn) from
mobile_connections where dt=20151124 and msisdn_last_digit=2", 1);
Thread.sleep(3000);
asyncExecute("alter table mobile_connections drop if exists partition
(dt=20151124, msisdn_last_digit=2) purge", 2);
Thread.sleep(3000);
asyncExecute("Select count(distinct in_info_msisdn) from
mobile_connections where dt=20151124 and msisdn_last_digit=2", 3);
Thread.sleep(3000);
asyncExecute("Select count(distinct in_info_msisdn) from
mobile_connections where dt=20151124 and msisdn_last_digit=2", 4);

(full code here )

I cretate several threads, each execute query async. First is querying
partition. Second drop partition. Others are the same as first. First query
takes about 10-15 seconds to complete, so "alter table" query starts before
first query completes.
As a result i get:

   - First query - successfully completes
   - Second query - successfully completes
   - Third query - successfully completes
   - Fourth query - throw exception:

java.sql.SQLException: Error while processing statement: FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask.
Vertex failed, vertexName=Map 1, vertexId=vertex_1461923723503_0189_1_00,
diagnostics=[Vertex vertex_1461923723503_0189_1_00 [Map 1] killed/failed
due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: mobile_connections
initializer failed, vertex=vertex_1461923723503_0189_1_00 [Map 1],
java.lang.RuntimeException: serious problem
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1086)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
at
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:255)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:248)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:248)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:235)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException:
java.io.FileNotFoundException: File
hdfs://jupiter.bss:8020/apps/hive/warehouse/mobile_connections/dt=20151124/msisdn_last_digit=2
does not exist.
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1036)
... 15 more
Caused by: java.io.FileNotFoundException: File
hdfs://jupiter.bss:8020/apps/hive/warehouse/mobile_connections/dt=20151124/msisdn_last_digit=2
does not exist.
at
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:958)
at
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:937)
at
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:882)
at
org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:878)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:878)
at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1694)
at
org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:690)
at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:366)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:648)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
... 4 more
]Vertex killed, vertexName=Reducer 3,
vertexId=vertex_1461923723503_0189_1_02, diagnostics=[Vertex received Kill
in INITED state., Vertex vertex_1461923723503_0189_1_02 [Reducer 3]
killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer
2, 

Hive Hcatalog Streaming. Why hive table must be bucketed?

2016-04-08 Thread Igor Kuzmenko
Hello I've got few questions about Hive HCatalog streaming
.
This feature has requirement:
"*The Hive table must be bucketed
,
but not sorted. So something like “clustered by (colName) into 10 **buckets”
must be specified during table creation. The number of buckets is ideally
the same as the number of streaming writers*."

1) I wonder why it is required condition of streaming?
2) How many buckets should I create, when number of streaming writers
changes over time (for example from 1 to 10)?


Hive StreamingAPI leaves table in not consistent state

2016-03-10 Thread Igor Kuzmenko
Hello, I'm using Hortonworks Data Platform 2.3.4 which includes Apache Hive
1.2.1 and Apache Storm 0.10.
I've build Storm topology using Hive Bolt, which eventually using Hive
StreamingAPI to stream data into hive table.
In Hive I've created transactional table:


   1. CREATE EXTERNAL TABLE cdr1 (
   2. 
   3. )
   4. PARTITIONED BY (dt INT)
   5. CLUSTERED BY (telcoId) INTO 5 buckets
   6. STORED AS ORC
   7. LOCATION '/data/sorm3/cdr/cdr1'
   8. TBLPROPERTIES ("transactional"="true")


Hive settings:


   1. hive.support.concurrency=true
   2. hive.enforce.bucketing=true
   3. hive.exec.dynamic.partition.mode=nonstrict
   4. hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
   5. hive.compactor.initiator.on=true
   6. hive.compactor.worker.threads=1


When I run my Storm Topology it fails with OutOfMemoryException. The Storm
exception doesn't bother me, it was just a test. But after topology fail my
Hive table is not consistent.
Simple select from table leads into exception:

SELECT COUNT(*) FROM cdr1
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1,
vertexId=vertex_1453891518300_0098_1_00, diagnostics=[Task failed,
taskId=task_1453891518300_0098_1_00_00, diagnostics=[TaskAttempt 0
failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: java.io.IOException: java.io.EOFException

Caused by: java.io.IOException: java.io.EOFException
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:370)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:317)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:238)
at
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:460)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1269)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1
killedTasks:0, Vertex vertex_1453891518300_0098_1_00 [Map 1] killed/failed
due to:OWN_TASK_FAILURE]
ERROR : Vertex killed, vertexName=Reducer 2,
vertexId=vertex_1453891518300_0098_1_01, diagnostics=[Vertex received Kill
while in RUNNING state., Vertex did not succeed due to
OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex
vertex_1453891518300_0098_1_01 [Reducer 2] killed/failed due
to:OTHER_VERTEX_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
killedVertices:1



Compaction fails with same exception:

2016-03-10 13:20:54,550 WARN [main] org.apache.hadoop.mapred.YarnChild:
Exception running child : java.io.EOFException: Cannot seek after EOF
at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1488)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:368)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:317)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:238)
at
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:460)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1362)
at
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:565)
at
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:544)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)



Looking throw files that was created by streaming I've found several zero
sized ORC files. Probably these files leads to exception.


Is it normal for hive transactional table? How can I prevent such behavior?