[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-04-05 Thread mahesh kumar behera (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517302#comment-17517302
 ] 

mahesh kumar behera commented on HIVE-25540:


[~zabetak] 

The batch update is tested in scale for mysql and Postgres backend only. 

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-03-21 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509772#comment-17509772
 ] 

Stamatis Zampetakis commented on HIVE-25540:


[~pvary] The changes in HIVE-26040 do seem reasonable for the problem I 
discovered, many thanks for the quick fix.

However, I found this issue just by running one random qtest on a metastore 
using mssql so I cannot say with confidence that now we have sufficient test 
coverage for claiming that this feature (HIVE-25181) is production ready in 
*all* databases; I let [~maheshk114] answer this question.

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-03-16 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507813#comment-17507813
 ] 

Peter Vary commented on HIVE-25540:
---

[~zabetak], [~maheshk114]: I have a fix for the issue above: See: HIVE-26040. 
Could you please check that successfully running the below check confirms that 
the fix is enough to close this jira?:
{code}
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=mssql
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=oracle
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=postgres
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=derby
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=mysql
{code}

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-03-16 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17507810#comment-17507810
 ] 

Peter Vary commented on HIVE-25540:
---

Created a Jira to fix the issue mentioned above: HIVE-26040

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-03-15 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506975#comment-17506975
 ] 

Stamatis Zampetakis commented on HIVE-25540:


I think it would be good solve this JIRA before releasing 4.0.0-alpha-1 to 
avoid failures like the one outlined above.

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-03-15 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506973#comment-17506973
 ] 

Stamatis Zampetakis commented on HIVE-25540:


Today I was running a few tests (over commit 
https://github.com/apache/hive/commit/d696b34a5765fe950ebe4bfffd36b9ea914dfaab) 
with various kind of metastore backends (e.g., MicrosoftSQLServer) for another 
JIRA case and I bumped into a exceptions with directsql and updating statistics 
which I think are related/ can be solved by this JIRA.

{code:bash}
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=list_bucket_dml_9.q 
-Dtest.metastore.db=mssql
{code}

{noformat}
2022-03-15T07:57:17,078 ERROR [2b933b88-6083-4750-b151-2d2c7e04ccce main] 
metastore.DirectSqlUpdateStat: Unable to 
getNextCSIdForMPartitionColumnStatistics
com.microsoft.sqlserver.jdbc.SQLServerException: Line 1: FOR UPDATE clause 
allowed only for DECLARE CURSOR.
at 
com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:258)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1535)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:845)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:752)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7151) 
~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2478)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:219)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:199)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:654)
 ~[mssql-jdbc-6.2.1.jre8.jar:?]
at 
com.zaxxer.hikari.pool.ProxyStatement.executeQuery(ProxyStatement.java:108) 
~[HikariCP-2.6.1.jar:?]
at 
com.zaxxer.hikari.pool.HikariProxyStatement.executeQuery(HikariProxyStatement.java)
 ~[HikariCP-2.6.1.jar:?]
at 
org.apache.hadoop.hive.metastore.DirectSqlUpdateStat.getNextCSIdForMPartitionColumnStatistics(DirectSqlUpdateStat.java:676)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.updatePartitionColumnStatisticsBatch(MetaStoreDirectSql.java:2966)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatisticsInBatch(ObjectStore.java:9849)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_261]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_261]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_261]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
[hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
com.sun.proxy.$Proxy60.updatePartitionColumnStatisticsInBatch(Unknown Source) 
[?:?]
at 
org.apache.hadoop.hive.metastore.HMSHandler.updatePartitionColStatsForOneBatch(HMSHandler.java:7060)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.updatePartitionColStatsInBatch(HMSHandler.java:7113)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.set_aggr_stats_for(HMSHandler.java:9137)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_261]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_261]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_261]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:146)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
 [hive-standalone-metastore-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy61.set_aggr_stats_for(Unknown 

[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2021-12-14 Thread mahesh kumar behera (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459241#comment-17459241
 ] 

mahesh kumar behera commented on HIVE-25540:


[~zabetak] 

The batch update uses direct SQL to optimize the number of backend database 
calls. Some of the SQL used are not supported by Oracle. So we need to put a 
check to go via DN if the backend DB is Oracle. Currently we have tested only 
in Mysql and Postgres. Batch update  feature is not yet shipped.

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)