from:"Rajesh Balamohan \(Jira\)"

[jira] [Commented] (HIVE-19360) CBO: Add an "optimizedSQL" to QueryPlan object

2018-04-30 Thread Rajesh Balamohan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459137#comment-16459137
 ] 

Rajesh Balamohan commented on HIVE-19360:
-

Would this be ported to 2.x as well by any chance?

> CBO: Add an "optimizedSQL" to QueryPlan object 
> ---
>
> Key: HIVE-19360
> URL: https://issues.apache.org/jira/browse/HIVE-19360
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Diagnosability
>Affects Versions: 3.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-19360.1.patch, HIVE-19360.2.patch
>
>
> Calcite RelNodes can be converted back into SQL (as the new JDBC storage 
> handler does), which allows Hive to print out the post CBO plan as a SQL 
> query instead of having to guess the join orders from the subsequent Tez plan.
> The query generated might not be always valid SQL at this point, but is a 
> world ahead of DAG plans in readability.
> Eg. tpc-ds Query4 CTEs gets expanded to
> {code}
> SELECT t16.$f3 customer_preferred_cust_flag
> FROM
>   (SELECT t0.c_customer_id $f0,
>SUM((t2.ws_ext_list_price - 
> t2.ws_ext_wholesale_cost - t2.ws_ext_discount_amt + t2.ws_ext_sales_price) / 
> CAST(2 AS DECIMAL(10, 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t0
>INNER JOIN (
>  (SELECT ws_sold_date_sk,
>  ws_bill_customer_sk,
>  ws_ext_discount_amt,
>  ws_ext_sales_price,
>  ws_ext_wholesale_cost,
>  ws_ext_list_price
>   FROM default.web_sales
>   WHERE ws_bill_customer_sk IS NOT NULL
> AND ws_sold_date_sk IS NOT NULL) t2
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t4 ON t2.ws_sold_date_sk = 
> t4.d_date_sk) ON t0.c_customer_sk = t2.ws_bill_customer_sk
>GROUP BY t0.c_customer_id,
> t0.c_first_name,
> t0.c_last_name,
> t0.c_preferred_cust_flag,
> t0.c_birth_country,
> t0.c_login,
> t0.c_email_address) t7
> INNER JOIN (
>   (SELECT t9.c_customer_id $f0,
>t9.c_preferred_cust_flag $f3,
> 
> SUM((t11.ss_ext_list_price - t11.ss_ext_wholesale_cost - 
> t11.ss_ext_discount_amt + t11.ss_ext_sales_price) / CAST(2 AS DECIMAL(10, 
> 0))) $f8
>FROM
>  (SELECT c_customer_sk,
>  c_customer_id,
>  c_first_name,
>  c_last_name,
>  c_preferred_cust_flag,
>  c_birth_country,
>  c_login,
>  c_email_address
>   FROM default.customer
>   WHERE c_customer_sk IS NOT NULL
> AND c_customer_id IS NOT NULL) t9
>INNER JOIN (
>  (SELECT ss_sold_date_sk,
>  ss_customer_sk,
>  ss_ext_discount_amt,
>  ss_ext_sales_price,
>  ss_ext_wholesale_cost,
>  ss_ext_list_price
>   FROM default.store_sales
>   WHERE ss_customer_sk IS NOT NULL
> AND ss_sold_date_sk IS NOT NULL) t11
>INNER JOIN
>  (SELECT d_date_sk,
>  CAST(2002 AS INTEGER) d_year
>   FROM default.date_dim
>   WHERE d_year = 2002
> AND d_date_sk IS NOT NULL) t13 ON 
> t11.ss_sold_date_sk = t13.d_date_sk) ON t9.c_customer_sk = t11.ss_customer_sk
>GROUP BY t9.c_customer_id,
> t9.c_first_name,
> t9.c_last_name,
> t9.c_preferred_cust_flag,
> t9.c_birth_country,
> t9.c_login,
>

[jira] [Updated] (HIVE-26699) Iceberg: S3 fadvise can hurt JSON parsing significantly

2022-12-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26699:

Summary: Iceberg: S3 fadvise can hurt JSON parsing significantly  (was: 
Iceberg: S3 fadvise can hurt JSON parsing significantly in DWX)

> Iceberg: S3 fadvise can hurt JSON parsing significantly
> ---
>
> Key: HIVE-26699
> URL: https://issues.apache.org/jira/browse/HIVE-26699
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Hive reads JSON metadata information (TableMetadataParser::read()) multiple 
> times; E.g during query compilation, AM split computation, stats computation, 
> during commits  etc.
>  
> With large JSON files (due to multiple inserts), it takes a lot longer time 
> with S3 FS with "fs.s3a.experimental.input.fadvise" set to "random". (e.g in 
> the order of 10x).To be on safer side, it will be good to set this to 
> "normal" mode in configs, when reading iceberg tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-26902) Failed to close AbstractFileMergeOperator

2023-01-03 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-26902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17653882#comment-17653882
 ] 

Rajesh Balamohan commented on HIVE-26902:
-

Hive on spark is no longer supported. Highly unlikely this ticket may get fixed.

> Failed to close AbstractFileMergeOperator
> -
>
> Key: HIVE-26902
> URL: https://issues.apache.org/jira/browse/HIVE-26902
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 3.1.2
> Environment: hadoop：3.2.1
> hive：3.1.2
> spark：2.4.6
> hive on spark
>Reporter: zhenkuan_zhang
>Priority: Major
>
> when i set hive.merge.sparkfiles to true.Sometimes an error is reported when 
> SQL is running。The error log is as follows：
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
> AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:115)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:2212)
>   at 
> org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:2212)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>   at org.apache.spark.scheduler.Task.run(Task.scala:123)
>   at 
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
> AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:315)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:265)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMergeFileRecordHandler.close(SparkMergeFileRecordHandler.java:113)
>   ... 17 more
> Caused by: java.io.IOException: Unable to rename 
> hdfs://olapCluster/user/hive/warehouse/bi_dw.db/kpy_sfc_fyd_parts_d74_hour_temp/.hive-staging_hive_2023-01-03_13-15-16_144_4347904191947316325-50073/_task_tmp.-ext-1/_tmp.03_0
>  to 
> hdfs://olapCluster/user/hive/warehouse/bi_dw.db/sfc__temp/.hive-staging_hive_2023-01-03_13-15-16_144_4347904191947316325-50073/_tmp.-ext-1/03_0
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:254)
>   ... 19 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26913) HiveVectorizedReader::parquetRecordReader should reuse footer information

2023-01-09 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26913:

Attachment: Screenshot 2023-01-09 at 4.01.14 PM.png

> HiveVectorizedReader::parquetRecordReader should reuse footer information
> -
>
> Key: HIVE-26913
> URL: https://issues.apache.org/jira/browse/HIVE-26913
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance, stability
> Attachments: Screenshot 2023-01-09 at 4.01.14 PM.png
>
>
> HiveVectorizedReader::parquetRecordReader should reuse details of parquet 
> footer, instead of reading it again.
>  
> It reads parquet footer here:
> [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L230-L232]
> Again it reads the footer here for constructing vectorized recordreader
> [https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveVectorizedReader.java#L249]
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java#L50]
>  
> Check the codepath of 
> VectorizedParquetRecordReader::setupMetadataAndParquetSplit
> [https://github.com/apache/hive/blob/6b0139188aba6a95808c8d1bec63a651ec9e4bdc/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedParquetRecordReader.java#L180]
>  
> It should be possible to share "ParquetMetadata" in 
> VectorizedParuqetRecordReader.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26928) LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata cache is disabled

2023-01-10 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26928:

Description: 
When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color}

It should check for "metadatacache" correctly or fix it in LlapIoImpl.

 
{noformat}
Caused by: java.lang.NullPointerException: Metadata cache must not be null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
    at 
org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266)
    at 
org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40)
    at 
org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41)
 {noformat}

  was:
When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
following error.

It should check for "metadatacache" correctly or fix it in LlapIoImpl.

 
{noformat}

Caused by: java.lang.NullPointerException: Metadata cache must not be null
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
    at 
org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl.getParquetFooterBuffersFromCache(LlapIoImpl.java:467)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.parquetRecordReader(HiveVectorizedReader.java:227)
    at 
org.apache.iceberg.mr.hive.vector.HiveVectorizedReader.reader(HiveVectorizedReader.java:162)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
    at 
org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
    at 
org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:196)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.openVectorized(IcebergInputFormat.java:331)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.open(IcebergInputFormat.java:377)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.nextTask(IcebergInputFormat.java:270)
    at 
org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:266)
    at 
org.apache.iceberg.mr.mapred.AbstractMapredIcebergRecordReader.(AbstractMapredIcebergRecordReader.java:40)
    at 
org.apache.iceberg.mr.hive.vector.HiveIcebergVectorizedRecordReader.(HiveIcebergVectorizedRecordReader.java:41)
 {noformat}


> LlapIoImpl::getParquetFooterBuffersFromCache throws exception when metadata 
> cache is disabled
> -
>
> Key: HIVE-26928
> URL: https://issues.apache.org/jira/browse/HIVE-26928
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Priority: Major
>
> When metadata / LLAP cache is disabled, "iceberg + parquet" throws the 
> following error. "{color:#5a656d}hive.llap.io.memory.mode=none"{color}
> It should check for "metadatacache" correctly or fix it

[jira] [Updated] (HIVE-26944) FileSinkOperator shouldn't check for compactiontable for every row being processed

2023-01-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26944:

Attachment: Screenshot 2023-01-16 at 10.32.24 AM.png

> FileSinkOperator shouldn't check for compactiontable for every row being 
> processed
> --
>
> Key: HIVE-26944
> URL: https://issues.apache.org/jira/browse/HIVE-26944
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: iceberg
> Attachments: Screenshot 2023-01-16 at 10.32.24 AM.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26944) FileSinkOperator shouldn't check for compactiontable for every row being processed

2023-01-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26944:

Description: !Screenshot 2023-01-16 at 10.32.24 AM.png!

> FileSinkOperator shouldn't check for compactiontable for every row being 
> processed
> --
>
> Key: HIVE-26944
> URL: https://issues.apache.org/jira/browse/HIVE-26944
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: iceberg
> Attachments: Screenshot 2023-01-16 at 10.32.24 AM.png
>
>
> !Screenshot 2023-01-16 at 10.32.24 AM.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-26944) FileSinkOperator shouldn't check for compactiontable for every row being processed

2023-01-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-26944:
---

Assignee: Rajesh Balamohan

> FileSinkOperator shouldn't check for compactiontable for every row being 
> processed
> --
>
> Key: HIVE-26944
> URL: https://issues.apache.org/jira/browse/HIVE-26944
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: iceberg, pull-request-available
> Attachments: Screenshot 2023-01-16 at 10.32.24 AM.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> !Screenshot 2023-01-16 at 10.32.24 AM.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-24645) UDF configure not called when fetch task conversion occurs

2023-01-19 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678607#comment-17678607
 ] 

Rajesh Balamohan commented on HIVE-24645:
-

Just noted that this is causing perf issue in query compilation.Depending on 
the query complexity, compiler invokes this multiple times causing high perf 
regression in short running queries.

> UDF configure not called when fetch task conversion occurs
> --
>
> Key: HIVE-24645
> URL: https://issues.apache.org/jira/browse/HIVE-24645
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When hive.fetch.task.conversion kicks in - UDF configure is not called.
> This is likely due to MapredContext not being available when this conversion 
> occurs.
> The approach I suggest is to create a dummy MapredContext and provide it with 
> the current configuration from ExprNodeGenericFuncEvaluator.
> It is slightly unfortunate that the UDF API relies on MapredContext since 
> some aspects of the context do not apply to the variety of engines and 
> invocation paths for UDFs which makes it difficult to make a fully formed 
> dummy object such as the Reporter objects and the boolean around if it is a 
> Map context.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26975) Iceberg: MERGE: Wrong reducer estimate causing smaller files to be created

2023-01-23 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-26975:

Summary: Iceberg: MERGE: Wrong reducer estimate causing smaller files to be 
created  (was: MERGE: Wrong reducer estimate causing smaller files to be 
created)

> Iceberg: MERGE: Wrong reducer estimate causing smaller files to be created
> --
>
> Key: HIVE-26975
> URL: https://issues.apache.org/jira/browse/HIVE-26975
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> * "Merge into" estimates wrong number of reducers causing more number of 
> small files to be created.* e.g 400+ files in 3+ MB file each.*
>  * This can be reproduced by writing data into "store_sales" table in iceberg 
> format via another source table (using merge-into).
>  ** e.g  Running this few times will create wrong number of reduce tasks 
> causing lot of small files to be created in iceberg table.
> {noformat}
> MERGE INTO store_sales_t t
> using ssv s
> ON ( t.ss_item_sk = s.ss_item_sk
>  AND t.ss_customer_sk = s.ss_customer_sk
>  AND t.ss_sold_date_sk = "2451181"
>  AND ( ( Floor(( s.ss_item_sk ) / 1000) * 1000 ) BETWEEN 1000 AND 2000 )
>  AND s.ss_ext_discount_amt < 0.0 )
> WHEN matched AND t.ss_ext_discount_amt IS NULL THEN
>   UPDATE SET ss_ext_discount_amt = 0.0
> WHEN NOT matched THEN
>   INSERT ( ss_sold_time_sk,
>    ss_item_sk,
>    ss_customer_sk,
>    ss_cdemo_sk,
>    ss_hdemo_sk,
>    ss_addr_sk,
>    ss_store_sk,
>    ss_promo_sk,
>    ss_ticket_number,
>    ss_quantity,
>    ss_wholesale_cost,
>    ss_list_price,
>    ss_sales_price,
>    ss_ext_discount_amt,
>    ss_ext_sales_price,
>    ss_ext_wholesale_cost,
>    ss_ext_list_price,
>    ss_ext_tax,
>    ss_coupon_amt,
>    ss_net_paid,
>    ss_net_paid_inc_tax,
>    ss_net_profit,
>    ss_sold_date_sk )
>   VALUES ( s.ss_sold_time_sk,
>    s.ss_item_sk,
>    s.ss_customer_sk,
>    s.ss_cdemo_sk,
>    s.ss_hdemo_sk,
>    s.ss_addr_sk,
>    s.ss_store_sk,
>    s.ss_promo_sk,
>    s.ss_ticket_number,
>    s.ss_quantity,
>    s.ss_wholesale_cost,
>    s.ss_list_price,
>    s.ss_sales_price,
>    s.ss_ext_discount_amt,
>    s.ss_ext_sales_price,
>    s.ss_ext_wholesale_cost,
>    s.ss_ext_list_price,
>    s.ss_ext_tax,
>    s.ss_coupon_amt,
>    s.ss_net_paid,
>    s.ss_net_paid_inc_tax,
>    s.ss_net_profit,
>    "2451181") 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27010) Reduce compilation time

2023-01-31 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-27010:
---

Assignee: Rajesh Balamohan

> Reduce compilation time
> ---
>
> Key: HIVE-27010
> URL: https://issues.apache.org/jira/browse/HIVE-27010
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Context: Post HIVE-24645, compilation time for queries has increased.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27010) Reduce compilation time

2023-02-01 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan resolved HIVE-27010.
-
Resolution: Fixed

> Reduce compilation time
> ---
>
> Key: HIVE-27010
> URL: https://issues.apache.org/jira/browse/HIVE-27010
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Context: Post HIVE-24645, compilation time for queries has increased.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27013) Provide an option to enable iceberg manifest caching via table properties

2023-02-01 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-27013:

Description: 
{color:#22}I tried the following thinking that it would work with iceberg 
manifest caching; but it didn't.{color}
{noformat}
alter table store_sales set 
tblproperties('io.manifest.cache-enabled'='true');{noformat}


{color:#22}Creating this ticket as a placeholder to fix the same.{color}

 

  was:
{color:#22}I tried the following thinking that it would work with iceberg 
manifest caching; but it didn't.{color}
{color:#22}{noformat}{color}
{color:#22}alter table store_sales set 
tblproperties('io.manifest.cac{color}{color:#22}he-enabled'='true'); 
\{noformat}{color}
{color:#22}Creating this ticket as a placeholder to fix the same.{color}


> Provide an option to enable iceberg manifest caching via table properties
> -
>
> Key: HIVE-27013
> URL: https://issues.apache.org/jira/browse/HIVE-27013
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Priority: Major
>
> {color:#22}I tried the following thinking that it would work with iceberg 
> manifest caching; but it didn't.{color}
> {noformat}
> alter table store_sales set 
> tblproperties('io.manifest.cache-enabled'='true');{noformat}
> {color:#22}Creating this ticket as a placeholder to fix the same.{color}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-27115) HiveInputFormat column project push down wrong fields (MR)

2023-03-06 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17696980#comment-17696980
 ] 

Rajesh Balamohan commented on HIVE-27115:
-

I haven't tried it in MR (since it is completely deprecated [~yigress] ).  
Codepath where the patch has been added seems to be common for both MR and 
other engines. I tried in tez which works as expected. It may be be easier to 
add a .q file to show the diff.

> HiveInputFormat column project push down wrong fields (MR)
> --
>
> Key: HIVE-27115
> URL: https://issues.apache.org/jira/browse/HIVE-27115
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.3, 4.0.0-alpha-2
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For query such as 
> select * from (
> select r_name from r
> union all
> select t_name from t
> ) unioned
>  
> in MR execution, when column project push down for splits, t_name gets pushed 
> down to table r. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-25073) Optimise HiveAlterHandler::alterPartitions

2023-03-07 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697678#comment-17697678
 ] 

Rajesh Balamohan commented on HIVE-25073:
-

Thanks [~VenuReddy] . Confirmed in latest repo and it is fixed. Closing this.

> Optimise HiveAlterHandler::alterPartitions
> --
>
> Key: HIVE-25073
> URL: https://issues.apache.org/jira/browse/HIVE-25073
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>
> Table details are populated again and again for each partition, which can be 
> avoided.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L5892
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L808
> Following stacktrace may be relevant for apache master as well.
> {noformat}
>   at org.datanucleus.store.query.Query.executeWithArray(Query.java:1744)
>   at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368)
>   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:255)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2113)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2152)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.alterPartitionNoTxn(ObjectStore.java:4951)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.alterPartitions(ObjectStore.java:5057)
>   at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>   at com.sun.proxy.$Proxy27.alterPartitions(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:798)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:5695)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_req(HiveMetaStore.java:5647)
>   at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy28.alter_partitions_req(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:18557)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:18541)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:643)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:638)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27144) Alter table partitions need not DBNotificationListener for external tables

2023-03-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan resolved HIVE-27144.
-
Resolution: Invalid

> Alter table partitions need not DBNotificationListener for external tables
> --
>
> Key: HIVE-27144
> URL: https://issues.apache.org/jira/browse/HIVE-27144
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: perfomance
>
> DBNotificationListener for external tables may not be needed. 
> Even for "analyze table blah compute statistics for columns" for external 
> partitioned tables, it invokes DBNotificationListener for all partitions. 
> {noformat}
> at org.datanucleus.store.query.Query.execute(Query.java:1726)
>   at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:374)
>   at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:216)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:11774)
>   at jdk.internal.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(java.base@11.0.18/Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>   at com.sun.proxy.$Proxy33.addNotificationEvent(Unknown Source)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.process(DbNotificationListener.java:1308)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onAlterPartition(DbNotificationListener.java:458)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$14.notify(MetaStoreListenerNotifier.java:161)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:328)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:390)
>   at 
> org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:863)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:6253)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_req(HiveMetaStore.java:6201)
>   at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.18/Native
>  Method)
>   at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.18/NativeMethodAccessorImpl.java:62)
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(java.base@11.0.18/Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy34.alter_partitions_req(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:21532)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_partitions_req.getResult(ThriftHiveMetastore.java:21511)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:652)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:647)
>   at java.security.AccessController.doPrivileged(java.base@11.0.18/Native 
> Method)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-26091) Support DecimalFilterPredicateLeafBuilder for parquet

2023-03-21 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan resolved HIVE-26091.
-
Resolution: Duplicate

Closing this as a dup of HIVE-27159. (HIVE-27159 has more info).

> Support DecimalFilterPredicateLeafBuilder for parquet
> -
>
> Key: HIVE-26091
> URL: https://issues.apache.org/jira/browse/HIVE-26091
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
>
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java#L41
> It will nice to have DecimalFilterPredicateLeafBuilder. This will help in 
> supporting SARG pushdowns.
> {noformat}
> 2022-03-30 08:59:50,040 [ERROR] [TezChild] 
> |read.ParquetFilterPredicateConverter|: fail to build predicate filter leaf 
> with errorsorg.apache.hadoop.hive.ql.metadata.HiveException: Conversion to 
> Parquet FilterPredicate not supported for DECIMAL
> org.apache.hadoop.hive.ql.metadata.HiveException: Conversion to Parquet 
> FilterPredicate not supported for DECIMAL
> at 
> org.apache.hadoop.hive.ql.io.parquet.LeafFilterFactory.getLeafFilterBuilderByType(LeafFilterFactory.java:223)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.buildFilterPredicateFromPredicateLeaf(ParquetFilterPredicateConverter.java:130)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:111)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:97)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:71)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.translate(ParquetFilterPredicateConverter.java:88)
> at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetFilterPredicateConverter.toFilterPredicate(ParquetFilterPredicateConverter.java:57)
> at 
> org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.setFilter(ParquetRecordReaderBase.java:184)
> at 
> org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:124)
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:158)
> at 
> org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
> at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
> at 
> org.apache.hadoop.hive.ql.io.RecordReaderWrapper.create(RecordReaderWrapper.java:72)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:429)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:437)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:282)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:265)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(

[jira] [Updated] (HIVE-27183) Iceberg: Table information is loaded multiple times

2023-03-27 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-27183:

Attachment: hs2_iceberg_load.html

> Iceberg: Table information is loaded multiple times
> ---
>
> Key: HIVE-27183
> URL: https://issues.apache.org/jira/browse/HIVE-27183
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
> Attachments: hs2_iceberg_load.html
>
>
> HMS::getTable invokes "HiveIcebergMetaHook::postGetTable" which internally 
> loads iceberg table again.
> If this isn't needed or needed only for show-create-table, do not load the 
> table again.
> {noformat}
>     at jdk.internal.misc.Unsafe.park(java.base@11.0.18/Native Method)
>     - parking to wait for  <0x00066f84eef0> (a 
> java.util.concurrent.CompletableFuture$Signaller)
>     at 
> java.util.concurrent.locks.LockSupport.park(java.base@11.0.18/LockSupport.java:194)
>     at 
> java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.18/CompletableFuture.java:1796)
>     at 
> java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.18/ForkJoinPool.java:3128)
>     at 
> java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.18/CompletableFuture.java:1823)
>     at 
> java.util.concurrent.CompletableFuture.get(java.base@11.0.18/CompletableFuture.java:1998)
>     at 
> org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:77)
>     at 
> org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:196)
>     at 
> org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:263)
>     at 
> org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:258)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:177)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$609/0x000840e18040.apply(Unknown
>  Source)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:191)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$610/0x000840e18440.run(Unknown
>  Source)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:191)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:176)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:171)
>     at 
> org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:153)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:96)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:79)
>     at 
> org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44)
>     at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
>     at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$1(IcebergTableUtil.java:99)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$552/0x000840d59840.apply(Unknown
>  Source)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$4(IcebergTableUtil.java:111)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$557/0x000840d58c40.get(Unknown
>  Source)
>     at java.util.Optional.orElseGet(java.base@11.0.18/Optional.java:369)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:108)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:69)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:73)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergMetaHook.postGetTable(HiveIcebergMetaHook.java:931)
>     at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.executePostGetTableHook(HiveMetaStoreClient.java:2638)
>     at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:2624)
>     at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:267)
>     at jdk.internal.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)
>     at 
> jdk.internal.reflect.DelegatingMethodA

[jira] [Updated] (HIVE-27183) Iceberg: Table information is loaded multiple times

2023-03-27 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-27183:

Attachment: Screenshot 2023-03-28 at 8.13.52 AM.png

> Iceberg: Table information is loaded multiple times
> ---
>
> Key: HIVE-27183
> URL: https://issues.apache.org/jira/browse/HIVE-27183
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
> Attachments: Screenshot 2023-03-28 at 8.13.52 AM.png, 
> hs2_iceberg_load.html
>
>
> HMS::getTable invokes "HiveIcebergMetaHook::postGetTable" which internally 
> loads iceberg table again.
> If this isn't needed or needed only for show-create-table, do not load the 
> table again.
>  
> Note: It looks like it invokes loadTable around 6 times during entire 
> planning (semAnalyzer, stats etc). Attached the snapshot for reference.
>  
> {noformat}
>     at jdk.internal.misc.Unsafe.park(java.base@11.0.18/Native Method)
>     - parking to wait for  <0x00066f84eef0> (a 
> java.util.concurrent.CompletableFuture$Signaller)
>     at 
> java.util.concurrent.locks.LockSupport.park(java.base@11.0.18/LockSupport.java:194)
>     at 
> java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.18/CompletableFuture.java:1796)
>     at 
> java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.18/ForkJoinPool.java:3128)
>     at 
> java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.18/CompletableFuture.java:1823)
>     at 
> java.util.concurrent.CompletableFuture.get(java.base@11.0.18/CompletableFuture.java:1998)
>     at 
> org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:77)
>     at 
> org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:196)
>     at 
> org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:263)
>     at 
> org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:258)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:177)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$609/0x000840e18040.apply(Unknown
>  Source)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:191)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$610/0x000840e18440.run(Unknown
>  Source)
>     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
>     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
>     at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:191)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:176)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:171)
>     at 
> org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:153)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:96)
>     at 
> org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:79)
>     at 
> org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44)
>     at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
>     at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$1(IcebergTableUtil.java:99)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$552/0x000840d59840.apply(Unknown
>  Source)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$4(IcebergTableUtil.java:111)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$557/0x000840d58c40.get(Unknown
>  Source)
>     at java.util.Optional.orElseGet(java.base@11.0.18/Optional.java:369)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:108)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:69)
>     at 
> org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:73)
>     at 
> org.apache.iceberg.mr.hive.HiveIcebergMetaHook.postGetTable(HiveIcebergMetaHook.java:931)
>     at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.executePostGetTableHook(HiveMetaStoreClient.java:2638)
>     at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:2624)
>     at 
> org.apache.hadoop.hiv

[jira] [Updated] (HIVE-27183) Iceberg: Table information is loaded multiple times

2023-03-27 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-27183:

Description: 
HMS::getTable invokes "HiveIcebergMetaHook::postGetTable" which internally 
loads iceberg table again.

If this isn't needed or needed only for show-create-table, do not load the 
table again.

 

Note: It looks like it invokes loadTable around 6 times during entire planning 
(semAnalyzer, stats etc). Attached the snapshot for reference.

 
{noformat}
    at jdk.internal.misc.Unsafe.park(java.base@11.0.18/Native Method)
    - parking to wait for  <0x00066f84eef0> (a 
java.util.concurrent.CompletableFuture$Signaller)
    at 
java.util.concurrent.locks.LockSupport.park(java.base@11.0.18/LockSupport.java:194)
    at 
java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.18/CompletableFuture.java:1796)
    at 
java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.18/ForkJoinPool.java:3128)
    at 
java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.18/CompletableFuture.java:1823)
    at 
java.util.concurrent.CompletableFuture.get(java.base@11.0.18/CompletableFuture.java:1998)
    at org.apache.hadoop.util.functional.FutureIO.awaitFuture(FutureIO.java:77)
    at 
org.apache.iceberg.hadoop.HadoopInputFile.newStream(HadoopInputFile.java:196)
    at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:263)
    at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:258)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:177)
    at 
org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$609/0x000840e18040.apply(Unknown
 Source)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:191)
    at 
org.apache.iceberg.BaseMetastoreTableOperations$$Lambda$610/0x000840e18440.run(Unknown
 Source)
    at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
    at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:191)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:176)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:171)
    at 
org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:153)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:96)
    at 
org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:79)
    at 
org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44)
    at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115)
    at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$1(IcebergTableUtil.java:99)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$552/0x000840d59840.apply(Unknown
 Source)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil.lambda$getTable$4(IcebergTableUtil.java:111)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil$$Lambda$557/0x000840d58c40.get(Unknown
 Source)
    at java.util.Optional.orElseGet(java.base@11.0.18/Optional.java:369)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:108)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:69)
    at 
org.apache.iceberg.mr.hive.IcebergTableUtil.getTable(IcebergTableUtil.java:73)
    at 
org.apache.iceberg.mr.hive.HiveIcebergMetaHook.postGetTable(HiveIcebergMetaHook.java:931)
    at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.executePostGetTableHook(HiveMetaStoreClient.java:2638)
    at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:2624)
    at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:267)
    at jdk.internal.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)
    at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(java.base@11.0.18/Method.java:566)
    at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:216)
    at com.sun.proxy.$Proxy56.getTable(Unknown Source)
    at jdk.internal.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)
    at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.18/De

[jira] [Commented] (HIVE-22411) Performance degradation on single row inserts

2019-11-11 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971895#comment-16971895
 ] 

Rajesh Balamohan commented on HIVE-22411:
-

One side of issue is on stats computation. But actually split computation can 
slow down heavily due to {{AcidUtils::getAcidState}} in non-partitioned tables, 
depending on the number of delta dirs. 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1232].
 It can be a separate ticket though.

> Performance degradation on single row inserts
> -
>
> Key: HIVE-22411
> URL: https://issues.apache.org/jira/browse/HIVE-22411
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22411.2.patch, HIVE-22411.3.patch, Screen Shot 
> 2019-10-17 at 8.40.50 PM.png
>
>
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> Therefore insertion time goes up linearly:
> !Screen Shot 2019-10-17 at 8.40.50 PM.png|width=601,height=436!
> The fix is to use fs.listFiles(path, /**recursive**/ true) instead the 
> handcrafter recursive method/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22482) o.a.h.hive.q.i.AcidUtils.isInsertOnlyTable should not computed in FileSinkOperator for every record

2019-11-12 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972380#comment-16972380
 ] 

Rajesh Balamohan commented on HIVE-22482:
-

This was observed when copying data from an ORC table to Parq table.

> o.a.h.hive.q.i.AcidUtils.isInsertOnlyTable should not computed in 
> FileSinkOperator for every record
> ---
>
> Key: HIVE-22482
> URL: https://issues.apache.org/jira/browse/HIVE-22482
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>
>  
>  
> {noformat}
>   at java.util.Hashtable.get(Hashtable.java:367)
>   - locked <0x0006f4827098> (a java.util.Properties)
>   at java.util.Properties.getProperty(Properties.java:969)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.isInsertOnlyTable(AcidUtils.java:2104)
>   at 
> org.apache.hadoop.hive.ql.plan.FileSinkDesc.isMmTable(FileSinkDesc.java:333)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.areAllTrue(FileSinkOperator.java:1047)
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:966)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-11-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22485:

Attachment: HIVE-22485.1.patch

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22379) Reduce db lookups during dynamic partition loading

2019-11-26 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983180#comment-16983180
 ] 

Rajesh Balamohan commented on HIVE-22379:
-

LGTM. +1.

Plz remove commented out "msdb.getPartition" before check in.

> Reduce db lookups during dynamic partition loading
> --
>
> Key: HIVE-22379
> URL: https://issues.apache.org/jira/browse/HIVE-22379
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
> Attachments: HIVE-22379.1.patch, HIVE-22379.2.patch, 
> HIVE-22379.3.patch, HIVE-22379.4.patch, HIVE-22379.5.patch, HIVE-22379.6.patch
>
>
> {{HiveAlterHandler::alterPartitions}} could lookup all partition details via 
> single call instead of multiple lookups.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-11 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.1.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-11 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Status: Patch Available  (was: Open)

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22548) Optimise Utilities.removeTempOrDuplicateFiles when moving files to final location

2019-12-11 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994129#comment-16994129
 ] 

Rajesh Balamohan commented on HIVE-22548:
-

LGTM. +1. 

Please add the {{LOG.debug}} statement before commit.



> Optimise Utilities.removeTempOrDuplicateFiles when moving files to final 
> location
> -
>
> Key: HIVE-22548
> URL: https://issues.apache.org/jira/browse/HIVE-22548
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
> Attachments: HIVE-22548.01.patch, HIVE-22548.02.patch
>
>
> {{Utilities.removeTempOrDuplicateFiles}}
> is very slow with cloud storage, as it executes {{listStatus}} twice and also 
> runs in single threaded mode.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L1629



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-11 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.2.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-12-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22485:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ashutoshc]. Committed to master. 

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21971) HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with temporary functions + GenericUDF

2019-12-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-21971:

Attachment: HIVE-21971.2.patch

> HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> ---
>
> Key: HIVE-21971
> URL: https://issues.apache.org/jira/browse/HIVE-21971
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-21971.1.patch, HIVE-21971.2.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21971) HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with temporary functions + GenericUDF

2019-12-15 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16996890#comment-16996890
 ] 

Rajesh Balamohan commented on HIVE-21971:
-

Thanks [~ashutoshc]. Uploading .2 version (same as .1) to trigger tests. Will 
commit it after tests complete in master.

> HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> ---
>
> Key: HIVE-21971
> URL: https://issues.apache.org/jira/browse/HIVE-21971
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-21971.1.patch, HIVE-21971.2.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-15 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.3.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22654) ACID: Allow TxnHandler::checkLock to chunk partitions by 1000

2019-12-17 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-22654:
---

Assignee: Rajesh Balamohan

> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000 
> --
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22654.1.patch
>
>
> The following loop can end up with too many entries within the IN clause 
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
>         // If any of the partition requests are null, then I need to pull all
>         // partition locks for this table.
>         sawNull = false;
>         strings.clear();
>         for (LockInfo info : locksBeingChecked) {
>           if (info.partition == null) {
>             sawNull = true;
>             break;
>           } else {
>             strings.add(info.partition);
>           }
>         } 
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - 
> MetaException(message:Unable to update transaction database 
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22654) ACID: Allow TxnHandler::checkLock to chunk partitions by 1000

2019-12-17 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22654:

Attachment: HIVE-22654.1.patch

> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000 
> --
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22654.1.patch
>
>
> The following loop can end up with too many entries within the IN clause 
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
>         // If any of the partition requests are null, then I need to pull all
>         // partition locks for this table.
>         sawNull = false;
>         strings.clear();
>         for (LockInfo info : locksBeingChecked) {
>           if (info.partition == null) {
>             sawNull = true;
>             break;
>           } else {
>             strings.add(info.partition);
>           }
>         } 
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - 
> MetaException(message:Unable to update transaction database 
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22654) ACID: Allow TxnHandler::checkLock to chunk partitions by 1000

2019-12-17 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22654:

Status: Patch Available  (was: Open)

> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000 
> --
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22654.1.patch
>
>
> The following loop can end up with too many entries within the IN clause 
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
>         // If any of the partition requests are null, then I need to pull all
>         // partition locks for this table.
>         sawNull = false;
>         strings.clear();
>         for (LockInfo info : locksBeingChecked) {
>           if (info.partition == null) {
>             sawNull = true;
>             break;
>           } else {
>             strings.add(info.partition);
>           }
>         } 
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - 
> MetaException(message:Unable to update transaction database 
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22657) Add log message when stats have to to computed during calcite

2019-12-18 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999138#comment-16999138
 ] 

Rajesh Balamohan commented on HIVE-22657:
-

yes and reading from storage.

> Add log message when stats have to to computed during calcite
> -
>
> Key: HIVE-22657
> URL: https://issues.apache.org/jira/browse/HIVE-22657
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> When stats are not available, {[RelOptHiveTable::getColStat}} computes stats 
> on the fly. However, it turns out to be a lot more slower in cloud.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22654) ACID: Allow TxnHandler::checkLock to chunk partitions by 1000

2019-12-18 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22654:

Attachment: HIVE-22654.2.patch

> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000 
> --
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22654.1.patch, HIVE-22654.2.patch
>
>
> The following loop can end up with too many entries within the IN clause 
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
>         // If any of the partition requests are null, then I need to pull all
>         // partition locks for this table.
>         sawNull = false;
>         strings.clear();
>         for (LockInfo info : locksBeingChecked) {
>           if (info.partition == null) {
>             sawNull = true;
>             break;
>           } else {
>             strings.add(info.partition);
>           }
>         } 
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - 
> MetaException(message:Unable to update transaction database 
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.4.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch, HIVE-22609.4.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22654) ACID: Allow TxnHandler::checkLock to chunk partitions by 1000

2020-01-06 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008950#comment-17008950
 ] 

Rajesh Balamohan commented on HIVE-22654:
-

{{buildQueryWithINClause}} works on prefix and suffix for the query. Computing 
suffix for checking locks involves multiple computations. 
{{Batchable.runBatched}} model was a lot more appropriate in this case.

> ACID: Allow TxnHandler::checkLock to chunk partitions by 1000 
> --
>
> Key: HIVE-22654
> URL: https://issues.apache.org/jira/browse/HIVE-22654
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22654.1.patch, HIVE-22654.2.patch
>
>
> The following loop can end up with too many entries within the IN clause 
> throwing.
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L4428
> {code:java}
>         // If any of the partition requests are null, then I need to pull all
>         // partition locks for this table.
>         sawNull = false;
>         strings.clear();
>         for (LockInfo info : locksBeingChecked) {
>           if (info.partition == null) {
>             sawNull = true;
>             break;
>           } else {
>             strings.add(info.partition);
>           }
>         } 
> {code}
> {code}
> 2019-12-17T04:28:57,991 ERROR [pool-8-thread-143]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - 
> MetaException(message:Unable to update transaction database 
> java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in 
> a list is 1000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2020-01-07 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.5.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch, HIVE-22609.4.patch, HIVE-22609.5.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2020-01-07 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-22609:
---

Assignee: Rajesh Balamohan

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch, HIVE-22609.4.patch, HIVE-22609.5.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2020-01-07 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ashutoshc]. Committed to master.

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch, HIVE-22609.4.patch, HIVE-22609.5.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22719) Remove Log from HiveConf::getLogIdVar

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22719:

Attachment: HIVE-22719.1.patch

> Remove Log from HiveConf::getLogIdVar
> -
>
> Key: HIVE-22719
> URL: https://issues.apache.org/jira/browse/HIVE-22719
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: ExecuteStatemt_getResult.jpg, HIVE-22719.1.patch
>
>
> Log statement gets in the hotpath when executing large number of tiny sql 
> statements.
> !ExecuteStatemt_getResult.jpg|width=260,height=177!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22719) Remove Log from HiveConf::getLogIdVar

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22719:

Status: Patch Available  (was: Open)

> Remove Log from HiveConf::getLogIdVar
> -
>
> Key: HIVE-22719
> URL: https://issues.apache.org/jira/browse/HIVE-22719
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: ExecuteStatemt_getResult.jpg, HIVE-22719.1.patch
>
>
> Log statement gets in the hotpath when executing large number of tiny sql 
> statements.
> !ExecuteStatemt_getResult.jpg|width=260,height=177!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Summary: Optimise AuthenticationProviderFactory::getAuthenticationProvider  
(was: AuthenticationProviderFactory shouldn)

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: Screenshot 2020-01-13 at 10.07.34 AM.jpg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Attachment: Screenshot 2020-01-13 at 10.07.34 AM.jpg

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: Screenshot 2020-01-13 at 10.07.34 AM.jpg
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Description: !Screenshot 2020-01-13 at 10.07.34 AM.jpg!

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: Screenshot 2020-01-13 at 10.07.34 AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Description: !Screenshot 2020-01-13 at 10.07.34 
AM.jpg|width=439,height=269!  (was: !Screenshot 2020-01-13 at 10.07.34 AM.jpg!)

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22720.1.patch, Screenshot 2020-01-13 at 10.07.34 
> AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Attachment: HIVE-22720.1.patch

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22720.1.patch, Screenshot 2020-01-13 at 10.07.34 
> AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-12 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Status: Patch Available  (was: Open)

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22720.1.patch, Screenshot 2020-01-13 at 10.07.34 
> AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Attachment: HIVE-22720.2.patch

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22720.1.patch, HIVE-22720.2.patch, Screenshot 
> 2020-01-13 at 10.07.34 AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22719) Remove Log from HiveConf::getLogIdVar

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22719:

Attachment: HIVE-22719.2.patch

> Remove Log from HiveConf::getLogIdVar
> -
>
> Key: HIVE-22719
> URL: https://issues.apache.org/jira/browse/HIVE-22719
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: ExecuteStatemt_getResult.jpg, HIVE-22719.1.patch, 
> HIVE-22719.2.patch
>
>
> Log statement gets in the hotpath when executing large number of tiny sql 
> statements.
> !ExecuteStatemt_getResult.jpg|width=260,height=177!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) ObjectStore: Reduce number of DB calls

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Description: !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!

> ObjectStore: Reduce number of DB calls
> --
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: Screenshot 2020-01-14 at 4.55.12 AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) ObjectStore: Reduce number of DB calls

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Attachment: Screenshot 2020-01-14 at 4.55.12 AM.png

> ObjectStore: Reduce number of DB calls
> --
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: Screenshot 2020-01-14 at 4.55.12 AM.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) ObjectStore: Reduce number of DB calls

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Attachment: HIVE-22724.1.patch

> ObjectStore: Reduce number of DB calls
> --
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) ObjectStore: Reduce number of DB calls

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Status: Patch Available  (was: Open)

> ObjectStore: Reduce number of DB calls
> --
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Summary: Reduce number of DB calls in ObjectStore, TxnHandler  (was: 
ObjectStore: Reduce number of DB calls)

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Attachment: HIVE-22724.2.patch

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, HIVE-22724.2.patch, Screenshot 
> 2020-01-14 at 4.55.12 AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Attachment: HIVE-22724.3.patch

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, HIVE-22724.2.patch, 
> HIVE-22724.3.patch, Screenshot 2020-01-14 at 4.55.12 AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Assignee: Rajesh Balamohan
  Status: Patch Available  (was: Open)

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default. 
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: HIVE-22725.1.patch

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default. 
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: Screenshot 2020-01-14 at 6.32.36 AM.png

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, Screenshot 2020-01-14 at 6.32.36 
> AM.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default. 
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: (was: Screenshot 2020-01-14 at 6.32.36 AM.png)

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default. 
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: image-2020-01-14-13-22-54-483.png

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default. 
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-13 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Description: 
"TransactionalValidationListener" gets added in the pre-event listeners of HMS 
by default.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]

This causes issue in short select queries, as table details are computed for 
any partition lookups.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]

 

!image-2020-01-14-13-22-54-483.png|width=579,height=202!

 

It would be good to lazy evaluate table lookup in this codepath.

  was:
"TransactionalValidationListener" gets added in the pre-event listeners of HMS 
by default. 

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]

This causes issue in short select queries, as table details are computed for 
any partition lookups.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]

It would be good to lazy evaluate table lookup in this codepath.


> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-14 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: HIVE-22725.2.patch

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-14 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014922#comment-17014922
 ] 

Rajesh Balamohan commented on HIVE-22725:
-

Added guava's supplier::memoize in patch .2.

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-14 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015625#comment-17015625
 ] 

Rajesh Balamohan commented on HIVE-22725:
-

Fixing test cases.

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> HIVE-22725.3.patch, image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-14 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: HIVE-22725.3.patch

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> HIVE-22725.3.patch, image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-14 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015667#comment-17015667
 ] 

Rajesh Balamohan commented on HIVE-22724:
-

Test failure is unrelated to this patch and works locally. Resubmitting same 
patch for one more run.

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, HIVE-22724.2.patch, 
> HIVE-22724.3.patch, HIVE-22724.4.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-14 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22724:

Attachment: HIVE-22724.4.patch

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, HIVE-22724.2.patch, 
> HIVE-22724.3.patch, HIVE-22724.4.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22724) Reduce number of DB calls in ObjectStore, TxnHandler

2020-01-15 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016500#comment-17016500
 ] 

Rajesh Balamohan commented on HIVE-22724:
-

Test error is not related to this patch (connection refused error).

> Reduce number of DB calls in ObjectStore, TxnHandler
> 
>
> Key: HIVE-22724
> URL: https://issues.apache.org/jira/browse/HIVE-22724
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22724.1.patch, HIVE-22724.2.patch, 
> HIVE-22724.3.patch, HIVE-22724.4.patch, Screenshot 2020-01-14 at 4.55.12 
> AM.png
>
>
> !Screenshot 2020-01-14 at 4.55.12 AM.png|width=668,height=310!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-20974) TezTask should set task exception on failures

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-20974:

Attachment: HIVE-20974.2.patch

> TezTask should set task exception on failures
> -
>
> Key: HIVE-20974
> URL: https://issues.apache.org/jira/browse/HIVE-20974
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-20974.1.patch, HIVE-20974.2.patch
>
>
> TezTask logs the error as "Failed to execute tez graph" and proceeds further. 
> "TaskRunner.runSequentail()" code would not be able to get these exceptions 
> for TezTask. If there are any failure hooks configured, these exceptions 
> wouldn't show up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: HIVE-22725.4.patch

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> HIVE-22725.3.patch, HIVE-22725.4.patch, image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Priority: Minor  (was: Major)

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22720.1.patch, HIVE-22720.2.patch, Screenshot 
> 2020-01-13 at 10.07.34 AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22720) Optimise AuthenticationProviderFactory::getAuthenticationProvider

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22720:

Fix Version/s: 4.0.0
 Assignee: Rajesh Balamohan
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ashutoshc] , [~gopalv] . Committed to master.

> Optimise AuthenticationProviderFactory::getAuthenticationProvider
> -
>
> Key: HIVE-22720
> URL: https://issues.apache.org/jira/browse/HIVE-22720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22720.1.patch, HIVE-22720.2.patch, Screenshot 
> 2020-01-13 at 10.07.34 AM.jpg
>
>
> !Screenshot 2020-01-13 at 10.07.34 AM.jpg|width=439,height=269!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22719) Remove Log from HiveConf::getLogIdVar

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22719:

  Assignee: Rajesh Balamohan
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~ashutoshc] , [~gopalv] . Committed to master.

> Remove Log from HiveConf::getLogIdVar
> -
>
> Key: HIVE-22719
> URL: https://issues.apache.org/jira/browse/HIVE-22719
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: ExecuteStatemt_getResult.jpg, HIVE-22719.1.patch, 
> HIVE-22719.2.patch
>
>
> Log statement gets in the hotpath when executing large number of tiny sql 
> statements.
> !ExecuteStatemt_getResult.jpg|width=260,height=177!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22719) Remove Log from HiveConf::getLogIdVar

2020-01-16 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22719:

Fix Version/s: 4.0.0

> Remove Log from HiveConf::getLogIdVar
> -
>
> Key: HIVE-22719
> URL: https://issues.apache.org/jira/browse/HIVE-22719
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: ExecuteStatemt_getResult.jpg, HIVE-22719.1.patch, 
> HIVE-22719.2.patch
>
>
> Log statement gets in the hotpath when executing large number of tiny sql 
> statements.
> !ExecuteStatemt_getResult.jpg|width=260,height=177!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-20974) TezTask should set task exception on failures

2020-01-17 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-20974:

Attachment: HIVE-20974.3.patch

> TezTask should set task exception on failures
> -
>
> Key: HIVE-20974
> URL: https://issues.apache.org/jira/browse/HIVE-20974
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-20974.1.patch, HIVE-20974.2.patch, 
> HIVE-20974.3.patch
>
>
> TezTask logs the error as "Failed to execute tez graph" and proceeds further. 
> "TaskRunner.runSequentail()" code would not be able to get these exceptions 
> for TezTask. If there are any failure hooks configured, these exceptions 
> wouldn't show up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-17 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22725:

Attachment: HIVE-22725.5.patch

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> HIVE-22725.3.patch, HIVE-22725.4.patch, HIVE-22725.5.patch, 
> image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22725) Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation

2020-01-17 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017893#comment-17017893
 ] 

Rajesh Balamohan commented on HIVE-22725:
-

Thanks [~gopalv] . Uploaded .5 with the comment added.  will commit shortly.

> Lazy evaluate HiveMetastore::fireReadTablePreEvent table computation
> 
>
> Key: HIVE-22725
> URL: https://issues.apache.org/jira/browse/HIVE-22725
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22725.1.patch, HIVE-22725.2.patch, 
> HIVE-22725.3.patch, HIVE-22725.4.patch, HIVE-22725.5.patch, 
> image-2020-01-14-13-22-54-483.png
>
>
> "TransactionalValidationListener" gets added in the pre-event listeners of 
> HMS by default.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L559]
> This causes issue in short select queries, as table details are computed for 
> any partition lookups.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4984]
>  
> !image-2020-01-14-13-22-54-483.png|width=579,height=202!
>  
> It would be good to lazy evaluate table lookup in this codepath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22751) Move locking in HiveServer2::isDeregisteredWithZooKeeper to ZooKeeperHiveHelper

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22751:

Attachment: HIVE-22751.1.patch

> Move locking in HiveServer2::isDeregisteredWithZooKeeper to 
> ZooKeeperHiveHelper
> ---
>
> Key: HIVE-22751
> URL: https://issues.apache.org/jira/browse/HIVE-22751
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22751.1.patch
>
>
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/HiveServer2.java#L620]
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/session/SessionManager.java#L597]
>  
> When queries are run in beeline and closed, it causes unwanted delays in 
> shutting down beeline.  Here is the threaddump from server side, which shows 
> HiveServer2 lock contention.
>  
> It would be good to move synchronization to 
> "zooKeeperHelper.isDeregisteredWithZooKeeper"
>  
> {noformat}
> "main" #1 prio=5 os_prio=0 tid=0x7f78b0078800 nid=0x2d1c waiting on 
> condition [0x7f78b968c000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xac8d5ff0> (a 
> java.util.concurrent.FutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123)
>   - locked <0xa9c5f2a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:115)
>   at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:790)
>   at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:763)
>   at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:687)
>   - locked <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1016)
>   at 
> org.apache.hive.service.server.HiveServer2.access$1400(HiveServer2.java:137)
>   at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1294)
>   at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> "HiveServer2-HttpHandler-Pool: Thread-50" #50 prio=5 os_prio=0 
> tid=0x7f78b3e60800 nid=0x2fa7 waiting for monitor entry 
> [0x7f7884edf000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hive.service.server.HiveServer2.isDeregisteredWithZooKeeper(HiveServer2.java:600)
>   - waiting to lock <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSessionInternal(SessionManager.java:631)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:621)
>   - locked <0xaa1970b0> (a 
> org.apache.hive.service.cli.session.SessionManager)
>   at 
> org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:244)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:527)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1517)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1502)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)
>   at 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:237)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java

[jira] [Updated] (HIVE-22751) Move locking in HiveServer2::isDeregisteredWithZooKeeper to ZooKeeperHiveHelper

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22751:

Assignee: Rajesh Balamohan
  Status: Patch Available  (was: Open)

> Move locking in HiveServer2::isDeregisteredWithZooKeeper to 
> ZooKeeperHiveHelper
> ---
>
> Key: HIVE-22751
> URL: https://issues.apache.org/jira/browse/HIVE-22751
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22751.1.patch
>
>
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/HiveServer2.java#L620]
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/session/SessionManager.java#L597]
>  
> When queries are run in beeline and closed, it causes unwanted delays in 
> shutting down beeline.  Here is the threaddump from server side, which shows 
> HiveServer2 lock contention.
>  
> It would be good to move synchronization to 
> "zooKeeperHelper.isDeregisteredWithZooKeeper"
>  
> {noformat}
> "main" #1 prio=5 os_prio=0 tid=0x7f78b0078800 nid=0x2d1c waiting on 
> condition [0x7f78b968c000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xac8d5ff0> (a 
> java.util.concurrent.FutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123)
>   - locked <0xa9c5f2a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:115)
>   at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:790)
>   at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:763)
>   at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:687)
>   - locked <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1016)
>   at 
> org.apache.hive.service.server.HiveServer2.access$1400(HiveServer2.java:137)
>   at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1294)
>   at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> "HiveServer2-HttpHandler-Pool: Thread-50" #50 prio=5 os_prio=0 
> tid=0x7f78b3e60800 nid=0x2fa7 waiting for monitor entry 
> [0x7f7884edf000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hive.service.server.HiveServer2.isDeregisteredWithZooKeeper(HiveServer2.java:600)
>   - waiting to lock <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSessionInternal(SessionManager.java:631)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:621)
>   - locked <0xaa1970b0> (a 
> org.apache.hive.service.cli.session.SessionManager)
>   at 
> org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:244)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:527)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1517)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1502)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)
>   at 
> org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:237)
>   at javax.servlet.http.HttpServlet.service(HttpServl

[jira] [Updated] (HIVE-22752) HiveMetastore addWriteNotificationLog should be invoked only when listeners are enabled

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22752:

Description: 
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8109]

 

Even though listeners are turned off, it gets executed and causes load on the 
system. This should be guarded by listener checks.

 
{noformat}
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1589)
at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1700)
at 
org.apache.hadoop.hive.ql.metadata.Hive.addInsertFileInformation(Hive.java:3185)
at 
org.apache.hadoop.hive.ql.metadata.Hive.addWriteNotificationLog(Hive.java:3138)
at 
org.apache.hadoop.hive.ql.metadata.Hive.addWriteNotificationLog(Hive.java:3123)
at 
org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:2238){noformat}

  was:
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8109]

 

Even though listeners are turned off, it gets executed and causes load on the 
system. This should be guarded by listener checks.


> HiveMetastore addWriteNotificationLog should be invoked only when listeners 
> are enabled
> ---
>
> Key: HIVE-22752
> URL: https://issues.apache.org/jira/browse/HIVE-22752
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8109]
>  
> Even though listeners are turned off, it gets executed and causes load on the 
> system. This should be guarded by listener checks.
>  
> {noformat}
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1589)
>   at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1700)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.addInsertFileInformation(Hive.java:3185)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.addWriteNotificationLog(Hive.java:3138)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.addWriteNotificationLog(Hive.java:3123)
>   at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:2238){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Description: 
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

!2Q==!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=698,height=209!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820

  was:
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

!2Q==|width=487,height=204!

 

Prod instance mem

!Z|width=531,height=159!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820


> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
> !2Q==!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Attachment: image-2020-01-21-11-17-59-279.png

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
> !2Q==|width=487,height=204!
>  
> Prod instance mem
> !Z|width=531,height=159!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Attachment: image-2020-01-21-11-18-37-294.png

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
> !2Q==!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Description: 
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

 

Allocation tree

!image-2020-01-21-11-18-37-294.png|width=425,height=178!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=698,height=209!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820

  was:
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

!2Q==!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=698,height=209!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820


> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Attachment: HIVE-22753.1.patch

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Assignee: Rajesh Balamohan
  Status: Patch Available  (was: Open)

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-20 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019904#comment-17019904
 ] 

Rajesh Balamohan commented on HIVE-22753:
-

Yeah, that could end up closing operationlog way too soon. Will check and 
revise the patch.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-21 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Status: Open  (was: Patch Available)

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-21 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020723#comment-17020723
 ] 

Rajesh Balamohan commented on HIVE-22753:
-

[~maheshk114]: BTW, I have tried HIVE-22733 and issue persists.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-21 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020794#comment-17020794
 ] 

Rajesh Balamohan commented on HIVE-22753:
-

Did some more debugging.

1. https://issues.apache.org/jira/browse/HIVE-22733 does not fix this issue. 
Observed mem leak with this fix as well.

2. HushableRandomAccessFileAppender stop() is getting invoked correctly as part 
of "Operation.cleanupOperationLog --> LogUtils.stopQueryAppender".

However, due to some residual message in BatchEventProcessor,  
"HushableRandomAccessFileAppender" with same filename gets immediately 
recreated. This happens immediately after stop() is invoked. E.g

{noformat}
at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136)
at 
org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:958)
at 
org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:898)
at 
org.apache.logging.log4j.core.appender.routing.RoutingAppender.createAppender(RoutingAppender.java:271)
at 
org.apache.logging.log4j.core.appender.routing.RoutingAppender.getControl(RoutingAppender.java:255)
at 
org.apache.logging.log4j.core.appender.routing.RoutingAppender.append(RoutingAppender.java:225)
at 
org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)
at 
org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)
at 
org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)
at 
org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)
at 
org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:448)
at 
org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:433)
at 
org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:417)
at 
org.apache.logging.log4j.core.config.AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:79)
at 
org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:380)
at 
org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:152)
at 
org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:45)
at 
org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:29)
at 
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:129) 
 {noformat}

So this leaves the object forever in the map, causing the memory leak. Yet to 
check how to prevent this from reinstantiated immediately.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-21 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17020863#comment-17020863
 ] 

Rajesh Balamohan commented on HIVE-22753:
-

Another way could be to add "IdlePurgePolicy" to "RoutingAppender" to get rid 
of unused appenders.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-22 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021041#comment-17021041
 ] 

Rajesh Balamohan commented on HIVE-22753:
-

There is a race between BatchEventProcessor and the cleanup operation performed 
by HS2 thread. So even though stop() is being invoked and file is getting 
closed,  same filename is recreated by BatchEventProcessor in a span of 1 ms 
due to race. This instance never gets cleared up, causing the leak.  This also 
creates stale directories/files in ops log folder. Another option with .2 patch 
is to track the files which are genuinely getting closed and prevent them from 
getting recreated within seconds.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> image-2020-01-21-11-14-37-911.png, image-2020-01-21-11-17-59-279.png, 
> image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-22 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Attachment: HIVE-22753.2.patch

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> image-2020-01-21-11-14-37-911.png, image-2020-01-21-11-17-59-279.png, 
> image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-22 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22753:

Status: Patch Available  (was: Open)

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> image-2020-01-21-11-14-37-911.png, image-2020-01-21-11-17-59-279.png, 
> image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-01-22 Thread Rajesh Balamohan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021041#comment-17021041
 ] 

Rajesh Balamohan edited comment on HIVE-22753 at 1/22/20 1:16 PM:
--

There is a race between BatchEventProcessor and the cleanup operation performed 
by HS2 thread. So even though stop() is being invoked and file is getting 
closed,  same filename is recreated by BatchEventProcessor in a span of 1 ms 
due to race. This instance never gets cleared up, causing the leak.  This also 
creates stale directories/files in ops log folder. Another option with .2 patch 
is to track the files which are genuinely getting closed and prevent them from 
getting recreated within seconds. Verified that this fixes the leak in the 
cluster.


was (Author: rajesh.balamohan):
There is a race between BatchEventProcessor and the cleanup operation performed 
by HS2 thread. So even though stop() is being invoked and file is getting 
closed,  same filename is recreated by BatchEventProcessor in a span of 1 ms 
due to race. This instance never gets cleared up, causing the leak.  This also 
creates stale directories/files in ops log folder. Another option with .2 patch 
is to track the files which are genuinely getting closed and prevent them from 
getting recreated within seconds.

> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> image-2020-01-21-11-14-37-911.png, image-2020-01-21-11-17-59-279.png, 
> image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=698,height=209!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22751) Move locking in HiveServer2::isDeregisteredWithZooKeeper to ZooKeeperHiveHelper

2020-01-23 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22751:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~anishek] for the review. Committed to master.

> Move locking in HiveServer2::isDeregisteredWithZooKeeper to 
> ZooKeeperHiveHelper
> ---
>
> Key: HIVE-22751
> URL: https://issues.apache.org/jira/browse/HIVE-22751
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22751.1.patch
>
>
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/HiveServer2.java#L620]
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/session/SessionManager.java#L597]
>  
> When queries are run in beeline and closed, it causes unwanted delays in 
> shutting down beeline.  Here is the threaddump from server side, which shows 
> HiveServer2 lock contention.
>  
> It would be good to move synchronization to 
> "zooKeeperHelper.isDeregisteredWithZooKeeper"
>  
> {noformat}
> "main" #1 prio=5 os_prio=0 tid=0x7f78b0078800 nid=0x2d1c waiting on 
> condition [0x7f78b968c000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0xac8d5ff0> (a 
> java.util.concurrent.FutureTask)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.startUnderInitLock(TezSessionPool.java:187)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPool.start(TezSessionPool.java:123)
>   - locked <0xa9c5f2a8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.startPool(TezSessionPoolManager.java:115)
>   at 
> org.apache.hive.service.server.HiveServer2.initAndStartTezSessionPoolManager(HiveServer2.java:790)
>   at 
> org.apache.hive.service.server.HiveServer2.startOrReconnectTezSessions(HiveServer2.java:763)
>   at 
> org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:687)
>   - locked <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:1016)
>   at 
> org.apache.hive.service.server.HiveServer2.access$1400(HiveServer2.java:137)
>   at 
> org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:1294)
>   at 
> org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:1138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> "HiveServer2-HttpHandler-Pool: Thread-50" #50 prio=5 os_prio=0 
> tid=0x7f78b3e60800 nid=0x2fa7 waiting for monitor entry 
> [0x7f7884edf000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.hive.service.server.HiveServer2.isDeregisteredWithZooKeeper(HiveServer2.java:600)
>   - waiting to lock <0xa99bd568> (a 
> org.apache.hive.service.server.HiveServer2)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSessionInternal(SessionManager.java:631)
>   at 
> org.apache.hive.service.cli.session.SessionManager.closeSession(SessionManager.java:621)
>   - locked <0xaa1970b0> (a 
> org.apache.hive.service.cli.session.SessionManager)
>   at 
> org.apache.hive.service.cli.CLIService.closeSession(CLIService.java:244)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.CloseSession(ThriftCLIService.java:527)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1517)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseSession.getResult(TCLIService.java:1502)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)
>   at 
> org.apache.hive.service.cli.thri

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1253 matches

Mail list logo