date:20201026

[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19253?focusedWorklogId=504880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504880
 ]

ASF GitHub Bot logged work on HIVE-19253:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 19:22
Start Date: 26/Oct/20 19:22
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1537:
URL: https://github.com/apache/hive/pull/1537#issuecomment-716770245


   @szehonCriteo Hi Szehon, How you been? 
   yeah, it does seem a bit odd that the test sets the tableType=EXTERNAL_TABLE 
but add capabilities that are for typically set for MANAGED_TABLEs. 
   I do think that changing the test output to whats in the fix is incorrect. 
The API is supposed to only return fields are not shielded by a bit mask.
   
   If you changed the tableType=MANAGED_TABLE, the test should pass as-is. 
Could you give that a try please?
   
   Also should HiveMetastore.isExternal() be removed entirely and use just 
MetastoreUtils.isExternal() instead ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504880)
Time Spent: 1h  (was: 50m)

> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504904
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 20:16
Start Date: 26/Oct/20 20:16
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512240902



##
File path: 
service/src/java/org/apache/hive/service/cli/session/SessionManager.java
##
@@ -135,6 +138,13 @@ public synchronized void init(HiveConf hiveConf) {
 userIpAddressLimit = 
hiveConf.getIntVar(ConfVars.HIVE_SERVER2_LIMIT_CONNECTIONS_PER_USER_IPADDRESS);
 LOG.info("Connections limit are user: {} ipaddress: {} user-ipaddress: 
{}", userLimit, ipAddressLimit,
   userIpAddressLimit);
+
+int cleanupThreadCount = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_THREAD_COUNT);
+int cleanupQueueSize = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_QUEUE_SIZE);
+if (cleanupThreadCount > 0) {
+  cleanupService = new EventualCleanupService(cleanupThreadCount, 
cleanupQueueSize);
+}
+cleanupService.start();

Review comment:
   fixed





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504904)
Time Spent: 2h 10m  (was: 2h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

2020-10-26 Thread Chao Gao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220502#comment-17220502
 ] 

Chao Gao commented on HIVE-24066:
-

I also would like to share some findings after my investigation.

*After transferring the original data into new data, using any format like ORC, 
TEXTFILE, or PARQUET, it works fine.*

1. Creating exactly the same table as JIRA description, using ORC, TEXTFILE, 
PARQUET format, with table names *sample_orc_table_copy*, 
*sample_text_table_copy* and *sample_parquet_table_copy*, respectively.

2. Populate the new table with data from *sample_parquet_table.*
{code:java}
INSERT INTO sample_orc_table_copy SELECT * FROM sample_parquet_table;
INSERT INTO sample_text_table_copy SELECT * FROM sample_parquet_table;
INSERT INTO sample_parquet_table_copy SELECT * FROM sample_parquet_table;{code}
3. Run the problematic query again, they work well in the new table.
{code:java}
hive> SELECT context.os.name from sample_orc_table_copy;
hive> SELECT context.os.name from sample_text_table_copy;
hive> SELECT context.os.name from sample_parquet_table_copy;
OK
NULL
NULL
NULL
NULL
NULL{code}
 

> Hive query on parquet data should identify if column is not present in file 
> schema and show NULL value instead of Exception
> ---
>
> Key: HIVE-24066
> URL: https://issues.apache.org/jira/browse/HIVE-24066
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.2, 2.3.5
>Reporter: Jainik Vora
>Priority: Major
> Attachments: day_01.snappy.parquet
>
>
> I created a hive table containing columns with struct data type 
>   
> {code:java}
> CREATE EXTERNAL TABLE test_dwh.sample_parquet_table (
>   `context` struct<
> `app`: struct<
> `build`: string,
> `name`: string,
> `namespace`: string,
> `version`: string
> >,
> `device`: struct<
> `adtrackingenabled`: boolean,
> `advertisingid`: string,
> `id`: string,
> `manufacturer`: string,
> `model`: string,
> `type`: string
> >,
> `locale`: string,
> `library`: struct<
> `name`: string,
> `version`: string
> >,
> `os`: struct<
> `name`: string,
> `version`: string
> >,
> `screen`: struct<
> `height`: bigint,
> `width`: bigint
> >,
> `network`: struct<
> `carrier`: string,
> `cellular`: boolean,
> `wifi`: boolean
>  >,
> `timezone`: string,
> `userAgent`: string
> >
> ) PARTITIONED BY (day string)
> STORED as PARQUET
> LOCATION 's3://xyz/events'{code}
>  
>  All columns are nullable hence the parquet files read by the table don't 
> always contain all columns. If any file in a partition doesn't have 
> "context.os" struct and if "context.os.name" is queried, Hive throws an 
> exception as below. Same for "context.screen" as well.
>   
> {code:java}
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]java.io.IOException: 
> java.lang.RuntimeException: Primitive type osshould not doesn't match 
> typeos[name] 
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2208)
>   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
>   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
>

[jira] [Commented] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

2020-10-26 Thread Chao Gao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220516#comment-17220516
 ] 

Chao Gao commented on HIVE-24066:
-

Next I wrote some Python code to analyze the metadata inside the PARQUET file.
{code:java}
>>> import pyarrow.parquet as pq
>>>
>>> metadata = pq.read_table('day_01.snappy.parquet')
>>> print(metadata.schema)
context: struct, device: struct, 
library: struct, locale: string, network: 
struct, screen: 
struct, timezone: string, 
userAgent: string>
..
..
-- schema metadata --
org.apache.spark.sql.parquet.row.metadata: '{"type":"struct","fields":[{"' + 
1937

>>> data = metadata.to_pandas()
>>> print(data.to_string())
0  {'app': {'build': '123', 'name': 'User App', 'namespace': 'com.abc.xyz', 
'version': '1.0.0'}, 'device': {'adTrackingEnabled': True, 'advertisingId': 
'test', 'id': '1c61295af65611b6', 'manufacturer': 'Quanta', 'model': 'QTAIR7', 
'name': 'QTAIR7'}, 'library': {'name': 'analytics-android', 'version': 
'1.0.0'}, 'locale': 'en-US', 'network': {'bluetooth': False, 'carrier': '', 
'cellular': False, 'wifi': False}, 'screen': {'density': 1.5, 'height': 1128, 
'width': 1920}, 'timezone': 'America/Los_Angeles', 'userAgent': 'Dalvik/2.1.0 
(Linux; U; Android 5.1; QTAIR7 Build/LMY47D)'}
..
..{code}
*From the PARQUET analysis, we could find out that the original parquet 
metadata is generated by SparkSQL. And There is NO column of* *_`os`: 
struct<`name`: string,`version`: string>_*

 

+Could anyone from Open Source community help to answer the following question 
please?+

If original PARQUET file table schema do NOT contain some STRUCT column, and 
Hive table schema DO contain some STRUCT column with data of NULL, is Hive 
expected to handle with the query well? i.e. _*select context.os.name from 
sample_parquet_table;*_ shows NULL instead of exception.

> Hive query on parquet data should identify if column is not present in file 
> schema and show NULL value instead of Exception
> ---
>
> Key: HIVE-24066
> URL: https://issues.apache.org/jira/browse/HIVE-24066
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.2, 2.3.5
>Reporter: Jainik Vora
>Priority: Major
> Attachments: day_01.snappy.parquet
>
>
> I created a hive table containing columns with struct data type 
>   
> {code:java}
> CREATE EXTERNAL TABLE test_dwh.sample_parquet_table (
>   `context` struct<
> `app`: struct<
> `build`: string,
> `name`: string,
> `namespace`: string,
> `version`: string
> >,
> `device`: struct<
> `adtrackingenabled`: boolean,
> `advertisingid`: string,
> `id`: string,
> `manufacturer`: string,
> `model`: string,
> `type`: string
> >,
> `locale`: string,
> `library`: struct<
> `name`: string,
> `version`: string
> >,
> `os`: struct<
> `name`: string,
> `version`: string
> >,
> `screen`: struct<
> `height`: bigint,
> `width`: bigint
> >,
> `network`: struct<
> `carrier`: string,
> `cellular`: boolean,
> `wifi`: boolean
>  >,
> `timezone`: string,
> `userAgent`: string
> >
> ) PARTITIONED BY (day string)
> STORED as PARQUET
> LOCATION 's3://xyz/events'{code}
>  
>  All columns are nullable hence the parquet files read by the table don't 
> always contain all columns. If any file in a partition doesn't have 
> "context.os" struct and if "context.os.name" is queried, Hive throws an 
> exception as below. Same for "context.screen" as well.
>   
> {code:java}
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]
> 2020-10-23T00:44:10,496 ERROR [db58bfe6-d0ca-4233-845a-8a10916c3ff1 
> main([])]: CliDriver (SessionState.java:printError(1126)) - Failed with 
> exception java.io.IOException:java.lang.RuntimeException: Primitive type 
> osshould not doesn't match typeos[name]java.io.IOException: 
> java.lang.RuntimeException: Primitive type osshould not doesn't match 
> typeos[name] 
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
>   at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
>   at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
>   at

[jira] [Assigned] (HIVE-16490) Hive should not use private HDFS APIs for encryption

2020-10-26 Thread Uma Maheswara Rao G (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HIVE-16490:
--

Assignee: Uma Maheswara Rao G  (was: Naveen Gangam)

> Hive should not use private HDFS APIs for encryption
> 
>
> Key: HIVE-16490
> URL: https://issues.apache.org/jira/browse/HIVE-16490
> Project: Hive
>  Issue Type: Improvement
>  Components: Encryption
>Affects Versions: 2.2.0
>Reporter: Andrew Wang
>Assignee: Uma Maheswara Rao G
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When compiling against bleeding edge versions of Hive and Hadoop, we 
> discovered that HIVE-16047 references a private HDFS API, DFSClient, to get 
> at various encryption related information. The private API was recently 
> changed by HADOOP-14104, which broke Hive compilation.
> It'd be better to instead use publicly supported APIs. HDFS-11687 has been 
> filed to add whatever encryption APIs are needed by Hive. This JIRA is to 
> move Hive over to these new APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24108) LlapDaemon should use TezClassLoader

2020-10-26 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220600#comment-17220600
 ] 

László Bodor commented on HIVE-24108:
-

according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

> LlapDaemon should use TezClassLoader
> 
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288)
>   ... 16 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313)
>   ... 18 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76)
>   ... 21 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

2020-10-26 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220600#comment-17220600
 ] 

László Bodor edited comment on HIVE-24108 at 10/26/20, 10:29 AM:
-

according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

I think the reason why mapjoin_addjar.q is worked with 
TestMiniLlapLocalCliDriver (before this TezClassloader stuff and tez 0.10) is 
that it runs the llap daemon in the same JVM (and somehow also had the custom 
TestSerDe class on the classpath), but that doesn't reflect a real-life 
scenario (confirmed with TestMiniLlapCliDriver where the daemon runs in 
separate JVM and it fails as expected)

cc: [~harishjp]: what do you think about [^HIVE-24108.02.patch] ? it doesn't 
use the global TezClassloader...I'm testing it in the scope of HIVE-23930


was (Author: abstractdog):
according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

I think the reason why mapjoin_addjar.q is worked with 
TestMiniLlapLocalCliDriver (before this TezClassloader stuff and tez 0.10) is 
that it runs the llap daemon in the same JVM (and somehow also had the custom 
TestSerDe class on the classpath), but that doesn't reflect a real-life 
scenario (confirmed with TestMiniLlapCliDriver where the daemon runs in 
separate JVM and it fails as expected)

> AddToClassPathAction should use TezClassLoader
> --
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, 
> hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
>

[jira] [Updated] (HIVE-24108) AddToClassPathAction should use TezClassLoader

2020-10-26 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24108:

Summary: AddToClassPathAction should use TezClassLoader  (was: LlapDaemon 
should use TezClassLoader)

> AddToClassPathAction should use TezClassLoader
> --
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288)
>   ... 16 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313)
>   ... 18 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76)
>   ... 21 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24108) AddToClassPathAction should use TezClassLoader

2020-10-26 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24108:

Attachment: HIVE-24108.02.patch

> AddToClassPathAction should use TezClassLoader
> --
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, 
> hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288)
>   ... 16 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313)
>   ... 18 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76)
>   ... 21 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

2020-10-26 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220600#comment-17220600
 ] 

László Bodor edited comment on HIVE-24108 at 10/26/20, 10:26 AM:
-

according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

I think the reason why mapjoin_addjar.q is worked with 
TestMiniLlapLocalCliDriver (before this TezClassloader stuff and tez 0.10) is 
that it runs the llap daemon in the same JVM (and somehow also had the custom 
TestSerDe class on the classpath), but that doesn't reflect a real-life 
scenario (confirmed with TestMiniLlapCliDriver where the daemon runs in 
separate JVM and it fails as expected)


was (Author: abstractdog):
according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

I think the reason why mapjoin_addjar.q is worked with 
TestMiniLlapLocalCliDriver is that it runs the llap daemon in the same JVM (and 
somehow also had the custom TestSerDe class on the classpath), but that doesn't 
reflect a real-life scenario (confirmed with TestMiniLlapCliDriver where the 
daemon runs in separate JVM and it fails as expected)

> AddToClassPathAction should use TezClassLoader
> --
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, 
> hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427)
>   at 
>

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

2020-10-26 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220600#comment-17220600
 ] 

László Bodor edited comment on HIVE-24108 at 10/26/20, 10:24 AM:
-

according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

I think the reason why mapjoin_addjar.q is worked with 
TestMiniLlapLocalCliDriver is that it runs the llap daemon in the same JVM (and 
somehow also had the custom TestSerDe class on the classpath), but that doesn't 
reflect a real-life scenario (confirmed with TestMiniLlapCliDriver where the 
daemon runs in separate JVM and it fails as expected)


was (Author: abstractdog):
according to an earlier slack topic, I found that ADD JAR is not supposed to be 
supported in LLAP, so I'm about to change the first patch and use 
TezClassloader only in FunctionLocalizer/AddToClassPathAction

the only test which failed with the new TezClassLoader is mapjoin_addjar.q, 
which is an "ADD JAR" test on TestMiniLlapLocalCliDriver, and that's invalid in 
my opinion, I'm moving it to TestMiniTezCliDriver

> AddToClassPathAction should use TezClassLoader
> --
>
> Key: HIVE-24108
> URL: https://issues.apache.org/jira/browse/HIVE-24108
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, 
> hive_log_llap.log
>
>
> TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader 
> instead of the system classloader. However, there are some codepaths, e.g. in 
>  [^hive_log_llap.log]  which shows that the system class loader is used. As 
> thread context classloaders are inherited, the easier solution is to 
> early-initialize TezClassLoader in LlapDaemon, and let all threads use that 
> as context class loader, so this solution is more like TEZ-4223 for llap 
> daemons.
> {code}
> 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: 
> java.lang.RuntimeException: Map operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332)
>   at 
> org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288)
>   ... 16 more
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.serde2.TestSerDe
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79)
>   at 
> org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100)
>   at 
>

[jira] [Work logged] (HIVE-12371) Adding a timeout connection parameter for JDBC

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12371?focusedWorklogId=504707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504707
 ]

ASF GitHub Bot logged work on HIVE-12371:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 13:19
Start Date: 26/Oct/20 13:19
Worklog Time Spent: 10m 
  Work Description: jshmchenxi opened a new pull request #1611:
URL: https://github.com/apache/hive/pull/1611


   
   
   ### What changes were proposed in this pull request?
   
   Improvement [HIVE-12371](https://issues.apache.org/jira/browse/HIVE-12371): 
Adding a timeout connection parameter for JDBC
   
   ### Why are the changes needed?
   
   We run into the problem of 
[HIVE-22196](https://issues.apache.org/jira/browse/HIVE-22196): Socket timeouts 
happen when other drivers set DriverManager.loginTimeout.
   It is a good idea to add timeout parameter to hiveserver2 jdbc url. So that 
we do not use the global DriverManager.loginTimeout which may be unexpected set 
by other components in the program.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Tested by local install jar



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504707)
Remaining Estimate: 0h
Time Spent: 10m

> Adding a timeout connection parameter for JDBC
> --
>
> Key: HIVE-12371
> URL: https://issues.apache.org/jira/browse/HIVE-12371
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Nemon Lou
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are some timeout settings from server side:
> HIVE-4766
> HIVE-6679
> Adding a timeout connection parameter for JDBC is useful in some scenario:
> 1,beeline (which can not set timeout manually)
> 2,customize timeout for different connections (among hive or RDBs,which can 
> not be done via DriverManager.setLoginTimeout())
> Just like postgresql,
> {noformat}
> jdbc:postgresql://localhost/test?user=fred=secret=true=0
> {noformat}
> or mysql
> {noformat}
> jdbc:mysql://xxx.xx.xxx.xxx:3306/database?connectTimeout=6=6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-12371) Adding a timeout connection parameter for JDBC

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-12371:
--
Labels: pull-request-available  (was: )

> Adding a timeout connection parameter for JDBC
> --
>
> Key: HIVE-12371
> URL: https://issues.apache.org/jira/browse/HIVE-12371
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Nemon Lou
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There are some timeout settings from server side:
> HIVE-4766
> HIVE-6679
> Adding a timeout connection parameter for JDBC is useful in some scenario:
> 1,beeline (which can not set timeout manually)
> 2,customize timeout for different connections (among hive or RDBs,which can 
> not be done via DriverManager.setLoginTimeout())
> Just like postgresql,
> {noformat}
> jdbc:postgresql://localhost/test?user=fred=secret=true=0
> {noformat}
> or mysql
> {noformat}
> jdbc:mysql://xxx.xx.xxx.xxx:3306/database?connectTimeout=6=6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=504740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504740
 ]

ASF GitHub Bot logged work on HIVE-24259:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 14:32
Start Date: 26/Oct/20 14:32
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1610:
URL: https://github.com/apache/hive/pull/1610#discussion_r511995335



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2836,14 +2836,32 @@ long getPartsFound() {
   @Override
   public SQLAllTableConstraints getAllTableConstraints(String catName, String 
dbName, String tblName)
   throws MetaException, NoSuchObjectException {
-SQLAllTableConstraints sqlAllTableConstraints = new 
SQLAllTableConstraints();
-sqlAllTableConstraints.setPrimaryKeys(getPrimaryKeys(catName, dbName, 
tblName));
-sqlAllTableConstraints.setForeignKeys(getForeignKeys(catName, null, null, 
dbName, tblName));
-sqlAllTableConstraints.setUniqueConstraints(getUniqueConstraints(catName, 
dbName, tblName));
-
sqlAllTableConstraints.setDefaultConstraints(getDefaultConstraints(catName, 
dbName, tblName));
-sqlAllTableConstraints.setCheckConstraints(getCheckConstraints(catName, 
dbName, tblName));
-
sqlAllTableConstraints.setNotNullConstraints(getNotNullConstraints(catName, 
dbName, tblName));
-return sqlAllTableConstraints;
+
+catName = StringUtils.normalizeIdentifier(catName);
+dbName = StringUtils.normalizeIdentifier(dbName);
+tblName = StringUtils.normalizeIdentifier(tblName);
+if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && 
rawStore.isActiveTransaction())) {
+  return rawStore.getAllTableConstraints(catName, dbName, tblName);
+}
+
+Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
+if (tbl == null) {
+  // The table containing the constraints is not yet loaded in cache
+  return rawStore.getAllTableConstraints(catName, dbName, tblName);
+}
+SQLAllTableConstraints constraints = 
sharedCache.listCachedAllTableConstraints(catName, dbName, tblName);
+
+// if any of the constraint value is missing then there might be the case 
of partial constraints are stored in cached.
+// So fall back to raw store for correct values
+if (constraints != null && 
CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && CollectionUtils

Review comment:
   Majority of the calls are likely to be redirected to RawStore and lose 
the advantage of cache. How about an optional flag in SQLAllTableConstraints 
and TableWrapper to mark if it is a complete snapshot of constraints?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504740)
Time Spent: 20m  (was: 10m)

> [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call
> --
>
> Key: HIVE-24259
> URL: https://issues.apache.org/jira/browse/HIVE-24259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Description -
> currently inorder to get all constraint form the cachedstore. 6 different 
> call is made to the store. Instead combine that 6 call in 1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache calls to 1.

2020-10-26 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-24259:

Summary: [CachedStore] Optimise getAlltableConstraint from 6 cache calls to 
1.  (was: [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 
cache call)

> [CachedStore] Optimise getAlltableConstraint from 6 cache calls to 1.
> -
>
> Key: HIVE-24259
> URL: https://issues.apache.org/jira/browse/HIVE-24259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Description -
> currently inorder to get all constraint form the cachedstore. 6 different 
> call is made to the store. Instead combine that 6 call in 1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504814
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 17:21
Start Date: 26/Oct/20 17:21
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512136737



##
File path: 
service/src/java/org/apache/hive/service/cli/session/SessionManager.java
##
@@ -135,6 +138,13 @@ public synchronized void init(HiveConf hiveConf) {
 userIpAddressLimit = 
hiveConf.getIntVar(ConfVars.HIVE_SERVER2_LIMIT_CONNECTIONS_PER_USER_IPADDRESS);
 LOG.info("Connections limit are user: {} ipaddress: {} user-ipaddress: 
{}", userLimit, ipAddressLimit,
   userIpAddressLimit);
+
+int cleanupThreadCount = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_THREAD_COUNT);
+int cleanupQueueSize = 
hiveConf.getIntVar(ConfVars.HIVE_ASYNC_CLEANUP_SERVICE_QUEUE_SIZE);
+if (cleanupThreadCount > 0) {
+  cleanupService = new EventualCleanupService(cleanupThreadCount, 
cleanupQueueSize);
+}
+cleanupService.start();

Review comment:
   it seems to me that `cleanupService` is only initialized in case 
`cleanupThreadCount>0` - wouldn't there be an `NPE` when that is not satisfied?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/cleanup/EventualCleanupService.java
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.cleanup;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadFactory;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class EventualCleanupService implements CleanupService {
+  private final int threadCount;
+  private final int queueSize;
+  private final ThreadFactory factory;
+  private final Logger LOG = 
LoggerFactory.getLogger(EventualCleanupService.class.getName());
+  private final AtomicBoolean isRunning = new AtomicBoolean(true);
+  private final BlockingQueue deleteActions;
+  private ExecutorService cleanerExecutorService;
+
+  public EventualCleanupService(int threadCount, int queueSize) {
+if (queueSize < threadCount) {
+  throw new IllegalArgumentException("Queue size should be greater or 
equal to thread count. Queue size: "
+  + queueSize + ", thread count: " + threadCount);
+}
+this.factory = new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("EventualCleanupService 
thread %d").build();
+this.threadCount = threadCount;
+this.queueSize = queueSize;
+this.deleteActions = new LinkedBlockingQueue<>(queueSize);
+  }
+
+  @Override
+  public synchronized void start() {
+if (cleanerExecutorService != null) {
+  LOG.debug("EventualCleanupService is already running.");
+  return;
+}
+cleanerExecutorService = Executors.newFixedThreadPool(threadCount, 
factory);
+for (int i = 0; i < threadCount; i++) {
+  cleanerExecutorService.submit(new CleanupRunnable());

Review comment:
   I might be wrong but:
   * I think these threads will be there all the time - even if there is no 
work to be done (let's use `N` for this)
   * it seems to me that every session will have these cleaner threads for them 
(current session count: `M`)
   
   ...so there will be `N*M` threads running all the time?
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=504820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504820
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 17:32
Start Date: 26/Oct/20 17:32
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r512144358



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/cleanup/EventualCleanupService.java
##
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.cleanup;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.util.concurrent.ThreadFactoryBuilder;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.ThreadFactory;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public class EventualCleanupService implements CleanupService {
+  private final int threadCount;
+  private final int queueSize;
+  private final ThreadFactory factory;
+  private final Logger LOG = 
LoggerFactory.getLogger(EventualCleanupService.class.getName());
+  private final AtomicBoolean isRunning = new AtomicBoolean(true);
+  private final BlockingQueue deleteActions;
+  private ExecutorService cleanerExecutorService;
+
+  public EventualCleanupService(int threadCount, int queueSize) {
+if (queueSize < threadCount) {
+  throw new IllegalArgumentException("Queue size should be greater or 
equal to thread count. Queue size: "
+  + queueSize + ", thread count: " + threadCount);
+}
+this.factory = new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("EventualCleanupService 
thread %d").build();
+this.threadCount = threadCount;
+this.queueSize = queueSize;
+this.deleteActions = new LinkedBlockingQueue<>(queueSize);
+  }
+
+  @Override
+  public synchronized void start() {
+if (cleanerExecutorService != null) {
+  LOG.debug("EventualCleanupService is already running.");
+  return;
+}
+cleanerExecutorService = Executors.newFixedThreadPool(threadCount, 
factory);
+for (int i = 0; i < threadCount; i++) {
+  cleanerExecutorService.submit(new CleanupRunnable());

Review comment:
   CleanupService is one per HiveServer, not per session. So there will be 
N threads running all the time (actually blocked on blocking queue when there 
is no cleanup to be done, so not scheduled when unncessary).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504820)
Time Spent: 2h  (was: 1h 50m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24312) Use column stats to remove "x is not null" filter conditions if they are redundant

2020-10-26 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24312:
---


> Use column stats to remove "x is not null" filter conditions if they are 
> redundant
> --
>
> Key: HIVE-24312
> URL: https://issues.apache.org/jira/browse/HIVE-24312
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> with HIVE-24241 SharedWorkOptimizer could further merge branches for some 
> queries (ex: 
> [query32|https://github.com/apache/hive/blob/db895f374bf63b77b683574fdf678bfac91a5ac6/ql/src/test/results/clientpositive/perf/tez/query32.q.out#L118-L163]
>  )
> ...but a little `is not null` difference prevents it from proceeding.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24274:
--
Labels: pull-request-available  (was: )

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24274?focusedWorklogId=504808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504808
 ]

ASF GitHub Bot logged work on HIVE-24274:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 17:11
Start Date: 26/Oct/20 17:11
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1561:
URL: https://github.com/apache/hive/pull/1561#discussion_r512129871



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
##
@@ -472,7 +474,7 @@ public RelNode genLogicalPlan(ASTNode ast) throws 
SemanticException {
 }
 profilesCBO = obtainCBOProfiles(queryProperties);
 disableJoinMerge = true;
-final RelNode resPlan = logicalPlan();
+final RelNode resPlan = logicalPlan(ast);

Review comment:
   Changed to `SemantiAnalyzer.ast`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504808)
Remaining Estimate: 0h
Time Spent: 10m

> Implement Query Text based MaterializedView rewrite
> ---
>
> Key: HIVE-24274
> URL: https://issues.apache.org/jira/browse/HIVE-24274
> Project: Hive
>  Issue Type: Improvement
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Besides the way queries are currently rewritten to use materialized views in 
> Hive this project provides an alternative:
> Compare the query text with the materialized views query text stored. If we 
> found a match the original query's logical plan can be replaced by a scan on 
> the materialized view.
> - Only materialized views which are enabled to rewrite can participate
> - Use existing *HiveMaterializedViewsRegistry* through *Hive* object by 
> adding a lookup method by query text.
> - There might be more than one materialized views which have the same query 
> text. In this case chose the first valid one.
> - Validation can be done by calling 
> *Hive.validateMaterializedViewsFromRegistry()*
> - The scope of this first patch is rewriting queries which entire text can be 
> matched only.
> - Use the expanded query text (fully qualified column and table names) for 
> comparing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504755
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:02
Start Date: 26/Oct/20 15:02
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r511984518



##
File path: 
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
##
@@ -109,6 +109,20 @@ CREATE TABLE IF NOT EXISTS REPLICATION_METRICS (
 CREATE INDEX POLICY_IDX ON REPLICATION_METRICS (RM_POLICY);
 CREATE INDEX DUMP_IDX ON REPLICATION_METRICS (RM_DUMP_EXECUTION_ID);
 
+-- Create stored procedure tables
+CREATE TABLE STORED_PROCS (
+  `SP_ID` BIGINT(20) NOT NULL,
+  `CREATE_TIME` INT(11) NOT NULL,
+  `DB_ID` BIGINT(20) NOT NULL,
+  `NAME` VARCHAR(256) NOT NULL,
+  `OWNER_NAME` VARCHAR(128) NOT NULL,
+  `SOURCE` LONGTEXT NOT NULL,
+  PRIMARY KEY (`SP_ID`)
+);
+
+CREATE UNIQUE INDEX UNIQUESTOREDPROC ON STORED_PROCS (NAME, DB_ID);
+ALTER TABLE `STORED_PROCS` ADD CONSTRAINT `STOREDPROC_FK1` FOREIGN KEY 
(`DB_ID`) REFERENCES DBS (`DB_ID`);

Review comment:
   what will happen when the db is dropped? wouldn't this FK will restrict 
the DB from being dropped?

##
File path: hplsql/src/main/java/org/apache/hive/hplsql/functions/Function.java
##
@@ -1,780 +1,30 @@
 /*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
  *
- * http://www.apache.org/licenses/LICENSE-2.0
+ *  * Licensed to the Apache Software Foundation (ASF) under one

Review comment:
   these lines start with `* *`

##
File path: 
hplsql/src/main/java/org/apache/hive/hplsql/functions/BuiltinFunctions.java
##
@@ -0,0 +1,435 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance
+ *  * with the License.  You may obtain a copy of the License at
+ *  *
+ *  * http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
+ *
+ */
+
+package org.apache.hive.hplsql.functions;
+
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.text.SimpleDateFormat;
+import java.util.Calendar;
+import java.util.HashMap;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+import org.antlr.v4.runtime.ParserRuleContext;
+import org.apache.hive.hplsql.Exec;
+import org.apache.hive.hplsql.HplsqlParser;
+import org.apache.hive.hplsql.Query;
+import org.apache.hive.hplsql.Utils;
+import org.apache.hive.hplsql.Var;
+
+public class BuiltinFunctions {

Review comment:
   do we really need to define these function differently than others; I've 
taken a look at `MIN_PART_STRING` and it seenms like its an ordinary 
function...so it could probably use the registry way approach 

##
File path: hplsql/src/main/java/org/apache/hive/hplsql/functions/Function.java
##
@@ -1,780 +1,30 @@
 /*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
  *
- * http://www.apache.org/licenses/LICENSE-2.0
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance

[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19253?focusedWorklogId=504760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504760
 ]

ASF GitHub Bot logged work on HIVE-19253:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:11
Start Date: 26/Oct/20 15:11
Worklog Time Spent: 10m 
  Work Description: szehonCriteo commented on pull request #1537:
URL: https://github.com/apache/hive/pull/1537#issuecomment-716611935


   @nrg4878 Hello naveen, how are you?  As we're not super familiar with the 
spec of the MetastoreTransformer, do you think changing this tests this way is 
ok?  I put an explanation above, it seems before the test was actually making 
MANAGED table instead of EXTERNAL.  Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504760)
Time Spent: 50m  (was: 40m)

> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504773
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:41
Start Date: 26/Oct/20 15:41
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512060541



##
File path: 
hplsql/src/main/java/org/apache/hive/hplsql/functions/HmsFunction.java
##
@@ -0,0 +1,232 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance
+ *  * with the License.  You may obtain a copy of the License at
+ *  *
+ *  * http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
+ *
+ */
+
+package org.apache.hive.hplsql.functions;
+
+import static 
org.apache.hive.hplsql.functions.InMemoryFunction.setCallParameters;
+
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+import org.antlr.v4.runtime.ANTLRInputStream;
+import org.antlr.v4.runtime.CommonTokenStream;
+import org.antlr.v4.runtime.ParserRuleContext;
+import org.apache.hadoop.hive.metastore.IMetaStoreClient;
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.NoSuchObjectException;
+import org.apache.hadoop.hive.metastore.api.StoredProcedure;
+import org.apache.hadoop.hive.metastore.api.StoredProcedureRequest;
+import org.apache.hive.hplsql.Exec;
+import org.apache.hive.hplsql.HplsqlBaseVisitor;
+import org.apache.hive.hplsql.HplsqlLexer;
+import org.apache.hive.hplsql.HplsqlParser;
+import org.apache.hive.hplsql.Scope;
+import org.apache.hive.hplsql.Var;
+import org.apache.thrift.TException;
+
+public class HmsFunction implements Function {

Review comment:
   I agree, but this is how it used to work, and I didn't want to address 
these things as part of this patch. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504773)
Time Spent: 3h 50m  (was: 3h 40m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504772=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504772
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:41
Start Date: 26/Oct/20 15:41
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512060372



##
File path: 
hplsql/src/main/java/org/apache/hive/hplsql/functions/FunctionDatetime.java
##
@@ -36,20 +36,20 @@ public FunctionDatetime(Exec e) {
* Register functions
*/
   @Override
-  public void register(Function f) {
-f.map.put("DATE", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { date(ctx); }});
-f.map.put("FROM_UNIXTIME", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { fromUnixtime(ctx); }});
-f.map.put("NOW", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { now(ctx); }});
-f.map.put("TIMESTAMP_ISO", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { timestampIso(ctx); }});
-f.map.put("TO_TIMESTAMP", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { toTimestamp(ctx); }});
-f.map.put("UNIX_TIMESTAMP", new FuncCommand() { public void 
run(HplsqlParser.Expr_func_paramsContext ctx) { unixTimestamp(ctx); }});
+  public void register(BuiltinFunctions f) {
+f.map.put("DATE", this::date);

Review comment:
   This is a general issue, many things are package private. We can 
refactor that little by little.
   
   
   .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504772)
Time Spent: 3h 40m  (was: 3.5h)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504775
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:41
Start Date: 26/Oct/20 15:41
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512060858



##
File path: standalone-metastore/metastore-server/src/main/resources/package.jdo
##
@@ -1549,6 +1549,31 @@
 
   
 
+
+

Review comment:
   I feel like the similarity is accidental and in the future we can expect 
more diversion between the two. We would likely have columns which are only 
applicable to one case and not the other. There are already things like 
className, resourceUri, resourceType in MFunction. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504775)
Time Spent: 4h  (was: 3h 50m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504776
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:41
Start Date: 26/Oct/20 15:41
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512060986



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Exec.java
##
@@ -1659,13 +1665,70 @@ public Integer 
visitExpr_func(HplsqlParser.Expr_funcContext ctx) {
 }
 return 0;
   }
-  
+
+  /**
+   * User-defined function in a SQL query
+   */
+  public void execSql(String name, HplsqlParser.Expr_func_paramsContext ctx) {
+if (execUserSql(ctx, name)) {
+  return;
+}
+StringBuilder sql = new StringBuilder();
+sql.append(name);
+sql.append("(");
+if (ctx != null) {
+  int cnt = ctx.func_param().size();
+  for (int i = 0; i < cnt; i++) {
+sql.append(evalPop(ctx.func_param(i).expr()));
+if (i + 1 < cnt) {
+  sql.append(", ");
+}
+  }
+}
+sql.append(")");
+exec.stackPush(sql);
+  }
+
+  /**
+   * Execute a HPL/SQL user-defined function in a query
+   */
+  private boolean execUserSql(HplsqlParser.Expr_func_paramsContext ctx, String 
name) {
+if (!function.exists(name.toUpperCase())) {
+  return false;
+}
+StringBuilder sql = new StringBuilder();
+sql.append("hplsql('");

Review comment:
   I didn't check since this is not new code, and fixing every issue in the 
existing code was not the scope of this issue.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504776)
Time Spent: 4h 10m  (was: 4h)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

2020-10-26 Thread Karen Coppage (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-24302:
-
Summary: Cleaner shouldn't run if it can't remove obsolete files  (was: 
Cleaner should not mark compaction queue entry as cleaned if it doesn't remove 
obsolete files)

> Cleaner shouldn't run if it can't remove obsolete files
> ---
>
> Key: HIVE-24302
> URL: https://issues.apache.org/jira/browse/HIVE-24302
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> Example:
>  # open txn 5, leave it open (maybe it's a long-running compaction)
>  # insert into table t in txns 6, 7 with writeids 1, 2
>  # compactor.Worker runs on table t and compacts writeids 1, 2
>  # compactor.Cleaner picks up the compaction queue entry, but doesn't delete 
> any files because the min global open txnid is 5, which cannot see writeIds 
> 1, 2.
>  # Cleaner marks the compactor queue entry as cleaned and removes the entry 
> from the queue.
> delta_1 and delta_2 will remain in the file system until another compaction 
> is run on table t.
> Step 5 should not happen, we should skip calling markCleaned() and leave it 
> in the queue in "ready to clean" state. MarkCleaned() should be called only 
> after txn 5 is closed and, following that, the cleaner runs successfully.
> This will potentially slow down the cleaner, but on the other hand it won't 
> silently "fail" i.e. not do its job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24302?focusedWorklogId=504793=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504793
 ]

ASF GitHub Bot logged work on HIVE-24302:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 16:24
Start Date: 26/Oct/20 16:24
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1612:
URL: https://github.com/apache/hive/pull/1612


   # What changes were proposed in this pull request?
   If the global min open txnid blocks cleaner for a certain table, then 
cleaner will skip that compaction queue entry and leave it in the queue for 
later.
   
   ### Why are the changes needed?
   If the global min open txnid blocks cleaner for a certain table, then 
cleaner "runs", doesn't delete any files, and marks compaction as succeeded as 
if everything were fine.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, if the global min open txnid blocks cleaner for a certain table, then 
cleaner will skip that compaction queue entry and leave it in the queue for 
later.
   
   ### How was this patch tested?
   Unit tests



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504793)
Remaining Estimate: 0h
Time Spent: 10m

> Cleaner shouldn't run if it can't remove obsolete files
> ---
>
> Key: HIVE-24302
> URL: https://issues.apache.org/jira/browse/HIVE-24302
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Example:
>  # open txn 5, leave it open (maybe it's a long-running compaction)
>  # insert into table t in txns 6, 7 with writeids 1, 2
>  # compactor.Worker runs on table t and compacts writeids 1, 2
>  # compactor.Cleaner picks up the compaction queue entry, but doesn't delete 
> any files because the min global open txnid is 5, which cannot see writeIds 
> 1, 2.
>  # Cleaner marks the compactor queue entry as cleaned and removes the entry 
> from the queue.
> delta_1 and delta_2 will remain in the file system until another compaction 
> is run on table t.
> Step 5 should not happen, we should skip calling markCleaned() and leave it 
> in the queue in "ready to clean" state. MarkCleaned() should be called only 
> after txn 5 is closed and, following that, the cleaner runs successfully.
> This will potentially slow down the cleaner, but on the other hand it won't 
> silently "fail" i.e. not do its job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24302:
--
Labels: pull-request-available  (was: )

> Cleaner shouldn't run if it can't remove obsolete files
> ---
>
> Key: HIVE-24302
> URL: https://issues.apache.org/jira/browse/HIVE-24302
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Example:
>  # open txn 5, leave it open (maybe it's a long-running compaction)
>  # insert into table t in txns 6, 7 with writeids 1, 2
>  # compactor.Worker runs on table t and compacts writeids 1, 2
>  # compactor.Cleaner picks up the compaction queue entry, but doesn't delete 
> any files because the min global open txnid is 5, which cannot see writeIds 
> 1, 2.
>  # Cleaner marks the compactor queue entry as cleaned and removes the entry 
> from the queue.
> delta_1 and delta_2 will remain in the file system until another compaction 
> is run on table t.
> Step 5 should not happen, we should skip calling markCleaned() and leave it 
> in the queue in "ready to clean" state. MarkCleaned() should be called only 
> after txn 5 is closed and, following that, the cleaner runs successfully.
> This will potentially slow down the cleaner, but on the other hand it won't 
> silently "fail" i.e. not do its job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24302?focusedWorklogId=504794=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504794
 ]

ASF GitHub Bot logged work on HIVE-24302:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 16:26
Start Date: 26/Oct/20 16:26
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #1612:
URL: https://github.com/apache/hive/pull/1612#issuecomment-716664237


   Not sure about this bit (since HIVE-23048)
   
   `\"CQ_NEXT_TXN_ID\" = "
   + "(SELECT MAX(\"TXN_ID\") + 1 FROM \"TXNS\" "`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504794)
Time Spent: 20m  (was: 10m)

> Cleaner shouldn't run if it can't remove obsolete files
> ---
>
> Key: HIVE-24302
> URL: https://issues.apache.org/jira/browse/HIVE-24302
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Example:
>  # open txn 5, leave it open (maybe it's a long-running compaction)
>  # insert into table t in txns 6, 7 with writeids 1, 2
>  # compactor.Worker runs on table t and compacts writeids 1, 2
>  # compactor.Cleaner picks up the compaction queue entry, but doesn't delete 
> any files because the min global open txnid is 5, which cannot see writeIds 
> 1, 2.
>  # Cleaner marks the compactor queue entry as cleaned and removes the entry 
> from the queue.
> delta_1 and delta_2 will remain in the file system until another compaction 
> is run on table t.
> Step 5 should not happen, we should skip calling markCleaned() and leave it 
> in the queue in "ready to clean" state. MarkCleaned() should be called only 
> after txn 5 is closed and, following that, the cleaner runs successfully.
> This will potentially slow down the cleaner, but on the other hand it won't 
> silently "fail" i.e. not do its job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504767
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:38
Start Date: 26/Oct/20 15:38
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512058024



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/functions/Function.java
##
@@ -1,780 +1,30 @@
 /*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
  *
- * http://www.apache.org/licenses/LICENSE-2.0
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance
+ *  * with the License.  You may obtain a copy of the License at
+ *  *
+ *  * http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
  *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
  */
 
 package org.apache.hive.hplsql.functions;
 
-import java.sql.ResultSet;
-import java.sql.Date;
-import java.sql.SQLException;
-import java.text.SimpleDateFormat;
-import java.util.ArrayList;
-import java.util.Calendar;
-import java.util.HashMap;
-import java.util.Map;
-import java.util.TimeZone;
-import java.util.regex.Matcher;
-import java.util.regex.Pattern;
+import org.apache.hive.hplsql.HplsqlParser;
 
-import org.apache.commons.lang3.StringUtils;
-import org.antlr.v4.runtime.ParserRuleContext;
-import org.apache.hive.hplsql.*;
-
-interface FuncCommand {
-  void run(HplsqlParser.Expr_func_paramsContext ctx);
-}
-
-interface FuncSpecCommand {
-  void run(HplsqlParser.Expr_spec_funcContext ctx);
-}
-
-/**
- * HPL/SQL functions
- */
-public class Function {
-  Exec exec;
-  HashMap map = new HashMap();  
-  HashMap specMap = new HashMap();
-  HashMap specSqlMap = new HashMap();
-  HashMap userMap = new 
HashMap();
-  HashMap procMap = new 
HashMap();
-  boolean trace = false; 
-  
-  public Function(Exec e) {
-exec = e;  
-trace = exec.getTrace();
-  }
-  
-  /** 
-   * Register functions
-   */
-  public void register(Function f) {
-  }
-  
-  /**
-   * Execute a function
-   */
-  public void exec(String name, HplsqlParser.Expr_func_paramsContext ctx) {
-if (execUser(name, ctx)) {
-  return;
-}
-else if (isProc(name) && execProc(name, ctx, null)) {
-  return;
-}
-if (name.indexOf(".") != -1) {   // Name can be qualified and 
spaces are allowed between parts
-  String[] parts = name.split("\\.");
-  StringBuilder str = new StringBuilder();
-  for (int i = 0; i < parts.length; i++) {
-if (i > 0) {
-  str.append(".");
-}
-str.append(parts[i].trim());
-  }
-  name = str.toString();  
-} 
-if (trace && ctx != null && ctx.parent != null && ctx.parent.parent 
instanceof HplsqlParser.Expr_stmtContext) {
-  trace(ctx, "FUNC " + name);  
-}
-FuncCommand func = map.get(name.toUpperCase());
-if (func != null) {
-  func.run(ctx);
-}
-else {
-  info(ctx, "Function not found: " + name);
-  evalNull();
-}
-  }
-  
-  /**
-   * User-defined function in a SQL query
-   */
-  public void execSql(String name, HplsqlParser.Expr_func_paramsContext ctx) {
-if (execUserSql(ctx, name)) {
-  return;
-}
-StringBuilder sql = new StringBuilder();
-sql.append(name);
-sql.append("(");
-if (ctx != null) {
-  int cnt = ctx.func_param().size();
-  for (int i = 0; i < cnt; i++) {
-

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504768=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504768
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:38
Start Date: 26/Oct/20 15:38
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512058291



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/functions/Function.java
##
@@ -1,780 +1,30 @@
 /*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
  *
- * http://www.apache.org/licenses/LICENSE-2.0
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance
+ *  * with the License.  You may obtain a copy of the License at
+ *  *
+ *  * http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
  *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
  */
 
 package org.apache.hive.hplsql.functions;
 
-import java.sql.ResultSet;
-import java.sql.Date;
-import java.sql.SQLException;
-import java.text.SimpleDateFormat;
-import java.util.ArrayList;
-import java.util.Calendar;
-import java.util.HashMap;
-import java.util.Map;
-import java.util.TimeZone;
-import java.util.regex.Matcher;
-import java.util.regex.Pattern;
+import org.apache.hive.hplsql.HplsqlParser;
 
-import org.apache.commons.lang3.StringUtils;
-import org.antlr.v4.runtime.ParserRuleContext;
-import org.apache.hive.hplsql.*;
-
-interface FuncCommand {
-  void run(HplsqlParser.Expr_func_paramsContext ctx);
-}
-
-interface FuncSpecCommand {
-  void run(HplsqlParser.Expr_spec_funcContext ctx);
-}
-
-/**
- * HPL/SQL functions
- */
-public class Function {
-  Exec exec;
-  HashMap map = new HashMap();  
-  HashMap specMap = new HashMap();
-  HashMap specSqlMap = new HashMap();
-  HashMap userMap = new 
HashMap();
-  HashMap procMap = new 
HashMap();
-  boolean trace = false; 
-  
-  public Function(Exec e) {
-exec = e;  
-trace = exec.getTrace();
-  }
-  
-  /** 
-   * Register functions
-   */
-  public void register(Function f) {
-  }
-  
-  /**
-   * Execute a function
-   */
-  public void exec(String name, HplsqlParser.Expr_func_paramsContext ctx) {
-if (execUser(name, ctx)) {
-  return;
-}
-else if (isProc(name) && execProc(name, ctx, null)) {
-  return;
-}
-if (name.indexOf(".") != -1) {   // Name can be qualified and 
spaces are allowed between parts
-  String[] parts = name.split("\\.");
-  StringBuilder str = new StringBuilder();
-  for (int i = 0; i < parts.length; i++) {
-if (i > 0) {
-  str.append(".");
-}
-str.append(parts[i].trim());
-  }
-  name = str.toString();  
-} 
-if (trace && ctx != null && ctx.parent != null && ctx.parent.parent 
instanceof HplsqlParser.Expr_stmtContext) {
-  trace(ctx, "FUNC " + name);  
-}
-FuncCommand func = map.get(name.toUpperCase());
-if (func != null) {
-  func.run(ctx);
-}
-else {
-  info(ctx, "Function not found: " + name);
-  evalNull();
-}
-  }
-  
-  /**
-   * User-defined function in a SQL query
-   */
-  public void execSql(String name, HplsqlParser.Expr_func_paramsContext ctx) {
-if (execUserSql(ctx, name)) {
-  return;
-}
-StringBuilder sql = new StringBuilder();
-sql.append(name);
-sql.append("(");
-if (ctx != null) {
-  int cnt = ctx.func_param().size();
-  for (int i = 0; i < cnt; i++) {
-

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504770=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504770
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:39
Start Date: 26/Oct/20 15:39
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512058846



##
File path: 
hplsql/src/main/java/org/apache/hive/hplsql/functions/FunctionDatetime.java
##
@@ -27,7 +27,7 @@
 import org.apache.commons.lang3.StringUtils;
 import org.apache.hive.hplsql.*;
 
-public class FunctionDatetime extends Function {
+public class FunctionDatetime extends BuiltinFunctions {

Review comment:
   It used to extend from Function which was a class before. I didn't 
really change how it used to work, it's still using implementation inheritance, 
which I personally don't like but didn't want to change it as part of this 
patch. We might want to move builtin functions into the DB later on, making 
this class unnecessary in the future.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504770)
Time Spent: 3.5h  (was: 3h 20m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504769=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504769
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:38
Start Date: 26/Oct/20 15:38
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512058576



##
File path: 
hplsql/src/main/java/org/apache/hive/hplsql/functions/BuiltinFunctions.java
##
@@ -0,0 +1,435 @@
+/*
+ *
+ *  * Licensed to the Apache Software Foundation (ASF) under one
+ *  * or more contributor license agreements.  See the NOTICE file
+ *  * distributed with this work for additional information
+ *  * regarding copyright ownership.  The ASF licenses this file
+ *  * to you under the Apache License, Version 2.0 (the
+ *  * "License"); you may not use this file except in compliance
+ *  * with the License.  You may obtain a copy of the License at
+ *  *
+ *  * http://www.apache.org/licenses/LICENSE-2.0
+ *  *
+ *  * Unless required by applicable law or agreed to in writing, software
+ *  * distributed under the License is distributed on an "AS IS" BASIS,
+ *  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  * See the License for the specific language governing permissions and
+ *  * limitations under the License.
+ *
+ */
+
+package org.apache.hive.hplsql.functions;
+
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.SQLException;
+import java.text.SimpleDateFormat;
+import java.util.Calendar;
+import java.util.HashMap;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+import org.antlr.v4.runtime.ParserRuleContext;
+import org.apache.hive.hplsql.Exec;
+import org.apache.hive.hplsql.HplsqlParser;
+import org.apache.hive.hplsql.Query;
+import org.apache.hive.hplsql.Utils;
+import org.apache.hive.hplsql.Var;
+
+public class BuiltinFunctions {

Review comment:
   This was just extracted out from the existing code, it's a not a new 
thing, I think this is out of scope now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504769)
Time Spent: 3h 20m  (was: 3h 10m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-10-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=504777=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-504777
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 26/Oct/20 15:42
Start Date: 26/Oct/20 15:42
Worklog Time Spent: 10m 
  Work Description: zeroflag commented on a change in pull request #1542:
URL: https://github.com/apache/hive/pull/1542#discussion_r512061184



##
File path: hplsql/src/main/java/org/apache/hive/hplsql/Exec.java
##
@@ -799,30 +801,35 @@ Integer init(String[] args) throws Exception {
 select = new Select(this);
 stmt = new Stmt(this);
 converter = new Converter(this);
-
-function = new Function(this);
-new FunctionDatetime(this).register(function);
-new FunctionMisc(this).register(function);
-new FunctionString(this).register(function);
-new FunctionOra(this).register(function);
+
+builtinFunctions = new BuiltinFunctions(this);
+new FunctionDatetime(this).register(builtinFunctions);
+new FunctionMisc(this).register(builtinFunctions);
+new FunctionString(this).register(builtinFunctions);
+new FunctionOra(this).register(builtinFunctions);
+if ("hms".equalsIgnoreCase(System.getProperty("hplsql.storage"))) {
+  function = new HmsFunction(this, 
getMsc(System.getProperty("hplsq.metastore.uris", "thrift://localhost:9083")), 
builtinFunctions);

Review comment:
   This part is removed from subsequent patch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 504777)
Time Spent: 4h 20m  (was: 4h 10m)

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

[jira] [Commented] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

[jira] [Commented] (HIVE-24066) Hive query on parquet data should identify if column is not present in file schema and show NULL value instead of Exception

[jira] [Assigned] (HIVE-16490) Hive should not use private HDFS APIs for encryption

[jira] [Commented] (HIVE-24108) LlapDaemon should use TezClassLoader

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

[jira] [Updated] (HIVE-24108) AddToClassPathAction should use TezClassLoader

[jira] [Updated] (HIVE-24108) AddToClassPathAction should use TezClassLoader

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

[jira] [Comment Edited] (HIVE-24108) AddToClassPathAction should use TezClassLoader

[jira] [Work logged] (HIVE-12371) Adding a timeout connection parameter for JDBC

[jira] [Updated] (HIVE-12371) Adding a timeout connection parameter for JDBC

[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache call to 1 cache call

[jira] [Updated] (HIVE-24259) [CachedStore] Optimise getAlltableConstraint from 6 cache calls to 1.

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

[jira] [Assigned] (HIVE-24312) Use column stats to remove "x is not null" filter conditions if they are redundant

[jira] [Updated] (HIVE-24274) Implement Query Text based MaterializedView rewrite

[jira] [Work logged] (HIVE-24274) Implement Query Text based MaterializedView rewrite

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Updated] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

[jira] [Work logged] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

[jira] [Updated] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

[jira] [Work logged] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

35 matches

Site Navigation

Mail list logo

Footer information