[jira] [Commented] (HIVE-26610) Upgrade calcite-core to 1.32.0

2022-12-13 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17646953#comment-17646953
 ] 

Anishek Agarwal commented on HIVE-26610:


[~zabetak] the 
[CVE-2020-13955|https://cve.mitre.org/cgi-bin/cvename.cgi?name=2020-13955] is 
not relevant since calcite does not talk to an external system for computations 
in hive ? The cve also only mentions splunk and druid, is that when druid data 
is served via hive ?

> Upgrade calcite-core to 1.32.0
> --
>
> Key: HIVE-26610
> URL: https://issues.apache.org/jira/browse/HIVE-26610
> Project: Hive
>  Issue Type: Bug
>Reporter: Sai Hemanth Gantasala
>Assignee: Stamatis Zampetakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-25857) Replication fails in case of Control Character in the table description

2022-01-17 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-25857.

Resolution: Fixed

+1,  Merged to master. Thanks for the patch [~ayushtkn]

> Replication fails in case of Control Character in the table description
> ---
>
> Key: HIVE-25857
> URL: https://issues.apache.org/jira/browse/HIVE-25857
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In case there is a control character in the table metadata. The LOAD fails 
> while decoding the JSON.
> *Exception:*
> {noformat}
> Caused by: com.fasterxml.jackson.core.JsonParseException: Illegal unquoted 
> character ((CTRL-CHAR, code 24)): has to be escaped using backslash to be 
> included in string value
>  at [Source: 
> (String)"{"server":"","servicePrincipal":"","db":"sampletestreplic","table":"testlmo","tableType":"MANAGED_TABLE","tableObjBeforeJson":"{\"1\":{\"str\":\"testlmo\"},\"2\":{\"str\":\"sampletestreplic\"},\"3\":{\"str\":\"hive\"},\"4\":{\"i32\":1641717786},\"5\":{\"i32\":0},\"6\":{\"i32\":0},\"7\":{\"rec\":{\"1\":{\"lst\":[\"rec\",1,{\"1\":{\"str\":\"dc_codeacteurcandidat\"},\"2\":{\"str\":\"string\"},\"3\":{\"str\":\"Code
>  de l'acteur de candidature (^XA' a dterminer, ^XC' conseiller ou ^XD' 
> candidat)\"}}]},\"[truncated 3054 chars]; line: 1, column: 445]
>         at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1840) 
> ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:712)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.base.ParserBase._throwUnquotedSpace(ParserBase.java:1046)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString2(ReaderBasedJsonParser.java:2073)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString(ReaderBasedJsonParser.java:2044)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:293)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:35)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:10)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.impl.FieldProperty.deserializeAndSet(FieldProperty.java:138)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:288)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:151)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4218)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3214) 
>         at 
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3182) 
>         at 
> org.apache.hadoop.hive.metastore.messaging.json.JSONMessageDeserializer.getAlterTableMessage(JSONMessageDeserializer.java:111)
>  
>         at 
> org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.extract(TableHandler.java:111)]
>         at 
> org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.handle(TableHandler.java:51)
>  
>         at 
> org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.analyzeEventLoad(IncrementalLoadTasksBuilder.java:213){noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25857) Replication fails in case of Control Character in the table description

2022-01-17 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477593#comment-17477593
 ] 

Anishek Agarwal commented on HIVE-25857:


not sure why the git change didnt gete linked automatically 
https://github.com/apache/hive/pull/2935

> Replication fails in case of Control Character in the table description
> ---
>
> Key: HIVE-25857
> URL: https://issues.apache.org/jira/browse/HIVE-25857
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> In case there is a control character in the table metadata. The LOAD fails 
> while decoding the JSON.
> *Exception:*
> {noformat}
> Caused by: com.fasterxml.jackson.core.JsonParseException: Illegal unquoted 
> character ((CTRL-CHAR, code 24)): has to be escaped using backslash to be 
> included in string value
>  at [Source: 
> (String)"{"server":"","servicePrincipal":"","db":"sampletestreplic","table":"testlmo","tableType":"MANAGED_TABLE","tableObjBeforeJson":"{\"1\":{\"str\":\"testlmo\"},\"2\":{\"str\":\"sampletestreplic\"},\"3\":{\"str\":\"hive\"},\"4\":{\"i32\":1641717786},\"5\":{\"i32\":0},\"6\":{\"i32\":0},\"7\":{\"rec\":{\"1\":{\"lst\":[\"rec\",1,{\"1\":{\"str\":\"dc_codeacteurcandidat\"},\"2\":{\"str\":\"string\"},\"3\":{\"str\":\"Code
>  de l'acteur de candidature (^XA' a dterminer, ^XC' conseiller ou ^XD' 
> candidat)\"}}]},\"[truncated 3054 chars]; line: 1, column: 445]
>         at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1840) 
> ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:712)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.base.ParserBase._throwUnquotedSpace(ParserBase.java:1046)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString2(ReaderBasedJsonParser.java:2073)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser._finishString(ReaderBasedJsonParser.java:2044)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.core.json.ReaderBasedJsonParser.getText(ReaderBasedJsonParser.java:293)
>  ~[jackson-core-2.10.5.jar:2.10.5]
>         at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:35)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:10)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.impl.FieldProperty.deserializeAndSet(FieldProperty.java:138)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:288)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:151)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4218)
>  ~[jackson-databind-2.10.5.1.jar:2.10.5.1]
>         at 
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3214) 
>         at 
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3182) 
>         at 
> org.apache.hadoop.hive.metastore.messaging.json.JSONMessageDeserializer.getAlterTableMessage(JSONMessageDeserializer.java:111)
>  
>         at 
> org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.extract(TableHandler.java:111)]
>         at 
> org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.handle(TableHandler.java:51)
>  
>         at 
> org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.analyzeEventLoad(IncrementalLoadTasksBuilder.java:213){noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25829) Tez exec mode support for credential provider for jobs

2021-12-22 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-25829:
--

Assignee: László Bodor

> Tez exec mode support for credential provider for jobs
> --
>
> Key: HIVE-25829
> URL: https://issues.apache.org/jira/browse/HIVE-25829
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Ádám Szita
>Assignee: László Bodor
>Priority: Major
>
> HIVE-14822 introduced support to securely forward a job specific java 
> credential store path, and a corresponding password to the backend executors. 
> This is currently implemented for only MR2 and Spark execution engines. I 
> propose we extend this feature by adding Tez mode to said list.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25720) Fix flaky test TestScheduledReplicationScenarios

2021-11-18 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-25720:
--

Assignee: Haymant Mangla

> Fix flaky test TestScheduledReplicationScenarios
> 
>
> Key: HIVE-25720
> URL: https://issues.apache.org/jira/browse/HIVE-25720
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Haymant Mangla
>Priority: Major
>
> failed at the first attempt; the issue happened during
> {code}
> drop scheduled query repl_load_p2
> {code}
> which is in a finally block ; so this exception may be shadowing another 
> exception
> http://ci.hive.apache.org/job/hive-flaky-check/463/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25336) Use single call to get tables in DropDatabaseAnalyzer

2021-07-15 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381205#comment-17381205
 ] 

Anishek Agarwal commented on HIVE-25336:


dont think you can get all table definitions at once, can lead to memory 
pressure on HMS, some form of batching however would make sense. or lazy 
loading.

> Use single call to get tables in DropDatabaseAnalyzer
> -
>
> Key: HIVE-25336
> URL: https://issues.apache.org/jira/browse/HIVE-25336
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Optimise 
> org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseAnalyzer.analyzeInternal(DropDatabaseAnalyzer.java:61),
>  where it fetches entire tables one by one. Move to a single call. This could 
> save around 20+ seconds when large number of tables are present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25159) Remove support for ordered results in llap external client library

2021-06-02 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-25159.

Resolution: Fixed

+1 , patch committed to master, Thanks for the patch [~ShubhamChaurasia] and 
review [~harishjp]


> Remove support for ordered results in llap external client library
> --
>
> Key: HIVE-25159
> URL: https://issues.apache.org/jira/browse/HIVE-25159
> Project: Hive
>  Issue Type: Bug
>  Components: Clients, Hive
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25159.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when querying via llap external client framework, in case of order 
> by queries -
> 1. Due to the fact that spark-llap used to wrap actual query in a subquery as 
> mentioned in [HIVE-19794|https://issues.apache.org/jira/browse/HIVE-19794]
> a) We had to detect order by like - 
> {code}
> orderByQuery = plan.getQueryProperties().hasOrderBy() || 
> plan.getQueryProperties().hasOuterOrderBy();
> {code}
> Due to this we recently saw an exception like below for one of the queries 
> that did not have an outer order by (It was having an order by in a subquery)
> {code}
> org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> java.lang.IllegalStateException: Requested to generate single split. Paths 
> and fileStatuses are expected to be 1. Got paths: 1 fileStatuses: 7
> {code}
> b) Also we had to disable following optimization - 
> {code}
> HiveConf.setBoolVar(conf, ConfVars.HIVE_REMOVE_ORDERBY_IN_SUBQUERY, false);
> {code}
> 2. By default we have 
> {{hive.llap.external.splits.order.by.force.single.split=true}} which forces 
> us to generate single split leading to performance bottleneck. 
> We should remove ordering support altogether from llap external client repo 
> and let clients handle it at their end.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24895) Add a DataCopyEnd stage in ReplStateLogTask for external table replication

2021-03-17 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303845#comment-17303845
 ] 

Anishek Agarwal commented on HIVE-24895:


wouldnt that also go in DirCopyTask execute() - > finally block , we have to 
pass the repl context to make sure it gets logged only then or may be not just 
log these in debug level anyways even for regular ones.

> Add a DataCopyEnd stage in ReplStateLogTask for external table replication
> --
>
> Key: HIVE-24895
> URL: https://issues.apache.org/jira/browse/HIVE-24895
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a task to mark the end of external table copy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24895) Add a DataCopyEnd task for external table replication

2021-03-17 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303833#comment-17303833
 ] 

Anishek Agarwal commented on HIVE-24895:


I thought the plan was to use the existing repl loggers for this as well, any 
reason this went to task based ? 

> Add a DataCopyEnd task for external table replication
> -
>
> Key: HIVE-24895
> URL: https://issues.apache.org/jira/browse/HIVE-24895
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a task to mark the end of external table copy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24881) Abort old open replication txns

2021-03-14 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301364#comment-17301364
 ] 

Anishek Agarwal commented on HIVE-24881:


[~klcopp] are you talking about replication txns on target ? 

These txns will be mapped by replication system, since replication happens 
based on event streams, we can partially abort these txns since we will not 
know where in the event stream we have to go back to replay as well as it will 
break the linear sequence of events.

Can you please elaborate a bit more on the use case affected or what are we 
trying to achieve ?

cc [~thejas]/[~aasha]/ [~pkumarsinha]


> Abort old open replication txns
> ---
>
> Key: HIVE-24881
> URL: https://issues.apache.org/jira/browse/HIVE-24881
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should auto-abort/remove open replication txns that are older than a time 
> threshold (default: 24h).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24640) Add error message in hive proto logger for failed queries.

2021-02-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24640:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, thanks for the patch [~harishjp]

> Add error message in hive proto logger for failed queries.
> --
>
> Key: HIVE-24640
> URL: https://issues.apache.org/jira/browse/HIVE-24640
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Harish JP
>Assignee: Harish JP
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-24640.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Error message is missing in the failed event in HiveProtoLoggingHook, extract 
> and add this information from context when available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24649) Optimise Hive::addWriteNotificationLog for large data inserts

2021-01-18 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267074#comment-17267074
 ] 

Anishek Agarwal commented on HIVE-24649:


actually the method has few failure points, for example


{code:java}

getMSC().addDynamicPartitions(parentSession.getTxnMgr().getCurrentTxnId(), 
writeId,
tbl.getDbName(), tbl.getTableName(), partNames,
AcidUtils.toDataOperationType(operation));

{code}

if it fails the whole state is not reverted, there are now changes in 
partitions and respective tables in rdbms but not on txn_components tables, how 
does this affect other txns read view etc ? cc [~pvary]



> Optimise Hive::addWriteNotificationLog for large data inserts
> -
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in 
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition 
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}} 
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed 
> again in {{HiveMetaStore::add_write_notification_log}}
>  
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24649) Optimise Hive::addWriteNotificationLog for large data inserts

2021-01-17 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267006#comment-17267006
 ] 

Anishek Agarwal commented on HIVE-24649:


the partitions add additional data like catalog name etc for now in HMS call 
and not present on HS2 side , may be other things internally be added later ( 
which is difficult to predict ) ,  at most i think we can probably prevent 
reloading of the table object, this is also across the HS2 and HMS boundary, 
better would be if caching of metadata is enabled on HMS that way round trip to 
rdbms would be small. another possible way is return list of partitions from 
{{addPartitionsToMetastore}} which is lot of network roundtrip to and from HMS 
to HS2 and back to HMS in addWriteNotificationLog.

cc [~aasha]/[~pkumarsinha]

> Optimise Hive::addWriteNotificationLog for large data inserts
> -
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in 
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition 
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}} 
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed 
> again in {{HiveMetaStore::add_write_notification_log}}
>  
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24526) Get grouped locations of external table data using metatool.

2021-01-05 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24526.

Resolution: Fixed

merged to master, thanks for the patch and review.

> Get grouped locations of external table data using metatool.
> 
>
> Key: HIVE-24526
> URL: https://issues.apache.org/jira/browse/HIVE-24526
> Project: Hive
>  Issue Type: Task
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24526.01.patch, HIVE-24526.02.patch, 
> HIVE-24526.03.patch, HIVE-24526.04.patch, HIVE-24526.05.patch, 
> HIVE-24526.06.patch, HIVE-24526.07.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This task adds two new functionalities to metatool.
> The first option, -listExtTblLocs generates a json-file containing a set of 
> locations which cover all external-table data-locations for a database 
> specified by user.
> The second option, -diffExtTblLocs creates a diff from two jsons generated 
> using the first option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24502) Store table level regular expression used during dump for table level replication

2021-01-03 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24502:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master,Thanks for the patch [~aasha] and review [~pkumarsinha]

> Store table level regular expression used during dump for table level 
> replication
> -
>
> Key: HIVE-24502
> URL: https://issues.apache.org/jira/browse/HIVE-24502
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24502.01.patch, HIVE-24502.02.patch, 
> HIVE-24502.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Store include table list and exclude table list as part of dump meta data file



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24468) Use Event Time instead of Current Time in Notification Log DB Entry

2020-12-02 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242465#comment-17242465
 ] 

Anishek Agarwal commented on HIVE-24468:


+1

> Use Event Time instead of Current Time in Notification Log DB Entry
> ---
>
> Key: HIVE-24468
> URL: https://issues.apache.org/jira/browse/HIVE-24468
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24417) Add config options for Atlas and Ranger client timeouts

2020-12-01 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241429#comment-17241429
 ] 

Anishek Agarwal commented on HIVE-24417:


Patch committed to master, Thanks for the patch [~pkumarsinha] and review 
[~aasha]

> Add config options for Atlas and Ranger client timeouts
> ---
>
> Key: HIVE-24417
> URL: https://issues.apache.org/jira/browse/HIVE-24417
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24417.01.patch, HIVE-24417.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24417) Add config options for Atlas and Ranger client timeouts

2020-12-01 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24417:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add config options for Atlas and Ranger client timeouts
> ---
>
> Key: HIVE-24417
> URL: https://issues.apache.org/jira/browse/HIVE-24417
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24417.01.patch, HIVE-24417.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24456) Column masking/hashing function in hive should use SH512 if FIPS mode is enabled

2020-11-30 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241303#comment-17241303
 ] 

Anishek Agarwal commented on HIVE-24456:


just wondering why not just by default move these hash functions, is there 
significant performance overhead here ?

> Column masking/hashing function in hive should use SH512 if FIPS mode is 
> enabled
> 
>
> Key: HIVE-24456
> URL: https://issues.apache.org/jira/browse/HIVE-24456
> Project: Hive
>  Issue Type: Wish
>  Components: HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> hive-site.xml should have the following property to indicate that FIPS mode 
> is enabled.
> 
>     hive.masking.algo
>      sha512
> 
> If this property is present, then GenericUDFMaskHash should use SHA512 
> instead of SHA256 encoding for column masking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24450) DbNotificationListener Request Notification IDs in Batches

2020-11-30 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241238#comment-17241238
 ] 

Anishek Agarwal commented on HIVE-24450:


[~belugabehr] you cant get sequence id's in blocks, replication will not work. 
it has to be one at a time. 

cc [~thejas]/[~aasha]/[~pkumarsinha]

> DbNotificationListener Request Notification IDs in Batches
> --
>
> Key: HIVE-24450
> URL: https://issues.apache.org/jira/browse/HIVE-24450
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Every time a new notification event is logged into the database, the sequence 
> number for the ID of the even is incremented by one.  It is very standard in 
> database design to instead request a block of IDs for each fetch from the 
> database.  The sequence numbers are then handed out locally until the block 
> of IDs is exhausted.  This allows for fewer database round-trips and 
> transactions, at the expense of perhaps burning a few IDs.
> Burning of IDs happens when the server is restarted in the middle of a block 
> of sequence IDs.  That is, if the HMS requests a block of 10 ids, and only 
> three have been assigned, after the restart, the HMS will request another 
> block of 10, burning (wasting) 7 IDs.  As long as the blocks are not too 
> small, and restarts are infrequent, then few IDs are lost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24432) Delete Notification Events in Batches

2020-11-29 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240469#comment-17240469
 ] 

Anishek Agarwal commented on HIVE-24432:


rather than loading them and doing in different txn, wouldnt it be better to do 
this via a single direct sql statement ?

> Delete Notification Events in Batches
> -
>
> Key: HIVE-24432
> URL: https://issues.apache.org/jira/browse/HIVE-24432
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Notification events are loaded in batches (reduces memory pressure on the 
> HMS), but all of the deletes happen under a single transactions and, when 
> deleting many records, can put a lot of pressure on the backend database.
> Instead, delete events in batches (in different transactions) as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24349) Client connection count is not printed correctly in HiveMetastoreClient

2020-11-22 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24349.

Resolution: Fixed

Committed to master, thanks for the patch [~^sharma] and review [~pkumarsinha]

> Client connection count is not printed correctly in HiveMetastoreClient
> ---
>
> Key: HIVE-24349
> URL: https://issues.apache.org/jira/browse/HIVE-24349
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24349.01.patch, HIVE-24349.02.patch, 
> HIVE-24349.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24328) Run distcp in parallel for all file entries in repl load.

2020-11-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24328:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Run distcp in parallel for all file entries in repl load.
> -
>
> Key: HIVE-24328
> URL: https://issues.apache.org/jira/browse/HIVE-24328
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24328.01.patch, HIVE-24328.02.patch, 
> HIVE-24328.03.patch, HIVE-24328.04.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24363) Current order of transactional event listeners is prone to deadlock in backend DB connections

2020-11-12 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24363:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master, thanks for the patch [~pkumarsinha] and review 
[~aasha]/[~pgaref]

> Current order of transactional event listeners is prone to deadlock in 
> backend DB connections
> -
>
> Key: HIVE-24363
> URL: https://issues.apache.org/jira/browse/HIVE-24363
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24363.01.patch, HIVE-24363.02.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently the AcidEventListener is added to the end of list of transactional 
> event listeners. When DbNotificationListener is configured as 
> 'hive.metastore.transactional.event.listeners'. The final list will be formed 
> as :
> {"DbNotificationListener" , "AcidEventListener"}
> This will result in backend DB lock acquisition in this order:
> {code:java}
>  lock(a) {
> // perform some op on a
>     lock(b) {
>   // perform some op on b
> }
>   }
> {code}
> On the other hand, there are some HMS API say for example commit_txn(), which 
> calls the TxnHandler method directly, followed by DbNotificationListener 
> processing. Which will result in the lock acquisition in reverse order:
> {code:java}
> lock(b) {
> // perform some op on b    
> lock(a) {
> // perform some op on a
> }   
>  }
> {code}
> Note: 'a' and 'b' above are backend  DB lock and not jvm locks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24371) Ranger Replication fallback to updateIfExists

2020-11-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24371:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

merged to master, Thanks for patch [~aasha] and review [~pkumarsinha]

> Ranger Replication fallback to updateIfExists
> -
>
> Key: HIVE-24371
> URL: https://issues.apache.org/jira/browse/HIVE-24371
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24371.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Ranger Replication fallback to updateIfExists
> Add dummy resource as workaround while creating the deny policy to avoid it 
> from overriding the actual policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24366) changeMarker value sent to atlas export API is set to 0 in the 2nd repl dump call

2020-11-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24366.

Resolution: Fixed

merged to master, Thanks for patch [~^sharma] and review [~pkumarsinha]

> changeMarker value sent to atlas export API is set to 0 in the 2nd repl dump 
> call
> -
>
> Key: HIVE-24366
> URL: https://issues.apache.org/jira/browse/HIVE-24366
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24366.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24327) AtlasServer entity may not be present during first Atlas metadata dump

2020-11-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24327.

Resolution: Fixed

Merged to master , Thanks or the patch [~pkumarsinha] and review [~aasha]

> AtlasServer entity may not be present during first Atlas metadata dump
> --
>
> Key: HIVE-24327
> URL: https://issues.apache.org/jira/browse/HIVE-24327
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24327.01.patch, HIVE-24327.02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24173) notification cleanup interval value changes depending upon replication enabled or not.

2020-11-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24173.

Resolution: Fixed

Merged to master, Thanks for the patch [~^sharma] and review [~aasha]


> notification cleanup interval value changes depending upon replication 
> enabled or not.
> --
>
> Key: HIVE-24173
> URL: https://issues.apache.org/jira/browse/HIVE-24173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24173.01.patch, HIVE-24173.02.patch, 
> HIVE-24173.03.patch, HIVE-24173.04.patch, HIVE-24173.05.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently we use hive.metastore.event.db.listener.timetolive to determine how 
> long the events are stored in rdbms backing hms. We should have another 
> configuration for the same purpose in context of replication so that we have 
> longer time configured for that otherwise we can default to a 1 day.
> hive.repl.cm.enabled can be used to identify if replication is enabled or 
> not. if enabled use the new configuration property to determine ttl for 
> events in rdbms else use hive.metastore.event.db.listener.timetolive for ttl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24330) Automate setting permissions on cmRoot directories.

2020-11-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24330.

Resolution: Fixed

patch merged to master, Thanks for the patch [~^sharma] and review [~aasha]

> Automate setting permissions on cmRoot directories.
> ---
>
> Key: HIVE-24330
> URL: https://issues.apache.org/jira/browse/HIVE-24330
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24330.01.patch, HIVE-24330.02.patch, 
> HIVE-24330.03.patch, HIVE-24330.04.patch, HIVE-24330.05.patch, 
> HIVE-24330.06.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24307) Beeline with property-file and -e parameter is failing

2020-10-29 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24307:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to master , Thanks for the patch [~ayushtkn] and review 
[~aasha]!

> Beeline with property-file and -e parameter is failing
> --
>
> Key: HIVE-24307
> URL: https://issues.apache.org/jira/browse/HIVE-24307
> Project: Hive
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24307-01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Beeline query with property file specified with -e parameter fails with :
> {noformat}
> Cannot run commands specified using -e. No current connection
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24109) Load partitions in batches for managed tables in the bootstrap phase

2020-10-22 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24109:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Load partitions in batches for managed tables in the bootstrap phase
> 
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch, 
> HIVE-24109.03.patch, HIVE-24109.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24109) Load partitions in batches for managed tables in the bootstrap phase

2020-10-22 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24109:
---
Summary: Load partitions in batches for managed tables in the bootstrap 
phase  (was: Load partitions in parallel for managed tables in the bootstrap 
phase)

> Load partitions in batches for managed tables in the bootstrap phase
> 
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch, 
> HIVE-24109.03.patch, HIVE-24109.04.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24227) sys.replication_metrics table shows incorrect status for failed policies

2020-10-20 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24227:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to master, Thanks for the patch [~^sharma] and review [~aasha]!

> sys.replication_metrics table shows incorrect status for failed policies
> 
>
> Key: HIVE-24227
> URL: https://issues.apache.org/jira/browse/HIVE-24227
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24227.04.patch, HIVE-24227.05.patch, 
> HIVE-24227.06.patch, HIVE-24227.07.patch, HIVE-24227.08.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24267) RetryingClientTimeBased should always perform first invocation

2020-10-19 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24267:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed to master, Thanks for the patch [~pkumarsinha] and review 
[~aasha]!

> RetryingClientTimeBased should always perform first invocation
> --
>
> Key: HIVE-24267
> URL: https://issues.apache.org/jira/browse/HIVE-24267
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24267.01.patch, HIVE-24267.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24246:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24246.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24246) Fix for Ranger Deny policy overriding policy with same resource name

2020-10-13 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212958#comment-17212958
 ] 

Anishek Agarwal commented on HIVE-24246:


Committed to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Fix for Ranger Deny policy overriding policy with same resource name 
> -
>
> Key: HIVE-24246
> URL: https://issues.apache.org/jira/browse/HIVE-24246
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24246.01.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24244) NPE during Atlas metadata replication

2020-10-12 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24244:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed to master, Thanks for the patch [~pkumarsinha] and review 
[~aasha]

> NPE during Atlas metadata replication
> -
>
> Key: HIVE-24244
> URL: https://issues.apache.org/jira/browse/HIVE-24244
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24244.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24197) Check for write transactions for the db under replication at a frequent interval

2020-10-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24197:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch [~aasha] and review [~pkumarsinha]

> Check for write transactions for the db under replication at a frequent 
> interval
> 
>
> Key: HIVE-24197
> URL: https://issues.apache.org/jira/browse/HIVE-24197
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24197.01.patch, HIVE-24197.02.patch, 
> HIVE-24197.03.patch, HIVE-24197.04.patch, HIVE-24197.05.patch, 
> HIVE-24197.06.patch, HIVE-24197.07.patch, HIVE-24197.08.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24254) Remove setOwner call in ReplChangeManager

2020-10-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24254:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch [~aasha] and review [~pkumarsinha]. Patch committed to 
master.

> Remove setOwner call in ReplChangeManager
> -
>
> Key: HIVE-24254
> URL: https://issues.apache.org/jira/browse/HIVE-24254
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24254.01.patch, HIVE-24254.02.patch, 
> HIVE-24254.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-10-06 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24215:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24215.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23867) Truncate table fail with AccessControlException if doAs enabled and tbl database has source of replication

2020-10-05 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208129#comment-17208129
 ] 

Anishek Agarwal commented on HIVE-23867:


all managed table locations should be owned by hive. i dont think we should 
support otherwise cc [~thejas]

> Truncate table fail with AccessControlException if doAs enabled and tbl 
> database has source of replication
> --
>
> Key: HIVE-23867
> URL: https://issues.apache.org/jira/browse/HIVE-23867
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, repl
>Affects Versions: 3.1.1
>Reporter: Rajkumar Singh
>Priority: Major
>
> Steps to repro:
> 1. enable doAs
> 2. with some user (not a super user) create database 
> create database sampledb with dbproperties('repl.source.for'='1,2,3');
> 3. create table using create table sampledb.sampletble (id int);
> 4. insert some data into it insert into sampledb.sampletble values (1), 
> (2),(3);
> 5. Run truncate command on the table which fail with following error
> {code:java}
>  org.apache.hadoop.ipc.RemoteException: User username is not a super user 
> (non-super user cannot change owner).
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1907)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:866)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:531)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1498) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at org.apache.hadoop.ipc.Client.call(Client.java:1444) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at org.apache.hadoop.ipc.Client.call(Client.java:1354) 
> ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at com.sun.proxy.$Proxy31.setOwner(Unknown Source) ~[?:?]
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setOwner(ClientNamenodeProtocolTranslatorPB.java:470)
>  ~[hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) ~[?:?]
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_232]
>  at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
>  at com.sun.proxy.$Proxy32.setOwner(Unknown Source) [?:?]
>  at org.apache.hadoop.hdfs.DFSClient.setOwner(DFSClient.java:1914) 
> [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1764)
>  [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1761)
>  [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:

[jira] [Updated] (HIVE-24187) Handle _files creation for HA config with same nameservice name on source and destination

2020-09-24 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24187:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master. Thanks for the patch [~pkumarsinha] and reivew [~aasha]!

> Handle _files creation for HA config with same nameservice name on source and 
> destination
> -
>
> Key: HIVE-24187
> URL: https://issues.apache.org/jira/browse/HIVE-24187
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24187.01.patch, HIVE-24187.02.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Current HA is supported only for different nameservices on Source and 
> Destination. We need to add support of same nameservice on Source and 
> Destination.
> Local nameservice will be passed correctly to the repl command.
> Remote nameservice will be a random name and corresponding configs for the 
> same.
> Example:
> Clusters originally configured with ns for hdfs:
> src: ns1
> target : ns1
> We can denote remote name with some random name, say for example: nsRemote. 
> This is how the command will see the ns w.r.t source and target:
> Repl Dump : src: ns1, target: nsRemote
> Repl Load: src: nsRemote, target: ns1
> Entries in the _files(for managed table data loc) will be made with nsRemote 
> in stead of ns1(for src).
> Example: 
> hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot
> Same way list of external table data locations will also be modified using 
> nsRemote in stead of ns1(for src).
> New configs can control the behavior:
> *hive.repl.ha.datapath.replace.remote.nameservice = *
> *hive.repl.ha.datapath.replace.remote.nameservice.name = *
> Based on the above configs replacement of nameservice can be done.
> This will also require that 'hive.repl.rootdir' is passed accordingly during 
> dump and load:
> Repl dump:
> ||Repl Operation||Repl Command||
> |*Staging on source cluster*|
> |Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |*Staging on target cluster*|
> |Repl Dump|repl dump dbName 
> with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
> |Repl Load|repl load dbName into dbName 
> with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24170) Add UDF resources explicitly to the classpath while handling drop function event during load.

2020-09-21 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24170:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master, Thanks for the patch [~pkumarsinha] and review [~aasha]

> Add UDF resources explicitly to the classpath while handling drop function 
> event during load.
> -
>
> Key: HIVE-24170
> URL: https://issues.apache.org/jira/browse/HIVE-24170
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24170.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24131) Use original src location always when data copy runs on target

2020-09-14 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195366#comment-17195366
 ] 

Anishek Agarwal commented on HIVE-24131:


Committed to master , Thanks for the patch [~pkumarsinha] and review [~aasha]

> Use original src location always when data copy runs on target 
> ---
>
> Key: HIVE-24131
> URL: https://issues.apache.org/jira/browse/HIVE-24131
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24131.01.patch, HIVE-24131.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24117) Fix for not setting managed table location in incremental load

2020-09-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24117:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed to master, thanks for the patch [~aasha] and review 
[~pkumarsinha]

> Fix for not setting managed table location in incremental load
> --
>
> Key: HIVE-24117
> URL: https://issues.apache.org/jira/browse/HIVE-24117
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24117.01.patch, HIVE-24117.02.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24127) Dump events from default catalog only

2020-09-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24127:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

thanks for the patch [~aasha] and review [~pkumarsinha]

> Dump events from default catalog only
> -
>
> Key: HIVE-24127
> URL: https://issues.apache.org/jira/browse/HIVE-24127
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24127.01.patch, HIVE-24127.02.patch, 
> HIVE-24127.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Don't dump events from spark catalog. In bootstrap we skip spark tables. In 
> inceremental load also we should skip spark events.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24129) Deleting the previous successful dump directory should be based on config

2020-09-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24129:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~^sharma] and review [~aasha]

> Deleting the previous successful dump directory should be based on config
> -
>
> Key: HIVE-24129
> URL: https://issues.apache.org/jira/browse/HIVE-24129
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24129.01.patch, HIVE-24129.02.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {color:#22}Description: Provide a policy level config defaulted to 
> true.{color}
> {color:#22}This can help debug any issue in the production.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24095) Load partitions in parallel for external tables in the bootstrap phase

2020-09-08 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24095:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to master, Thanks for the patch [~aasha] and review 
[~pkumarsinha]

> Load partitions in parallel for external tables in the bootstrap phase
> --
>
> Key: HIVE-24095
> URL: https://issues.apache.org/jira/browse/HIVE-24095
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24095.01.patch, HIVE-24095.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is part 1 of the change. This will load partitions in parallel for 
> external tables. Managed table is tracked as part of 
> https://issues.apache.org/jira/browse/HIVE-24109



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24114) Fix Repl Load with both staging and data copy on target

2020-09-07 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24114:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~pkumarsinha] and review [~aasha]

> Fix Repl Load with both staging and data copy on target
> ---
>
> Key: HIVE-24114
> URL: https://issues.apache.org/jira/browse/HIVE-24114
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24114.01.patch, HIVE-24114.02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24059) Llap external client - Initial changes for running in cloud environment

2020-09-02 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal resolved HIVE-24059.

Resolution: Fixed

committed to master, thanks for patch [~ShubhamChaurasia] and review 
[~prasanth_j]

> Llap external client - Initial changes for running in cloud environment
> ---
>
> Key: HIVE-24059
> URL: https://issues.apache.org/jira/browse/HIVE-24059
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24059.01.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Please see problem description in 
> https://issues.apache.org/jira/browse/HIVE-24058
> Initial changes include - 
> 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) 
> side.
> 2. Opening additional RPC port in LLAP Daemon.
> 3. JWT Based authentication on this port.
> cc [~prasanth_j] [~jdere] [~anishek] [~thejas]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation

2020-09-02 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24102:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master, thanks for the patch [~aasha] and review [~pkumarsinha]

> Add ENGINE=InnoDB for replication mysql schema changes and not exists clause 
> for the table creation
> ---
>
> Key: HIVE-24102
> URL: https://issues.apache.org/jira/browse/HIVE-24102
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24102.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24064) Disable Materialized View Replication

2020-09-01 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24064:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master! thanks for the patch [~^sharma] and review [~aasha]

> Disable Materialized View Replication
> -
>
> Key: HIVE-24064
> URL: https://issues.apache.org/jira/browse/HIVE-24064
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24064.01.patch, HIVE-24064.02.patch, 
> HIVE-24064.03.patch, HIVE-24064.04.patch, HIVE-24064.05.patch, 
> HIVE-24064.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24067) TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop

2020-08-26 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24067:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch committed to master, thanks for the patch [~pkumarsinha] and review 
[~aasha]

> TestReplicationScenariosExclusiveReplica - Wrong FS error during DB drop
> 
>
> Key: HIVE-24067
> URL: https://issues.apache.org/jira/browse/HIVE-24067
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24067.01.patch, HIVE-24067.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TestReplicationScenariosExclusiveReplica during drop database operation 
> for primary db, it leads to wrong FS error as the ReplChangeManager is 
> associated with replica FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23123) Disable export/import of views and materialized views

2020-08-25 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183958#comment-17183958
 ] 

Anishek Agarwal commented on HIVE-23123:


I agree that in the context of import/export statements, we might not require 
supporting view/MV. however the code paths for replication and export/import is 
common in lot of places since they do similar things during bootstrap. 
For replication including view + MV would be required and work well since we 
would copy the state of MV based on valid Txn List. Currently replication 
performs view replication as it is ( Assuming the view definition is only 
restricted to data in the same db ) and disables MV's.  We will try to prevent 
MV export/import and only enable it via replication. 

> Disable export/import of views and materialized views
> -
>
> Key: HIVE-23123
> URL: https://issues.apache.org/jira/browse/HIVE-23123
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23123.01.patch, HIVE-23123.02.patch, 
> HIVE-23123.03.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport]
>  import and export can be done by using the
> {code:java}
> export table ...
> import table ... 
> {code}
> commands. The document doesn't mention views or materialized views at all, 
> and in fact we don't support commands like
> {code:java}
> export view ...
> import view ...
> export materialized view ...
> import materialized view ... 
> {code}
> they can not be parsed at all. The word table is often used though in a 
> broader sense, when it means all table like entities, including views and 
> materialized views. For example the various Table classes may represent any 
> of these as well.
> If I try to export a view with the export table ... command, it goes fine. A 
> _metadata file will be created, but no data directory, which is what we'd 
> expect. If I try to import it back, an exception is thrown due to the lack of 
> the data dir:
> {code:java}
> java.lang.AssertionError: null==getPath() for exim_view
>  at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:3088)
>  at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:419)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
>  at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:364)
>  at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:335)
>  at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
>  at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:722)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:491)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:485) 
> {code}
> Still the view gets imported successfully, as data movement wasn't even 
> necessary.
> If we try to export a materialized view which is transactional, then this 
> exception occurs:
> {code:java}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> exim_materialized_view_da21d41a_9fe4_4446_9c72_d251496abf9d
>  at 
> org.apache.hadoop.hive.ql.parse.AcidExportSemanticAnalyzer.analyzeAcidExport(AcidExportSemanticAnalyzer.java:163)
>  at 
> org.apache.hadoop.hive.ql.parse.AcidExportSemanticAnalyzer.analyze(AcidExportSemanticAnalyzer.java:71)
>  at 
> org.apache.hadoop.hive.ql.parse.RewriteSemanticAnalyzer.analyzeInternal(RewriteSemanticAnalyzer.java:72)
>  at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
>  at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220)
>  at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104)
>  at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:183)
>  at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:601)
>  at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:547)
>  at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:541) 
> {code}
> So the export process can not handle it, as the temporary table is not 
> getting created.
>  
> The import command handling have a lot of codes dedicated to importing views 
> and materialized views, which suggests that we support the importing (and 
> thus also suggests implicitly that we support the exporting) of views and 
> materialiezed views.
>  
> So the conclusion is that we have to decide if we support exporting/importing

[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-08-25 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183861#comment-17183861
 ] 

Anishek Agarwal commented on HIVE-24070:


can you give some more details on this please, i thought the config introduced 
as part of HIVE-19430 allows us to control how many events we want to delete. 
however looks like there should be an inner loop since if the number of events 
generated in sleepTime are more than deleted due to EVENT_CLEAN_MAX_EVENTS then 
the db will never get cleaned.

cc [~aasha] / [~pkumarsinha]

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23926) Flaky test TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion

2020-08-24 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23926:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1. Merged to master. Thanks for the patch [~^sharma]

> Flaky test 
> TestTableLevelReplicationScenarios.testRenameTableScenariosWithReplacePolicyDMLOperattion
> 
>
> Key: HIVE-23926
> URL: https://issues.apache.org/jira/browse/HIVE-23926
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23926.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/123/testReport/org.apache.hadoop.hive.ql.parse/TestTableLevelReplicationScenarios/Testing___split_18___Archive___testRenameTableScenariosWithReplacePolicyDMLOperattion/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24032) Remove hadoop shims dependency and use FileSystem Api directly from standalone metastore

2020-08-24 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24032:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master. Thanks for the patch [~aasha] an review [~pkumarsinha] !

> Remove hadoop shims dependency and use FileSystem Api directly from 
> standalone metastore
> 
>
> Key: HIVE-24032
> URL: https://issues.apache.org/jira/browse/HIVE-24032
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24032.01.patch, HIVE-24032.02.patch, 
> HIVE-24032.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Remove hadoop shims dependency from standalone metastore. 
> Rename hive.repl.data.copy.lazy hive conf to 
> hive.repl.run.data.copy.tasks.on.target



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23993) Handle irrecoverable errors

2020-08-13 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23993:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Handle irrecoverable errors
> ---
>
> Key: HIVE-23993
> URL: https://issues.apache.org/jira/browse/HIVE-23993
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23993.01.patch, HIVE-23993.02.patch, 
> HIVE-23993.03.patch, HIVE-23993.04.patch, HIVE-23993.05.patch, 
> HIVE-23993.06.patch, HIVE-23993.07.patch, HIVE-23993.08.patch, Retry Logic 
> for Replication.pdf
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23995) Don't set location for managed tables in case of replication

2020-08-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23995:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master , Thanks for the patch [~aasha] and review [~pkumarsinha]

> Don't set location for managed tables in case of replication
> 
>
> Key: HIVE-23995
> URL: https://issues.apache.org/jira/browse/HIVE-23995
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23995.01.patch, HIVE-23995.02.patch, 
> HIVE-23995.03.patch, HIVE-23995.04.patch, HIVE-23995.05.patch, 
> HIVE-23995.06.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Managed table location should not be set
> Migration code of replication should be removed
> add logging to all ack files
> set hive.repl.data.copy.lazy to true



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24014) Need to delete DumpDirectoryCleanerTask

2020-08-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-24014:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch [~^sharma] and review [~aasha]

> Need to delete DumpDirectoryCleanerTask
> ---
>
> Key: HIVE-24014
> URL: https://issues.apache.org/jira/browse/HIVE-24014
> Project: Hive
>  Issue Type: Bug
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24014.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> With the newer implementation, every dump operation cleans up the  
> dump-directory previously consumed by load operation. Hence, for a policy, at 
> most only one dump directory will be there. Also, now dump directory base 
> location config is policy level config and hence this DumpDirCleanerTask will 
> not be effective.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23955) Classification of Error Codes in Replication

2020-08-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23955:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, thanks for the patch [~aasha] and review [~pkumarsinha]

> Classification of Error Codes in Replication
> 
>
> Key: HIVE-23955
> URL: https://issues.apache.org/jira/browse/HIVE-23955
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23955.01.patch, HIVE-23955.02.patch, 
> HIVE-23955.03.patch, HIVE-23955.04.patch, Retry Logic for Replication.pdf
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23960) Partition with no column statistics leads to unbalanced calls to openTransaction/commitTransaction error during get_partitions_by_names

2020-08-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23960:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~pkumarsinha] and review [~aasha]

> Partition with no column statistics leads to unbalanced calls to 
> openTransaction/commitTransaction error during get_partitions_by_names
> ---
>
> Key: HIVE-23960
> URL: https://issues.apache.org/jira/browse/HIVE-23960
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23960.01.patch, HIVE-23960.02.patch, 
> HIVE-23960.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {color:#172b4d}Creating a partition with data and adding another partition is 
> leading to unbalanced calls to open/commit transaction during 
> get_partitions_by_names call.{color}
> {color:#172b4d}Issue was discovered during REPL DUMP operation which uses  
> this HMS call to get the metadata of partition. This error occurs when there 
> is a partition with no column statistics.{color}
> {color:#172b4d}To reproduce:{color}
> {code:java}
> CREATE TABLE student_part_acid(name string, age int, gpa double) PARTITIONED 
> BY (ds string) STORED AS orc;
> LOAD DATA INPATH ‘/user/hive/partDir/student_part_acid/ds=20110924’ INTO 
> TABLE student_part_acid partition(ds=20110924);
> ALTER TABLE student_part_acid ADD PARTITION (ds=20110925);
> Now if we try to preform REPL DUMP it fails with this the error "Unbalanced 
> calls to open/commit transaction" on the HS2 side. 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23916) Fix Atlas client dependency version

2020-08-07 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23916:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

merged to master, Thanks for the patch [~pkumarsinha] and review [~aasha]

> Fix Atlas client dependency version
> ---
>
> Key: HIVE-23916
> URL: https://issues.apache.org/jira/browse/HIVE-23916
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23916.01.patch, HIVE-23916.02.patch, 
> HIVE-23916.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23986) flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump

2020-08-04 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-23986:
--

Assignee: Arko Sharma

> flaky TestStatsReplicationScenariosMigration.testMetadataOnlyDump
> -
>
> Key: HIVE-23986
> URL: https://issues.apache.org/jira/browse/HIVE-23986
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/143/testReport/junit/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23982) TestStatsReplicationScenariosMigrationNoAutogather is flaky

2020-08-04 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-23982:
--

Assignee: Arko Sharma

> TestStatsReplicationScenariosMigrationNoAutogather is flaky
> ---
>
> Key: HIVE-23982
> URL: https://issues.apache.org/jira/browse/HIVE-23982
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Arko Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/148/testReport/junit/org.apache.hadoop.hive.ql.parse/TestStatsReplicationScenariosMigrationNoAutogather/Testing___split_16___Archive___testRetryFailure/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23961) Enable external table replication by default

2020-08-04 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23961:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Enable external table replication by default
> 
>
> Key: HIVE-23961
> URL: https://issues.apache.org/jira/browse/HIVE-23961
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23961.01.patch, HIVE-23961.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23961) Enable external table replication by default

2020-08-04 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171253#comment-17171253
 ] 

Anishek Agarwal commented on HIVE-23961:


Merged to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Enable external table replication by default
> 
>
> Key: HIVE-23961
> URL: https://issues.apache.org/jira/browse/HIVE-23961
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23961.01.patch, HIVE-23961.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23835) Repl Dump should dump function binaries to staging directory

2020-07-27 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23835:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to master, thanks for the patch !

> Repl Dump should dump function binaries to staging directory
> 
>
> Key: HIVE-23835
> URL: https://issues.apache.org/jira/browse/HIVE-23835
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23835.01.patch, HIVE-23835.02.patch, 
> HIVE-23835.03.patch, HIVE-23835.04.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {color:#172b4d}When hive function's binaries are on source HDFS, repl dump 
> should dump it to the staging location in order to break cross clusters 
> visibility requirement.{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23863) UGI doAs privilege action to make calls to Ranger Service

2020-07-26 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23863:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master . Thanks for the patch !

> UGI doAs privilege action  to make calls to Ranger Service
> --
>
> Key: HIVE-23863
> URL: https://issues.apache.org/jira/browse/HIVE-23863
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23863.01.patch, HIVE-23863.02.patch, 
> HIVE-23863.03.patch, UGI and Replication.pdf
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23069) Memory efficient iterator should be used during replication.

2020-07-21 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23069:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

thanks for the patch [~pkumarsinha] and review [~aasha] , Committed to master .

> Memory efficient iterator should be used during replication.
> 
>
> Key: HIVE-23069
> URL: https://issues.apache.org/jira/browse/HIVE-23069
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23069.01.patch, HIVE-23069.02.patch, 
> HIVE-23069.03.patch, HIVE-23069.04.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently the iterator used while copying table data is memory based. In case 
> of a database with very large number of table/partitions, such iterator may 
> cause HS2 process to go OOM.
> Also introduces a config option to run data copy tasks during repl load 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-07-19 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23474:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch [~aasha] and review [~pkumarsinha], merged to master.

> Deny Repl Dump if the database is a target of replication
> -
>
> Key: HIVE-23474
> URL: https://issues.apache.org/jira/browse/HIVE-23474
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23474.01.patch, HIVE-23474.02.patch, 
> HIVE-23474.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23560) Optimize bootstrap dump to abort only write Transactions

2020-07-19 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23560:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch [~aasha] and review [~pkumarsinha], Merged to master.

> Optimize bootstrap dump to abort only write Transactions
> 
>
> Key: HIVE-23560
> URL: https://issues.apache.org/jira/browse/HIVE-23560
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23560.01.patch, HIVE-23560.02.patch, 
> HIVE-23560.03.patch, HIVE-23560.04.patch, Optimize bootstrap dump to avoid 
> aborting all transactions.pdf
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently before doing a bootstrap dump, we abort all open transactions after 
> waiting for a configured time. We are proposing to abort only write 
> transactions for the db under replication and leave the read and repl created 
> transactions as is.
> This doc attached talks about it in detail



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23787) Write all the events present in a task_queue in a single file.

2020-07-01 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23787:
---
Description: 
Events are not written to file when the queue becomes full, and it ignores the 
post_exec_hook / pre_exec_hook event. The default capacity is 64 in 
hive.hook.proto.queue.capacity config for hs2.

Now, we will increase the queue-capacity (let's say upto 256).
Also for the optimisation, need to run all the events present in a task_queue, 
and write in a single file.

  was:
DAS does not get the event when the queue becomes full, and it ignores the 
post_exec_hook / pre_exec_hook event. The default capacity is 64 in 
hive.hook.proto.queue.capacity config for hs2.

Now, we will increase the queue-capacity (let's say upto 256).
Also for the optimisation, need to run all the events present in a task_queue, 
and write in a single file.


> Write all the events present in a task_queue in a single file.
> --
>
> Key: HIVE-23787
> URL: https://issues.apache.org/jira/browse/HIVE-23787
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Amlesh Kumar
>Assignee: Amlesh Kumar
>Priority: Major
>
> Events are not written to file when the queue becomes full, and it ignores 
> the post_exec_hook / pre_exec_hook event. The default capacity is 64 in 
> hive.hook.proto.queue.capacity config for hs2.
> Now, we will increase the queue-capacity (let's say upto 256).
> Also for the optimisation, need to run all the events present in a 
> task_queue, and write in a single file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23784) Fix Replication Metrics Sink to DB

2020-06-30 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23784:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, committed to master 

> Fix Replication Metrics Sink to DB
> --
>
> Key: HIVE-23784
> URL: https://issues.apache.org/jira/browse/HIVE-23784
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23784.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23611) Mandate fully qualified absolute path for external table base dir during REPL operation

2020-06-30 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23611:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master.

> Mandate fully qualified absolute path for external table base dir during REPL 
> operation
> ---
>
> Key: HIVE-23611
> URL: https://issues.apache.org/jira/browse/HIVE-23611
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23611.01.patch, HIVE-23611.02.patch, 
> HIVE-23611.03.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23755) Fix Ranger Url extra slash

2020-06-30 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23755:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 , Committed to master 

> Fix Ranger Url extra slash
> --
>
> Key: HIVE-23755
> URL: https://issues.apache.org/jira/browse/HIVE-23755
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23755.01.patch, HIVE-23755.02.patch, 
> HIVE-23755.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-22 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23668:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 , Committed to master.

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch, 
> HIVE-23668.03.patch, HIVE-23668.04.patch, HIVE-23668.05.patch, 
> HIVE-23668.06.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details

2020-06-17 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23585:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 merged to master 

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23696) DB Metadata and Progress column not taking the defined length

2020-06-17 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23696:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, merged to master

> DB Metadata and Progress column not taking the defined length
> -
>
> Key: HIVE-23696
> URL: https://issues.apache.org/jira/browse/HIVE-23696
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23696.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Caused by: org.datanucleus.exceptions.NucleusUserException: Attempt to store 
> value 
> "{"dbName":"testAcidTablesReplLoadBootstrapIncr_1592205875387","replicationType":"BOOTSTRAP","stagingDir":"hdfs://localhost:65158/tmp/org_apache_hadoop_hive_ql_parse_TestReplicationScenarios_245261428230295/hrepl0/dGVzdGFjaWR0YWJsZXNyZXBsbG9hZGJvb3RzdHJhcGluY3JfMTU5MjIwNTg3NTM4Nw==/0/hive","lastReplId":25}"
>  in column "RM_METADATA" that has maximum length of 255. Please correct your 
> data!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-15 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135733#comment-17135733
 ] 

Anishek Agarwal commented on HIVE-22015:


The constraint is created on the from table (Primary) and not target 
table(foreign). For ex if there is foreign key relationship from t1 - > t2 the 
relation ship is stored as part of t1 definition not as part of t2, hence the 
event will be generated for the primary key table and DB.



> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23680) TestDbNotificationListener is unstable

2020-06-14 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-23680:
--

Assignee: Aasha Medhi

> TestDbNotificationListener is unstable
> --
>
> Key: HIVE-23680
> URL: https://issues.apache.org/jira/browse/HIVE-23680
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Aasha Medhi
>Priority: Major
>
> http://34.66.156.144:8080/job/hive-precommit/job/master/35/testReport/
> http://130.211.9.232/job/hive-flaky-check/24/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23680) TestDbNotificationListener is unstable

2020-06-14 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135394#comment-17135394
 ] 

Anishek Agarwal commented on HIVE-23680:


[~aasha] ca you help with this ?

> TestDbNotificationListener is unstable
> --
>
> Key: HIVE-23680
> URL: https://issues.apache.org/jira/browse/HIVE-23680
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Aasha Medhi
>Priority: Major
>
> http://34.66.156.144:8080/job/hive-precommit/job/master/35/testReport/
> http://130.211.9.232/job/hive-flaky-check/24/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23659) Add Retry for Ranger Replication

2020-06-10 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23659:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Add Retry for Ranger Replication
> 
>
> Key: HIVE-23659
> URL: https://issues.apache.org/jira/browse/HIVE-23659
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23659.01.patch, HIVE-23659.02.patch, 
> HIVE-23659.03.patch, HIVE-23659.04.patch, HIVE-23659.05.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23516) Store hive replication policy execution metrics in the relational DB

2020-06-09 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23516:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, Committed to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Store hive replication policy execution metrics in the relational DB
> 
>
> Key: HIVE-23516
> URL: https://issues.apache.org/jira/browse/HIVE-23516
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23516.01.patch, HIVE-23516.02.patch, 
> HIVE-23516.03.patch, HIVE-23516.04.patch, HIVE-23516.05.patch, 
> HIVE-23516.06.patch, HIVE-23516.07.patch, HIVE-23516.08.patch, 
> HIVE-23516.09.patch, HIVE-23516.10.patch, HIVE-23516.11.patch, 
> HIVE-23516.12.patch, HIVE-23516.13.patch, HIVE-23516.14.patch, 
> HIVE-23516.15.patch, HIVE-23516.16.patch, HIVE-23516.17.patch, 
> HIVE-23516.18.patch, HIVE-23516.19.patch, HIVE-23516.20.patch, 
> HIVE-23516.21.patch, HIVE-23516.22.patch, HIVE-23516.23.patch, 
> HIVE-23516.24.patch, HIVE-23516.25.patch, HIVE-23516.26.patch, 
> HIVE-23516.27.patch, HIVE-23516.28.patch, HIVE-23516.29.patch, 
> HIVE-23516.30.patch, HIVE-23516.31.patch, HIVE-23516.32.patch, 
> HIVE-23516.33.patch, HIVE-23516.34.patch, Replication Metrics.pdf
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Details documented in the attached doc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23605) 'Wrong FS' error during _external_tables_info creation when staging location is remote

2020-06-08 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128189#comment-17128189
 ] 

Anishek Agarwal commented on HIVE-23605:


+1, committed to master. Thanks for the patch [~pkumarsinha] and review 
[~aasha].

> 'Wrong FS' error during _external_tables_info creation when staging location 
> is remote
> --
>
> Key: HIVE-23605
> URL: https://issues.apache.org/jira/browse/HIVE-23605
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23605.01.patch, HIVE-23605.02.patch, 
> HIVE-23605.03.patch, HIVE-23605.04.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When staging location is on target cluster, Repl Dump fails to create 
> _external_tables_info file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23514) Add Atlas metadata replication metrics

2020-06-01 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23514:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1, Committed to master. Thanks for the patch [~pkumarsinha] and review [~aasha]

> Add Atlas metadata replication metrics
> --
>
> Key: HIVE-23514
> URL: https://issues.apache.org/jira/browse/HIVE-23514
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23514.01.patch, HIVE-23514.02.patch, 
> HIVE-23514.02.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23353) Atlas metadata replication scheduling

2020-05-29 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23353:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 , Committed to Master, Thanks for the patch [~pkumarsinha] and review 
[~aasha]

> Atlas metadata replication scheduling
> -
>
> Key: HIVE-23353
> URL: https://issues.apache.org/jira/browse/HIVE-23353
> Project: Hive
>  Issue Type: Task
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23353.01.patch, HIVE-23353.02.patch, 
> HIVE-23353.03.patch, HIVE-23353.04.patch, HIVE-23353.05.patch, 
> HIVE-23353.06.patch, HIVE-23353.07.patch, HIVE-23353.08.patch, 
> HIVE-23353.08.patch, HIVE-23353.08.patch, HIVE-23353.08.patch, 
> HIVE-23353.09.patch, HIVE-23353.10.patch, HIVE-23353.10.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23519) Read Ranger Configs from Classpath

2020-05-27 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23519:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 , Merged to master, Thanks for the patch [~aasha] and review [~pkumarsinha]

> Read Ranger Configs from Classpath
> --
>
> Key: HIVE-23519
> URL: https://issues.apache.org/jira/browse/HIVE-23519
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23519.01.patch, HIVE-23519.02.patch, 
> HIVE-23519.03.patch, HIVE-23519.04.patch, HIVE-23519.05.patch, 
> HIVE-23519.06.patch, HIVE-23519.08.patch, HIVE-23519.09.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23433) Add Deny Policy on Target Database After Ranger Replication to avoid writes

2020-05-21 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23433:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch [~aasha] and review [~pkumarsinha]

> Add Deny Policy on Target Database After Ranger Replication to avoid writes
> ---
>
> Key: HIVE-23433
> URL: https://issues.apache.org/jira/browse/HIVE-23433
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23433.01.patch, HIVE-23433.02.patch, 
> HIVE-23433.03.patch, HIVE-23433.04.patch, HIVE-23433.05.patch, 
> HIVE-23433.06.patch, HIVE-23433.07.patch, HIVE-23433.08.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23522) repl bootstrap load: optimize partition loads

2020-05-20 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal reassigned HIVE-23522:
--

Assignee: Aasha Medhi

> repl bootstrap load: optimize partition loads 
> --
>
> Key: HIVE-23522
> URL: https://issues.apache.org/jira/browse/HIVE-23522
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Anishek Agarwal
>Assignee: Aasha Medhi
>Priority: Major
>
> when "hive.repl.dump.metadata.only.for.external.table" = true is used in repl 
> dump, we only dump metadata for external tables. partitioned external tables 
> currently on the load side load one partition at a time, even though HMS has 
> an api for bulk partition update. we should for such scenarios use the bulk 
> api. this will significantly improve performance during bootstrap phase. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23522) repl bootstrap load: optimize partition loads

2020-05-20 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23522:
---
Issue Type: Improvement  (was: Bug)

> repl bootstrap load: optimize partition loads 
> --
>
> Key: HIVE-23522
> URL: https://issues.apache.org/jira/browse/HIVE-23522
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Anishek Agarwal
>Priority: Major
>
> when "hive.repl.dump.metadata.only.for.external.table" = true is used in repl 
> dump, we only dump metadata for external tables. partitioned external tables 
> currently on the load side load one partition at a time, even though HMS has 
> an api for bulk partition update. we should for such scenarios use the bulk 
> api. this will significantly improve performance during bootstrap phase. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23522) repl bootstrap load: optimize partition loads

2020-05-20 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112827#comment-17112827
 ] 

Anishek Agarwal commented on HIVE-23522:


related to HIVE-23520

cc [~aasha] 

> repl bootstrap load: optimize partition loads 
> --
>
> Key: HIVE-23522
> URL: https://issues.apache.org/jira/browse/HIVE-23522
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Anishek Agarwal
>Priority: Major
>
> when "hive.repl.dump.metadata.only.for.external.table" = true is used in repl 
> dump, we only dump metadata for external tables. partitioned external tables 
> currently on the load side load one partition at a time, even though HMS has 
> an api for bulk partition update. we should for such scenarios use the bulk 
> api. this will significantly improve performance during bootstrap phase. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23521) REPL: Optimise partition loading during bootstrap

2020-05-20 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112807#comment-17112807
 ] 

Anishek Agarwal commented on HIVE-23521:


assuming this is a bootstrap repl case you are talking about rajesh ? we can do 
something in batches for sure there to improve performance.

> REPL: Optimise partition loading during bootstrap
> -
>
> Key: HIVE-23521
> URL: https://issues.apache.org/jira/browse/HIVE-23521
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>
> When bootstrapping with large "REPL dump" with ~10K partitions, it starts 
> executing "addPartition" in sequential manner and takes very long time as it 
> communicates with HMS/registers partition etc for every call.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L399]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L165]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java#L210]
> When bootstrap loading has to deal with DDL, it would be good to collate all 
> partitions in single call to HMS. This would help in reducing overall runtime.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23520) REPL: repl dump could add support for immutable dataset

2020-05-20 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112172#comment-17112172
 ] 

Anishek Agarwal commented on HIVE-23520:


[~rajesh.balamohan] not sure about the use case, can you explain in detail 
please. looks like you are skipping copy of the data to staging directory, but 
repl load will not work here since currently it only looks up data on the 
staging directory. 



> REPL: repl dump could add support for immutable dataset
> ---
>
> Key: HIVE-23520
> URL: https://issues.apache.org/jira/browse/HIVE-23520
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-23520.1.patch
>
>
> Currently, "REPL DUMP" ends up copying entire dataset along with partition 
> information, stats etc in its dump folder. However, there are cases (e.g 
> large reference datasets), where we need a way to just retain metadata along 
> with partition information & stats.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23432) Add Ranger Replication Metrics

2020-05-15 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23432:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

merged to master. Thanks for the patch [~aasha] and review [~pkumarsinha]

> Add Ranger Replication Metrics 
> ---
>
> Key: HIVE-23432
> URL: https://issues.apache.org/jira/browse/HIVE-23432
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23432.01.patch, HIVE-23432.02.patch, 
> HIVE-23432.03.patch, HIVE-23432.04.patch, HIVE-23432.05.patch, 
> HIVE-23432.06.patch, HIVE-23432.07.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23351) Ranger Replication Scheduling

2020-05-11 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23351:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~aasha]

> Ranger Replication Scheduling
> -
>
> Key: HIVE-23351
> URL: https://issues.apache.org/jira/browse/HIVE-23351
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23351.01.patch, HIVE-23351.02.patch, 
> HIVE-23351.03.patch, HIVE-23351.04.patch, HIVE-23351.05.patch, 
> HIVE-23351.06.patch, HIVE-23351.07.patch, HIVE-23351.08.patch, 
> HIVE-23351.09.patch, HIVE-23351.10.patch, HIVE-23351.10.patch, 
> HIVE-23351.11.patch, HIVE-23351.12.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23351) Ranger Replication Scheduling

2020-05-11 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105062#comment-17105062
 ] 

Anishek Agarwal commented on HIVE-23351:


+1

> Ranger Replication Scheduling
> -
>
> Key: HIVE-23351
> URL: https://issues.apache.org/jira/browse/HIVE-23351
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23351.01.patch, HIVE-23351.02.patch, 
> HIVE-23351.03.patch, HIVE-23351.04.patch, HIVE-23351.05.patch, 
> HIVE-23351.06.patch, HIVE-23351.07.patch, HIVE-23351.08.patch, 
> HIVE-23351.09.patch, HIVE-23351.10.patch, HIVE-23351.10.patch, 
> HIVE-23351.11.patch, HIVE-23351.12.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23362) Repl dump returns dump location and repl id but doesn't write the ack in s3a FS

2020-05-10 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104063#comment-17104063
 ] 

Anishek Agarwal commented on HIVE-23362:


+1

> Repl dump returns dump location and repl id but doesn't write the ack in s3a 
> FS
> ---
>
> Key: HIVE-23362
> URL: https://issues.apache.org/jira/browse/HIVE-23362
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23362.01.patch, HIVE-23362.02.patch, 
> HIVE-23362.03.patch, HIVE-23362.04.patch, HIVE-23362.05.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23362) Repl dump returns dump location and repl id but doesn't write the ack in s3a FS

2020-05-10 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23362:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, Thanks for the patch [~aasha]!

> Repl dump returns dump location and repl id but doesn't write the ack in s3a 
> FS
> ---
>
> Key: HIVE-23362
> URL: https://issues.apache.org/jira/browse/HIVE-23362
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23362.01.patch, HIVE-23362.02.patch, 
> HIVE-23362.03.patch, HIVE-23362.04.patch, HIVE-23362.05.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23309) Lazy Initialization of Hadoop Shims

2020-05-06 Thread Anishek Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23309:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Merged to Master, [~aasha] thanks for the patch.

> Lazy Initialization of Hadoop Shims
> ---
>
> Key: HIVE-23309
> URL: https://issues.apache.org/jira/browse/HIVE-23309
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-23309.01.patch, HIVE-23309.02.patch, 
> HIVE-23309.03.patch, HIVE-23309.04.patch, HIVE-23309.05.patch, 
> HIVE-23309.06.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Initialize hadoop-shims only if CM is enabled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >