Re: Review Request 65018: HIVE-18372 Create testing infra to test different HMS instances

2018-01-11 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65018/
---

(Updated Jan. 11, 2018, 12:54 p.m.)


Review request for hive, Alan Gates, Marta Kuczora, Adam Szita, and Vihang 
Karajgaonkar.


Changes
---

Updated the patch based on Vihang's comment


Bugs: HIVE-18372
https://issues.apache.org/jira/browse/HIVE-18372


Repository: hive-git


Description
---

Created:
- AbstractMetastore class - to privide an interface for different metastore 
implementation (start/stop/warehouse path methods)
-- Implementation for Embedded/Remote/Cluster metastores
- MiniHMS with builder - to create hms instances for test
- MetaStoreFactory - to create the parameter list for parametrized test
- TestDatabases - test for database related metastore functions to showcase the 
infrastructure


Diffs (updated)
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreFactoryForTests.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/AbstractMetaStoreService.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/ClusterMetaStoreForTests.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/EmbeddedMetaStoreForTests.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/MiniHMS.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/RemoteMetaStoreForTests.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/65018/diff/4/

Changes: https://reviews.apache.org/r/65018/diff/3-4/


Testing
---

Run the new tests


Thanks,

Peter Vary



Review Request 65075: HIVE-18426: Memory leak in RoutingAppender for every hive operation

2018-01-11 Thread kalyan kumar kalvagadda via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65075/
---

Review request for hive, Aihua Xu and Andrew Sherman.


Bugs: HIVE-18426
https://issues.apache.org/jira/browse/HIVE-18426


Repository: hive-git


Description
---

Each new operation creates new entry in the ConcurrentMap in RoutingAppender 
but when the operation ends, AppenderControl stored in the map is retrieved and 
stopped but the entry in ConcurrentMap is never cleaned up.


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/LogUtils.java 
0a3e0c72011951b6b1543352308bd51233c847fb 
  
itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingLayout.java
 8febe3e79ff892c54b696b6c6ef92f7026c46033 


Diff: https://reviews.apache.org/r/65075/diff/1/


Testing
---

Made sure that the new tests updated to verify this change is passing.


Thanks,

kalyan kumar kalvagadda



Re: Review Request 65018: HIVE-18372 Create testing infra to test different HMS instances

2018-01-11 Thread Peter Vary via Review Board


> On Jan. 10, 2018, 6:35 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
> > Lines 350 (patched)
> > 
> >
> > I think we should catch the exception and assert the exception type is 
> > InvalidOperationException so that we catch errors like if someone changes 
> > the thrown exception in the future.

Shall I separate the test case into two? With/Without cascade? There is a 
little extra stuff there creating the table/function/index in two tests which I 
wanted to avoid with one test case. That is why I decied to keep the cascade 
tests in one test case and used try-catch here to check the exception without 
the cascade option, and then proceed with the test and drop the database with 
the cascade option.

What do you think?


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65018/#review195158
---


On Jan. 11, 2018, 12:54 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65018/
> ---
> 
> (Updated Jan. 11, 2018, 12:54 p.m.)
> 
> 
> Review request for hive, Alan Gates, Marta Kuczora, Adam Szita, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-18372
> https://issues.apache.org/jira/browse/HIVE-18372
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Created:
> - AbstractMetastore class - to privide an interface for different metastore 
> implementation (start/stop/warehouse path methods)
> -- Implementation for Embedded/Remote/Cluster metastores
> - MiniHMS with builder - to create hms instances for test
> - MetaStoreFactory - to create the parameter list for parametrized test
> - TestDatabases - test for database related metastore functions to showcase 
> the infrastructure
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreFactoryForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/AbstractMetaStoreService.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/ClusterMetaStoreForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/EmbeddedMetaStoreForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/MiniHMS.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/RemoteMetaStoreForTests.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65018/diff/4/
> 
> 
> Testing
> ---
> 
> Run the new tests
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



[jira] [Created] (HIVE-18442) HoS: No FileSystem for scheme: nullscan

2018-01-11 Thread Rui Li (JIRA)
Rui Li created HIVE-18442:
-

 Summary: HoS: No FileSystem for scheme: nullscan
 Key: HIVE-18442
 URL: https://issues.apache.org/jira/browse/HIVE-18442
 Project: Hive
  Issue Type: Bug
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 65018: HIVE-18372 Create testing infra to test different HMS instances

2018-01-11 Thread Adam Szita via Review Board


> On Jan. 10, 2018, 6:35 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
> > Lines 350 (patched)
> > 
> >
> > I think we should catch the exception and assert the exception type is 
> > InvalidOperationException so that we catch errors like if someone changes 
> > the thrown exception in the future.
> 
> Peter Vary wrote:
> Shall I separate the test case into two? With/Without cascade? There is a 
> little extra stuff there creating the table/function/index in two tests which 
> I wanted to avoid with one test case. That is why I decied to keep the 
> cascade tests in one test case and used try-catch here to check the exception 
> without the cascade option, and then proceed with the test and drop the 
> database with the cascade option.
> 
> What do you think?

I'd vote for separation. One test case for cascade and one for without cascade. 
The latter should have an @Test(expected = ..) annotation IMHO


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65018/#review195158
---


On Jan. 11, 2018, 12:54 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65018/
> ---
> 
> (Updated Jan. 11, 2018, 12:54 p.m.)
> 
> 
> Review request for hive, Alan Gates, Marta Kuczora, Adam Szita, and Vihang 
> Karajgaonkar.
> 
> 
> Bugs: HIVE-18372
> https://issues.apache.org/jira/browse/HIVE-18372
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Created:
> - AbstractMetastore class - to privide an interface for different metastore 
> implementation (start/stop/warehouse path methods)
> -- Implementation for Embedded/Remote/Cluster metastores
> - MiniHMS with builder - to create hms instances for test
> - MetaStoreFactory - to create the parameter list for parametrized test
> - TestDatabases - test for database related metastore functions to showcase 
> the infrastructure
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreFactoryForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/AbstractMetaStoreService.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/ClusterMetaStoreForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/EmbeddedMetaStoreForTests.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/MiniHMS.java
>  PRE-CREATION 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/minihms/RemoteMetaStoreForTests.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65018/diff/4/
> 
> 
> Testing
> ---
> 
> Run the new tests
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



[jira] [Created] (HIVE-18443) Ensure git gc finished in ptest prep phase before copying repo

2018-01-11 Thread Adam Szita (JIRA)
Adam Szita created HIVE-18443:
-

 Summary: Ensure git gc finished in ptest prep phase before copying 
repo
 Key: HIVE-18443
 URL: https://issues.apache.org/jira/browse/HIVE-18443
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Adam Szita
Assignee: Adam Szita


In ptest's prep phase script first we checkout the latest Hive code from git, 
and then we make copy of its contents (along .git folder) for that will serve 
as Yetus' working directory.

In some cases we can see errors such as {{+ cp -R . ../yetus
cp: cannot stat ?./.git/gc.pid?: No such file or directory}}, e.g. 
[here|https://issues.apache.org/jira/browse/HIVE-18372?focusedCommentId=16321507=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16321507]

This is caused by git running its gc feature in the background when our prep 
script has already started copying. In cases where gc finishes while cp is 
running, we'll get this error



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 63382: HIVE-17833: Publish split generation counters

2018-01-11 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63382/
---

(Updated Jan. 11, 2018, 10:17 p.m.)


Review request for hive and Sergey Shelukhin.


Changes
---

Rebased after tez version update


Bugs: HIVE-17833
https://issues.apache.org/jira/browse/HIVE-17833


Repository: hive-git


Description
---

HIVE-17833: Publish split generation counters


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersTezSessionPoolManager.java
 3b6eb71 
  itests/src/test/resources/testconfiguration.properties ac81995 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 92741ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d68d646 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 6915cf1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveInputCounters.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
5dd5e80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
5ade1f3 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecTezSummaryPrinter.java 
45bd6e0 
  ql/src/test/queries/clientpositive/tez_input_counters.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_input_counters.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/63382/diff/2/

Changes: https://reviews.apache.org/r/63382/diff/1-2/


Testing
---


Thanks,

Prasanth_J



Re: Review Request 65075: HIVE-18426: Memory leak in RoutingAppender for every hive operation

2018-01-11 Thread Andrew Sherman via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65075/#review195255
---




common/src/java/org/apache/hadoop/hive/common/LogUtils.java
Lines 259 (patched)


Does deleteAppender() call stop() internally? If so then can we delete the 
previous call to stop the subordinateAppender?



itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingLayout.java
Line 162 (original), 163 (patched)


This code was to check the case described in 
https://issues.apache.org/jira/browse/HIVE-17826?focusedCommentId=16208636=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16208636
I think this is no longer tested. Is that OK?


- Andrew Sherman


On Jan. 11, 2018, 2:11 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65075/
> ---
> 
> (Updated Jan. 11, 2018, 2:11 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Andrew Sherman, and Sergio Pena.
> 
> 
> Bugs: HIVE-18426
> https://issues.apache.org/jira/browse/HIVE-18426
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Each new operation creates new entry in the ConcurrentMap in RoutingAppender 
> but when the operation ends, AppenderControl stored in the map is retrieved 
> and stopped but the entry in ConcurrentMap is never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/LogUtils.java 
> 0a3e0c72011951b6b1543352308bd51233c847fb 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingLayout.java
>  8febe3e79ff892c54b696b6c6ef92f7026c46033 
> 
> 
> Diff: https://reviews.apache.org/r/65075/diff/1/
> 
> 
> Testing
> ---
> 
> Made sure that the new tests updated to verify this change are passing.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



Re: Review Request 63382: HIVE-17833: Publish split generation counters

2018-01-11 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63382/#review195265
---




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
Lines 224 (patched)


make sense. I will probably assume the fileSplit.getPath() is a file path 
and not directory path. I guess it does not matter from trigger's perspective. 
In most cases, fileSplit.getPath() will be a file except for ACID deltas. 

Will update the counter to assume getPath() returns file.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java
Lines 306 (patched)


In the context of triggers, we are more interested in *any* vertex 
violating the constraint.

INPUT_DIRECTORIES_vertex_a = 50
INPUT_DIRECTORIES_vertex_b = 100

This will only send INPUT_DIRECTORIES = 100 to trigger validator. 

We should probably special case this to something like

TOTAL_INPUT_DIRECTORIES (aggregated of all vertex counters)
MAX_INPUT_DIRECTORIES (max of all vertex counters)

Any thoughts?


- Prasanth_J


On Jan. 11, 2018, 10:17 p.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63382/
> ---
> 
> (Updated Jan. 11, 2018, 10:17 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-17833
> https://issues.apache.org/jira/browse/HIVE-17833
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-17833: Publish split generation counters
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersTezSessionPoolManager.java
>  3b6eb71 
>   itests/src/test/resources/testconfiguration.properties ac81995 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 92741ee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d68d646 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 6915cf1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveInputCounters.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 5dd5e80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
> 5ade1f3 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecTezSummaryPrinter.java 
> 45bd6e0 
>   ql/src/test/queries/clientpositive/tez_input_counters.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/tez_input_counters.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63382/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



Re: Review Request 63382: HIVE-17833: Publish split generation counters

2018-01-11 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63382/#review195264
---




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java
Lines 224 (patched)


hmm... does this iterate thru the entire input of the query? That can be 
very expensive on cloud FS or if there are millions of partitions. I don't 
think we should do that



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java
Lines 306 (patched)


it seems like counter name without vertex name would imply total, not max.
I.e. if I say splits > 50, or input-size > 1Gb I mean total splits > 
50/size > 1Gb, not for any one vertex.


- Sergey Shelukhin


On Jan. 11, 2018, 10:17 p.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63382/
> ---
> 
> (Updated Jan. 11, 2018, 10:17 p.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-17833
> https://issues.apache.org/jira/browse/HIVE-17833
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-17833: Publish split generation counters
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersTezSessionPoolManager.java
>  3b6eb71 
>   itests/src/test/resources/testconfiguration.properties ac81995 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 92741ee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d68d646 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 6915cf1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveInputCounters.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 5dd5e80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
> 5ade1f3 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecTezSummaryPrinter.java 
> 45bd6e0 
>   ql/src/test/queries/clientpositive/tez_input_counters.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/tez_input_counters.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63382/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



Re: Review Request 65075: HIVE-18426: Memory leak in RoutingAppender for every hive operation

2018-01-11 Thread kalyan kumar kalvagadda via Review Board


> On Jan. 11, 2018, 9:51 p.m., Andrew Sherman wrote:
> > common/src/java/org/apache/hadoop/hive/common/LogUtils.java
> > Lines 259 (patched)
> > 
> >
> > Does deleteAppender() call stop() internally? If so then can we delete 
> > the previous call to stop the subordinateAppender?

You are right. deleteAppender internallu calls stop() on the appender. we need 
to explicitly look-up and stop it. I will update this in my next patch.


> On Jan. 11, 2018, 9:51 p.m., Andrew Sherman wrote:
> > itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingLayout.java
> > Line 162 (original), 163 (patched)
> > 
> >
> > This code was to check the case described in 
> > https://issues.apache.org/jira/browse/HIVE-17826?focusedCommentId=16208636=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16208636
> > I think this is no longer tested. Is that OK?

I have couple of observations with the limited knowledge that I have. Correct 
me if i'm wrong here.

1. HIVE-17128: Code changes done as part of this jira makes sure that log4j 
Appender is closed when operation is closed to avoid file descriptors.
2. HIVE-17826: Added HushableRandomAccessFileAppender which is a copy of 
RandomAccessFileAppender but has a explicit check in append() method to see if 
the appender is closed.

 I think issue reported in HIVE-17826 is seen only because the RoutingAppender 
is not properly cleaned-up when the operation is stopped. If we have the patch 
i submitted we may not see the issue reported in HIVE-17826 even with out the 
fix of adding HushableRandomAccessFileAppender.


- kalyan kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65075/#review195255
---


On Jan. 11, 2018, 2:11 p.m., kalyan kumar kalvagadda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65075/
> ---
> 
> (Updated Jan. 11, 2018, 2:11 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Andrew Sherman, and Sergio Pena.
> 
> 
> Bugs: HIVE-18426
> https://issues.apache.org/jira/browse/HIVE-18426
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Each new operation creates new entry in the ConcurrentMap in RoutingAppender 
> but when the operation ends, AppenderControl stored in the map is retrieved 
> and stopped but the entry in ConcurrentMap is never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/LogUtils.java 
> 0a3e0c72011951b6b1543352308bd51233c847fb 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingLayout.java
>  8febe3e79ff892c54b696b6c6ef92f7026c46033 
> 
> 
> Diff: https://reviews.apache.org/r/65075/diff/1/
> 
> 
> Testing
> ---
> 
> Made sure that the new tests updated to verify this change are passing.
> 
> 
> Thanks,
> 
> kalyan kumar kalvagadda
> 
>



[jira] [Created] (HIVE-18444) when creating transactional table make sure location has no data

2018-01-11 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-18444:
-

 Summary: when creating transactional table make sure location has 
no data
 Key: HIVE-18444
 URL: https://issues.apache.org/jira/browse/HIVE-18444
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman


if a user creates a new transactional table but sets a location to some place 
that already has data any number of things can break.  

Data may not be in Acid format, it may have been written by another cluster and 
txnids won't make sense in current cluster.  Once per table writeIDs are there, 
if the data was written by another table, writeIDs won't match.

This could actually work if the data at the existing location was not written 
by an acid write but it would be safer/cleaner to just prevent this (at least 
at first).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 63382: HIVE-17833: Publish split generation counters

2018-01-11 Thread j . prasanth . j


> On Jan. 11, 2018, 10:47 p.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java
> > Lines 306 (patched)
> > 
> >
> > In the context of triggers, we are more interested in *any* vertex 
> > violating the constraint.
> > 
> > INPUT_DIRECTORIES_vertex_a = 50
> > INPUT_DIRECTORIES_vertex_b = 100
> > 
> > This will only send INPUT_DIRECTORIES = 100 to trigger validator. 
> > 
> > We should probably special case this to something like
> > 
> > TOTAL_INPUT_DIRECTORIES (aggregated of all vertex counters)
> > MAX_INPUT_DIRECTORIES (max of all vertex counters)
> > 
> > Any thoughts?

Added DAG_ and VERTEX_ prefix to counters with total vs max values being 
returned the counter name following the prefix.


- Prasanth_J


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63382/#review195265
---


On Jan. 12, 2018, 1:08 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63382/
> ---
> 
> (Updated Jan. 12, 2018, 1:08 a.m.)
> 
> 
> Review request for hive and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-17833
> https://issues.apache.org/jira/browse/HIVE-17833
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-17833: Publish split generation counters
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersTezSessionPoolManager.java
>  3b6eb71 
>   itests/src/test/resources/testconfiguration.properties ac81995 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 92741ee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d68d646 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 6915cf1 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveInputCounters.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 5dd5e80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
> 5ade1f3 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecTezSummaryPrinter.java 
> 45bd6e0 
>   ql/src/java/org/apache/hadoop/hive/ql/wm/VertexCounterLimit.java 7d6482a 
>   ql/src/test/org/apache/hadoop/hive/ql/wm/TestTrigger.java b686783 
>   ql/src/test/queries/clientpositive/tez_input_counters.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/tez_input_counters.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/63382/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



Re: Review Request 63382: HIVE-17833: Publish split generation counters

2018-01-11 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63382/
---

(Updated Jan. 12, 2018, 1:08 a.m.)


Review request for hive and Sergey Shelukhin.


Changes
---

Addressed review comments.


Bugs: HIVE-17833
https://issues.apache.org/jira/browse/HIVE-17833


Repository: hive-git


Description
---

HIVE-17833: Publish split generation counters


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersTezSessionPoolManager.java
 3b6eb71 
  itests/src/test/resources/testconfiguration.properties ac81995 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 92741ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d68d646 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkTask.java 6915cf1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveInputCounters.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
5dd5e80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/TezJobMonitor.java 
5ade1f3 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/PostExecTezSummaryPrinter.java 
45bd6e0 
  ql/src/java/org/apache/hadoop/hive/ql/wm/VertexCounterLimit.java 7d6482a 
  ql/src/test/org/apache/hadoop/hive/ql/wm/TestTrigger.java b686783 
  ql/src/test/queries/clientpositive/tez_input_counters.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/tez_input_counters.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/63382/diff/3/

Changes: https://reviews.apache.org/r/63382/diff/2-3/


Testing
---


Thanks,

Prasanth_J



Re: Branch for "Per Table Write ID" implementation

2018-01-11 Thread Sankar Hariappan
Thanks Thejas, Eugene and Gopal for the feedback!
Will go ahead and create the branch!

Best regards
Sankar








On 11/01/18, 11:03 AM, "Eugene Koifman"  wrote:

>+1
>
>
>On 1/10/18, 8:18 PM, "Thejas Nair"  wrote:
>
>+1
>Makes sense to split the changes into multiple smaller patches that are
>easier to review, and creating this branch would help with that.
>
>
>
>On Tue, Jan 9, 2018 at 10:55 PM, Sankar Hariappan <
>shariap...@hortonworks.com> wrote:
>
>> Hi all,
>>
>> "Hive Replication” feature is advancing to support ACID tables 
> (HIVE-18320<
>> https://issues.apache.org/jira/browse/HIVE-18320>).
>> “Per Table Write ID” is an important requirement to support replication
>> for ACID tables especially for the use case of “Analytics workload
>> off-loading for scalability”. Details are available in the design 
> document
>> attached in the JIRA.
>>
>> Per table Write ID implementation have several changes.
>>
>>   1.  Add metadata tables to allocate and manage write ID. Also, map it
>> against global transaction.
>>   2.  Handle snapshot isolation for ACID/MM table reads by using
>> ValidWriteIDList instead of ValidTxnList.
>>   3.  Modify ORC/Hive row readers to use ValidWriteIDList instead of
>> ValidTxnList to read valid delta/base directories.
>>   4.  Update ValidCompactorTxnList to use table Write Ids.
>>   5.  Upgrade from existing Hive versions by migrating the ACID/MM tables
>> to use Write ID instead of global transaction ID.
>>   6.  Correct the UT test scripts to use ValidWriteIDList instead of
>> ValidTxnList for snapshot isolation tests.
>>   7.  Rename the method/variable names of several classes to use WriteId
>> instead of TxnId.
>>
>> As part of HIVE-18192,
>> I have implemented first 3 changes in the list which makes ACID 
> read/write
>> to work with Write ID change. But, this feature will be incomplete 
> without
>> rest of the changes.
>>
>> Hence, I would like to create a branch (branch-per-table-writeid) from
>> master to commit this feature with multiple patches. This branch is
>> expected to be short-lived for 2 to 3 weeks.
>>
>> Request feedback from the community.
>>
>> Best regards
>> Sankar
>>
>>
>
>