Re: Review Request 67296: HIVE-18875 : Enable SMB Join by default in Tez

2018-06-04 Thread Deepak Jaiswal


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java
> > Lines 172 (patched)
> > 
> >
> > It seems you fixed the same problem twice now. Once by fixing the close 
> > logic, and a second time with this. Did the close logic by itself not 
> > suffice?
> > 
> > I prefer the fix in the join operator to be honest. For multiple 
> > reasons:
> > 
> > a) This is a lot of new code.
> > b) The code assumes a lot about surrounding operators, that can easily 
> > break when you add new code paths.
> > c) Fixing it in the group by operator seems wrong. What if other 
> > operators flush on close? PTF? other joins? This seems brittle.
> > 
> > Can you go back to the original exec fix? Was there an issue with it?

The fix in join op is not sufficient. I have multiple scenarios in which it 
broke that too with a lot of additional fixes (See Patch 11). The reason being, 
the join op is written with shuffle join in mind first where it is the top 
operator in the Reducer. If it pulls a row, it gets it immediately, whereas a 
reduce side SMB does not. A parent GBY Op has it and it depends on reading the 
next group for pushing the current group. This behavior differs from regular 
shuffle. The numerous fixes are just patching one case at a time which makes 
this code really ugly, bug prone and inefficient due to too much branching.

Inorder to fix this properly in Join Op, the correct fix would be to go back to 
drawing board and rewrite the entire thing.

a) The new code is very specific to SMB on reduce side and only applies to it. 
This makes sure it mimics the state machine assumed by shuffle join. Infact, I 
think the fix in close() may not be even needed, but I need to verify that.
b) The assumptions are very rigid. A lot of conditions have to be met inorder 
to set reduceSMB true, however, I think, we can do this at compile time when 
SMB Op is created right after DummyOp is created. If a new code path is added, 
it has to make sure existing ones don't break.
c) The rows are forwarded as they are read. Once a row is pushed out of GBY, it 
is pushed out until it reaches JoinOp. Good point about PTF, I think we need to 
do similar thing in PTF as well. There is no other known case for reduce side 
SMB.

As mentioned above, the fix before this is a giant pool of patches which can 
run clean on ptests but can break on very simple and small tests which I plan 
to add once this code is green lit.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java
> > Lines 313 (patched)
> > 
> >
> > introducing method calls in the inner loop can have negative perf 
> > implications, are you sure this won't hurt?

Thanks for pointing this out, its a remnant of the earlier patch which needs to 
go away.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
> > Lines 733 (patched)
> > 
> >
> > this adds another branch in the inner loop also. might have perf 
> > implications.

Sure, will find a way to avoid it.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
> > Lines 906 (patched)
> > 
> >
> > see other comment. this is a lot of new code - and unnecessary if 
> > you've already fixed it in join.

Explained in 1st comment.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java
> > Lines 614 (patched)
> > 
> >
> > I still don't think this code should be here. This seems to be doing 
> > the exact same thing as the "checkColEquality" below. If not can you tell 
> > me how this is different and why the calls below don't suffice? Otherwise 
> > let's remove this?

I will see if this code can be removed.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
> > Lines 265 (patched)
> > 
> >
> > I don't think this comment should be here. In this class you should be 
> > explaining what the rules are not what certain situations you ran in on the 
> > execution side. Can you please remove?

I will rewrite it.


> On June 5, 2018, 3:21 a.m., Gunther Hagleitner wrote:
> > 

Re: Review Request 67296: HIVE-18875 : Enable SMB Join by default in Tez

2018-06-04 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67296/#review204301
---




ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java
Lines 172 (patched)


It seems you fixed the same problem twice now. Once by fixing the close 
logic, and a second time with this. Did the close logic by itself not suffice?

I prefer the fix in the join operator to be honest. For multiple reasons:

a) This is a lot of new code.
b) The code assumes a lot about surrounding operators, that can easily 
break when you add new code paths.
c) Fixing it in the group by operator seems wrong. What if other operators 
flush on close? PTF? other joins? This seems brittle.

Can you go back to the original exec fix? Was there an issue with it?



ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java
Lines 313 (patched)


introducing method calls in the inner loop can have negative perf 
implications, are you sure this won't hurt?



ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
Lines 733 (patched)


this adds another branch in the inner loop also. might have perf 
implications.



ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
Lines 906 (patched)


see other comment. this is a lot of new code - and unnecessary if you've 
already fixed it in join.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java
Lines 614 (patched)


I still don't think this code should be here. This seems to be doing the 
exact same thing as the "checkColEquality" below. If not can you tell me how 
this is different and why the calls below don't suffice? Otherwise let's remove 
this?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
Lines 265 (patched)


I don't think this comment should be here. In this class you should be 
explaining what the rules are not what certain situations you ran in on the 
execution side. Can you please remove?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
Lines 299 (patched)


This is a nit, feel free to ignore: I don't think it's necessarily a 
"parent gby" that creates the bucketing. Just drop after "than".


- Gunther Hagleitner


On June 4, 2018, 5:38 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67296/
> ---
> 
> (Updated June 4, 2018, 5:38 a.m.)
> 
> 
> Review request for hive, Gunther Hagleitner and Jason Dere.
> 
> 
> Bugs: HIVE-18875
> https://issues.apache.org/jira/browse/HIVE-18875
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fixed various issues with SMB, mostly on the Reducer side join.
> GBY Op now uses inputObjectInspector[0] all the time as it is the only OI it 
> has. The tag is irrelevant here. Was causing problem with SMB.
> Disabled SMB in spark on hive tests as the same config for Tez was enabling 
> it there.
> Some SMB specific tests were designed to first run without SMB and then with 
> SMB. With SMB enabled by default, it is explicitely turned off to make sure 
> the behavior is maintained.
> 
> Please go through JIRA comments as they may clear out some questions.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3295d1dbc5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 
> aefaa0586e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4b766382ef 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> 4019f132d3 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
>  9e5446566b 
>   ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q 7416eb0ec0 
>   ql/src/test/queries/clientpositive/skewjoinopt19.q 02cadda7f5 
>   ql/src/test/queries/clientpositive/skewjoinopt20.q 160e5b82d9 
>   ql/src/test/queries/clientpositive/smb_mapjoin_11.q 6ce49b83c2 
>   ql/src/test/queries/clientpositive/smb_mapjoin_12.q 753e4d3c9a 
>   ql/src/test/queries/clientpositive/smb_mapjoin_17.q d68f5f3139 
>   ql/src/test/queries/clientpositive/subquery_notin.q 64940277bb 
>   ql/src/test/results/clientpositive/llap/correlationoptimizer2.q.out 
> 0f839ead0e 
>   

[jira] [Created] (HIVE-19793) disable LLAP IO batch-to-row wrapper for ACID deletes/updates

2018-06-04 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19793:
---

 Summary: disable LLAP IO batch-to-row wrapper for ACID 
deletes/updates
 Key: HIVE-19793
 URL: https://issues.apache.org/jira/browse/HIVE-19793
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19792) Enable schema evolution tests for decimal 64

2018-06-04 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-19792:


 Summary: Enable schema evolution tests for decimal 64
 Key: HIVE-19792
 URL: https://issues.apache.org/jira/browse/HIVE-19792
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Prasanth Jayachandran


Following tests are disabled in HIVE-19629 as orc ConvertTreeReaderFactory does 
not handle Decimal64ColumnVectors. This jira is to re-enable those tests after 
orc supports it. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19791) Modify TableDesc to contain the catalog

2018-06-04 Thread Alan Gates (JIRA)
Alan Gates created HIVE-19791:
-

 Summary: Modify TableDesc to contain the catalog
 Key: HIVE-19791
 URL: https://issues.apache.org/jira/browse/HIVE-19791
 Project: Hive
  Issue Type: Sub-task
  Components: Query Planning
Affects Versions: 3.0.0
Reporter: Alan Gates
Assignee: Alan Gates


TableDesc currently only contains a table's database and tablename.  It needs 
to also have the catalog name.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19790) Metastore upgrade: 3.1.0 upgrade script is slow and non-idempotent

2018-06-04 Thread Gopal V (JIRA)
Gopal V created HIVE-19790:
--

 Summary: Metastore upgrade: 3.1.0 upgrade script is slow and 
non-idempotent
 Key: HIVE-19790
 URL: https://issues.apache.org/jira/browse/HIVE-19790
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Gopal V


Because of the giant bit-vectors stored on mysql, the update of PART_COL_STATS 
is very slow and also is not idempotent.

{code}
--
UPDATE `PART_COL_STATS`
  SET `CAT_NAME` = 'hive'
--

Query OK, 0 rows affected (4 min 1.57 sec)
Rows matched: 778025  Changed: 0  Warnings: 0
{code}

Adding a filter speeds it up because it will no longer overwrite 

{code}
mysql> explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` 
='';
--
explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` =''
--

++-++---+---+---+-+---+--+--+
| id | select_type | table  | type  | possible_keys | key   | 
key_len | ref   | rows | Extra|
++-++---+---+---+-+---+--+--+
|  1 | SIMPLE  | PART_COL_STATS | range | PCS_STATS_IDX | PCS_STATS_IDX | 
258 | const |1 | Using where; Using temporary |
++-++---+---+---+-+---+--+--+
1 row in set (0.00 sec)
{code}

this would be much faster to re-run and would not accidentally overwrite any 
existing CAT_NAMEs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19789) reenable orc_llap test

2018-06-04 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19789:
---

 Summary: reenable orc_llap test
 Key: HIVE-19789
 URL: https://issues.apache.org/jira/browse/HIVE-19789
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Matt McCline


Test has been disabled, looks like by mistake (or due to some issue with the 
patch there that was never addressed), in HIVE-11394.
It needs to be reenabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19788) Flaky test: TestHCatLoaderComplexSchema

2018-06-04 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19788:
---

 Summary: Flaky test: TestHCatLoaderComplexSchema
 Key: HIVE-19788
 URL: https://issues.apache.org/jira/browse/HIVE-19788
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Reporter: Sahil Takiar
Assignee: Sahil Takiar


{{TestHCatLoaderComplexSchema}} is still flaky because its writing to {{/tmp/}} 
- HIVE-19731 was meant to fix this, and that fixes the tmp dir for any Hive 
queries, but these tests run a bunch of Pig queries too, and those queries 
write to {{/tmp/}} - we need to pass in custom configs to the embedded 
{{PigServer}} that is being created as part of these tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19787) Log message when spark-submit has completed

2018-06-04 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19787:
---

 Summary: Log message when spark-submit has completed
 Key: HIVE-19787
 URL: https://issues.apache.org/jira/browse/HIVE-19787
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Sahil Takiar


If {{spark-submit}} runs successfully the "Driver" thread should log a message. 
Otherwise there is no way to know if {{spark-submit}} exited successfully. We 
should also rename the thread to some more informative than "Driver".

Without this, debugging timeout exceptions of the RemoteDriver -> HS2 
connection is difficult, because there is no way to know if {{spark-submit}} 
finished or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67263: HIVE-19602

2018-06-04 Thread Bharathkrishna Guruvayoor Murali via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67263/
---

(Updated June 4, 2018, 6:34 p.m.)


Review request for hive, Sahil Takiar and Vihang Karajgaonkar.


Changes
---

Correcting issues reported by checkstyle and findbugs checks.


Bugs: HIVE-19602
https://issues.apache.org/jira/browse/HIVE-19602


Repository: hive-git


Description
---

Refactor inplace progress code in Hive-on-spark progress monitor to use 
ProgressMonitor instance


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java 
7afd8864075aa0d9708274eea8839c662324c732 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkProgressMonitor.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/67263/diff/4/

Changes: https://reviews.apache.org/r/67263/diff/3-4/


Testing
---


Thanks,

Bharathkrishna Guruvayoor Murali



Re: Review Request 67351: HIVE-19718 Adding partitions in bulk also fetches table for each partition

2018-06-04 Thread Vihang Karajgaonkar via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67351/#review204262
---




standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 2411-2416 (original)


Why do we need to remove these lines?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Line 2422 (original)


Why do we need to remove this line?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Lines 2426 (patched)


Adding a javadoc would be great. esp. mentioning that the advantage of 
using this method and when its better to use it.


- Vihang Karajgaonkar


On May 29, 2018, 10:53 a.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67351/
> ---
> 
> (Updated May 29, 2018, 10:53 a.m.)
> 
> 
> Review request for hive, Alexander Kolbasov and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-19718
> https://issues.apache.org/jira/browse/HIVE-19718
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Various optimization for addPartitions call:
> - Push down table object to convertToMPart
> - Push down partitionKeys to startAddPartition -> doesPartitionExist -> 
> getMPartition, so it does not have to query the table object for every time 
> if we add multiple partitions for the same table
> - The original getMPartition used to query the table every time. Created a 
> new version of getMPartition, which can use the provided partitionKeys 
> instead of querying it again.
> 
> 
> Diffs
> -
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  3d6fda6 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  c1d25db 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  13ccdb1 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  ce7d286 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  b223920 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  f6899be 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  98a85cc 
> 
> 
> Diff: https://reviews.apache.org/r/67351/diff/1/
> 
> 
> Testing
> ---
> 
> Run several performance tests with Sasha's performance tool. These 
> optimisations shave of ~10% of the runtime
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



[jira] [Created] (HIVE-19786) RpcServer cancelTask log message is incorrect

2018-06-04 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19786:
---

 Summary: RpcServer cancelTask log message is incorrect
 Key: HIVE-19786
 URL: https://issues.apache.org/jira/browse/HIVE-19786
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Sahil Takiar


The log message inside the {{cancelTask}} of the {{RpcServer}} 
{{ChannelInitializer}} is incorrect. It states its measuring the timeout for 
the "test" message to be sent (basically a "hello" message to test the 
connection works). However, the {{cancelTask}} is actually used to timeout the 
SASL negotiation between the client and the server.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19785) Race condition when timeout task is invoked during SASL negotation

2018-06-04 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19785:
---

 Summary: Race condition when timeout task is invoked during SASL 
negotation
 Key: HIVE-19785
 URL: https://issues.apache.org/jira/browse/HIVE-19785
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Sahil Takiar


There is a race condition that leads to some extraneous exception messages when 
the timeout task is invoked in {{RpcServer}}.

If a timeout is triggered by {{RpcServer#registerClient}} the method will 
remove the {[clientId}} from {{pendingClients}}. However, if the SASL 
negotiation is in progress when the timeout task is invoked, then 
{{SaslServerHandler#update}} will throw an {{IllegalArgumentException}} 
complaining that it can't find the {{clientId}} in the map of 
{{pendingClients}}.

The timeout still succeeds, but the logging is confusing and multiple 
exceptions make this difficult to debug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67437: HIVE-19649: Clean up inputs in JDBC PreparedStatement. Add unit tests.

2018-06-04 Thread Mykhailo Kysliuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67437/
---

Review request for hive, Ashutosh Chauhan, Eugene Koifman, and Zoltan Haindrich.


Repository: hive-git


Description
---

Initial commit.


Diffs
-

  jdbc/src/test/org/apache/hive/jdbc/TestHivePreparedStatement.java 2a68c91 


Diff: https://reviews.apache.org/r/67437/diff/1/


Testing
---


Thanks,

Mykhailo Kysliuk



[jira] [Created] (HIVE-19784) Regression test selection framework for ptest

2018-06-04 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-19784:
---

 Summary: Regression test selection framework for ptest
 Key: HIVE-19784
 URL: https://issues.apache.org/jira/browse/HIVE-19784
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Sahil Takiar


Regression test selection is a methodology for decreasing the number of tests 
that are run in regression test suites. The idea is to that for a given change, 
only run the tests that are relevant to the given change, rather than all the 
tests.

For example, right now Hive QA runs all the {{standalone-metastore}} tests for 
every patch. However, most of the time this isn't necessary. If a patch is only 
modifying files in {{ql}} or {{common}} there is no need to run 
{{standalone-metastore}} tests as there is no dependency from the 
{{standalone-metastore}} to any other Hive module (exception for 
{{storage-api}}).

RTS is commonly used for CI systems. Google has published some interesting info 
on how they do this
* 
http://google-engtools.blogspot.com/2011/06/testing-at-speed-and-scale-of-google.html
* https://drive.google.com/file/d/0Bx-FLr0Egz9zYXJfMEZ6NERTbkU/view
* [Bazelhttps://bazel.build/] seems to provide some functionality to do this: 
http://code.hootsuite.com/faster-automated-tests-bazel/

There are a few other open-source projects that offer different ways of doing 
this: [Ekstazi|http://ekstazi.org/]

A short term solution would be to implement the following:
* Before each Hive QA, parse the Maven dependency graph
* Take the specified patch and check which Maven modules it modifies
* Runs tests contained inside the modified modules and their dependent modules



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67436: HIVE-19582: NPE during CREATE ROLE using SQL Standard Based Hive Authorization

2018-06-04 Thread Mykhailo Kysliuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67436/
---

Review request for hive, Ashutosh Chauhan, Eugene Koifman, and Thejas Nair.


Repository: hive-git


Description
---

Initial commit


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6c56212 
  ql/src/test/org/apache/hadoop/hive/ql/session/TestSessionState.java 0fa1c81 
  
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFLoggedInUser.java
 PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_logged_in_user.q c9a6370 
  ql/src/test/results/clientpositive/udf_logged_in_user.q.out da8b894 


Diff: https://reviews.apache.org/r/67436/diff/1/


Testing
---


Thanks,

Mykhailo Kysliuk



[jira] [Created] (HIVE-19783) Retrieve only locations in HiveMetaStore.dropPartitionsAndGetLocations

2018-06-04 Thread Peter Vary (JIRA)
Peter Vary created HIVE-19783:
-

 Summary: Retrieve only locations in 
HiveMetaStore.dropPartitionsAndGetLocations
 Key: HIVE-19783
 URL: https://issues.apache.org/jira/browse/HIVE-19783
 Project: Hive
  Issue Type: Improvement
Reporter: Peter Vary
Assignee: Peter Vary


Optimize further the dropTable command.
Currently {{HiveMetaStore.dropPartitionsAndGetLocations}} retrieves the whole 
partition object, but we need only the locations instead.

Create a RawStore method to retrieve only the locations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19782) Flash out TestObjectStore.testDirectSQLDropParitionsCleanup

2018-06-04 Thread Peter Vary (JIRA)
Peter Vary created HIVE-19782:
-

 Summary: Flash out 
TestObjectStore.testDirectSQLDropParitionsCleanup
 Key: HIVE-19782
 URL: https://issues.apache.org/jira/browse/HIVE-19782
 Project: Hive
  Issue Type: Test
  Components: Standalone Metastore
Affects Versions: 4.0.0
Reporter: Peter Vary
Assignee: Peter Vary


{{TestObjectStore.testDirectSQLDropParitionsCleanup}} checks that the tables 
are empty after the drop. We should add some rows to every partition related 
table, to see that they are really cleaned up



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67399: HIVE-19503 Create a test that checks for dropPartitions with directSql

2018-06-04 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67399/
---

(Updated June 4, 2018, 9:51 a.m.)


Review request for hive, Alexander Kolbasov and Vihang Karajgaonkar.


Changes
---

The PreCommit test failed because the ObjectStore is not run it test mode, so 
the initializing queries are not called, thus the tables are not created.
Set the HIVE_IN_TEST flag before starting the ObjectStore


Bugs: HIVE-19503
https://issues.apache.org/jira/browse/HIVE-19503


Repository: hive-git


Description
---

The patch contains 2 tests:

Test which checks if the JDO cache is able to handle directSql partition drops
Test which checks if the directSQL partition drop removes every connected data 
from the RDBMS tables.
To create these tests we have 2 helper methods:

Method to create the partitioned table
Method to check the number of rows in a given RDBMS table
Added a new ObjectStore.dropPartitionsInternal method which only visible for 
testing so we can make sure that the dropPartition is using directSql and does 
not fall back to JDO.

Fixed a problem where some of the tables are not created automatically by the 
tests, adding new init queries to MetaStoreDirectSql.ensureDbInit method


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 5bb1985 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 b15d89d 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java
 9912213 


Diff: https://reviews.apache.org/r/67399/diff/3/

Changes: https://reviews.apache.org/r/67399/diff/2-3/


Testing
---

Run the new tests


Thanks,

Peter Vary



[jira] [Created] (HIVE-19781) Fix qtest isolation problems

2018-06-04 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-19781:
---

 Summary: Fix qtest isolation problems
 Key: HIVE-19781
 URL: https://issues.apache.org/jira/browse/HIVE-19781
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


with HIVE-19237 the following happens:

* running explainanlayze_2 locally produces different operator ids in the plan 
than on the ptest server
* I've narrowed this down to making a single change to 
"hive.fetch.task.conversion" in a different test

I feel that the isolation is somewhere broken...hiveconf changes may not 
inflict changes in different tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)