[jira] [Created] (HIVE-23667) Incorrect output with option hive.auto.convert.join=fasle

2020-06-08 Thread gaozhan ding (Jira)
gaozhan ding created HIVE-23667:
---

 Summary: Incorrect output with option hive.auto.convert.join=fasle
 Key: HIVE-23667
 URL: https://issues.apache.org/jira/browse/HIVE-23667
 Project: Hive
  Issue Type: Bug
Reporter: gaozhan ding
 Fix For: 3.1.0


We use hive with version 3.1.0 with tez engine 0.9.1.3

I encountered an error when executing a hive SQL. This SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=1;
set hive.exec.max.dynamic.partitions=1;
set hive.fileformat.check=false;
set mapred.reduce.tasks=50;
set hive.auto.convert.join=true;
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  
dw_fact_inverter_detail.site=230_dim_site.id;{code}
with the output.
{code:java}
+--+ | _c0 | +--+ | 4954736 | +--+
{code}
But when the hive.auto.convert.join option is set to false,the utput is not as 
expected。

The SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=1;
set hive.exec.max.dynamic.partitions=1;
set hive.fileformat.check=false;  
set mapred.reduce.tasks=50;
set hive.auto.convert.join=false; //changed
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  
dw_fact_inverter_detail.site=230_dim_site.id;{code}
with output:
{code:java}
+--+ | _c0 | +--+ | 0 | +--+
{code}
Beside,both tables participating in the join are partition tables.

Especially,if the option mapred.reduce.tasks=50 was not set,all above the sql 
output expected results.

We just upgraded hive from 1.2 to 3.1.0, and we found that these problems only 
occurred in the old hive table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23666) checkHashModeEfficiency is skipped when a groupby operator doesn't have a grouping set

2020-06-08 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-23666:
---

 Summary: checkHashModeEfficiency is skipped when a groupby 
operator doesn't have a grouping set
 Key: HIVE-23666
 URL: https://issues.apache.org/jira/browse/HIVE-23666
 Project: Hive
  Issue Type: Bug
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


checkHashModeEfficiency is skipped when a groupby operator doesn't have a 
grouping set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-06-08 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-23665:
---

 Summary: Rewrite last_value to first_value to enable streaming 
results
 Key: HIVE-23665
 URL: https://issues.apache.org/jira/browse/HIVE-23665
 Project: Hive
  Issue Type: Bug
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan


Rewrite last_value to first_value to enable streaming results

last_value cannot be streamed because the intermediate results need to be 
buffered to determine the window result till we get the last row in the window. 
But if we can rewrite to first_value we can stream the results, although the 
order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23664) Handle SQL Mode For Views

2020-06-08 Thread David Mollitor (Jira)
David Mollitor created HIVE-23664:
-

 Summary: Handle SQL Mode For Views
 Key: HIVE-23664
 URL: https://issues.apache.org/jira/browse/HIVE-23664
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor


When a view is created, its SQL statement is stored in the HMS.

With the introduction of SQL mode [HIVE-19064], what happens when a view is 
created in one mode, but executed while the session is set to a different mode?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72528: ValidTxnManager doesn't consider txns opened and committed between snapshot generation and locking when evaluating ValidTxnListState

2020-06-08 Thread Jesús Camacho Rodríguez


> On May 20, 2020, 3:16 p.m., Peter Varga wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 686 (patched)
> > 
> >
> > I have concerns here, but I am not sure if they are well founded or 
> > not. I think this will break what the outside world thinks of snapshot 
> > isolation. I might have a hypothetical client that inserts lots of data in 
> > a source table and sometimes issue a merge statement from the source to the 
> > target table. They have some requirement that the target table can not have 
> > partial data regarding some property. Example they inserting sales data, 
> > and the target table can not contain half the data of a day, it can either 
> > have all or none. So what the clients does, it will issue the inserts into 
> > the source table synchronously ordered by the date and when it gets to a 
> > next day it issue a merge statement asynchronously and continues to inserts 
> > the data for the next day synchronously. And it might think that it is save 
> > to do so, since the merge statement has a snapshot it will not see the data 
> > inserted afterwards. But with this change it will break.
> > It might not be the best example, since how would the client know when 
> > the snapshot is actually captured. But I am not familiar enough with the 
> > ecosystem, does anything use the Hive by issuing the compile and run 
> > separately? Because there you could be sure before this change, that the 
> > compilation order also meant snapshot order. So summarized, I don't know 
> > what the outside world excepts of the snapshot isolation.
> 
> Denys Kuzmenko wrote:
> insert into source and merge from source into target won't conflict with 
> each other, they touch different tables. Maybe I missing something here...
> 
> Peter Varga wrote:
> My example was not perfect. I don't mean that it will conflict with the 
> insert into the source table. It can conflict with some other client's 
> transaction. My main point is, after the conflict is noticed and you 
> regenerate the snapshot it will starts to read results from transactions that 
> were opened and committed after the original query was compiled, and I'm just 
> trying to figure out, what kinf of problems can it cause, if any. In my 
> example you start to read records inserted later, but what if somebody added 
> a new partition since the compilation, wouldn't it cause problem?
> 
> Denys Kuzmenko wrote:
> probably there might be an issue as we won't create any locks for the 
> newly created partition, however we'll start reading it.
> instead of rollback & retry on Hive side we might consider to just fail 
> and let the user re-try.
> 
> Denys Kuzmenko wrote:
> however it still leaves the question what happens now in Hive when 
> somebody adds a new partition (insert with dynamic partitioning) since the 
> compilation (merge insert). I'll test this out.
> 
> Peter Varga wrote:
> You could construct an example like this:
> 1. open and compile transaction 1 that merge inserts data from a 
> partitioned source table that has a few partition.
> 2. Open, run and commit transaction 2 that inserts data to an old and a 
> new partition to the source table.
> 3. Open, run and commit transaction 3 that inserts data to the target 
> table of the merge statement, that will retrigger a snapshot generation in 
> transaction 1.
> 4. Run transaction 1, the snapshot will be regenerated, and I think it 
> will read partial data from transaction 2 breaking the ACID properties.
> 
> But we should try the test without your patch with a little different 
> setup.
> Switch the transaction order:
> 1. compile transaction 1 that inserts data to an old and a new partition 
> of the source table.
> 2. compile transaction 2 that insert data to the target table
> 2. compile transaction 3 that merge inserts data from the source table to 
> the target table
> 3. run and commit transaction 1
> 4. run and commit transaction 2
> 5. run transaction 3, since it cointains 1 and 2 in its snaphot the 
> isValidTxnListState will be triggered without your patch and I think it 
> possible that we do a partial read of the transaction 1 for the same reasons.

For the first scenario, afaik we should not read the list of partitions 
incorrectly in item 4. In fact, if a new snapshot needs to be generated, we 
recompile the query, and thus, the list of partitions should be regenerated 
too. If it is not, then we are keeping some state unintentionally (since the 
semantic analyzer is created again in the recompilation, I am not sure where 
that state going be coming from though).


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72528/#review220838

[jira] [Created] (HIVE-23663) Fix Travis YAML For Hive 2.x

2020-06-08 Thread David Mollitor (Jira)
David Mollitor created HIVE-23663:
-

 Summary: Fix Travis YAML For Hive 2.x
 Key: HIVE-23663
 URL: https://issues.apache.org/jira/browse/HIVE-23663
 Project: Hive
  Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Disabling falling tests

2020-06-08 Thread Zoltan Haindrich

Hey all!

Since we run tests on the new place a new stream of builds on the master branch 
have also started.
This enables us to see how are we doing without any patches - because we only 
merge changes after a clean run: this should test-fail-free.

However it might happen that some tests start polluting it - I would like to 
propose to disable all tests which start failing there:
* open a jira mentioning the issue
* ignore the test - and place pointer to the opened jira
* submit the patch immediately - so that people with commit rights can 
safeguard a green environment

http://34.66.156.144:8080/job/hive-precommit/job/master/

cheers,
Zoltan


[jira] [Created] (HIVE-23662) Run a smoketest during precommit

2020-06-08 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-23662:
---

 Summary: Run a smoketest during precommit
 Key: HIVE-23662
 URL: https://issues.apache.org/jira/browse/HIVE-23662
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23661) Run metastore integration tests during precommit

2020-06-08 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-23661:
---

 Summary: Run metastore integration tests during precommit
 Key: HIVE-23661
 URL: https://issues.apache.org/jira/browse/HIVE-23661
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23660) Provide a way to check test stability

2020-06-08 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-23660:
---

 Summary: Provide a way to check test stability
 Key: HIVE-23660
 URL: https://issues.apache.org/jira/browse/HIVE-23660
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72526: HIVE-23493

2020-06-08 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72526/
---

(Updated June 8, 2020, 12:52 p.m.)


Review request for hive and Jesús Camacho Rodríguez.


Bugs: HIVE-23493
https://issues.apache.org/jira/browse/HIVE-23493


Repository: hive-git


Description
---

Rewrite plan to join back tables with many projected columns joined multiple 
times


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1464d6aa7a 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinOptimization.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveCardinalityPreservingJoinRule.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFieldTrimmerRule.java
 73ff1bccf2 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 19ce3ea223 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 377e8280e5 
  ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query11.q.out 
0136ee4bb5 
  ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query4.q.out 
987a0f348e 
  ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query74.q.out 
289e5d2569 
  ql/src/test/results/clientpositive/perf/tez/constraints/query11.q.out 
7f9df5e8af 
  ql/src/test/results/clientpositive/perf/tez/constraints/query4.q.out 
585f4d6b9c 
  ql/src/test/results/clientpositive/perf/tez/constraints/query74.q.out 
39c76fc82c 


Diff: https://reviews.apache.org/r/72526/diff/5/

Changes: https://reviews.apache.org/r/72526/diff/4-5/


Testing (updated)
---

mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=cardinality_preserving_join_opt.q -pl 
itests/qtest -Pitests
mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestTezPerfConstraintsCliDriver 
-Dqfile=cbo_query4.q,cbo_query11.q,cbo_query74.q,query4.q,query11.q,query74.q 
-pl itests/qtest -Pitests


Thanks,

Krisztian Kasa



[jira] [Created] (HIVE-23659) Add Retry for Ranger Replication

2020-06-08 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-23659:
--

 Summary: Add Retry for Ranger Replication
 Key: HIVE-23659
 URL: https://issues.apache.org/jira/browse/HIVE-23659
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup

2020-06-08 Thread Peter Varga via Review Board


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > LGTM, a few minor suggestions.
> > (Non-binding)

Thanks for the review.


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Line 1411 (original)
> > 
> >
> > Are these originals not needed, or collected elsewhere?

This one is bothering me, these lines were added when the snapshot way was 
introduced, but I do not see why. When we calculated the AcidState without the 
snapshot these files were not added to the originals list. It is explicitly 
there few lines above, that if we have a base we consider every original files 
as obsolete. The 
TestTxnCommandsForMmTable#testInsertOverwriteForPartitionedMmTable breaks for 
example if these files are added to the list. After an insert-overwrite to a mm 
table and calling the major compaction, the compaction will create a new base 
dir, not leaving the perfectly fine base dir generated by the insert overwrite. 
I did not dig into the compaction to see why the original files are triggering 
it, but I do not think these files needed in the original list.


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Lines 1438 (patched)
> > 
> >
> > might want to add a check + error message before the while loop in case 
> > someone added a file that has no delta or base dir in its path. you never 
> > know what people are capable of :)

This is only looping if we are inside a delta or a base directory in some 
depth, but not directly in the folder. If someone adds a random dir, we will 
just consider it an originalDir and do not enter in the loop.


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Lines 1441 (patched)
> > 
> >
> > nit: typo

fixed


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Lines 1442 (patched)
> > 
> >
> > nit: typo

fixed


> On June 5, 2020, 2:04 p.m., Karen Coppage wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
> > Line 1937 (original), 1880 (patched)
> > 
> >
> > Does this need to be public?

fixed


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72532/#review220956
---


On June 8, 2020, 10:58 a.m., Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72532/
> ---
> 
> (Updated June 8, 2020, 10:58 a.m.)
> 
> 
> Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> since HIVE-21225 there are two redundant implementation of the 
> AcidUtils.getAcidState.
> 
> The previous implementation (without the recursive listing) can be removed.
> 
> Also the performance can be improved, by removing unnecessary fileStatus 
> calls.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 16c915959c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  598220b0c4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 4e5d5b003b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7913295380 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> d83a50f555 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  5e11d8d2d8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1bdec7df2d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
> e4440e9136 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 

Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup

2020-06-08 Thread Peter Varga via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72532/
---

(Updated June 8, 2020, 10:58 a.m.)


Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary.


Repository: hive-git


Description
---

since HIVE-21225 there are two redundant implementation of the 
AcidUtils.getAcidState.

The previous implementation (without the recursive listing) can be removed.

Also the performance can be improved, by removing unnecessary fileStatus calls.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
16c915959c 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 598220b0c4 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
4e5d5b003b 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 7913295380 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
d83a50f555 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java 
5e11d8d2d8 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java 
1bdec7df2d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
e4440e9136 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
f63c40a7b5 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java 3a3b267927 


Diff: https://reviews.apache.org/r/72532/diff/3/

Changes: https://reviews.apache.org/r/72532/diff/2-3/


Testing
---


Thanks,

Peter Varga



[jira] [Created] (HIVE-23658) Fix FindBug issues in hive-kudu-handler

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23658:
-

 Summary: Fix FindBug issues in hive-kudu-handler
 Key: HIVE-23658
 URL: https://issues.apache.org/jira/browse/HIVE-23658
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23656) Fix FindBug issues in llap-server

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23656:
-

 Summary: Fix FindBug issues in llap-server
 Key: HIVE-23656
 URL: https://issues.apache.org/jira/browse/HIVE-23656
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23657) Fix FindBug issues in hive-shims

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23657:
-

 Summary: Fix FindBug issues in hive-shims
 Key: HIVE-23657
 URL: https://issues.apache.org/jira/browse/HIVE-23657
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23655) Fix FindBug issues in llap-tez

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23655:
-

 Summary: Fix FindBug issues in llap-tez
 Key: HIVE-23655
 URL: https://issues.apache.org/jira/browse/HIVE-23655
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23654) Fix FindBug issues in llap-ext-client

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23654:
-

 Summary: Fix FindBug issues in llap-ext-client
 Key: HIVE-23654
 URL: https://issues.apache.org/jira/browse/HIVE-23654
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23653) Fix FindBug issues in llap-client

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23653:
-

 Summary: Fix FindBug issues in llap-client
 Key: HIVE-23653
 URL: https://issues.apache.org/jira/browse/HIVE-23653
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23652) Fix FindBug issues in llap-common

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23652:
-

 Summary: Fix FindBug issues in llap-common
 Key: HIVE-23652
 URL: https://issues.apache.org/jira/browse/HIVE-23652
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23651) Fix FindBug issues in hive-service

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23651:
-

 Summary: Fix FindBug issues in hive-service
 Key: HIVE-23651
 URL: https://issues.apache.org/jira/browse/HIVE-23651
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23650) Fix FindBug issues in hive-streaming

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23650:
-

 Summary: Fix FindBug issues in hive-streaming
 Key: HIVE-23650
 URL: https://issues.apache.org/jira/browse/HIVE-23650
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23649) Fix FindBug issues in hive-service-rpc

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23649:
-

 Summary: Fix FindBug issues in hive-service-rpc
 Key: HIVE-23649
 URL: https://issues.apache.org/jira/browse/HIVE-23649
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23648) Fix FindBug issues in hive-serde

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23648:
-

 Summary: Fix FindBug issues in hive-serde
 Key: HIVE-23648
 URL: https://issues.apache.org/jira/browse/HIVE-23648
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23646) Fix FindBug issues in hive-ql

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23646:
-

 Summary: Fix FindBug issues in hive-ql
 Key: HIVE-23646
 URL: https://issues.apache.org/jira/browse/HIVE-23646
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23647) Fix FindBug issues in hive-parser

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23647:
-

 Summary: Fix FindBug issues in hive-parser
 Key: HIVE-23647
 URL: https://issues.apache.org/jira/browse/HIVE-23647
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23645) Fix FindBug issues in hive-metastore

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23645:
-

 Summary: Fix FindBug issues in hive-metastore
 Key: HIVE-23645
 URL: https://issues.apache.org/jira/browse/HIVE-23645
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23643) Fix FindBug issues in hive-hplsql

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23643:
-

 Summary: Fix FindBug issues in hive-hplsql
 Key: HIVE-23643
 URL: https://issues.apache.org/jira/browse/HIVE-23643
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23644) Fix FindBug issues in hive-jdbc

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23644:
-

 Summary: Fix FindBug issues in hive-jdbc
 Key: HIVE-23644
 URL: https://issues.apache.org/jira/browse/HIVE-23644
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23642) Fix FindBug issues in hive-jdbc-handler

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23642:
-

 Summary: Fix FindBug issues in hive-jdbc-handler
 Key: HIVE-23642
 URL: https://issues.apache.org/jira/browse/HIVE-23642
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23641) Fix FindBug issues in hive-hbase-handler

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23641:
-

 Summary: Fix FindBug issues in hive-hbase-handler
 Key: HIVE-23641
 URL: https://issues.apache.org/jira/browse/HIVE-23641
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23640) Fix FindBug issues in hive-druid-handler

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23640:
-

 Summary: Fix FindBug issues in hive-druid-handler
 Key: HIVE-23640
 URL: https://issues.apache.org/jira/browse/HIVE-23640
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23639) Fix FindBug issues in hive-contrib

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23639:
-

 Summary: Fix FindBug issues in hive-contrib
 Key: HIVE-23639
 URL: https://issues.apache.org/jira/browse/HIVE-23639
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23638) Fix FindBug issues in hive-common

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23638:
-

 Summary: Fix FindBug issues in hive-common
 Key: HIVE-23638
 URL: https://issues.apache.org/jira/browse/HIVE-23638
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23637) Fix FindBug issues in cli

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23637:
-

 Summary: Fix FindBug issues in cli
 Key: HIVE-23637
 URL: https://issues.apache.org/jira/browse/HIVE-23637
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23636) Fix FindBug issues in beeline

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23636:
-

 Summary: Fix FindBug issues in beeline
 Key: HIVE-23636
 URL: https://issues.apache.org/jira/browse/HIVE-23636
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23635) Fix FindBug issues in vector-code-gen

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23635:
-

 Summary: Fix FindBug issues in vector-code-gen
 Key: HIVE-23635
 URL: https://issues.apache.org/jira/browse/HIVE-23635
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23634) Fix FindBug issues in storage-api accumulo-handler

2020-06-08 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23634:
-

 Summary: Fix FindBug issues in storage-api accumulo-handler
 Key: HIVE-23634
 URL: https://issues.apache.org/jira/browse/HIVE-23634
 Project: Hive
  Issue Type: Sub-task
Reporter: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23633) Metastore some JDO query objects do not close properly

2020-06-08 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-23633:
--

 Summary: Metastore some JDO query objects do not close properly
 Key: HIVE-23633
 URL: https://issues.apache.org/jira/browse/HIVE-23633
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


After patched [HIVE-10895|https://issues.apache.org/jira/browse/HIVE-10895],  
The metastore still has seen a memory leak on db resources: many StatementImpls 
left unclosed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72528: ValidTxnManager doesn't consider txns opened and committed between snapshot generation and locking when evaluating ValidTxnListState

2020-06-08 Thread Peter Varga via Review Board


> On May 20, 2020, 3:16 p.m., Peter Varga wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Lines 686 (patched)
> > 
> >
> > I have concerns here, but I am not sure if they are well founded or 
> > not. I think this will break what the outside world thinks of snapshot 
> > isolation. I might have a hypothetical client that inserts lots of data in 
> > a source table and sometimes issue a merge statement from the source to the 
> > target table. They have some requirement that the target table can not have 
> > partial data regarding some property. Example they inserting sales data, 
> > and the target table can not contain half the data of a day, it can either 
> > have all or none. So what the clients does, it will issue the inserts into 
> > the source table synchronously ordered by the date and when it gets to a 
> > next day it issue a merge statement asynchronously and continues to inserts 
> > the data for the next day synchronously. And it might think that it is save 
> > to do so, since the merge statement has a snapshot it will not see the data 
> > inserted afterwards. But with this change it will break.
> > It might not be the best example, since how would the client know when 
> > the snapshot is actually captured. But I am not familiar enough with the 
> > ecosystem, does anything use the Hive by issuing the compile and run 
> > separately? Because there you could be sure before this change, that the 
> > compilation order also meant snapshot order. So summarized, I don't know 
> > what the outside world excepts of the snapshot isolation.
> 
> Denys Kuzmenko wrote:
> insert into source and merge from source into target won't conflict with 
> each other, they touch different tables. Maybe I missing something here...
> 
> Peter Varga wrote:
> My example was not perfect. I don't mean that it will conflict with the 
> insert into the source table. It can conflict with some other client's 
> transaction. My main point is, after the conflict is noticed and you 
> regenerate the snapshot it will starts to read results from transactions that 
> were opened and committed after the original query was compiled, and I'm just 
> trying to figure out, what kinf of problems can it cause, if any. In my 
> example you start to read records inserted later, but what if somebody added 
> a new partition since the compilation, wouldn't it cause problem?
> 
> Denys Kuzmenko wrote:
> probably there might be an issue as we won't create any locks for the 
> newly created partition, however we'll start reading it.
> instead of rollback & retry on Hive side we might consider to just fail 
> and let the user re-try.
> 
> Denys Kuzmenko wrote:
> however it still leaves the question what happens now in Hive when 
> somebody adds a new partition (insert with dynamic partitioning) since the 
> compilation (merge insert). I'll test this out.

You could construct an example like this:
1. open and compile transaction 1 that merge inserts data from a partitioned 
source table that has a few partition.
2. Open, run and commit transaction 2 that inserts data to an old and a new 
partition to the source table.
3. Open, run and commit transaction 3 that inserts data to the target table of 
the merge statement, that will retrigger a snapshot generation in transaction 1.
4. Run transaction 1, the snapshot will be regenerated, and I think it will 
read partial data from transaction 2 breaking the ACID properties.

But we should try the test without your patch with a little different setup.
Switch the transaction order:
1. compile transaction 1 that inserts data to an old and a new partition of the 
source table.
2. compile transaction 2 that insert data to the target table
2. compile transaction 3 that merge inserts data from the source table to the 
target table
3. run and commit transaction 1
4. run and commit transaction 2
5. run transaction 3, since it cointains 1 and 2 in its snaphot the 
isValidTxnListState will be triggered without your patch and I think it 
possible that we do a partial read of the transaction 1 for the same reasons.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72528/#review220838
---


On May 19, 2020, 11:19 a.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72528/
> ---
> 
> (Updated May 19, 2020, 11:19 a.m.)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Peter Varga, and Peter Vary.
> 
> 
> Bugs: HIVE-23503
> https://issues.apache.org/jira/browse/HIVE-23503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
>