[jira] [Created] (HIVE-20743) Enable udaf_context_ngrams.q and udaf_corr.q tests

2018-10-13 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-20743:
--

 Summary: Enable udaf_context_ngrams.q and udaf_corr.q tests
 Key: HIVE-20743
 URL: https://issues.apache.org/jira/browse/HIVE-20743
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Jesus Camacho Rodriguez


Two qfile tests for TestCliDriver, they may all relate to number precision 
issues:
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_context_ngrams] 
(batchId=79)

Error:
Client Execution succeeded but contained differences (error code = 1) after 
executing udaf_context_ngrams.q 
43c43
< [{"ngram":["travelling"],"estfrequency":1.0}]
---
> [{"ngram":["travelling"],"estfrequency":3.0}]

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_corr] (batchId=84)

Client Execution succeeded but contained differences (error code = 1) after 
executing udaf_corr.q 
100c100
< 0.6633880657639324
---
> 0.6633880657639326





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68975: HIVE-20661: Dynamic partitions loading calls add partition for every partition 1-by-1

2018-10-13 Thread Laszlo Pinter via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68975/
---

(Updated Oct. 13, 2018, 9:55 a.m.)


Review request for hive and Peter Vary.


Repository: hive-git


Description
---

HIVE-20661: Dynamic partitions loading calls add partition for every partition 
1-by-1


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
0ab77e84c6222d35bcec9ce95ed02014911ef144 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
4de038913a5c9a2c199f71702b8f70ca84d0856b 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
a2b57fb646899c54b63be14a8cde9b8644a973aa 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 a2ec09f7157140fda751ab777649c698b3319a16 


Diff: https://reviews.apache.org/r/68975/diff/3/

Changes: https://reviews.apache.org/r/68975/diff/2-3/


Testing
---


Thanks,

Laszlo Pinter



[jira] [Created] (HIVE-20742) SparkSessionManagerImpl maintenance thread only cleans up session once

2018-10-13 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-20742:
--

 Summary: SparkSessionManagerImpl maintenance thread only cleans up 
session once
 Key: HIVE-20742
 URL: https://issues.apache.org/jira/browse/HIVE-20742
 Project: Hive
  Issue Type: Bug
Affects Versions: 4.0.0
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits


If there is a reconnect at the client session, the SparkSessionManagerImpl 
doesn't puts it back in the created sessions, so it will not time out the 
second time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68975: HIVE-20661: Dynamic partitions loading calls add partition for every partition 1-by-1

2018-10-13 Thread Laszlo Pinter via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68975/
---

(Updated Oct. 13, 2018, 7:56 a.m.)


Review request for hive and Peter Vary.


Repository: hive-git


Description
---

HIVE-20661: Dynamic partitions loading calls add partition for every partition 
1-by-1


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
0ab77e84c6222d35bcec9ce95ed02014911ef144 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
4de038913a5c9a2c199f71702b8f70ca84d0856b 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
a2b57fb646899c54b63be14a8cde9b8644a973aa 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 a2ec09f7157140fda751ab777649c698b3319a16 


Diff: https://reviews.apache.org/r/68975/diff/2/

Changes: https://reviews.apache.org/r/68975/diff/1-2/


Testing
---


Thanks,

Laszlo Pinter



Re: Review Request 68975: HIVE-20661: Dynamic partitions loading calls add partition for every partition 1-by-1

2018-10-13 Thread Laszlo Pinter via Review Board


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1896 (patched)
> > 
> >
> > What does this method do?
> > Moving files and updating partition data if the partition is already 
> > exists?
> > Javadoc would be good anyway.
> > Follow-up idea: update partition data with alter_partitions (multiple 
> > updates with 1 HMS call)?

Good point, javadoc added.

I noticed that as well, just didn't want to make this change even more 
complicated. There are other points also wich could be be further optimized.  I 
made a note to myself to open a new jira after this is merged in.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1909 (original), 1939 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1939 (original), 1961 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1948 (original), 1970 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Line 1966 (original), 1988 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1970-1971 (original), 1992-1993 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 1978-1979 (original), 2000-2001 (patched)
> > 
> >
> > nit: maybe do not reformat these lines if they are not needed

done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2102-2103 (patched)
> > 
> >
> > Concerned about this.
> > When we call Hive.loadPartition we call 2 methods there:
> > - loadPartitionInternal - 1 snapshot
> > - addPartitionToMetastore - 1 snapshot
> > Are we sure that these calls are:
> > - Lightweight - so we can happily call them twice
> > - Return the same value in both ocassions even if some other transacion 
> > is finished during this time?

Yes, I see your point here. 
By checking the implementation of AcidUtils.getTableSnapshot, it is not 
guaranteed that the two calls will have the same result. 
I made a modification to use the same instance of snapshot until the loading of 
partition is done.


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2083-2084 (original), 2129-2130 (patched)
> > 
> >
> > We do not use batching for adding partitions?

What should be the size of the batch? I used BATCH_RETRIEVE_MAX configuration 
parameter for fetching data from metastore. Can I use this one as well, or 
there is another parameter?


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2492-2506 (patched)
> > 
> >
> > Do we need this? Can't we use equals method of maps to compare instead?

Java arrays do not override Object.equals(). Hence if you compare them using 
equals() (which is what the equals methods of all the collection classes do), 
you get "instance equality", instead of "value equality".


> On Oct. 11, 2018, 8:54 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
> > Lines 2511 (patched)
> > 
> >
> > Note: We still might fail with OOM if too many new partitions are 
> > there. We store in memory all of the new and old partition specifications. 
> > Was this the same before?

Yes, in the previous