[jira] [Created] (HIVE-20629) Hive onprim-onprim replication fails with events missing error if database is kept idle for more than an hour

2018-09-24 Thread mahesh kumar behera (JIRA)
mahesh kumar behera created HIVE-20629:
--

 Summary: Hive onprim-onprim replication fails with events missing 
error if database is kept idle for more than an hour 
 Key: HIVE-20629
 URL: https://issues.apache.org/jira/browse/HIVE-20629
 Project: Hive
  Issue Type: Bug
  Components: repl
Affects Versions: 4.0.0
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera
 Fix For: 4.0.0


Start a source cluster with 2 database. Replicate the databases to target after 
doing some operations. Keep taking incremental dump for both database and keep 
replicating them to target cluster. Keep one the database idle for more than 24 
hrs. After 24 hrs, the incremental dump of idle database fails with event 
missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 68834: HIVE-20556

2018-09-24 Thread Jaume Marhuenda

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68834/
---

Review request for hive.


Repository: hive-git


Description
---

Expose an API to retrieve the TBL_ID from TBLS in the metastore tables


Diffs
-

  data/files/exported_table/_metadata 81fbf63a54 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestAuthorizationPreEventListener.java
 05c00094d6 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestMetastoreAuthorizationProvider.java
 767321332c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java f72e08c14f 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java ca4d36f30d 
  ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
ff411f62d5 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java
 78ac909f72 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
 22deffe1d3 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
 38fac465d7 
  
standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
 0192c6da31 
  standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
85a5c601e0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ba82a9327c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d27224b235 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/model/MTable.java
 deeb97133d 
  standalone-metastore/metastore-server/src/main/resources/package.jdo 
2a5f016b1f 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 4937d9d861 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStorePartitionSpecs.java
 df83171648 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestTablesCreateDropAlterTruncate.java
 bf302ed491 


Diff: https://reviews.apache.org/r/68834/diff/1/


Testing
---


Thanks,

Jaume Marhuenda



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-24 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208968
---



Hi Denys,

Could you please think a little about separating the Manager/Factory and the 
tryAcquire mess?

Incomplete thoughts, but I had to run

Thanks, and sorry :(
Peter


ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java
Lines 130 (patched)


nit: I do prefer creating static final variables at the begining of the 
class, or at the first use. Do not create a new patch because of this, but if 
you have to do a new one please move the declaration up to the line ~51



ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Line 1854 (original), 1849-1850 (patched)


This still makes me itching...
I think we should separate the Manager / Factory and the actual lock object.
I would prefer the following:
- CompileLockManager should create the lock object
- Use the lock object as Zoltan suggested (try-with-resources)
- If we decide to keep tryAcquire - can we do it as a wrapper around the 
tryLock method


- Peter Vary


On szept. 19, 2018, 9:37 de, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 19, 2018, 9:37 de)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/5/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68805: HIVE-20538

2018-09-24 Thread Jaume Marhuenda


> On Sept. 24, 2018, 6:54 p.m., Eugene Koifman wrote:
> > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
> > Lines 101 (patched)
> > 
> >
> > Nit: what is the advantage of using direct jdbc calls to modify the 
> > metastore DBMS.  Why not run "cretate table ...", "Alter table..." though 
> > Driver and "describe table to see the value"

The problem I have with using the driver is that I'd have to add the dependency 
hive-exec.


- Jaume


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68805/#review208955
---


On Sept. 21, 2018, 10:51 p.m., Jaume Marhuenda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68805/
> ---
> 
> (Updated Sept. 21, 2018, 10:51 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20538: Allow to store a key value together with a transaction.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CommitTxnRequest.java
>  db47f9db8b 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  22deffe1d3 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  38fac465d7 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  0192c6da31 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  df6d56b679 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  54e7eda0da 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 85a5c601e0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d76049eda1 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
>  ce590d0f55 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  db4dd9ec42 
> 
> 
> Diff: https://reviews.apache.org/r/68805/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jaume Marhuenda
> 
>



Review Request 68828: HIVE-20601 : EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-24 Thread Bharathkrishna Guruvayoor Murali via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68828/
---

Review request for hive and Alexander Kolbasov.


Repository: hive-git


Description
---

It will be useful to have the environmentContext passed to 
DbNotificationListener in this case, to know if the alter happened due to a 
stat change.


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 f52ff91a8f2e7710801dcadc4a83ce454992a66a 


Diff: https://reviews.apache.org/r/68828/diff/1/


Testing
---


Thanks,

Bharathkrishna Guruvayoor Murali



Review Request 68827: Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-24 Thread Bharathkrishna Guruvayoor Murali via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68827/
---

Review request for hive and Alexander Kolbasov.


Repository: hive-git


Description
---

Clients can add large-sized parameters in Table/Partition objects. So we need 
to enable adding regex patterns through HiveConf to match parameters to be 
filtered from table and partition objects before serialization in HMS 
notifications.


Diffs
-

  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 30ea7f8129 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 c681a87a1c 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/json/JSONMessageFactory.java
 2668b05320 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/common/TestMetaStoreUtils.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/68827/diff/1/


Testing
---


Thanks,

Bharathkrishna Guruvayoor Murali



Re: Review Request 68805: HIVE-20538

2018-09-24 Thread Eugene Koifman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68805/#review208955
---




standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
Lines 2929 (patched)


This comment seems confusing to me.  Maybe give Kafka offset as a concrete 
example of point to some wiki where this API is documented.

for example,
"...for example to know if a transaction has
  already been committed"
which transaction is this talking about?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1095 (patched)


I think a MetaException would be better (or IllegalState/Argument).  
SQLException is generally produced by the DB and has sqlstate/sqlcode that 
various handlers try to examine.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1105 (patched)


MetaException.  Also, it should at least include info to help identify what 
exactly failed, i.e. txnid, tableid, param/value.  W/o it's impossible to 
correlate this error batch id, etc. I'ld also add a LOG.warn() so that it's 
visible in the log file.
It seems you have a requirement that the parameter exist.  Perhaps as part 
of the error code path, you can do another query to see if does exist - I bet 
that would be a common error.



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
Lines 2289 (patched)


why not make (tbleid,key,value) it's own object.  Then this object in 
CommitTxnRequest can be optional but all 3 fields in it can be mandatory.  as 
it is you are checking if they are set here and in TxnHandler.commit...



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
Lines 101 (patched)


Nit: what is the advantage of using direct jdbc calls to modify the 
metastore DBMS.  Why not run "cretate table ...", "Alter table..." though 
Driver and "describe table to see the value"



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
Lines 135 (patched)


should probably check that you got the right exception not just "any 
exception", i.e. check the message.


- Eugene Koifman


On Sept. 21, 2018, 3:51 p.m., Jaume Marhuenda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68805/
> ---
> 
> (Updated Sept. 21, 2018, 3:51 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20538: Allow to store a key value together with a transaction.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CommitTxnRequest.java
>  db47f9db8b 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  22deffe1d3 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  38fac465d7 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  0192c6da31 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  df6d56b679 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  54e7eda0da 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 85a5c601e0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d76049eda1 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
>  ce590d0f55 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  db4dd9ec42 
> 
> 
> Diff: https://reviews.apache.org/r/68805/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jaume Marhuenda
> 
>



Re: Review Request 68767: HIVE-20551: Create PreparedStatement query dynamically when IN clause is used

2018-09-24 Thread Andrew Sherman via Review Board


> On Sept. 20, 2018, 10:56 p.m., Andrew Sherman wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 316 (patched)
> > 
> >
> > 1) Does setObject() work OK on all the jdbc drivers that are supported? 
> > In the oast I have seen cases where it was necessary to dispatch to the 
> > correct method like setString, setInt 
> > 2) can the params over be null? Do we need to call setNull instead of 
> > setObject()? Again we need to consider all the drivers.
> 
> Laszlo Pinter wrote:
> 1) The jdbc driver will do the type checking. A slight disadvantage is 
> the minor overhead, but this is negligible as compared to the better 
> maitainable code you end up with.
> 2) You're correct, I have to make sure that the params[i] is not null or 
> use setNull instead.
> 
> Laszlo Pinter wrote:
> So I did a bit more of a debugging, and my previous comment about the 
> params[i] can be null is not correct. The params can contain partitionIds, 
> storageDescriptorIds, columnDescriptorIds, serdeIds, depeding from where the 
> executeNoResult() is called.  These fields are mandatory and cannot be null. 
> If any of these items are null, means that the metastore db is not consistent 
> and it was corrupted.
> 
> Andrew Sherman wrote:
> I worte this once, but rb ate it, sorry if it duplicates.
> On 1) Did you test with all drivers?
> On 2) I suggest you add some checking to nail down that aprams are 
> non-null. How is the java testing of this class? Do we need negative test 
> cases?
> 
> Laszlo Pinter wrote:
> 1) Do you know what are the supported jdbc drivers or where could I check 
> them?
> 2) It doesn't makes sense to have null values in queries like 
> ```sql
> SELECT column_name1 FROM table_name WHERE column_name2 IN (value1, value2 
> ...);
> ```
> so I filtered them out.

2) OK
1) 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-SupportedBackendDatabasesforMetastore
http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_hive_metastore_configure.html#topic_18_4_2
I don't know if there is a clever way to test with different DB/drivers, other 
people may know better.


- Andrew


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68767/#review208821
---


On Sept. 24, 2018, 12:16 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68767/
> ---
> 
> (Updated Sept. 24, 2018, 12:16 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Peter Vary, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20551: Create PreparedStatement query dynamically when IN clause is used
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  571c789eddfd2b1a27c65c48bdc6dccfafaaf676 
> 
> 
> Diff: https://reviews.apache.org/r/68767/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



[jira] [Created] (HIVE-20628) Parsing error when using a complex map data type under dynamic column masking

2018-09-24 Thread Darryl Dutton (JIRA)
Darryl Dutton created HIVE-20628:


 Summary: Parsing error when using a complex map data type under 
dynamic column masking
 Key: HIVE-20628
 URL: https://issues.apache.org/jira/browse/HIVE-20628
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2, Parser, Security
Affects Versions: 2.1.0
 Environment: The error can be simulated using HDP 2.6.4 sandbox
Reporter: Darryl Dutton


When trying to use the map complex data type as part of dynamic column mask, 
Hive throws a parsing error as it is expecting a primitive type (see trace 
pasted below). The use case is trying to apply masking to elements within a map 
type by applying a custom hive UDF (to apply the mask) using Ranger. Expect 
Hive to support complex data types for masking in addition to the primitive 
types. The expectation occurs when Hive need to evaluate the UDF or apply a 
standard mask (pass-through works as expected). You can recreate the problem by 
creating a simple table with a map data type column, then applying the masking 
to that column through a Ranger resource based policy and  a custom function 
(you can use a standard Hive UDF  str_to_map('F4','') to simulate returning 
a map). 

CREATE  TABLE `mask_test`(
 `key` string, 
 `value` map)
STORED AS INPUTFORMAT 
 'org.apache.hadoop.mapred.TextInputFormat'

 

INSERT INTO TABLE mask_test
SELECT 'AAA' as key, 
map('F1','2022','F2','','F3','333') as value
FROM (select 1 ) as temp;

 

 

Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.parse.SemanticException:org.apache.hadoop.hive.ql.parse.ParseException:
 line 1:57 cannot recognize input near 'map' '<' 'string' in primitive type 
specification
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10370)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10486)
 at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:219)
 at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
 at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218)
 at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
 ... 15 more
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.parse.ParseException:line 1:57 cannot recognize input 
near 'map' '<' 'string' in primitive type specification
 at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:214)
 at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:171)
 at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
 at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.rewriteASTWithMaskAndFilter(SemanticAnalyzer.java:10368)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 68710: HIVE-20544: TOpenSessionReq logs password and username

2018-09-24 Thread Karen Coppage via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68710/
---

(Updated Sept. 24, 2018, 2:01 p.m.)


Review request for hive and Laszlo Pinter.


Changes
---

Fixed typo in last diff


Bugs: HIVE-20544
https://issues.apache.org/jira/browse/HIVE-20544


Repository: hive-git


Description
---

TOpenSessionReq, if client protocol is unset, both username and password are 
logged. Logging a password is a security risk. This patch would hide it with 
asterisks.


Diffs (updated)
-

  service-rpc/pom.xml d6a07a55bc 
  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq.java
 3195e704f3 


Diff: https://reviews.apache.org/r/68710/diff/5/

Changes: https://reviews.apache.org/r/68710/diff/4-5/


Testing
---


File Attachments (updated)


HIVE-20544.3.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/24/9f8ef0d8-22df-40cf-a311-56335d88516a__HIVE-20544.3.patch
HIVE-20544.3.patch
  
https://reviews.apache.org/media/uploaded/files/2018/09/24/afdfc085-cc06-4a47-81f8-499029719bd0__HIVE-20544.3.patch


Thanks,

Karen Coppage



Re: Review Request 68710: HIVE-20544: TOpenSessionReq logs password and username

2018-09-24 Thread Karen Coppage via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68710/
---

(Updated Sept. 24, 2018, 1:47 p.m.)


Review request for hive and Laszlo Pinter.


Changes
---

Whether the password is set or not, "password:-" is printed to logs.


Bugs: HIVE-20544
https://issues.apache.org/jira/browse/HIVE-20544


Repository: hive-git


Description
---

TOpenSessionReq, if client protocol is unset, both username and password are 
logged. Logging a password is a security risk. This patch would hide it with 
asterisks.


Diffs (updated)
-

  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq.java
 3195e704f3 


Diff: https://reviews.apache.org/r/68710/diff/4/

Changes: https://reviews.apache.org/r/68710/diff/3-4/


Testing
---


Thanks,

Karen Coppage



Re: Review Request 68710: HIVE-20544: TOpenSessionReq logs password and username

2018-09-24 Thread Karen Coppage via Review Board


> On Sept. 21, 2018, 3:41 p.m., Andrew Sherman wrote:
> > service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq.java
> > Line 546 (original), 546 (patched)
> > 
> >
> > why give a clue about password length? Maybe just always print  or 
> > something?

Thanks for taking a look, Andrew! Fair point. I would worry that just printing 
some asterisks could confuse someone ("Is my password really that short?"), so 
i'll replace the password mask with a simple "-" in the next patch.


- Karen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68710/#review208862
---


On Sept. 21, 2018, 3:31 p.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68710/
> ---
> 
> (Updated Sept. 21, 2018, 3:31 p.m.)
> 
> 
> Review request for hive and Laszlo Pinter.
> 
> 
> Bugs: HIVE-20544
> https://issues.apache.org/jira/browse/HIVE-20544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> TOpenSessionReq, if client protocol is unset, both username and password are 
> logged. Logging a password is a security risk. This patch would hide it with 
> asterisks.
> 
> 
> Diffs
> -
> 
>   service-rpc/pom.xml d6a07a55bc 
>   
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq.java
>  3195e704f3 
> 
> 
> Diff: https://reviews.apache.org/r/68710/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 68767: HIVE-20551: Create PreparedStatement query dynamically when IN clause is used

2018-09-24 Thread Laszlo Pinter via Review Board


> On Sept. 20, 2018, 10:56 p.m., Andrew Sherman wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 316 (patched)
> > 
> >
> > 1) Does setObject() work OK on all the jdbc drivers that are supported? 
> > In the oast I have seen cases where it was necessary to dispatch to the 
> > correct method like setString, setInt 
> > 2) can the params over be null? Do we need to call setNull instead of 
> > setObject()? Again we need to consider all the drivers.
> 
> Laszlo Pinter wrote:
> 1) The jdbc driver will do the type checking. A slight disadvantage is 
> the minor overhead, but this is negligible as compared to the better 
> maitainable code you end up with.
> 2) You're correct, I have to make sure that the params[i] is not null or 
> use setNull instead.
> 
> Laszlo Pinter wrote:
> So I did a bit more of a debugging, and my previous comment about the 
> params[i] can be null is not correct. The params can contain partitionIds, 
> storageDescriptorIds, columnDescriptorIds, serdeIds, depeding from where the 
> executeNoResult() is called.  These fields are mandatory and cannot be null. 
> If any of these items are null, means that the metastore db is not consistent 
> and it was corrupted.
> 
> Andrew Sherman wrote:
> I worte this once, but rb ate it, sorry if it duplicates.
> On 1) Did you test with all drivers?
> On 2) I suggest you add some checking to nail down that aprams are 
> non-null. How is the java testing of this class? Do we need negative test 
> cases?

1) Do you know what are the supported jdbc drivers or where could I check them?
2) It doesn't makes sense to have null values in queries like 
```sql
SELECT column_name1 FROM table_name WHERE column_name2 IN (value1, value2 ...);
```
so I filtered them out.


- Laszlo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68767/#review208821
---


On Sept. 24, 2018, 12:16 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68767/
> ---
> 
> (Updated Sept. 24, 2018, 12:16 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Peter Vary, and Vihang 
> Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20551: Create PreparedStatement query dynamically when IN clause is used
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  571c789eddfd2b1a27c65c48bdc6dccfafaaf676 
> 
> 
> Diff: https://reviews.apache.org/r/68767/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 68767: HIVE-20551: Create PreparedStatement query dynamically when IN clause is used

2018-09-24 Thread Laszlo Pinter via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68767/
---

(Updated Sept. 24, 2018, 12:16 p.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Vihang 
Karajgaonkar.


Repository: hive-git


Description
---

HIVE-20551: Create PreparedStatement query dynamically when IN clause is used


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 571c789eddfd2b1a27c65c48bdc6dccfafaaf676 


Diff: https://reviews.apache.org/r/68767/diff/3/

Changes: https://reviews.apache.org/r/68767/diff/2-3/


Testing
---


Thanks,

Laszlo Pinter



[GitHub] hive pull request #435: HIVE-20627: Concurrent async queries intermittently ...

2018-09-24 Thread sankarh
GitHub user sankarh opened a pull request:

https://github.com/apache/hive/pull/435

HIVE-20627: Concurrent async queries intermittently fails with 
LockException and cause memory leak.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sankarh/hive HIVE-20627

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/435.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #435


commit 829ea3db8a4f6837c29fb76d7d89c002efbcb4f5
Author: Sankar Hariappan 
Date:   2018-09-24T10:47:16Z

HIVE-20627: Concurrent async queries intermittently fails with 
LockException and cause memory leak.




---


[jira] [Created] (HIVE-20627) Concurrent Async queries from same session intermittently fails with LockException.

2018-09-24 Thread Sankar Hariappan (JIRA)
Sankar Hariappan created HIVE-20627:
---

 Summary: Concurrent Async queries from same session intermittently 
fails with LockException.
 Key: HIVE-20627
 URL: https://issues.apache.org/jira/browse/HIVE-20627
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0, 3.2.0
Reporter: Sankar Hariappan
Assignee: Sankar Hariappan


When multiple async queries are executed from same session, it leads to 
multiple async query execution DAGs share the same Hive object which is set by 
caller for all threads. In case of loading dynamic partitions, it creates 
MoveTask which re-creates the Hive object and closes the shared Hive object 
which causes metastore connection issues for other async execution thread who 
still access it. This is also seen if ReplDumpTask and ReplLoadTask are part of 
the DAG.

*Root cause:*
For Async query execution from SQLOperation.runInternal, we set the Thread 
local Hive object for all the child threads as parentHive 
(parentSession.getSessionHive())
{code}
@Override
 public void run() {
 PrivilegedExceptionAction doAsAction = new 
PrivilegedExceptionAction() {
 @Override
 public Object run() throws HiveSQLException {
 Hive.set(parentHive); // Setting parentHive for all async operations.
 // TODO: can this result in cross-thread reuse of session state?
 SessionState.setCurrentSessionState(parentSessionState);
 PerfLogger.setPerfLogger(parentPerfLogger);
 LogUtils.registerLoggingContext(queryState.getConf());
 try {
 if (asyncPrepare) {
 prepare(queryState);
 }
 runQuery();
 } catch (HiveSQLException e) {
 // TODO: why do we invent our own error path op top of the one from Future.get?
 setOperationException(e);
 LOG.error("Error running hive query: ", e);
 } finally {
 LogUtils.unregisterLoggingContext();
 }
 return null;
 }
 };
{code}

Now, when async execution in progress and if one of the thread re-creates the 
Hive object, it closes the parentHive object first which impacts other threads 
using it and hence conf object it refers too gets cleaned up and hence we get 
null for VALID_TXNS_KEY value. 
{code}
private static Hive create(HiveConf c, boolean needsRefresh, Hive db, boolean 
doRegisterAllFns)
 throws HiveException {
 if (db != null) {
 LOG.debug("Creating new db. db = " + db + ", needsRefresh = " + needsRefresh +
 ", db.isCurrentUserOwner = " + db.isCurrentUserOwner());
 db.close();
 }
 closeCurrent();
 if (c == null) {
 c = createHiveConf();
 }
 c.set("fs.scheme.class", "dfs");
 Hive newdb = new Hive(c, doRegisterAllFns);
 hiveDB.set(newdb);
 return newdb;
 }
{code}

*Fix:*
We shouldn't clean the old Hive object if it is shared by multiple threads. 
Shall use a flag to know this.

*Memory leak issue:*
Memory leak is found if one of the threads from Hive.loadDynamicPartitions 
throw exception. rawStoreMap is used to store rawStore objects which has to be 
cleaned. In this case, it is populated only in success flow but if there are 
exceptions, it is not and hence there is a leak. 
{code}
futures.add(pool.submit(new Callable() {
 @Override
 public Void call() throws Exception {
 try {
 // move file would require session details (needCopy() invokes 
SessionState.get)
 SessionState.setCurrentSessionState(parentSession);
 LOG.info("New loading path = " + partPath + " with partSpec " + fullPartSpec);

// load the partition
 Partition newPartition = loadPartition(partPath, tbl, fullPartSpec, 
loadFileType,
 true, false, numLB > 0, false, isAcid, hasFollowingStatsTask, writeId, stmtId,
 isInsertOverwrite);
 partitionsMap.put(fullPartSpec, newPartition);

if (inPlaceEligible) {
 synchronized (ps) {
 InPlaceUpdate.rePositionCursor(ps);
 partitionsLoaded.incrementAndGet();
 InPlaceUpdate.reprintLine(ps, "Loaded : " + partitionsLoaded.get() + "/"
 + partsToLoad + " partitions.");
 }
 }
 // Add embedded rawstore, so we can cleanup later to avoid memory leak
 if (getMSC().isLocalMetaStore()) {
 if (!rawStoreMap.containsKey(Thread.currentThread().getId())) {
 rawStoreMap.put(Thread.currentThread().getId(), 
HiveMetaStore.HMSHandler.getRawStore());
 }
 }
 return null;
 } catch (Exception t) {
 }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Writing Parquet Timestamp and reading from Hive table

2018-09-24 Thread Srinivas M
Hi Folks

Any suggestions or thoughts on the question / issue posted below ?

Regards
Srinivas

On 2018/09/19 10:47:38, Srinivas M  wrote:
> Hi>
>
> We have a java application which writes parquet files. We are using the>
> Parquet 1.9.0 API to write the Timestamp data. Since there are>
> incompatibilities between the Parquet and Hive representation of the>
> Timestamp data, we have tried to work around the same by writing the>
> Parquet Timestamp data as 12 byte array by converting the Timestamp
fields>
> in the format Hive expects. However, while setting the field type in the>
> Schema, since Avro Schema Types does not have an enumeration for the
INT96>
> type, we have set it to bytes under the assumption that hive would allow>
> reading the data since we have written in the format Hive expects.
However,>
> when we are trying to read the data from the Hive table, we are running>
> into the following exception.>
>
>
> *Question : *>
> *---*>
> *1. Is there any way we can work around this issue by making hive read
the>
> data when the timestamp field is set as bytes*>
> *2. Is there any way in which the data type can be set as INT96 in the>
> parquet schema ?*>
>
> Exception :>
> >
> Failed with exception>
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:>
> java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot
be>
> cast to org.apache.hadoop.hive.serde2.io.TimestampWritable>
> >
>
> Schema of the file>
> =>
> file schema: parquet.filecc>
>
>

> C1:  REQUIRED INT32 R:0 D:0>
> C2:  REQUIRED BINARY O:UTF8 R:0 D:0>
> C3:  REQUIRED BINARY O:UTF8 R:0 D:0>
> *C4:  REQUIRED BINARY R:0 D:0  > Timestamp>
> Column*>
> *C5:  REQUIRED BINARY R:0 D:0  > Timestamp>
> Column*>
>
>
--->

>
> hive> show create table HiveParquetTimestamp;>
> OK>
> CREATE EXTERNAL TABLE `HiveParquetTimestamp`(>
>   `c1` int,>
>   `c2` char(4),>
>   `c3` varchar(8),>
>   `c4` timestamp,>
>   `c5` timestamp)>
> ROW FORMAT SERDE>
>   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'>
> STORED AS INPUTFORMAT>
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'>
> OUTPUTFORMAT>
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'>
> LOCATION>
>   'hdfs://cdhkrb123.fyre.com:8020/tmp/HiveParquetTimestamp'>
>
> -- >
> Srinivas>
> (*-*)>
>
-->

> You have to grow from the inside out. None can teach you, none can make
you>
> spiritual.>
>   -Narendra Nath Dutta(Swamy Vivekananda)>
>
-->

>


Re: Review Request 68772: HIVE-20593

2018-09-24 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68772/
---

(Updated Sept. 24, 2018, 6:56 a.m.)


Review request for hive and Eugene Koifman.


Changes
---

Implemented changes recommended.
Got green run on ptests.


Bugs: HIVE-20593
https://issues.apache.org/jira/browse/HIVE-20593


Repository: hive-git


Description
---

Load Data for partitioned ACID tables fails with bucketId out of range: -1

The tempTblObj is inherited from target table. However, the only table property 
which needs to be inherited is bucketing version. Properties like transactional 
etc should be ignored.


Diffs (updated)
-

  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_0_0
 PRE-CREATION 
  
data/files/load_data_job_acid/20180918230307-b382b8c7-271c-4025-be64-4a68f4db32e5_1_0
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
8d33cf5b23 
  ql/src/test/queries/clientpositive/load_data_using_job.q b760d9bc7e 
  ql/src/test/results/clientpositive/llap/load_data_using_job.q.out 21fd9334ea 


Diff: https://reviews.apache.org/r/68772/diff/2/

Changes: https://reviews.apache.org/r/68772/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-20626) Log more details when druid metastore transaction fails in callback

2018-09-24 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20626:
---

 Summary: Log more details when druid metastore transaction fails 
in callback
 Key: HIVE-20626
 URL: https://issues.apache.org/jira/browse/HIVE-20626
 Project: Hive
  Issue Type: Task
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Below exception does not give much details on what is the actual cause of the 
error. 
We also need to log the callback exception when we get it. 
{code} 
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:Transaction failed do to exception being thrown from 
within the callback. See cause for the original exception.)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:932) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:937) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at 
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4954) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:428) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2668) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
{code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)