[GitHub] hive pull request #355: HIVE-19584: Dictionary encoding for string types

2018-05-24 Thread pudidic
GitHub user pudidic opened a pull request:

https://github.com/apache/hive/pull/355

HIVE-19584: Dictionary encoding for string types

Apache Arrow supports dictionary encoding for some data types. So implement 
dictionary encoding for string types in Arrow SerDe.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pudidic/hive HIVE-19584

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #355


commit 9c043f310d0d684318093e6d9a074fce51cf6b4c
Author: Teddy Choi 
Date:   2018-05-24T07:36:31Z

HIVE-19584: Dictionary encoding for string types




---


[jira] [Created] (HIVE-19692) Hive need to support Multilines Comments

2018-05-24 Thread PRAFUL DASH (JIRA)
PRAFUL DASH created HIVE-19692:
--

 Summary: Hive need to support Multilines Comments
 Key: HIVE-19692
 URL: https://issues.apache.org/jira/browse/HIVE-19692
 Project: Hive
  Issue Type: Improvement
  Components: Clients
Affects Versions: 2.3.2
Reporter: PRAFUL DASH
Assignee: PRAFUL DASH
 Fix For: 0.10.1


I want Develop multilines comments in hive.  Please provide required classes , 
jars and engine understanding , execution plan

 

 

Thanks,

PRAFUL

+91-9818868839



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66800: HIVE-6980 Drop table by using direct sql

2018-05-24 Thread Vihang Karajgaonkar via Review Board


> On May 14, 2018, 7:07 p.m., Vihang Karajgaonkar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2545 (patched)
> > 
> >
> > Why not LOG.error?
> 
> Peter Vary wrote:
> My original thought was, that we fall back to JDO solution this case, so 
> the problem is not fatal.
> I think this is why the other cases are using LOG.warn too.
> Shall I change here, and other places as well?
> 
> Thanks for the review!

I see, that sounds good to me then. Thanks


- Vihang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66800/#review203055
---


On May 11, 2018, 2:13 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66800/
> ---
> 
> (Updated May 11, 2018, 2:13 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Alan Gates, Marta Kuczora, Adam 
> Szita, and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-6980
> https://issues.apache.org/jira/browse/HIVE-6980
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> First version of the patch.
> 
> Splits getPartitionsViaSqlFilterInternal to:
> 
> getPartitionIdsViaSqlFilter - which returns the partition ids
> getPartitionsFromPartitionIds - which returns the partition data for the 
> partitions
> Creates dropPartitionsByPartitionIds which drops the partitions by directSQL 
> commands
> 
> Creates a dropPartitionsViaSqlFilter using getPartitionIdsViaSqlFilter and 
> dropPartitionsByPartitionIds.
> 
> Modifies the ObjectStore to drop partitions with directsql if possible.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
>  56fbfed 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  b0a805f 
> 
> 
> Diff: https://reviews.apache.org/r/66800/diff/4/
> 
> 
> Testing
> ---
> 
> Run the TestDropPartition tests, also checked the database manually, that no 
> object left in the database
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



[jira] [Created] (HIVE-19693) Create hive API on Java 1.9 based

2018-05-24 Thread Murtaza Hatim Zaveri (JIRA)
Murtaza Hatim Zaveri created HIVE-19693:
---

 Summary: Create hive API on Java 1.9 based
 Key: HIVE-19693
 URL: https://issues.apache.org/jira/browse/HIVE-19693
 Project: Hive
  Issue Type: Improvement
  Components: API
Affects Versions: 2.3.2
Reporter: Murtaza Hatim Zaveri
Assignee: Murtaza Hatim Zaveri
 Fix For: 0.10.1






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19695) Year Month Day extraction functions need to add an implicit cast for column that are String types

2018-05-24 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-19695:
-

 Summary: Year Month Day extraction functions need to add an 
implicit cast for column that are String types
 Key: HIVE-19695
 URL: https://issues.apache.org/jira/browse/HIVE-19695
 Project: Hive
  Issue Type: Bug
  Components: Druid integration, Query Planning
Affects Versions: 3.0.0
Reporter: slim bouguerra
Assignee: slim bouguerra
 Fix For: 3.1.0


To avoid surprising/wrong results, Hive Query plan shall add an explicit cast 
over non date/timestamp column type when user try to extract Year/Month/Hour 
etc..
This is an example of misleading results.
{code}
create table test_base_table(`timecolumn` timestamp, `date_c` string, 
`timestamp_c` string,  `metric_c` double);
insert into test_base_table values ('2015-03-08 00:00:00', '2015-03-10', 
'2015-03-08 00:00:00', 5.0);
CREATE TABLE druid_test_table
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.segment.granularity" = "DAY")
AS select
cast(`timecolumn` as timestamp with local time zone) as `__time`, `date_c`, 
`timestamp_c`, `metric_c` FROM test_base_table;
select
year(date_c), month(date_c),day(date_c), hour(date_c),
year(timestamp_c), month(timestamp_c),day(timestamp_c), hour(timestamp_c)
from druid_test_table;
{code} 

will return the following wrong results:
{code}
PREHOOK: query: select
year(date_c), month(date_c),day(date_c), hour(date_c),
year(timestamp_c), month(timestamp_c),day(timestamp_c), hour(timestamp_c)
from druid_test_table
PREHOOK: type: QUERY
PREHOOK: Input: default@druid_test_table
 A masked pattern was here 
POSTHOOK: query: select
year(date_c), month(date_c),day(date_c), hour(date_c),
year(timestamp_c), month(timestamp_c),day(timestamp_c), hour(timestamp_c)
from druid_test_table
POSTHOOK: type: QUERY
POSTHOOK: Input: default@druid_test_table
 A masked pattern was here 
196912  31  16  196912  31  16 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19696) Create schema scripts for Hive 3.1.0 and Hive 4.0.0

2018-05-24 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-19696:
--

 Summary: Create schema scripts for Hive 3.1.0 and Hive 4.0.0
 Key: HIVE-19696
 URL: https://issues.apache.org/jira/browse/HIVE-19696
 Project: Hive
  Issue Type: Task
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


metastore schema init scripts are missing for the new release after branch was 
cut-out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19697) TestReOptimization#testStatCachingMetaStore is flaky

2018-05-24 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19697:
--

 Summary: TestReOptimization#testStatCachingMetaStore is flaky
 Key: HIVE-19697
 URL: https://issues.apache.org/jira/browse/HIVE-19697
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez


https://builds.apache.org/job/PreCommit-HIVE-Build/11180/testReport/junit/org.apache.hadoop.hive.ql.plan.mapping/TestReOptimization/testStatCachingMetaStore/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Hive Metastore Client Backwards Compatibility Issue

2018-05-24 Thread Bohdan Kazydub
Hello Hive,

I am using HiveMetaStoreClient of version 2.3.3 (
https://github.com/apache/hive/blob/rel/release-2.3.3/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java)
to query data (sql queries like 'show tables;' and 'select * from
customers;' etc.) from Hive 2.1 installed on my machine. I get following
exception:

HiveMetaStoreClient - Failure while attempting to get hive table. Retries
once.
org.apache.thrift.TApplicationException: Invalid method name:
'get_table_req'

because ThriftHiveMetastore of version 2.1 does not have the method (as it
uses get_table method which got deprecated in 2.3).

The question is: is it possible for hive metastore client of version 2.3 to
interact with hive of version 2.1 without that method's invocation? If not,
what are my options?

P.s. There is a configuration property
https://github.com/apache/hive/blob/rel/release-2.3.3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L598
(added in the same commit as 'get_table_req' method) which is true by
default, but does nothing in the case, but it looks like if it should (If I
understand it correctly, my queries should have failed fast, even before
the method invocation). Disabling it (setting to false) produced the same
result as described above.


[jira] [Created] (HIVE-19699) Re-enable TestReOptimization

2018-05-24 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19699:
--

 Summary: Re-enable TestReOptimization
 Key: HIVE-19699
 URL: https://issues.apache.org/jira/browse/HIVE-19699
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Zoltan Haindrich


https://builds.apache.org/job/PreCommit-HIVE-Build/11180/testReport/junit/org.apache.hadoop.hive.ql.plan.mapping/TestReOptimization/testStatCachingMetaStore/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67294: HIVE-18453: ACID: Add "CREATE TRANSACTIONAL TABLE" syntax to unify ACID ORC & Parquet support

2018-05-24 Thread Igor Kryvenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67294/
---

Review request for hive, Ashutosh Chauhan, Eugene Koifman, and Vineet Garg.


Bugs: HIVE-18453
https://issues.apache.org/jira/browse/HIVE-18453


Repository: hive-git


Description
---

The ACID table markers are currently done with TBLPROPERTIES which is 
inherently fragile.

The "create transactional table" offers a way to standardize the syntax and 
allows for future compatibility changes to support Parquet ACIDv2 tables along 
with ORC tables.

The ACIDv2 design is format independent, with the ability to add new vectorized 
input formats with no changes to the design.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 09a4368 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 8726974 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 55bd92d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7ff7e18 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 3e2784b 
  ql/src/test/queries/clientnegative/create_external_transactional.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/create_1.q f348e59 
  ql/src/test/queries/clientpositive/create_transactional.q PRE-CREATION 
  ql/src/test/results/clientnegative/create_external_transactional.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/create_transactional.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/67294/diff/1/


Testing
---


Thanks,

Igor Kryvenko



[jira] [Created] (HIVE-19698) TestAMReporter#testMultipleAM is flaky

2018-05-24 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19698:
--

 Summary: TestAMReporter#testMultipleAM is flaky
 Key: HIVE-19698
 URL: https://issues.apache.org/jira/browse/HIVE-19698
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez


https://builds.apache.org/job/PreCommit-HIVE-Build/11184/testReport/junit/org.apache.hadoop.hive.llap.daemon.impl.comparator/TestAMReporter/testMultipleAM/

It timeouts sporadically:
https://builds.apache.org/job/PreCommit-HIVE-Build/11184/testReport/junit/org.apache.hadoop.hive.llap.daemon.impl.comparator/TestAMReporter/testMultipleAM/history/




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19694) Create Materialized View statement should check for MV name conflicts before running MV's SQL statement.

2018-05-24 Thread Nita Dembla (JIRA)
Nita Dembla created HIVE-19694:
--

 Summary: Create Materialized View statement should check for MV 
name conflicts before running MV's SQL statement. 
 Key: HIVE-19694
 URL: https://issues.apache.org/jira/browse/HIVE-19694
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.0.0
Reporter: Nita Dembla
 Fix For: 3.0.1


If the CREATE MATERIALIZE VIEW statement refers to a mv name that already 
exists, the statement runs the SQL on cluster and Move task returns an error at 
the very end.

This unnecessarily uses up cluster resources and user time.

 
{code:java}
0: jdbc:hive2://localhost:10007/tpcds_bin_par> CREATE MATERIALIZED VIEW 
mv_store_sales_item_store
. . . . . . . . . . . . . . . . . . . . . . .> ENABLE REWRITE AS (
. . . . . . . . . . . . . . . . . . . . . . .>  select ss_item_sk,
. . . . . . . . . . . . . . . . . . . . . . .>  ss_store_sk,
. . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_quantity) as ss_quantity,
. . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_ext_wholesale_cost) as 
ss_ext_wholesale_cost,
. . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_paid) as ss_net_paid,
. . . . . . . . . . . . . . . . . . . . . . .>  sum(ss_net_profit) as 
ss_net_profit
. . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
. . . . . . . . . . . . . . . . . . . . . . .>  group by ss_item_sk,ss_store_sk
. . . . . . . . . . . . . . . . . . . . . . .>  );
INFO  : Compiling 
command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
CREATE MATERIALIZED VIEW mv_store_sales_item_store
ENABLE REWRITE AS (
select ss_item_sk,
|   `ss_store_sk` bigint,    |
|   `ss_quantity` bigint,    |
|   `ss_ext_wholesale_cost` double,  |
|   `ss_net_paid` double,    |
|   `ss_net_profit` double)  |
. . . . . . . . . . . . . . . . . . . . . . .>  from store_sales
. . . . . . . . . . . . . . . . . . . . . . .>  group by ss_item_sk,ss_store_sk
. . . . . . . . . . . . . . . . . . . . . . .>  );
INFO  : Compiling 
command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
CREATE MATERIALIZED VIEW mv_store_sales_item_store
ENABLE REWRITE AS (
select ss_item_sk,
ss_store_sk,
sum(ss_quantity) as ss_quantity,
sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
sum(ss_net_paid) as ss_net_paid,
sum(ss_net_profit) as ss_net_profit
from store_sales
group by ss_item_sk,ss_store_sk
)
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:ss_item_sk, type:bigint, comment:null), 
FieldSchema(name:ss_store_sk, type:bigint, comment:null), 
FieldSchema(name:ss_quantity, type:bigint, comment:null), 
FieldSchema(name:ss_ext_wholesale_cost, type:double, comment:null), 
FieldSchema(name:ss_net_paid, type:double, comment:null), 
FieldSchema(name:ss_net_profit, type:double, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037); Time 
taken: 3.652 seconds
INFO  : Executing 
command(queryId=root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037): 
CREATE MATERIALIZED VIEW mv_store_sales_item_store
ENABLE REWRITE AS (
select ss_item_sk,
ss_store_sk,
sum(ss_quantity) as ss_quantity,
sum(ss_ext_wholesale_cost) as ss_ext_wholesale_cost,
sum(ss_net_paid) as ss_net_paid,
sum(ss_net_profit) as ss_net_profit
from store_sales
group by ss_item_sk,ss_store_sk
)
INFO  : Query ID = root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Subscribed to counters: [] for queryId: 
root_20180524034330_21fca7f6-ed5a-492c-88e9-913d4120b037
INFO  : Session is already open
INFO  : Dag name: CREATE MATERIALIZED V...tem_sk,ss_store_sk
) (Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id 
application_1525123931791_0151)

--
    VERTICES  MODE    STATUS  TOTAL  COMPLETED  RUNNING  PENDING  
FAILED  KILLED
--
Map 1 ..  llap SUCCEEDED   1682   1682    0    0
   0   0
Reducer 2 ..  llap SUCCEEDED   1009   1009    0    0
   0   7
--
VERTICES: 02/02  [==>>] 100%  ELAPSED TIME: 1734.00 s
--
INFO  : Status: DAG finished successfully in 1731.89 seconds
INFO  :
INFO  : Query Execution Summary
INFO  : 

[jira] [Created] (HIVE-19703) GenericUDTFGetSplits never uses num splits argument

2018-05-24 Thread Eric Wohlstadter (JIRA)
Eric Wohlstadter created HIVE-19703:
---

 Summary: GenericUDTFGetSplits never uses num splits argument
 Key: HIVE-19703
 URL: https://issues.apache.org/jira/browse/HIVE-19703
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Eric Wohlstadter


The description for GenericUDTFGetSplits says
{code}
Returns an array of length int serialized splits for the referenced tables 
string.
{code}

but the argument to control the number of splits is DOA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67296: HIVE-18875 : Enable SMB Join by default in Tez

2018-05-24 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67296/
---

Review request for hive, Gunther Hagleitner and Jason Dere.


Bugs: HIVE-18875
https://issues.apache.org/jira/browse/HIVE-18875


Repository: hive-git


Description
---

Fixed various issues with SMB, mostly on the Reducer side join.
GBY Op now uses inputObjectInspector[0] all the time as it is the only OI it 
has. The tag is irrelevant here. Was causing problem with SMB.
Disabled SMB in spark on hive tests as the same config for Tez was enabling it 
there.
Some SMB specific tests were designed to first run without SMB and then with 
SMB. With SMB enabled by default, it is explicitely turned off to make sure the 
behavior is maintained.

Please go through JIRA comments as they may clear out some questions.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 931533a556 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 
aefaa0586e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4b766382ef 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
4019f132d3 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 9e5446566b 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q 7416eb0ec0 
  ql/src/test/queries/clientpositive/skewjoinopt19.q 02cadda7f5 
  ql/src/test/queries/clientpositive/skewjoinopt20.q 160e5b82d9 
  ql/src/test/queries/clientpositive/smb_mapjoin_11.q 6ce49b83c2 
  ql/src/test/queries/clientpositive/smb_mapjoin_12.q 753e4d3c9a 
  ql/src/test/queries/clientpositive/smb_mapjoin_17.q d68f5f3139 
  ql/src/test/queries/clientpositive/subquery_notin.q 64940277bb 
  ql/src/test/results/clientpositive/llap/correlationoptimizer2.q.out 
0f839ead0e 
  ql/src/test/results/clientpositive/llap/correlationoptimizer6.q.out 
499ef4b178 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out f4852d2f15 
  ql/src/test/results/clientpositive/llap/limit_pushdown.q.out 79311d0415 
  ql/src/test/results/clientpositive/llap/mergejoin.q.out 832ed487ec 
  ql/src/test/results/clientpositive/llap/mrr.q.out 737c73893f 
  ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 
09a120ae12 
  ql/src/test/results/clientpositive/llap/smb_cache.q.out 7c885d1ffa 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_14.q.out c334b9386b 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_15.q.out 21aac455f2 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_4.q.out 4b8728fbff 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_5.q.out a1313696f0 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_6.q.out f44a0dbc70 
  ql/src/test/results/clientpositive/llap/subquery_in_having.q.out c9956121f8 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out d72e8c349c 
  ql/src/test/results/clientpositive/llap/vectorized_bucketmapjoin1.q.out 
61c5051bb9 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out a79a8c466a 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_14.q.out 1fd4490ac4 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_15.q.out 6ca577fdbb 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out 629a6c428a 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out 7d0934010e 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_6.q.out 7445135159 
  ql/src/test/results/clientpositive/spark/subquery_notin.q.out ea473c3b40 


Diff: https://reviews.apache.org/r/67296/diff/1/


Testing
---


Thanks,

Deepak Jaiswal



DB install and upgrade scripts in the brave new world of multiple release lines

2018-05-24 Thread Alan Gates
The change to have branches running with master as 4 and branch-3 for 3.x
releases is complicating our DB install and upgrade scripts.  There's a
JIRA to track the changes but some discussion on that JIRA of how best to
proceed, starting with the comment
https://issues.apache.org/jira/browse/HIVE-19323?focusedCommentId=16489833=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16489833
I'm posting this here as others may be interested in chiming in.

Alan.


[jira] [Created] (HIVE-19700) Workaround for JLine issue with UnsupportedTerminal

2018-05-24 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-19700:


 Summary: Workaround for JLine issue with UnsupportedTerminal
 Key: HIVE-19700
 URL: https://issues.apache.org/jira/browse/HIVE-19700
 Project: Hive
  Issue Type: Bug
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Fix For: 2.2.1


>From the JLine's ConsoleReader, readLine(prompt, mask) calls the following 
>beforeReadLine() method.
{code}
try {
// System.out.println("is terminal supported " + 
terminal.isSupported());
if (!terminal.isSupported()) {
beforeReadLine(prompt, mask);
}
{code}

So specifically when using UnsupportedTerminal {{-Djline.terminal}} and 
{{prompt=null}} and {{mask!=null}}, a "null" string gets printed to the console 
before and after the query result. {{UnsupportedTerminal}} is required to be 
used when running beeline as a background process, hangs otherwise.

{code}
private void beforeReadLine(final String prompt, final Character mask) {
if (mask != null && maskThread == null) {
final String fullPrompt = "\r" + prompt
+ " "
+ " "
+ " "
+ "\r" + prompt;

maskThread = new Thread()
{
public void run() {
while (!interrupted()) {
try {
Writer out = getOutput();
out.write(fullPrompt);
{code}

So the {{prompt}} is null and {{mask}} is NOT in atleast 2 scenarios in 
beeline. 
when beeline's silent=true, prompt is null
* 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/BeeLine.java#L1264
when running multiline queries
* 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/Commands.java#L1093

When executing beeline in script mode (commands in a file), there should not be 
any masking while reading lines from the script file. aka, entire line should 
be a beeline command or part of a multiline hive query.

So it should be safe to use a null mask instead of {{ConsoleReader.NULL_MASK}} 
when using UnsupportedTerminal as jline terminal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19702) Backport ALTER TABLE SET OWNER patches to branch-2

2018-05-24 Thread JIRA
Sergio Peña created HIVE-19702:
--

 Summary: Backport ALTER TABLE SET OWNER patches to branch-2
 Key: HIVE-19702
 URL: https://issues.apache.org/jira/browse/HIVE-19702
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 2.4.0
Reporter: Sergio Peña


This Jira will track and test the backports of all the patches to enable ALTER 
TABLE SET OWNER on the branch-2 branch.

The patches are:
 # HIVE-19371
 # HIVE-19372
 # HIVE-19374



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19701) getDelegationTokenFromMetaStore doesn't need to be synchronized

2018-05-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19701:
---

 Summary: getDelegationTokenFromMetaStore doesn't need to be 
synchronized
 Key: HIVE-19701
 URL: https://issues.apache.org/jira/browse/HIVE-19701
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair


or so it seems



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 67307: HIVE-19704 LLAP IO retries on branch-2 should be stoppable

2018-05-24 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67307/
---

Review request for hive and Prasanth_J.


Repository: hive-git


Description
---

see jira


Diffs
-

  llap-server/src/java/org/apache/hadoop/hive/llap/cache/BuddyAllocator.java 
302918aadfb54a024ced8ddbd46153cad1ab8baf 
  
llap-server/src/java/org/apache/hadoop/hive/llap/cache/LowLevelCacheMemoryManager.java
 e331f1bdfdce294bd6c50984308a654a4ad451a3 
  llap-server/src/java/org/apache/hadoop/hive/llap/cache/MemoryManager.java 
0f4d3c01d7151d1994b5c7e5c7580391af161bca 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
 0fd81397a2a546e0a00776eb6b123bed32031afb 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/SerDeEncodedDataReader.java
 a088e27c82d2a856764b0b48f13361a75a281053 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java
 601b622b4967b9192afdbf8e3cde1ef3492c875b 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestBuddyAllocator.java 
a6080e63fa90ce09a17cd245729174183a5e9769 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestLowLevelLrfuCachePolicy.java
 0cce624682cd5673d372e5c2098c7b0b9a343d79 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java
 3059382942a0f4f920c9e12a880e4cca4797 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReader.java 
7540e72b5392fe6a5d67410c68f26a5d62435541 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 
3ef03ea35a0ca48119efa3e5e4a2d63ee6217b6d 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/StoppableAllocator.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67307/diff/1/


Testing
---


Thanks,

Sergey Shelukhin



[jira] [Created] (HIVE-19704) LLAP IO retries on branch-2 should be stoppable

2018-05-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19704:
---

 Summary: LLAP IO retries on branch-2 should be stoppable
 Key: HIVE-19704
 URL: https://issues.apache.org/jira/browse/HIVE-19704
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19707) Enable TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout

2018-05-24 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19707:
--

 Summary: Enable 
TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout
 Key: HIVE-19707
 URL: https://issues.apache.org/jira/browse/HIVE-19707
 Project: Hive
  Issue Type: Test
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19706) Disable TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout

2018-05-24 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-19706:
--

 Summary: Disable 
TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout
 Key: HIVE-19706
 URL: https://issues.apache.org/jira/browse/HIVE-19706
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 3.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


https://builds.apache.org/job/PreCommit-HIVE-Build/11190/testReport/junit/org.apache.hive.jdbc/TestJdbcWithMiniHS2/testHttpRetryOnServerIdleTimeout/history/

It seems to timeout sporadically. I will ignore specifically that retry test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 67296: HIVE-18875 : Enable SMB Join by default in Tez

2018-05-24 Thread Deepak Jaiswal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67296/
---

(Updated May 25, 2018, 3:06 a.m.)


Review request for hive, Gunther Hagleitner and Jason Dere.


Changes
---

Updated result files.


Bugs: HIVE-18875
https://issues.apache.org/jira/browse/HIVE-18875


Repository: hive-git


Description
---

Fixed various issues with SMB, mostly on the Reducer side join.
GBY Op now uses inputObjectInspector[0] all the time as it is the only OI it 
has. The tag is irrelevant here. Was causing problem with SMB.
Disabled SMB in spark on hive tests as the same config for Tez was enabling it 
there.
Some SMB specific tests were designed to first run without SMB and then with 
SMB. With SMB enabled by default, it is explicitely turned off to make sure the 
behavior is maintained.

Please go through JIRA comments as they may clear out some questions.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 931533a556 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 
aefaa0586e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 4b766382ef 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
4019f132d3 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
 9e5446566b 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q 7416eb0ec0 
  ql/src/test/queries/clientpositive/skewjoinopt19.q 02cadda7f5 
  ql/src/test/queries/clientpositive/skewjoinopt20.q 160e5b82d9 
  ql/src/test/queries/clientpositive/smb_mapjoin_11.q 6ce49b83c2 
  ql/src/test/queries/clientpositive/smb_mapjoin_12.q 753e4d3c9a 
  ql/src/test/queries/clientpositive/smb_mapjoin_17.q d68f5f3139 
  ql/src/test/queries/clientpositive/subquery_notin.q 64940277bb 
  ql/src/test/results/clientpositive/llap/bucketsortoptimize_insert_6.q.out 
7533cf8579 
  ql/src/test/results/clientpositive/llap/correlationoptimizer2.q.out 
0f839ead0e 
  ql/src/test/results/clientpositive/llap/correlationoptimizer6.q.out 
499ef4b178 
  ql/src/test/results/clientpositive/llap/explainuser_1.q.out f4852d2f15 
  ql/src/test/results/clientpositive/llap/limit_pushdown.q.out 79311d0415 
  ql/src/test/results/clientpositive/llap/mergejoin.q.out 832ed487ec 
  ql/src/test/results/clientpositive/llap/mrr.q.out 737c73893f 
  ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 
09a120ae12 
  ql/src/test/results/clientpositive/llap/quotedid_smb.q.out 9c271a7958 
  ql/src/test/results/clientpositive/llap/smb_cache.q.out 7c885d1ffa 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_14.q.out c334b9386b 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_15.q.out 21aac455f2 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_4.q.out 4b8728fbff 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_5.q.out a1313696f0 
  ql/src/test/results/clientpositive/llap/smb_mapjoin_6.q.out f44a0dbc70 
  ql/src/test/results/clientpositive/llap/subquery_in_having.q.out c9956121f8 
  ql/src/test/results/clientpositive/llap/subquery_notin.q.out d72e8c349c 
  ql/src/test/results/clientpositive/llap/subquery_views.q.out 2c8530933c 
  ql/src/test/results/clientpositive/llap/vector_groupby_grouping_sets4.q.out 
285c154f3b 
  ql/src/test/results/clientpositive/llap/vectorized_bucketmapjoin1.q.out 
61c5051bb9 
  ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out a79a8c466a 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_14.q.out 1fd4490ac4 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_15.q.out 6ca577fdbb 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out 629a6c428a 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out 7d0934010e 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_6.q.out 7445135159 
  ql/src/test/results/clientpositive/spark/subquery_notin.q.out ea473c3b40 


Diff: https://reviews.apache.org/r/67296/diff/2/

Changes: https://reviews.apache.org/r/67296/diff/1-2/


Testing
---


Thanks,

Deepak Jaiswal



[jira] [Created] (HIVE-19709) How to convert Julian date to calendar date

2018-05-24 Thread Rajkumar Budha Singh (JIRA)
Rajkumar Budha Singh created HIVE-19709:
---

 Summary: How to convert Julian date to calendar date
 Key: HIVE-19709
 URL: https://issues.apache.org/jira/browse/HIVE-19709
 Project: Hive
  Issue Type: Task
Reporter: Rajkumar Budha Singh


Hi ,

We have files which have the column of date which is in the Julian format.

How to convert the Julian date to calendar date in hive or impala.

e.g. date in hive table is 120001, I don't know what will be the correct 
calendar date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19705) remove the stop call from LLAP IO and switch to using to thread interrupt

2018-05-24 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-19705:
---

 Summary: remove the stop call from LLAP IO and switch to using to 
thread interrupt
 Key: HIVE-19705
 URL: https://issues.apache.org/jira/browse/HIVE-19705
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Custom handling leaves places where processing may get stuck. 
However, I'm not sure all the code is 100% interrupt safe w.r.t. refcounts.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19708) Repl copy retrying with cm path even if the failure is due to network issue

2018-05-24 Thread mahesh kumar behera (JIRA)
mahesh kumar behera created HIVE-19708:
--

 Summary: Repl copy retrying with cm path even if the failure is 
due to network issue
 Key: HIVE-19708
 URL: https://issues.apache.org/jira/browse/HIVE-19708
 Project: Hive
  Issue Type: Task
  Components: Hive, HiveServer2, repl
Affects Versions: 3.1.0
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera
 Fix For: 3.1.0


* add a parameter at db level to identify if its a source of replication. 
beacon will set this.

 * Enable CM root only for databases that are a source of a replication policy, 
for other db's skip the CM root functionality.

 * prevent database drop if the parameter indicating its source of a 
replication, is set.

 * as an upgrade to this version, beacon should set the property on all 
existing database policies, in affect.

 * the parameter should be of the form . –  repl.source.for : List < policy ids 
>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: DB install and upgrade scripts in the brave new world of multiple release lines

2018-05-24 Thread Vihang Karajgaonkar
Historically speaking, once we cut out a branch for a certain release line,
most of the development shifts to master and only critical bug fixes are
made in release specific branches. This happened when branch-2 was cut out
too. Fortunately (or not) we always have had some minor releases on the
latest releases before the next major release was announced. This kept the
linearity of the upgrade process. Assuming the same trend is going to
repeat i.e master being the active development branch and only critical
fixes going to branch-3 (with branch-2 being ignored?) we should keep the
upgrade path of 3.max -> 4.0 open. Otherwise we will end up with weird 3.x
releases which can only be upgraded to 4.1 and so on.

On Thu, May 24, 2018 at 2:33 PM, Alan Gates  wrote:

> The change to have branches running with master as 4 and branch-3 for 3.x
> releases is complicating our DB install and upgrade scripts.  There's a
> JIRA to track the changes but some discussion on that JIRA of how best to
> proceed, starting with the comment
> https://issues.apache.org/jira/browse/HIVE-19323?
> focusedCommentId=16489833=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-16489833
> I'm posting this here as others may be interested in chiming in.
>
> Alan.
>