Review Request 48771: HIVE-13590: Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-06-15 Thread Chaoyu Tang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48771/
---

Review request for hive.


Bugs: HIVE-13590
https://issues.apache.org/jira/browse/HIVE-13590


Repository: hive-git


Description
---

Hive should not use Hadoop security (e.g. kerberos) related APIs such as 
KerberosName etc to process user logged in via other SASL mechanism such as 
LDAP.


Diffs
-

  
itests/hive-minikdc/src/test/java/org/apache/hive/minikdc/TestJdbcNonKrbSASLWithMiniKdc.java
 1c1beda 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java ab8806c 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
8bc3d94 
  
shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 8a4786c 

Diff: https://reviews.apache.org/r/48771/diff/


Testing
---

Manual test
PreCommit test


Thanks,

Chaoyu Tang



[jira] [Created] (HIVE-14029) Update Spark version to 1.6

2016-06-15 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-14029:
---

 Summary: Update Spark version to 1.6
 Key: HIVE-14029
 URL: https://issues.apache.org/jira/browse/HIVE-14029
 Project: Hive
  Issue Type: Bug
Reporter: Ferdinand Xu


There are quite some new optimizations in Spark 2.0.0. We need to bump up Spark 
to 2.0.0 to benefit those performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Assign JIRA to myself

2016-06-15 Thread Xu, Cheng A
Hi Peter,
You're not added as a contributor. Please see additional information at 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-BecomingaContributor

From: Peter Vary [mailto:pv...@cloudera.com]
Sent: Thursday, June 16, 2016 12:15 AM
To: dev@hive.apache.org
Subject: Assign JIRA to myself

Hi everyone,

I am trying to assign a JIRA to myself, but could not (see: screenshot).
Can anyone help me there?

Thanks in advance,
Peter

[cid:image002.png@01D1C7AE.96BBEBF0]


[jira] [Created] (HIVE-14028) stats is not updated

2016-06-15 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-14028:
--

 Summary: stats is not updated
 Key: HIVE-14028
 URL: https://issues.apache.org/jira/browse/HIVE-14028
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


{code}
DROP TABLE users;

CREATE TABLE users(key string, state string, country string, country_id int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "info:state,info:country,info:country_id"
);

INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src;

desc formatted users;
{code}
the result is
{code}
 A masked pattern was here 
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
numFiles0
numRows 0
rawDataSize 0
storage_handler org.apache.hadoop.hive.hbase.HBaseStorageHandler
totalSize   0
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14026) data can not be retrieved

2016-06-15 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-14026:
--

 Summary: data can not be retrieved
 Key: HIVE-14026
 URL: https://issues.apache.org/jira/browse/HIVE-14026
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong


{code}
DROP TABLE users;

CREATE TABLE users(key string, state string, country string, country_id int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "info:state,info:country,info:country_id"
);

INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src;

select * from users;
{code}
The result is only one row:
{code}
user1   IA  USA 0
{code}
should be 500 rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14027) NULL values produced by left outer join do not behave as NULL

2016-06-15 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-14027:
---

 Summary: NULL values produced by left outer join do not behave as 
NULL
 Key: HIVE-14027
 URL: https://issues.apache.org/jira/browse/HIVE-14027
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.0.1, 1.2.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


Consider the following setup:
{code}
create table tbl (n bigint, t string); 

insert into tbl values (1, 'one'); 
insert into tbl values(2, 'two');

select a.n, a.t, isnull(b.n), isnull(b.t) from (select * from tbl where n = 1) 
a  left outer join  (select * from tbl where 1 = 2) b on a.n = b.n;

1onefalsetrue
{code}

The query should return true for isnull(b.n).

I've tested by inserting a row with null value for the bigint column into tbl, 
and isnull returns true in that case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14025) Insert overwrite does not work in HBase tables

2016-06-15 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-14025:
--

 Summary: Insert overwrite does not work in HBase tables
 Key: HIVE-14025
 URL: https://issues.apache.org/jira/browse/HIVE-14025
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14024) setAllColumns is called incorrectly after some changes

2016-06-15 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14024:
---

 Summary: setAllColumns is called incorrectly after some changes
 Key: HIVE-14024
 URL: https://issues.apache.org/jira/browse/HIVE-14024
 Project: Hive
  Issue Type: Bug
Reporter: Takahiko Saito
Assignee: Sergey Shelukhin


h/t [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2

2016-06-15 Thread Sergey Shelukhin
Should all the 2.1.1-fixed JIRAs be converted to 2.1.0?

On 16/6/15, 14:03, "Jesus Camacho Rodriguez"
 wrote:

>OK, vote for RC2 is cancelled.
>
>Matt, please push HIVE-13974 as soon as possible and I will restart the
>vote.
>
>Thanks,
>Jesús
>
>
>
>
>
>On 6/15/16, 9:47 PM, "Matthew McCline"  wrote:
>
>>
>>-1 for HIVE-13974 ORC Schema Evolution doesn't support add columns to
>>non-last STRUCT columns
>>
>>This bug will prevent people with ORC tables that have added columns to
>>inner STRUCT columns to not be able to read their tables.
>>
>>
>>From: Jesus Camacho Rodriguez 
>>Sent: Wednesday, June 15, 2016 3:20 AM
>>To: dev@hive.apache.org
>>Subject: Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2
>>
>>Hive PMC members,
>>
>>Just a quick reminder that the vote for RC2 is still open and it needs
>>two additional votes to pass.
>>
>>Please test and cast your vote!
>>
>>Thanks,
>>Jesús
>>
>>
>>
>>On 6/10/16, 6:29 PM, "Alan Gates"  wrote:
>>
>>>+1, checked signatures, did a build and ran a few simple unit tests.
>>>
>>>Alan.
>>>
 On Jun 10, 2016, at 05:44, Jesus Camacho Rodriguez
 wrote:

 Apache Hive 2.1.0 Release Candidate 2 is available here:

 http://people.apache.org/~jcamacho/hive-2.1.0-rc2

 Maven artifacts are available here:

 https://repository.apache.org/content/repositories/orgapachehive-1055/

 Source tag for RC2 is at:
 https://github.com/apache/hive/releases/tag/release-2.1.0-rc2


 Voting will conclude in 72 hours.

 Hive PMC Members: Please test and vote.

 Thanks.


>>>
>>>
>>



Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2

2016-06-15 Thread Jesus Camacho Rodriguez
OK, vote for RC2 is cancelled.

Matt, please push HIVE-13974 as soon as possible and I will restart the vote.

Thanks,
Jesús





On 6/15/16, 9:47 PM, "Matthew McCline"  wrote:

>
>-1 for HIVE-13974 ORC Schema Evolution doesn't support add columns to non-last 
>STRUCT columns
>
>This bug will prevent people with ORC tables that have added columns to inner 
>STRUCT columns to not be able to read their tables.
>
>
>From: Jesus Camacho Rodriguez 
>Sent: Wednesday, June 15, 2016 3:20 AM
>To: dev@hive.apache.org
>Subject: Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2
>
>Hive PMC members,
>
>Just a quick reminder that the vote for RC2 is still open and it needs two 
>additional votes to pass.
>
>Please test and cast your vote!
>
>Thanks,
>Jesús
>
>
>
>On 6/10/16, 6:29 PM, "Alan Gates"  wrote:
>
>>+1, checked signatures, did a build and ran a few simple unit tests.
>>
>>Alan.
>>
>>> On Jun 10, 2016, at 05:44, Jesus Camacho Rodriguez 
>>>  wrote:
>>>
>>> Apache Hive 2.1.0 Release Candidate 2 is available here:
>>>
>>> http://people.apache.org/~jcamacho/hive-2.1.0-rc2
>>>
>>> Maven artifacts are available here:
>>>
>>> https://repository.apache.org/content/repositories/orgapachehive-1055/
>>>
>>> Source tag for RC2 is at:
>>> https://github.com/apache/hive/releases/tag/release-2.1.0-rc2
>>>
>>>
>>> Voting will conclude in 72 hours.
>>>
>>> Hive PMC Members: Please test and vote.
>>>
>>> Thanks.
>>>
>>>
>>
>>
>


Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2

2016-06-15 Thread Matthew McCline

-1 for HIVE-13974 ORC Schema Evolution doesn't support add columns to non-last 
STRUCT columns

This bug will prevent people with ORC tables that have added columns to inner 
STRUCT columns to not be able to read their tables.


From: Jesus Camacho Rodriguez 
Sent: Wednesday, June 15, 2016 3:20 AM
To: dev@hive.apache.org
Subject: Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2

Hive PMC members,

Just a quick reminder that the vote for RC2 is still open and it needs two 
additional votes to pass.

Please test and cast your vote!

Thanks,
Jesús



On 6/10/16, 6:29 PM, "Alan Gates"  wrote:

>+1, checked signatures, did a build and ran a few simple unit tests.
>
>Alan.
>
>> On Jun 10, 2016, at 05:44, Jesus Camacho Rodriguez 
>>  wrote:
>>
>> Apache Hive 2.1.0 Release Candidate 2 is available here:
>>
>> http://people.apache.org/~jcamacho/hive-2.1.0-rc2
>>
>> Maven artifacts are available here:
>>
>> https://repository.apache.org/content/repositories/orgapachehive-1055/
>>
>> Source tag for RC2 is at:
>> https://github.com/apache/hive/releases/tag/release-2.1.0-rc2
>>
>>
>> Voting will conclude in 72 hours.
>>
>> Hive PMC Members: Please test and vote.
>>
>> Thanks.
>>
>>
>
>


Re: Review Request 48233: HIVE-13884: Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-15 Thread Sergio Pena

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48233/
---

(Updated June 15, 2016, 8:45 p.m.)


Review request for hive, Mohit Sabharwal and Naveen Gangam.


Changes
---

Address Mohit's changes.


Bugs: HIVE-13884
https://issues.apache.org/jira/browse/HIVE-13884


Repository: hive-git


Description
---

The patch verifies the # of partitions a table has before fetching any from the 
metastore. I
t checks that limit from 'hive.limit.query.max.table.partition'.

A limitation added here is that the variable must be on hive-site.xml in order 
to work, and it does not accept to set this through beeline because 
HiveMetaStore.java does not read the variables set through beeline. I think it 
is better to keep it this way to avoid users changing the value on fly, and 
crashing the metastore.

Another change is that EXPLAIN commands won't be executed either. EXPLAIN 
commands need to fetch partitions in order to create the operator tree. If we 
allow EXPLAIN to do that, then we may have the same OOM situations for large 
partitions.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
761dbb279fb196e2bf1e0e59824827a4504eb136 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
c0827ea9d47e569d9697649a7e16d196de3de14d 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
c135179b97354108f842a5ca2de0c6f0ef28b7fc 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
da188d33d6194740ba9ecb37a6e533ecf1ec6906 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
a6d3f5385b33b8a4e31ee20ca5cb8f58c97c8702 
  metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
31f0d7b89670b8a749bbe8a7ff2b4ff9f059a8e2 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 3152e77c3c7152ac4dbe7e779ce35f28044fe3c9 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 86a243609b23e2ca9bb8849f0da863a95e477d5c 

Diff: https://reviews.apache.org/r/48233/diff/


Testing
---

Waiting for HiveQA.


Thanks,

Sergio Pena



[jira] [Created] (HIVE-14023) LLAP: Make the Hive query id available in ContainerRunner

2016-06-15 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-14023:
-

 Summary: LLAP: Make the Hive query id available in ContainerRunner
 Key: HIVE-14023
 URL: https://issues.apache.org/jira/browse/HIVE-14023
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth


Needed to generate logs per query.

We can use the dag identifier for now, but that isn't very useful. (The queryId 
may not be too useful either if users cannot find it - but that's better than a 
dagIdentifier)

The queryId is available right now after the Processor starts, which is too 
late for log changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 48233: HIVE-13884: Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-15 Thread Sergio Pena


> On June 14, 2016, 1:29 a.m., Mohit Sabharwal wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, 
> > line 3179
> > 
> >
> > Since we are moving the functionality from driver to HMS, should we 
> > deprecate 
> > hive.limit.query.max.table.partition and introduce a new config called 
> > hive.metastore.retrieve.max.partitions ?
> > 
> > All metastore configs have "hive.metastore" prefix. 
> > 
> > Otherwise:
> > 1) The change is backward incompatible for existing users that
> > are setting this config at HS2 level and are now expected to set it
> > at HMS level to get the same functionality.
> > 2) Name would be confusing.
> > 
> > We could do the following:
> > 1) Mark hive.limit.query.max.table.partition as deprecated in HiveConf 
> > and suggest that user 
> > move to hive.metastore.retrieve.max.partitions
> > 2) Do not remove the functionality associated with 
> > hive.limit.query.max.table.partition in PartitionPruner.
> > It does do what the description promises - i.e. fail the query before 
> > execution stage if number of 
> > partitions associated with any scan operator exceed configured value.
> > 3) Add new config hive.metastore.retrieve.max.partitions to configure 
> > functionality in this patch.
> > 
> > Makes sense ?

Thanks. It makes sense.
I will use "hive.metastore.limit.partition.request". I saw in the hive code 
that sometimes users can request up to a max # of partitions from the metastore 
without failing. So I do not want to cause confusion with the name.


- Sergio


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48233/#review137434
---


On June 13, 2016, 6:28 p.m., Sergio Pena wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48233/
> ---
> 
> (Updated June 13, 2016, 6:28 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Naveen Gangam.
> 
> 
> Bugs: HIVE-13884
> https://issues.apache.org/jira/browse/HIVE-13884
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The patch verifies the # of partitions a table has before fetching any from 
> the metastore. I
> t checks that limit from 'hive.limit.query.max.table.partition'.
> 
> A limitation added here is that the variable must be on hive-site.xml in 
> order to work, and it does not accept to set this through beeline because 
> HiveMetaStore.java does not read the variables set through beeline. I think 
> it is better to keep it this way to avoid users changing the value on fly, 
> and crashing the metastore.
> 
> Another change is that EXPLAIN commands won't be executed either. EXPLAIN 
> commands need to fetch partitions in order to create the operator tree. If we 
> allow EXPLAIN to do that, then we may have the same OOM situations for large 
> partitions.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> c0827ea9d47e569d9697649a7e16d196de3de14d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 
> c135179b97354108f842a5ca2de0c6f0ef28b7fc 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
> f98de1326956b19b9d28fc9b1fcdede8d851180d 
>   metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
> a6d3f5385b33b8a4e31ee20ca5cb8f58c97c8702 
>   metastore/src/java/org/apache/hadoop/hive/metastore/hbase/HBaseStore.java 
> 31f0d7b89670b8a749bbe8a7ff2b4ff9f059a8e2 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  3152e77c3c7152ac4dbe7e779ce35f28044fe3c9 
>   
> metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86a243609b23e2ca9bb8849f0da863a95e477d5c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> cd3c86064df3e7febcc16e03aab6ce407e0dc8a0 
> 
> Diff: https://reviews.apache.org/r/48233/diff/
> 
> 
> Testing
> ---
> 
> Waiting for HiveQA.
> 
> 
> Thanks,
> 
> Sergio Pena
> 
>



Re: Review Request 48500: HIVE-13982

2016-06-15 Thread Jesús Camacho Rodríguez


> On June 15, 2016, 3:23 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java,
> >  line 247
> > 
> >
> > Can you add a comment that why we don't care for sql type?

This is taken from line 146-147 in original version of ASTConverter. We do not 
care about the type because we are just converting to AST (and thus, we just 
create the references to extract the column names for the GBy).


> On June 15, 2016, 3:23 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/vector_groupby_reduce.q.out, line 788
> > 
> >
> > Plan change expected?

This is expected.
* In the original query, we have: GBy1 x,y - GBy2 x,y - OBy x,y.
* Calcite was returning: GBy1 y,x - GBy2 y,x - OBy x,y. Note that we cannot do 
anything about this, as Calcite consider GBy1 columns in the order given by the 
underlying expression, and in the case, in the order the columns are present in 
the TableScan. RS dedup was kicking in for GBy1 y,x - GBy2 y,x.
* With this patch, when we translate to AST, we get: GBy1 y,x - GBy2 x,y - OBy 
x,y. Observe that order of columns of last GBy is transformed to respect order 
of columns of OBy. However, RS does not kick in for GBy2 x,y - OBy x,y (because 
of number of reducers).

If we want to solve the problem completely, I have just thought that I could 
modify the patch and create a Calcite rule that takes care of this propagation 
through the whole operator tree top-down: gains could be potentially big. Then, 
we do not need to modify AST converter. Please, let me know what you think and 
I will proceed accordingly.


- Jesús


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48500/#review137744
---


On June 13, 2016, 12:06 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48500/
> ---
> 
> (Updated June 13, 2016, 12:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-13982
> https://issues.apache.org/jira/browse/HIVE-13982
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13982
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
>  353d8db41af10512c94c0700a9bb06a07d660190 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
>  1c3eb8155defa99a223ccf4ee4b072abb40a 
>   ql/src/test/queries/clientpositive/limit_pushdown2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/bucket_groupby.q.out 
> e198617c82b8ab4c3ad3d8b255975413fbdc382d 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/lineage3.q.out 
> 12ae13e388b3cb9c051cb419b75682fa4296d211 
>   ql/src/test/results/clientpositive/perf/query45.q.out 
> 04f9b02b019b6cf591dee48964a73fdb4a4b285f 
>   ql/src/test/results/clientpositive/spark/vectorization_14.q.out 
> cb3d9a4da84a379e00550ce7e31893b304d5e560 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out 
> 1871c7e443cf775b09badc4cbf4b86e23ad9e525 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out 
> 553066039881f225634c08d93a9054df5636e5d2 
>   ql/src/test/results/clientpositive/tez/vector_groupby_reduce.q.out 
> 7f00b064e5a91b45282823e2725e11ab7f508b01 
>   ql/src/test/results/clientpositive/tez/vectorization_14.q.out 
> 2a598332207f4540defa21a107642aa0502e1a58 
>   ql/src/test/results/clientpositive/vector_groupby_reduce.q.out 
> bc23b365b02b505d0f8e79cdacca3449bf46ead3 
>   ql/src/test/results/clientpositive/vectorization_14.q.out 
> 6d4f13a23de5c184cd100af07ac19f24ba9fac4a 
> 
> Diff: https://reviews.apache.org/r/48500/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>



[jira] [Created] (HIVE-14022) left semi join throws SemanticException if where clause contains columnname with table alias

2016-06-15 Thread Jagruti Varia (JIRA)
Jagruti Varia created HIVE-14022:


 Summary: left semi join throws SemanticException if where clause 
contains columnname with table alias
 Key: HIVE-14022
 URL: https://issues.apache.org/jira/browse/HIVE-14022
 Project: Hive
  Issue Type: Bug
Reporter: Jagruti Varia
Assignee: Jesus Camacho Rodriguez
 Fix For: 2.2.0


Left semi join throws following error if where clause contains column name with 
table alias
{noformat}
select * from src_emptybucket_partitioned_1 e1 left semi join 
src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
e3.year1=2016;
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: INFO ql.Driver: We are setting the hadoop caller 
context from  to hrt_qa_20160610223737_c3821398-d8df-44d8-9dd5-e66c9b7ed7c7
16/06/10 22:37:37 [main]: DEBUG parse.VariableSubstitution: Substitution is on: 
select * from src_emptybucket_partitioned_1 e1 left semi join 
src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
e3.year1=2016
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parsing command: select * 
from src_emptybucket_partitioned_1 e1 left semi join 
src_emptybucket_partitioned_3 e3 on e1.age =  e3.age where e1.year = 2015 and 
e3.year1=2016
16/06/10 22:37:37 [main]: INFO parse.ParseDriver: Parse Completed
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: DEBUG ql.Driver: Encoding valid txns info 
9223372036854775807:
16/06/10 22:37:37 [main]: INFO log.PerfLogger: 
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Starting Semantic Analysis
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed phase 1 of 
Semantic Analysis
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for source 
tables
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for subqueries
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Get metadata for 
destination tables
16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #194
16/06/10 22:37:37 [IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
ipc.Client: IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value #194
16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath took 
2ms
16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #195
16/06/10 22:37:37 [IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
ipc.Client: IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value #195
16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getEZForPath took 
1ms
16/06/10 22:37:37 [main]: DEBUG hdfs.DFSClient: 
/tmp/hive/hrt_qa/d2568b75-6399-46df-82b9-34ec445e8f64/hive_2016-06-10_22-37-37_392_2780828105665881901-1:
 masked=rwx--
16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #196
16/06/10 22:37:37 [IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
ipc.Client: IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value #196
16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: mkdirs took 2ms
16/06/10 22:37:37 [IPC Parameter Sending Thread #0]: DEBUG ipc.Client: IPC 
Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa sending #197
16/06/10 22:37:37 [IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa]: DEBUG 
ipc.Client: IPC Client (147022238) connection to 
jvaria-hive2-440-5.openstacklocal/172.22.126.47:8020 from hrt_qa got value #197
16/06/10 22:37:37 [main]: DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 
1ms
16/06/10 22:37:37 [main]: INFO ql.Context: New scratch dir is 
hdfs://jvaria-hive2-440-5.openstacklocal:8020/tmp/hive/hrt_qa/d2568b75-6399-46df-82b9-34ec445e8f64/hive_2016-06-10_22-37-37_392_2780828105665881901-1
16/06/10 22:37:37 [main]: INFO parse.CalcitePlanner: Completed getting MetaData 
in Semantic Analysis
16/06/10 22:37:37 [main]: INFO parse.BaseSemanticAnalyzer: Not invoking CBO 
because the statement has too few joins
16/06/10 22:37:37 [main]: DEBUG hive.log: DDL: struct 

[jira] [Created] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-15 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-14021:
--

 Summary: When converting to CNF, fail if the expression exceeds a 
threshold
 Key: HIVE-14021
 URL: https://issues.apache.org/jira/browse/HIVE-14021
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 2.1.0, 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


When converting to conjunctive normal form (CNF), fail if the expression 
exceeds a threshold. CNF can explode exponentially in the size of the input 
expression, but rarely does so in practice. Add a maxNodeCount parameter to 
RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14020) Hive MS restart failed during EU with ORA-00922 error as part of DB schema upgrade

2016-06-15 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-14020:


 Summary: Hive MS restart failed during EU with ORA-00922 error as 
part of DB schema upgrade
 Key: HIVE-14020
 URL: https://issues.apache.org/jira/browse/HIVE-14020
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


The underlying failure seems to be visible from --verbose : 

{noformat}
Metastore connection URL:jdbc:oracle:thin:@//172.22.66.99:1521/XE
Metastore Connection Driver :oracle.jdbc.driver.OracleDriver
Metastore connection User:   hiveuser
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.oracle.sql
Connecting to jdbc:oracle:thin:@//172.22.66.99:1521/XE
Connected to: Oracle (version Oracle Database 11g Express Edition Release 
11.2.0.2.0 - 64bit Production)
Driver: Oracle JDBC driver (version 11.2.0.4.0)
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:oracle:thin:@//172.22.66.99:1521/XE> !autocommit on
Autocommit status: true
0: jdbc:oracle:thin:@//172.22.66.99:1521/XE> SELECT 'Upgrading MetaStore schema 
from 2.0.0 to 2.1.0' AS Status from dual
+-+--+
| STATUS  |
+-+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
+-+--+
1 row selected (0.072 seconds)
0: jdbc:oracle:thin:@//172.22.66.99:1521/XE> CREATE TABLE IF NOT EXISTS  
KEY_CONSTRAINTS ( CHILD_CD_ID NUMBER, CHILD_INTEGER_IDX NUMBER, CHILD_TBL_ID 
NUMBER, PARENT_CD_ID NUMBER NOT NULL, PARENT_INTEGER_IDX ^M NUMBER NOT NULL, 
PARENT_TBL_ID NUMBER NOT NULL, POSITION NUMBER NOT NULL, CONSTRAINT_NAME 
VARCHAR(400) NOT NULL, CONSTRAINT_TYPE NUMBER NOT NULL, UPDATE_RULE NUMBER, 
DELETE_RULE NUMBER, ENABLE_VALIDATE_REL ^MY NUMBER NOT NULL ) 
Error: ORA-00922: missing or invalid option (state=42000,code=922)

Closing: 0: jdbc:oracle:thin:@//172.22.66.99:1521/XE
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore 
state would be inconsistent !!
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:250)
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:218)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:500)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at 
org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:390)
at 
org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:347)
at 
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:245)
... 8 more
*** schemaTool failed ***
{noformat}


At the face of it, it looks like bad ^Ms got added from the actual script ( 
034-HIVE-13076.oracle.sql ) that's provided:

{noformat}
CREATE TABLE IF NOT EXISTS  KEY_CONSTRAINTS
(
  CHILD_CD_ID NUMBER,
  CHILD_INTEGER_IDX NUMBER,
  CHILD_TBL_ID NUMBER,
  PARENT_CD_ID NUMBER NOT NULL,
  PARENT_INTEGER_IDX NUMBER NOT NULL,
  PARENT_TBL_ID NUMBER NOT NULL,
  POSITION NUMBER NOT NULL,
  CONSTRAINT_NAME VARCHAR(400) NOT NULL,
  CONSTRAINT_TYPE NUMBER NOT NULL,
  UPDATE_RULE NUMBER,
  DELETE_RULE NUMBER,
  ENABLE_VALIDATE_RELY NUMBER NOT NULL
) ;
ALTER TABLE KEY_CONSTRAINTS ADD CONSTRAINT CONSTRAINTS_PK PRIMARY KEY 
(CONSTRAINT_NAME, POSITION);
CREATE INDEX CONSTRAINTS_PT_INDEX ON KEY_CONSTRAINTS(PARENT_TBL_ID);
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14019) HiveServer2: Enable Kerberos with SSL for TCP transport

2016-06-15 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-14019:
---

 Summary: HiveServer2: Enable Kerberos with SSL for TCP transport
 Key: HIVE-14019
 URL: https://issues.apache.org/jira/browse/HIVE-14019
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 2.0.1, 1.2.1
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


Currently, there is a limitation where an HS2 user needs to use the 
{{auth-conf}} SASL qop value to achieve encryption when Kerberos is used as the 
authentication mechanism and transport is TCP. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14018) Make IN clause row selectivity estimation customizable

2016-06-15 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-14018:
--

 Summary: Make IN clause row selectivity estimation customizable
 Key: HIVE-14018
 URL: https://issues.apache.org/jira/browse/HIVE-14018
 Project: Hive
  Issue Type: Improvement
  Components: Statistics
Affects Versions: 2.1.0, 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
Priority: Minor


After HIVE-13287 went in, we calculate IN clause estimates natively (instead of 
just dividing incoming number of rows by 2). However, as the distribution of 
values of the columns is considered uniform, we might end up heavily 
underestimating/overestimating the resulting number of rows.

This issue is to add a factor that multiplies the IN clause estimation so we 
can alleviate this problem. The solution is not very elegant, but it is the 
best we can do until we have histograms to improve our estimate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 48500: HIVE-13982

2016-06-15 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48500/#review137744
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
 (line 247)


Can you add a comment that why we don't care for sql type?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
 (line 308)


Can you add a comment when this else block will be executed?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
 (line 352)


We get here if RS-*-RS and there is a single parent for 2nd RS. What if 
there is a PTF operator in between. Will merging be triggered in that case? If 
so, will it be valid?



ql/src/test/results/clientpositive/vector_groupby_reduce.q.out (line 788)


Plan change expected?


- Ashutosh Chauhan


On June 13, 2016, 12:06 p.m., Jesús Camacho Rodríguez wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48500/
> ---
> 
> (Updated June 13, 2016, 12:06 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-13982
> https://issues.apache.org/jira/browse/HIVE-13982
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-13982
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java
>  353d8db41af10512c94c0700a9bb06a07d660190 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java
>  1c3eb8155defa99a223ccf4ee4b072abb40a 
>   ql/src/test/queries/clientpositive/limit_pushdown2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/bucket_groupby.q.out 
> e198617c82b8ab4c3ad3d8b255975413fbdc382d 
>   ql/src/test/results/clientpositive/limit_pushdown2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/lineage3.q.out 
> 12ae13e388b3cb9c051cb419b75682fa4296d211 
>   ql/src/test/results/clientpositive/perf/query45.q.out 
> 04f9b02b019b6cf591dee48964a73fdb4a4b285f 
>   ql/src/test/results/clientpositive/spark/vectorization_14.q.out 
> cb3d9a4da84a379e00550ce7e31893b304d5e560 
>   ql/src/test/results/clientpositive/tez/explainuser_1.q.out 
> 1871c7e443cf775b09badc4cbf4b86e23ad9e525 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out 
> 553066039881f225634c08d93a9054df5636e5d2 
>   ql/src/test/results/clientpositive/tez/vector_groupby_reduce.q.out 
> 7f00b064e5a91b45282823e2725e11ab7f508b01 
>   ql/src/test/results/clientpositive/tez/vectorization_14.q.out 
> 2a598332207f4540defa21a107642aa0502e1a58 
>   ql/src/test/results/clientpositive/vector_groupby_reduce.q.out 
> bc23b365b02b505d0f8e79cdacca3449bf46ead3 
>   ql/src/test/results/clientpositive/vectorization_14.q.out 
> 6d4f13a23de5c184cd100af07ac19f24ba9fac4a 
> 
> Diff: https://reviews.apache.org/r/48500/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>



Re: [VOTE] Apache Hive 2.1.0 Release Candidate 2

2016-06-15 Thread Jesus Camacho Rodriguez
Hive PMC members,

Just a quick reminder that the vote for RC2 is still open and it needs two 
additional votes to pass.

Please test and cast your vote!

Thanks,
Jesús



On 6/10/16, 6:29 PM, "Alan Gates"  wrote:

>+1, checked signatures, did a build and ran a few simple unit tests.
>
>Alan.
>
>> On Jun 10, 2016, at 05:44, Jesus Camacho Rodriguez 
>>  wrote:
>> 
>> Apache Hive 2.1.0 Release Candidate 2 is available here:
>> 
>> http://people.apache.org/~jcamacho/hive-2.1.0-rc2
>> 
>> Maven artifacts are available here:
>> 
>> https://repository.apache.org/content/repositories/orgapachehive-1055/
>> 
>> Source tag for RC2 is at:
>> https://github.com/apache/hive/releases/tag/release-2.1.0-rc2
>> 
>> 
>> Voting will conclude in 72 hours.
>> 
>> Hive PMC Members: Please test and vote.
>> 
>> Thanks.
>> 
>> 
>
>


[jira] [Created] (HIVE-14017) Compaction failed when run on ACID table with extended schema

2016-06-15 Thread Hong Dai Thanh (JIRA)
Hong Dai Thanh created HIVE-14017:
-

 Summary: Compaction failed when run on ACID table with extended 
schema
 Key: HIVE-14017
 URL: https://issues.apache.org/jira/browse/HIVE-14017
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.1
 Environment: HDP 2.4.0/Hive 1.2.1 on RHEL 6
Reporter: Hong Dai Thanh


Create an ACID table, insert the data into the table, then extend the schema of 
the table by adding a column at the end, then add data to the table with the 
extended schema.

{code:borderStyle=solid}
drop table if exists test purge;

create table test (
  a int,
  b int
)
clustered by (a) into 10 buckets
stored as orc
tblproperties ('transactional' = 'true');

insert into test values (1, 1), (2, 2), (3, 3);
insert into test values (4, 4), (5, 5), (6, 6);


alter table test add columns (c int);

insert into test values (10, 10, 10), (11, 11, 11), (12, 12, 12);
{code}

We then run compaction on the table:

{code}alter table test compact 'major';{code}

However, the compaction job fails with the following exception:

{code}
2016-06-15 09:54:52,517 INFO [IPC Server handler 5 on 25906] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt 
attempt_1465960802609_0030_m_08_0 is : 0.0
2016-06-15 09:54:52,525 FATAL [IPC Server handler 4 on 25906] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1465960802609_0030_m_08_0 - exited : java.io.IOException: subtype 9 
exceeds the included array size 9 fileTypes [kind: STRUCT
subtypes: 1
subtypes: 2
subtypes: 3
subtypes: 4
subtypes: 5
subtypes: 6
fieldNames: "operation"
fieldNames: "originalTransaction"
fieldNames: "bucket"
fieldNames: "rowId"
fieldNames: "currentTransaction"
fieldNames: "row"
, kind: INT
, kind: LONG
, kind: INT
, kind: LONG
, kind: LONG
, kind: STRUCT
subtypes: 7
subtypes: 8
subtypes: 9
fieldNames: "_col0"
fieldNames: "_col1"
fieldNames: "_col2"
, kind: INT
, kind: INT
, kind: INT
] schemaTypes [kind: STRUCT
subtypes: 1
subtypes: 2
subtypes: 3
subtypes: 4
subtypes: 5
subtypes: 6
fieldNames: "operation"
fieldNames: "originalTransaction"
fieldNames: "bucket"
fieldNames: "rowId"
fieldNames: "currentTransaction"
fieldNames: "row"
, kind: INT
, kind: LONG
, kind: INT
, kind: LONG
, kind: LONG
, kind: STRUCT
subtypes: 7
subtypes: 8
subtypes: 9
fieldNames: "_col0"
fieldNames: "_col1"
fieldNames: "_col2"
, kind: INT
, kind: INT
, kind: INT
] innerStructSubtype -1
at 
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2066)
at 
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2492)
at 
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2072)
at 
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2492)
at 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:219)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:179)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:476)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRawReader(OrcInputFormat.java:1463)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:573)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorMap.map(CompactorMR.java:552)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 40867: HIVE-11527 - bypass HiveServer2 thrift interface for query results

2016-06-15 Thread Takanobu Asanuma


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHA.java,
> >  line 157
> > 
> >
> > Can you add a test with a join query as well? The join query should 
> > write the results in a new intermediate file on hdfs and it will be good to 
> > test that.

I added it in the latest patch.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java, line 552
> > 
> >
> > Nit: Usually an iterator implements 
> > https://docs.oracle.com/javase/7/docs/api/java/util/Iterator.html.

This class wrappers SequenceFile.Reader. And I think implementing an iterator 
is not suitable in this class. (Implementing hasNext() is not easy.) I just 
changed the class name to avoid confusion.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java, line 573
> > 
> >
> > Nit: getXXX methods are usually used to return something other than 
> > void.

I think the name is not valid. I changed the name in the latest patch.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java, line 590
> > 
> >
> > This should be a private method.

I fixed it.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1978
> > 
> >
> > Shouldn't the read happen independent of the file format? For example, 
> > the default format used to be TextFile until very recently and the user can 
> > as well choose to configure it that way.

I thought handling other formats will make code complex. If this should handle 
TextFile or other formats, I want to create a follow-up jira.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java, line 536
> > 
> >
> > This hardcodes the serde to LazySimpleSerde. There is work in 
> > https://issues.apache.org/jira/browse/HIVE-12049 where we write using a 
> > different serde in the final tasks. However, I'll create a follow-up jira 
> > for making this more generic.

I got it. Thank you.


> On 6月 3, 2016, 5:50 p.m., Vaibhav Gumashta wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1976
> > 
> >
> > We can create a follow-up jira to handle this.

Shall I create the jira?


- Takanobu


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40867/#review136073
---


On 6月 15, 2016, 6:50 a.m., Takanobu Asanuma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/40867/
> ---
> 
> (Updated 6月 15, 2016, 6:50 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This is a WIP patch for HIVE-11527
> 
> * I added a new configuration whose name is 
> hive.server2.webhdfs.bypass.enabled. The default is false. When this value is 
> true, clients use the bypass.
> 
> * I still have not considered security such as Kerberos and SSL at present.
> 
> * I have not implement Statement#setFetchSize for bypass yet.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 761dbb2 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHA.java 
> 84644d1 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
> 0c313a2 
>   itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 
> 637e51a 
>   jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java 92fdbca 
>   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java a242501 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 2263192 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
>   service-rpc/if/TCLIService.thrift 5a9a785 
>   service-rpc/src/gen/thrift/gen-cpp/TCLIService_types.h d23b3cd 
>   service-rpc/src/gen/thrift/gen-cpp/TCLIService_types.cpp 0f53cb2 
>   
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TColumnDesc.java
>  31472c8 
>   
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TExecuteStatementResp.java

Re: Review Request 40867: HIVE-11527 - bypass HiveServer2 thrift interface for query results

2016-06-15 Thread Takanobu Asanuma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/40867/
---

(Updated 6月 15, 2016, 6:50 a.m.)


Review request for hive.


Changes
---

I updated the patch based on Vaibhav's advice.


Repository: hive-git


Description
---

This is a WIP patch for HIVE-11527

* I added a new configuration whose name is 
hive.server2.webhdfs.bypass.enabled. The default is false. When this value is 
true, clients use the bypass.

* I still have not considered security such as Kerberos and SSL at present.

* I have not implement Statement#setFetchSize for bypass yet.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 761dbb2 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHA.java 
84644d1 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java 
0c313a2 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 
637e51a 
  jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java 92fdbca 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java a242501 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 2263192 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java dff1815 
  service-rpc/if/TCLIService.thrift 5a9a785 
  service-rpc/src/gen/thrift/gen-cpp/TCLIService_types.h d23b3cd 
  service-rpc/src/gen/thrift/gen-cpp/TCLIService_types.cpp 0f53cb2 
  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TColumnDesc.java
 31472c8 
  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TExecuteStatementResp.java
 7101fa5 
  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TGetTablesReq.java
 1aa3f94 
  
service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TProtocolVersion.java
 14d50ed 
  service-rpc/src/gen/thrift/gen-php/Types.php a6a257f 
  service-rpc/src/gen/thrift/gen-py/TCLIService/ttypes.py fcd330f 
  service-rpc/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 71148a0 
  service/src/java/org/apache/hive/service/cli/CLIService.java ed52b4a 
  service/src/java/org/apache/hive/service/cli/ColumnDescriptor.java bfd7135 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java d48b92c 
  service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
2f18231 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
3bf40eb 
  service/src/java/org/apache/hive/service/cli/session/HiveSession.java 78ff388 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
7341635 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 
8bc3d94 

Diff: https://reviews.apache.org/r/40867/diff/


Testing
---

I have tested few simple queries and they worked well. But I think there are 
some problems for some queries. I'm going to test more queries and fix bugs. 
I'm also going to add unit tests.


Thanks,

Takanobu Asanuma