Review Request 46365: Pass view's ColumnAccessInfo to HiveAuthorizer

2016-04-18 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46365/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

HIVE-13541


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/authorization/plugin/TestHiveAuthorizerCheckInvocation.java
 acf2663 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 65744ac 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java 
7638ba0 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelFieldTrimmer.java
 03002cc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnAccessAnalyzer.java dcc8daf 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 96df189 

Diff: https://reviews.apache.org/r/46365/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Created] (HIVE-13544) LLAP: Add tests for metrics

2016-04-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-13544:


 Summary: LLAP: Add tests for metrics
 Key: HIVE-13544
 URL: https://issues.apache.org/jira/browse/HIVE-13544
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Add unit tests for all llap metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13543) HiveServer2: Add unit tests configurable LDAP user key name, group membership key name and group class name

2016-04-18 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-13543:
---

 Summary: HiveServer2: Add unit tests configurable LDAP user key 
name, group membership key name and group class name  
 Key: HIVE-13543
 URL: https://issues.apache.org/jira/browse/HIVE-13543
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 2.1.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


HIVE-13295 made the above mentioned properties configurable. We should add unit 
tests for these.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13542) Missing stats for tables in TPCDS performance regression suite

2016-04-18 Thread Hari Sankar Sivarama Subramaniyan (JIRA)
Hari Sankar Sivarama Subramaniyan created HIVE-13542:


 Summary: Missing stats for tables in TPCDS performance regression 
suite
 Key: HIVE-13542
 URL: https://issues.apache.org/jira/browse/HIVE-13542
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan


These are the tables whose stats are missing in 
data/files/tpcds-perf/metastore_export/csv/TAB_COL_STATS.txt:

* catalog_returns
* catalog_sales
* inventory
* store_returns
* store_sales
* web_returns
* web_sales

Thanks to [~jcamachorodriguez] for discovering this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13541) Pass view's ColumnAccessInfo to HiveAuthorizer

2016-04-18 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-13541:
--

 Summary: Pass view's ColumnAccessInfo to HiveAuthorizer
 Key: HIVE-13541
 URL: https://issues.apache.org/jira/browse/HIVE-13541
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


RIght now, only table's ColumnAccessInfo is passed to HiveAuthorizer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13540) Casts to numeric types don't seem to work in hplsql

2016-04-18 Thread Carter Shanklin (JIRA)
Carter Shanklin created HIVE-13540:
--

 Summary: Casts to numeric types don't seem to work in hplsql
 Key: HIVE-13540
 URL: https://issues.apache.org/jira/browse/HIVE-13540
 Project: Hive
  Issue Type: Bug
  Components: hpl/sql
Reporter: Carter Shanklin
Assignee: Dmitry Tolpeko


Maybe I'm doing this wrong? But it seems to be broken.

Casts to string types seem to work fine, but not numbers.

This code:
{code}
temp_int = CAST('1' AS int);
print temp_int
temp_float   = CAST('1.2' AS float);
print temp_float
temp_double  = CAST('1.2' AS double);
print temp_double
temp_decimal = CAST('1.2' AS decimal(10, 4));
print temp_decimal
temp_string = CAST('1.2' AS string);
print temp_string
{code}

Produces this output:
{code}
[vagrant@hdp250 hplsql]$ hplsql -f temp2.hplsql
which: no hbase in 
(/usr/lib64/qt-3.3/bin:/usr/lib/jvm/java/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/puppetlabs/bin:/usr/local/share/jmeter/bin:/home/vagrant/bin)
WARNING: Use "yarn jar" to launch YARN applications.
null
null
null
null
1.2
{code}

The software I'm using is not anything released but is pretty close to the 
trunk, 2 weeks old at most.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Cannot connect Precommit Jenkins job

2016-04-18 Thread Wei Zheng
Which one is down? AWS or our job?
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/

Thanks,
Wei


[GitHub] hive pull request: HIVE-11417. Move the ReaderImpl and the RecordR...

2016-04-18 Thread omalley
GitHub user omalley opened a pull request:

https://github.com/apache/hive/pull/72

HIVE-11417. Move the ReaderImpl and the RecordReaderImpl to the ORC module.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omalley/hive hive-11417

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/72.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #72


commit 4132063b010e25d82707ef9312146f5437f1e6b5
Author: Owen O'Malley 
Date:   2016-04-15T21:16:35Z

Revert "HIVE-13522 : regexp_extract.q hangs on master (Ashutosh Chauhan via 
Thejas Nair)"

This reverts commit d567773ff4afe3a23a026e2f4e381c0fe897195b.

commit 043881ec757f19d587368ed9b1bee19c477b850a
Author: Owen O'Malley 
Date:   2016-04-16T01:07:40Z

HIVE-12159. Fix up for llap.

commit 7f3c7d2de0a392d1ecb39fd0f7f5c000cd7f6094
Author: Owen O'Malley 
Date:   2016-04-18T16:53:12Z

HIVE-12159: Create vectorized readers for the complex types (Owen O'Malley, 
reviewed by Matt McCline)

This reverts commit d559b34755010b5ed3ecc31fa423d01788e5e875.

commit 1c21992804d25b0bd762d3e95480bfe4b8cac931
Author: Owen O'Malley 
Date:   2016-03-26T02:39:12Z

HIVE-11417. Move the ReaderImpl and RowReaderImpl to the ORC module,
by making shims for the row by row reader.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (HIVE-13539) HiveHFileOutputFormat searching the wrong directory for HFiles?

2016-04-18 Thread Tim Robertson (JIRA)
Tim Robertson created HIVE-13539:


 Summary: HiveHFileOutputFormat searching the wrong directory for 
HFiles?
 Key: HIVE-13539
 URL: https://issues.apache.org/jira/browse/HIVE-13539
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 1.1.0
 Environment: Built into CDH 5.4.7
Reporter: Tim Robertson
Assignee: Sushanth Sowmyan
Priority: Blocker


When creating HFiles for a bulkload in HBase I believe it is looking in the 
wrong directory to find the HFiles, resulting in the following exception:

{code}
Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: 
java.io.IOException: Multiple family directories found in 
hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:295)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: Multiple family directories found in 
hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:188)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
... 7 more
Caused by: java.io.IOException: Multiple family directories found in 
hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary
at 
org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:158)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
... 11 more
{code}

The issue is that is looks for the HFiles in 
{{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary}}
 when I believe it should be looking in the task attempt subfolder, such as 
{{hdfs://c1n1.gbif.org:8020/user/hive/warehouse/tim.db/coords_hbase/_temporary/2/_temporary/attempt_1461004169450_0002_r_00_1000}}.

This can be reproduced in any HBase load such as:

{code:sql}
CREATE TABLE coords_hbase(id INT, x DOUBLE, y DOUBLE)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
  'hbase.columns.mapping' = ':key,o:x,o:y',
  'hbase.table.default.storage.type' = 'binary');

SET hfile.family.path=/tmp/coords_hfiles/o; 
SET hive.hbase.generatehfiles=true;

INSERT OVERWRITE TABLE coords_hbase 
SELECT id, decimalLongitude, decimalLatitude
FROM source
CLUSTER BY id; 
{code}

Any advice greatly appreciated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13538) LLAPTaskScheduler should consider local tasks before non-local tasks at the same priority level

2016-04-18 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-13538:
-

 Summary: LLAPTaskScheduler should consider local tasks before 
non-local tasks at the same priority level
 Key: HIVE-13538
 URL: https://issues.apache.org/jira/browse/HIVE-13538
 Project: Hive
  Issue Type: Improvement
Reporter: Siddharth Seth
Assignee: Siddharth Seth


To get better locality - try scheduling tasks with locality information, before 
tasks without locality information if they are at the same priority level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Bylaws change to allow some commits without review

2016-04-18 Thread Thejas Nair
?+1


From: Wei Zheng 
Sent: Monday, April 18, 2016 10:51 AM
To: u...@hive.apache.org
Subject: Re: [VOTE] Bylaws change to allow some commits without review

+1

Thanks,
Wei

From: Siddharth Seth >
Reply-To: "u...@hive.apache.org" 
>
Date: Monday, April 18, 2016 at 10:29
To: "u...@hive.apache.org" 
>
Subject: Re: [VOTE] Bylaws change to allow some commits without review

+1

On Wed, Apr 13, 2016 at 3:58 PM, Lars Francke 
> wrote:
Hi everyone,

we had a discussion on the dev@ list about allowing some forms of contributions 
to be committed without a review.

The exact sentence I propose to add is: "Minor issues (e.g. typos, code style 
issues, JavaDoc changes. At committer's discretion) can be committed after 
soliciting feedback/review on the mailing list and not receiving feedback 
within 2 days."

The proposed bylaws can also be seen here 


This vote requires a 2/3 majority of all Active PMC members so I'd love to get 
as many votes as possible. The vote will run for at least six days.

Thanks,
Lars



[jira] [Created] (HIVE-13537) Update slf4j version to 1.7.10

2016-04-18 Thread Siddharth Seth (JIRA)
Siddharth Seth created HIVE-13537:
-

 Summary: Update slf4j version to 1.7.10
 Key: HIVE-13537
 URL: https://issues.apache.org/jira/browse/HIVE-13537
 Project: Hive
  Issue Type: Task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
Priority: Minor


Both Hadoop and Tez are on 1.7.10, and have been for a while. We should update 
hive to use this version as well. Should get rid of some of the noise in the 
logs related to multiple version.s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13536) LLAP: Add metrics for task scheduler

2016-04-18 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-13536:


 Summary: LLAP: Add metrics for task scheduler
 Key: HIVE-13536
 URL: https://issues.apache.org/jira/browse/HIVE-13536
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 2.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


Currently there are no metrics for task scheduler. It will be useful to provide 
one. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] Vote going on in user@ about Bylaws change

2016-04-18 Thread Lars Francke
Hi everyone,

sorry for bumping this but I'd love some more votes in the current thread
on the user@ mailing list ("[VOTE] Bylaws change to allow some commits
without review").

I need as many votes from PMC members as possible but I'd also love
contributions from others.

Thank you!


Re: Reviews & commits (RTC/CTR), contributions, bylaws

2016-04-18 Thread Lars Francke
Hi everyone,

could a few more PMC members please head over to the user@ mailing list and
vote?

Thank you!

On Thu, Apr 14, 2016 at 9:14 AM, Lars Francke 
wrote:

> Okay I have started a VOTE thread on the user@ mailing list (as per the
> bylaws). I would appreciate it if you could head over there and vote :)
>
> Thank you!
>
> On Thu, Apr 14, 2016 at 12:41 AM, Lars Francke 
> wrote:
>
>> Thanks for the +1 Alan.
>>
>> I agree that we're leaving potential contributions on the floor. Doing
>> more reviews is definitely a very good step in the right direction. Thank
>> you! I see this Bylaws change as another (small) step in the right
>> direction. I'm sure we can come up with more ideas.
>>
>> I'll start a VOTE thread on the user@ mailing list.
>>
>> On Tue, Apr 12, 2016 at 5:32 PM, Alan Gates  wrote:
>>
>>> I’m +1 on this change of allowing simple cleanup changes without
>>> requiring a full review.
>>>
>>> But jumping to this fix obscures a bigger problem we have as a
>>> community.  This fix only works for committers, not for non-committers who
>>> may also contribute such patches.  And it doesn’t solve the situation for
>>> non-trivial patches.  We’re leaving potential contributions on the floor
>>> and keeping people out of our community.  We need to solve this.
>>>
>>> One thing I’ve been doing over the last few months is set up a filter in
>>> JIRA for components that I know well (metastore, acid, etc.) and then put a
>>> recurring task in my task tracker app to review a patch every day.
>>> Realistically I manage 2-3 reviews a week, but that’s 1-2 more than I was
>>> doing before.  I encourage my fellow committers to find something that
>>> works for them.  We need to improve the health of our community.
>>>
>>> Alan.
>>>
>>> > On Apr 12, 2016, at 07:56, Lars Francke 
>>> wrote:
>>> >
>>> > Thanks Thejas for the suggestion & others for jumping in. That seems
>>> fine
>>> > for me. 2 days also seems good. Holidays are different in almost every
>>> > country so I wouldn't exclude those.
>>> >
>>> > I have followed the procedure used for the last Bylaws change and
>>> created a
>>> > new Wiki page here: <
>>> >
>>> https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Project+Bylaws+-+April+2016
>>> >> .
>>> >
>>> > It includes this paragraph: "Minor issues (e.g. typos, code style
>>> issues,
>>> > JavaDoc changes. At committer's discretion) can be committed after
>>> > soliciting feedback/review on the mailing list and not receiving
>>> feedback
>>> > within 2 days."
>>> > I'm not a native speaker so feedback is welcome.
>>> >
>>> > I also fixed three typos in the Bylaws (and marked them as changed): <
>>> >
>>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=62691925=3=2
>>> >>
>>> >
>>> > Once the discussion settles down I'll open a vote thread on the user@
>>> > mailing list which requires a 2/3 majority of all active PMC members. I
>>> > couldn't find a definition of "active" though.
>>> >
>>> > On Mon, Apr 11, 2016 at 10:26 PM, Thejas Nair 
>>> wrote:
>>> >
>>> >> I agree we have a problem here. At least patches as small as this
>>> >> shouldn't take too long to get reviewed.
>>> >>
>>> >> Knox seems to consider a very large set of patches as being under CTR
>>> >> process.
>>> >> I think hive is very large and mature project that I would lean
>>> >> towards RTC process for most issues. I think we can make an exception
>>> >> for very minor patches such as fixing typos and and checkstyle issues.
>>> >> Maybe the process can be to solicit reviews for such minor patches by
>>> >> sending an email to dev@ list and if no response is seen in 2 days,
>>> go
>>> >> ahead and commit it ?
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Apr 11, 2016 at 6:38 AM, Lars Francke >> >
>>> >> wrote:
>>> >>> Hi,
>>> >>>
>>> >>> I've been a long-time contributor to Hive (5 or so years) and have
>>> been
>>> >>> voted in as a committer and I'm very grateful for that. I also
>>> understand
>>> >>> that my situation is different than most or lots of committers as
>>> I'm not
>>> >>> working for one of the big companies (Facebook, Cloudera, Hortonworks
>>> >> etc.)
>>> >>> where you can just ask someone sitting next to you to do a review.
>>> >>>
>>> >>> I'd really like to contribute more than I do currently but the
>>> process of
>>> >>> getting patches in is painful for me (and other 'outside'
>>> contributors)
>>> >> as
>>> >>> it is hard to get reviews & things committed. The nature of most of
>>> my
>>> >>> patches is very minor[1] (fixing typos, checkstyle issues etc.) and I
>>> >>> understand that these are not the most interesting patches to review
>>> and
>>> >>> are easy to miss. I don't blame anyone for this situation as I
>>> totally
>>> >>> understand it and have been on the other side of this for other
>>> projects.
>>> >>>

[jira] [Created] (HIVE-13535) UPDATE ... SET only working when properties are set globally.

2016-04-18 Thread Bibin Joseph (JIRA)
Bibin Joseph created HIVE-13535:
---

 Summary: UPDATE ... SET only working when properties are set 
globally.
 Key: HIVE-13535
 URL: https://issues.apache.org/jira/browse/HIVE-13535
 Project: Hive
  Issue Type: Bug
  Components: Hive
 Environment: Operating System : SUSE Linux Enterprise Server 11 
(x86_64)
Architecture : x86_64
CPU op-mode(s) : 32-bit, 64-bit
Byte Order : Little Endian
Hadoop version : Hadoop 2.7.1.2.3.2.0-2950
Hive version : Hive 1.2.1.2.3.2.0-2950

Reporter: Bibin Joseph
Priority: Minor


h3.Making a hive table transactional
*Steps followed*
1. Entered hive shell.
2. Choose database.
3. Set hive transaction properties.
{quote}
SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;
SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
{quote}
4. Created bucketed table with 'transactional'='true'
{quote}
create table test(id int, name string) clustered by (id) into 2 buckets
stored as orc TBLPROPERTIES ('transactional'='true');
{quote}
5. Inserted values.
{quote}
insert into table test values(1,'Name');
{quote}
6. Fired update query.
{quote}
update test set name='New_Name' where id=1;
{quote}

Created following error ;
{quote}
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using 
transaction manager that does not support these operations.
{quote}

*It works without error when the transaction properties are set, before 
choosing database.*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13534) xception when trying to access TIMESTAMP columns into parquet file using hive external table

2016-04-18 Thread Bobeff (JIRA)
Bobeff created HIVE-13534:
-

 Summary: xception when trying to access TIMESTAMP columns into 
parquet file using hive external table 
 Key: HIVE-13534
 URL: https://issues.apache.org/jira/browse/HIVE-13534
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Hive, Import/Export, JDBC
Reporter: Bobeff
Assignee: Sushanth Sowmyan
Priority: Critical


Imported data was stored from a netezza datasource using a sqoop import command 
like this 

SQL DDL creation script of imported table looks like this 

CREATE TABLE "ADMIN"."MIS_AUX_ITR" ( 
"DDEBVAL" DATE, 
"DFINVAL" DATE, 
"NAUX" VARCHAR(6), 
"CDMNITR" VARCHAR(3), 
"CDERIMG" VARCHAR(1), 
"DDERIMG" DATE 
); 

Import sqoop job is the following 

sqoop job 
--create import-name 
-- import 
--connect jdbc:netezza://server:port/database 
--username user 
--password pwd 
--table MIS_AUX_ITR 
--as-parquetfile 
--target-dir hdfs:///prod/ZA/dee/MIS_AUX_ITR 
-m 1 

After import parquet file schema is the following 

> yarn jar /tmp/parquet-tools-1.6.0.jar schema 
> /prod/ZA/dee/MIS_AUX_ITR/2cf3e971-4c2c-408f-bd86-5d3cf3bd4fa5.parquet 

message MIS_AUX_ITR { 
optional int64 DDEBVAL; 
optional int64 DFINVAL; 
optional binary NAUX (UTF8); 
optional binary CDMNITR (UTF8); 
optional binary CDERIMG (UTF8); 
optional int64 DDERIMG; 
} 

In order to access data stored into the parquet file we created the external 
table below 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR 
( 
`DDEBVAL`   DATE, 
`DFINVAL`   DATE, 
`NAUX`  VARCHAR(6), 
`CDMNITR`   VARCHAR(3), 
`CDERIMG`   VARCHAR(1), 
`DDERIMG`   DATE 
) 
COMMENT 'Table DEE MIS_AUX_ITR' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR'; 


But when we try to list data from external table above we get the following 
exception 

hive> CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD 
> ( 
> `DDEBVAL`DATE, 
> `DFINVAL`DATE, 
> `NAUX`VARCHAR(6), 
> `CDMNITR`VARCHAR(3), 
> `CDERIMG`VARCHAR(1), 
> `DDERIMG`DATE 
> ) 
> COMMENT 'Table DEE MIS_AUX_ITR_V_PROD' 
> STORED AS PARQUET 
> LOCATION 
> '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 
OK 
Time taken: 0.196 seconds 
hive> select * from za_dee.MIS_AUX_ITR_V_PPROD limit 100; 
OK 
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". 
SLF4J: Defaulting to no-operation (NOP) logger implementation 
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details. 
Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast 
to org.apache.hadoop.hive.serde2.io.DateWritable 
Time taken: 0.529 seconds 
hive> 


We also tried with the following external table 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_BI 
( 
`DDEBVAL`   BIGINT, 
`DFINVAL`   BIGINT, 
`NAUX`  VARCHAR(6), 
`CDMNITR`   VARCHAR(3), 
`CDERIMG`   VARCHAR(1), 
`DDERIMG`   BIGINT 
) 
COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_BI' 
STORED AS PARQUET 
LOCATION '/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 

Then the “Date” columns are shown as “timestamp” values as below 
hive> select DDEBVAL from za_dee.MIS_AUX_ITR_V_PPROD_BI limit 5; 
OK 
108077040 
108077040 
108077040 
108077040 
108077040 
Time taken: 0.081 seconds, Fetched: 5 row(s) 
hive> 

However “Date” values can be listed by casting as Timestamp 
hive> select cast(DDEBVAL as Timestamp) from za_dee.MIS_AUX_ITR_V_PPROD_BI 
limit 5; 
OK 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
2004-04-01 00:00:00 
Time taken: 0.087 seconds, Fetched: 5 row(s) 
hive> 

We also have tested with an external table using TIMESTAMP type as shown below 
CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR 
( 
`DDEBVAL`   TIMESTAMP, 
`DFINVAL`   TIMESTAMP, 
`NAUX`  VARCHAR(6), 
`CDMNITR`   VARCHAR(3), 
`CDERIMG`   VARCHAR(1), 
`DDERIMG`   TIMESTAMP 
) 
COMMENT 'Table DEE MIS_AUX_ITR' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR'; 

But we got the same behavior: an exception when trying to access data from an 
Oracle DB.

I tried this 

CREATE EXTERNAL TABLE za_dee.MIS_AUX_ITR_V_PPROD_TS 
( 
`DDEBVAL`   TIMESTAMP, 
`DFINVAL`   TIMESTAMP, 
`NAUX`  VARCHAR(6), 
`CDMNITR`   VARCHAR(3), 
`CDERIMG`   VARCHAR(1), 
`DDERIMG`   TIMESTAMP 
) 
COMMENT 'Table DEE MIS_AUX_ITR_V_PROD_TS' 
STORED AS PARQUET 
LOCATION 
'/prod/ZA/dee/MIS_AUX_ITR_V_PPROD'; 

and then i created and launched the sqoop job below 

sqoop job --create import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD -- 
import 
--connect jdbc:netezza:/:/db 
--username  
--password  
--table MIS_AUX_ITR_V 
--as-parquetfile 
--hive-import 
--hive-overwrite 
--hive-database za_dee 
--hive-table MIS_AUX_ITR_V_PPROD_TS 
-m 1 

sqoop job --exec import-za_dee-MIS_AUX_ITR_V-full-default-import-PPROD 

the raising error is the following 

16/04/11 

[jira] [Created] (HIVE-13533) Remove AST dump

2016-04-18 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-13533:
--

 Summary: Remove AST dump
 Key: HIVE-13533
 URL: https://issues.apache.org/jira/browse/HIVE-13533
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For very large queries, dumping the AST can lead to OOM errors. Currently there 
are two places where we dump the AST:

- CalcitePlanner if we are running in DEBUG mode (line 300).
- ExplainTask if we use extended explain (line 179).

I guess the original reason to add the dump was to check whether the AST 
conversion from CBO was working properly, but I think we are past that stage 
now.

We will remove the logic to dump the AST in explain extended. For debug mode in 
CalcitePlanner, we will lower the level to LOG.TRACE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)