from:"Hari Sekhon \(JIRA\)"

[jira] [Updated] (HIVE-20666) HiveServer2 Interactive LLAP reconnect to already running Yarn app

2018-10-03 Thread Hari Sekhon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-20666:
---
Description: 
Improve HiveServer2 Interactive LLAP to (re)connect to already running hive 
llap yarn app.

Currently HiveServer2 Interactive startup may fail with the following error if 
it cannot get enough containers on the queue:
{code:java}
WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
state RUNNING is attained.
2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
Desired Instance : '4'
2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
2018-10-01 16:26:55,625 - Stopping LLAP
2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
True, 'user': 'hive', 'stderr': -1}{code}

I could meanwhile see 5 containers for a previous hive llap invocation in the 
yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
it appears it wasn't (re)connecting and making use of the running llap app. 
It's also possible that the containers were simply slow to allocate as the 
cluster was operating at 100% capacity and therefore weren't fully initialized 
when the app failed, but the error feedback doesn't give enough details about 
the state of the llap0 app.

  was:
Improve HiveServer2 Interactive LLAP to reconnect to already running hive llap 
yarn app.

Currently HiveServer2 Interactive startup may fail with the following error if 
it cannot get enough containers on the queue:
{code:java}
WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
state RUNNING is attained.
2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
Desired Instance : '4'
2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
2018-10-01 16:26:55,625 - Stopping LLAP
2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
True, 'user': 'hive', 'stderr': -1}{code}

I could meanwhile see 5 containers for a previous hive llap invocation in the 
yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
it appears it wasn't reconnecting and making use of the running llap app.


> HiveServer2 Interactive LLAP reconnect to already running Yarn app
> --
>
> Key: HIVE-20666
> URL: https://issues.apache.org/jira/browse/HIVE-20666
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, llap
>Affects Versions: 1.2.1
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve HiveServer2 Interactive LLAP to (re)connect to already running hive 
> llap yarn app.
> Currently HiveServer2 Interactive startup may fail with the following error 
> if it cannot get enough containers on the queue:
> {code:java}
> WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
> state RUNNING is attained.
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
> Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
> 2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
> Desired Instance : '4'
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
> 2018-10-01 16:26:55,625 - Stopping LLAP
> 2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
> True, 'user': 'hive', 'stderr': -1}{code}
> I could meanwhile see 5 containers for a previous hive llap invocation in the 
> yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
> it appears it wasn't (re)connecting and making use of the running llap app. 
> It's also possible that the containers were simply slow to allocate as the 
> cluster was operating at 100% capacity and therefore weren't fully 
> initialized when the app failed, but the error feedback doesn't give enough 
> details about the state of the llap0 app.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20666) HiveServer2 Interactive LLAP (re)connect to already running Yarn llap0 app

2018-10-03 Thread Hari Sekhon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-20666:
---
Summary: HiveServer2 Interactive LLAP (re)connect to already running Yarn 
llap0 app  (was: HiveServer2 Interactive LLAP reconnect to already running Yarn 
app)

> HiveServer2 Interactive LLAP (re)connect to already running Yarn llap0 app
> --
>
> Key: HIVE-20666
> URL: https://issues.apache.org/jira/browse/HIVE-20666
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, llap
>Affects Versions: 1.2.1
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve HiveServer2 Interactive LLAP to (re)connect to already running hive 
> llap yarn app.
> Currently HiveServer2 Interactive startup may fail with the following error 
> if it cannot get enough containers on the queue:
> {code:java}
> WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
> state RUNNING is attained.
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
> Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
> 2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
> Desired Instance : '4'
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
> 2018-10-01 16:26:55,625 - Stopping LLAP
> 2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
> True, 'user': 'hive', 'stderr': -1}{code}
> I could meanwhile see 5 containers for a previous hive llap invocation in the 
> yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
> it appears it wasn't (re)connecting and making use of the running llap app. 
> It's also possible that the containers were simply slow to allocate as the 
> cluster was operating at 100% capacity and therefore weren't fully 
> initialized when the app failed, but the error feedback doesn't give enough 
> details about the state of the llap0 app.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20666) HiveServer2 Interactive LLAP reconnect to already running Yarn app

2018-10-01 Thread Hari Sekhon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-20666:
---
Description: 
Improve HiveServer2 Interactive LLAP to reconnect to already running hive llap 
yarn app.

Currently HiveServer2 Interactive startup may fail with the following error if 
it cannot get enough containers on the queue:
{code:java}
WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
state RUNNING is attained.
2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
Desired Instance : '4'
2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
2018-10-01 16:26:55,625 - Stopping LLAP
2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
True, 'user': 'hive', 'stderr': -1}{code}

I could meanwhile see 5 containers for a previous hive llap invocation in the 
yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
it appears it wasn't reconnecting and making use of the running llap app.

  was:
Improve HiveServer2 Interactive LLAP to reconnect to already running hive llap 
yarn app.

Currently HiveServer2 Interactive startup may fail with the following error if 
it cannot get enough containers on the queue:
{code:java}
WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
state RUNNING is attained.
2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
Desired Instance : '4'
2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
2018-10-01 16:26:55,625 - Stopping LLAP
2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
True, 'user': 'hive', 'stderr': -1}{code}


> HiveServer2 Interactive LLAP reconnect to already running Yarn app
> --
>
> Key: HIVE-20666
> URL: https://issues.apache.org/jira/browse/HIVE-20666
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, llap
>Affects Versions: 1.2.1
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve HiveServer2 Interactive LLAP to reconnect to already running hive 
> llap yarn app.
> Currently HiveServer2 Interactive startup may fail with the following error 
> if it cannot get enough containers on the queue:
> {code:java}
> WARN cli.LlapStatusServiceDriver: Watch timeout 200s exhausted before desired 
> state RUNNING is attained.
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' in 'RUNNING_PARTIAL' state. Live 
> Instances : '3'. Desired Instances : '4' after 212.498996019 secs.
> 2018-10-01 16:26:55,624 - App state is RUNNING_PARTIAL. Live Instances : '3', 
> Desired Instance : '4'
> 2018-10-01 16:26:55,624 - LLAP app 'llap0' deployment unsuccessful.
> 2018-10-01 16:26:55,625 - Stopping LLAP
> 2018-10-01 16:26:55,625 - call[['slider', 'stop', u'llap0']] {'logoutput': 
> True, 'user': 'hive', 'stderr': -1}{code}
> I could meanwhile see 5 containers for a previous hive llap invocation in the 
> yarn scheduler page and this is the only HiveServer2 Interactive instance, so 
> it appears it wasn't reconnecting and making use of the running llap app.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-7782) tez default engine not overridden by hive.execution.engine=mr in hive cli session

2018-06-05 Thread Hari Sekhon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-7782:
--
Description: 
I've deployed hive.execution.engine=tez as the default on my secondary HDP 
cluster I find that hive cli interactive sessions where I do
{code:java}
set hive.execution.engine=mr
{code}
still execute with Tez as shown in the Resource Manager applications view. Now 
this may make sense since it's connected a Tez session by that point but it's 
also misleading because the job progress output in the cli changes to look like 
MapReduce rather than Tez and the query time is increased from 8 to to 15-16 
secs but still less than the 25-30+ secs I usually see with MR. The Resource 
Manager shows both of these jobs as TEZ application type regardless of setting 
hive.execution.engine=mr. Is this a bug in the way Hive is submitting the job 
(Tez vs MR) or a bug in the way the RM is reporting it?
{code:java}
hive

Logging initialized using configuration in 
file:/etc/hive/conf.dist/hive-log4j.properties
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_c03824c7-0e76-4507-b619-6a22cb0fbc4c
Total jobs = 1
Launching Job 1 out of 1


Status: Running (application id: application_1408444369445_0031)

Map 1: -/-  Reducer 2: 0/1
Map 1: 0/1  Reducer 2: 0/1
Map 1: 0/1  Reducer 2: 0/1
Map 1: 1/1  Reducer 2: 0/1
Map 1: 1/1  Reducer 2: 1/1
Status: Finished successfully
OK
823
Time taken: 8.492 seconds, Fetched: 1 row(s)
hive> set hive.execution.engine=mr;
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_b620d990-b405-479c-be5b-d9616527cefe
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1408444369445_0032, Tracking URL = 
http://:8088/proxy/application_1408444369445_0032/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1408444369445_0032
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-08-19 16:48:35,242 Stage-1 map = 0%,  reduce = 0%
2014-08-19 16:48:40,539 Stage-1 map = 100%,  reduce = 0%
2014-08-19 16:48:44,676 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_1408444369445_0032
MapReduce Jobs Launched:
Job 0:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
823
Time taken: 16.579 seconds, Fetched: 1 row(s)
{code}
If I exit hive shell and restart it instead using
{code:java}
--hiveconf hive.execution.engine=mr{code}
to set before session is established then it does a proper MapReduce job 
according to RM and it also takes the longer expected 25 secs instead of the 8 
in Tez or 15 in trying to do MR instead Tez session.

  was:
I've deployed hive.execution.engine=tez as the default on my secondary HDP 
cluster I find that hive cli interactive sessions where I do
{code}
set hive.execution.engine=mr
{code}
still execute with Tez as shown in the Resource Manager applications view. Now 
this may make sense since it's connected a Tez session by that point but it's 
also misleading because the job progress output in the cli changes to look like 
MapReduce rather than Tez and the query time is increased from 8 to to 15-16 
secs but still less than the 25-30+ secs I usually see with MR. The Resource 
Manager shows both of these jobs as TEZ application type regardless of setting 
hive.execution.engine=mr. Is this a bug in the way Hive is submitting the job 
(Tez vs MR) or a bug in the way the RM is reporting it?
{code}
hive

Logging initialized using configuration in 
file:/etc/hive/conf.dist/hive-log4j.properties
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_c03824c7-0e76-4507-b619-6a22cb0fbc4c
Total jobs = 1
Launching Job 1 out of 1


Status: Running (application id: application_1408444369445_0031)

Map 1: -/-  Reducer 2: 0/1
Map 1: 0/1  Reducer 2: 0/1
Map 1: 0/1  Reducer 2: 0/1
Map 1: 1/1  Reducer 2: 0/1
Map 1: 1/1  Reducer 2: 1/1
Status: Finished successfully
OK
823
Time taken: 8.492 seconds, Fetched: 1 row(s)
hive> set hive.execution.engine=mr;
hive> select count(*) from sample_07;
Query ID = hari_20140819164848_b620d990-b405-479c-be5b-d9616527cefe
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapreduce.job.reduces=
Starting Job = job_1408444369445_0032, Tracking URL = 
http://lonsl1101827-data.uk.net.intra:8088/proxy/application_1408444369445_0032/
Kill

[jira] [Commented] (HIVE-12349) NPE in ORC SARG for IS NULL queries on Timestamp and Date columns

2015-12-03 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037708#comment-15037708
 ] 

Hari Sekhon commented on HIVE-12349:


Any ideas on the timeline for Hive 1.3 release including this fix? This problem 
is currently affecting my users so we're eager to get a new release of Hive.

Hortonworks HDP 2.4 is due to be released some time this or next month based on 
the bi-annual release schedule, would be good to get Hive 1.3 included in that 
release.

> NPE in ORC SARG for IS NULL queries on Timestamp and Date columns
> -
>
> Key: HIVE-12349
> URL: https://issues.apache.org/jira/browse/HIVE-12349
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
>
> IS NULL queries can trigger an NPE for timestamp and date columns. All column 
> values per row group or stripe should be NULL to trigger this case. Following 
> is the exception stack trace
> {code}
> Caused by: java.lang.NullPointerException 
> at 
> org.apache.hadoop.hive.ql.io.orc.ColumnStatisticsImpl$TimestampStatisticsImpl.getMinimum(ColumnStatisticsImpl.java:795)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.getMin(RecordReaderImpl.java:2343)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicate(RecordReaderImpl.java:2366)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:2564)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:2627)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:3060)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:3102)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:288)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:534) 
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:183)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:226)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:437)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1141)
>  
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1039)
>  
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
>  
> ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12241) SQLStdAuth grant on .* not recognized

2015-11-03 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987372#comment-14987372
 ] 

Hari Sekhon commented on HIVE-12241:


Agreed Ranger is nice but not everybody has Ranger, I have it on the other 
cluster but haven't deployed it to this one yet, and other distributions don't 
even have Ranger yet.

>From Hive's perspective this should be solvable using native SQL grants.

The whole point of SQLStdAuth was to be similar to grants in RDBMS and since 
they can do grant  on .* then Hive should support this syntax also.

> SQLStdAuth grant on .* not recognized
> -
>
> Key: HIVE-12241
> URL: https://issues.apache.org/jira/browse/HIVE-12241
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Using SQLStdAuthorizer Hive doesn't recognize doing a grant on all tables 
> like I've done before in RDBMS. If having a lot of tables this becomes very 
> inconvenient to grant on a table-by-table basis and granting on database 
> succeeds but still doesn't allow user to query tables in that database:
> {code}
> > grant all on myDB.* to user hari;
> Error: Error while compiling statement: FAILED: ParseException line 1:15 
> mismatched input '.' expecting TO near 'myDB' in grant privileges 
> (state=42000,code=4)
> > grant all on myDB.`*` to user hari;
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.* (state=42S02,code=10001)
> > grant all on `myDB.*` to user hari;  
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.* (state=42S02,code=10001)
> > grant all on all to user hari;   
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.all (state=42S02,code=10001)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12242) SQLStdAuth add ability to map Roles to existing Groups

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12242:
---
Summary: SQLStdAuth add ability to map Roles to existing Groups  (was: 
SQLStdAuth map Roles to existing Groups)

> SQLStdAuth add ability to map Roles to existing Groups
> --
>
> Key: HIVE-12242
> URL: https://issues.apache.org/jira/browse/HIVE-12242
> Project: Hive
>  Issue Type: New Feature
>  Components: SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request to be able to map Hive roles to groups so that LDAP groups 
> can be reused rather than having to recreate all the information in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12242) SQLStdAuth map Roles to existing Groups

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12242:
---
Summary: SQLStdAuth map Roles to existing Groups  (was: SQLStdAuth map 
roles to groups (ldap))

> SQLStdAuth map Roles to existing Groups
> ---
>
> Key: HIVE-12242
> URL: https://issues.apache.org/jira/browse/HIVE-12242
> Project: Hive
>  Issue Type: New Feature
>  Components: SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request to be able to map Hive roles to groups so that LDAP groups 
> can be reused rather than having to recreate all the information in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Component/s: SQLStandardAuthorization

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Hive, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs (despite "should use beeline", an NPE is still a bug):
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6674) "show grant on all" throws NPE

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-6674:
--
Affects Version/s: (was: 0.14.0)

> "show grant on all" throws NPE
> --
>
> Key: HIVE-6674
> URL: https://issues.apache.org/jira/browse/HIVE-6674
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
>Assignee: Navis
> Fix For: 0.13.0
>
> Attachments: HIVE-6674.1.patch.txt, HIVE-6674.2.patch.txt
>
>
> "show grant on all" is supposed to show all the grants in the system on all 
> the objects. But it fails with NPE with both SQL standard auth, and legacy 
> auth.
> {code}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.listPrincipalAllDBGrant(ObjectStore.java:4206)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.listPrincipalDBGrantsAll(ObjectStore.java:4169)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy6.listPrincipalDBGrantsAll(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.list_db_privileges(HiveMetaStore.java:4295)
> ... 36 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12241) SQLStdAuth grant on .* not recognized

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12241:
---
Component/s: SQLStandardAuthorization

> SQLStdAuth grant on .* not recognized
> -
>
> Key: HIVE-12241
> URL: https://issues.apache.org/jira/browse/HIVE-12241
> Project: Hive
>  Issue Type: Bug
>  Components: Parser, SQLStandardAuthorization
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Using SQLStdAuthorizer Hive doesn't recognize doing a grant on all tables 
> like I've done before in RDBMS. If having a lot of tables this becomes very 
> inconvenient to grant on a table-by-table basis and granting on database 
> succeeds but still doesn't allow user to query tables in that database:
> {code}
> > grant all on myDB.* to user hari;
> Error: Error while compiling statement: FAILED: ParseException line 1:15 
> mismatched input '.' expecting TO near 'myDB' in grant privileges 
> (state=42000,code=4)
> > grant all on myDB.`*` to user hari;
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.* (state=42S02,code=10001)
> > grant all on `myDB.*` to user hari;  
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.* (state=42S02,code=10001)
> > grant all on all to user hari;   
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10001]: Table not found myDB.all (state=42S02,code=10001)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-6674) "show grant on all" throws NPE

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-6674:
--
Affects Version/s: 0.14.0

> "show grant on all" throws NPE
> --
>
> Key: HIVE-6674
> URL: https://issues.apache.org/jira/browse/HIVE-6674
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
>Reporter: Thejas M Nair
>Assignee: Navis
> Fix For: 0.13.0
>
> Attachments: HIVE-6674.1.patch.txt, HIVE-6674.2.patch.txt
>
>
> "show grant on all" is supposed to show all the grants in the system on all 
> the objects. But it fails with NPE with both SQL standard auth, and legacy 
> auth.
> {code}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.listPrincipalAllDBGrant(ObjectStore.java:4206)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.listPrincipalDBGrantsAll(ObjectStore.java:4169)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy6.listPrincipalDBGrantsAll(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.list_db_privileges(HiveMetaStore.java:4295)
> ... 36 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12240) HiveServer2 put username in job name when doAs=false (for viewing in RM otherwise all jobs show as 'hive')

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12240:
---
Description: When using HiveServer2 with SQLStdAuthorizer and doAs=false 
please put the username in the HIVE- job submission so that we can see 
which user submitted which job in the RM queues as all the jobs just show as 
'hive' user (this is running on Tez, so the query isn't shown in the name 
column making differentiating queries/job difficult).  (was: When using 
HiveServer2 with SQLStdAuthorizer and doAs=false please put the username in the 
HIVE- job submission so that we can see which user submitted which job in 
the RM queues as all the jobs just show as 'hive' user.)

> HiveServer2 put username in job name when doAs=false (for viewing in RM 
> otherwise all jobs show as 'hive')
> --
>
> Key: HIVE-12240
> URL: https://issues.apache.org/jira/browse/HIVE-12240
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> When using HiveServer2 with SQLStdAuthorizer and doAs=false please put the 
> username in the HIVE- job submission so that we can see which user 
> submitted which job in the RM queues as all the jobs just show as 'hive' user 
> (this is running on Tez, so the query isn't shown in the name column making 
> differentiating queries/job difficult).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12240) HiveServer2 put username in job when doAs=false (for viewing in RM otherwise all jobs show as 'hive')

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12240:
---
Summary: HiveServer2 put username in job when doAs=false (for viewing in RM 
otherwise all jobs show as 'hive')  (was: HiveServer2 put username in job when 
doAs=false for viewing in RM)

> HiveServer2 put username in job when doAs=false (for viewing in RM otherwise 
> all jobs show as 'hive')
> -
>
> Key: HIVE-12240
> URL: https://issues.apache.org/jira/browse/HIVE-12240
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> When using HiveServer2 with SQLStdAuthorizer and doAs=false please put the 
> username in the HIVE- job submission so that we can see which user 
> submitted which job in the RM queues as all the jobs just show as 'hive' user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12240) HiveServer2 put username in job name when doAs=false (for viewing in RM otherwise all jobs show as 'hive')

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12240:
---
Summary: HiveServer2 put username in job name when doAs=false (for viewing 
in RM otherwise all jobs show as 'hive')  (was: HiveServer2 put username in job 
when doAs=false (for viewing in RM otherwise all jobs show as 'hive'))

> HiveServer2 put username in job name when doAs=false (for viewing in RM 
> otherwise all jobs show as 'hive')
> --
>
> Key: HIVE-12240
> URL: https://issues.apache.org/jira/browse/HIVE-12240
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> When using HiveServer2 with SQLStdAuthorizer and doAs=false please put the 
> username in the HIVE- job submission so that we can see which user 
> submitted which job in the RM queues as all the jobs just show as 'hive' user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12240) HiveServer2 put username in job when doAs=false for viewing in RM

2015-10-23 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12240:
---
Summary: HiveServer2 put username in job when doAs=false for viewing in RM  
(was: Hive when not doAs put username in job name for viewing in RM)

> HiveServer2 put username in job when doAs=false for viewing in RM
> -
>
> Key: HIVE-12240
> URL: https://issues.apache.org/jira/browse/HIVE-12240
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> When using HiveServer2 with SQLStdAuthorizer and doAs=false please put the 
> username in the HIVE- job submission so that we can see which user 
> submitted which job in the RM queues as all the jobs just show as 'hive' user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Description: 
When trying to see the grants in Hive CLI the following NullPointerException 
bug occurs (despite "should use beeline", an NPE is still a bug):
{code}
2015-10-22 16:36:31,535 INFO  [main]: ql.Driver (Driver.java:launchTask(1604)) 
- Starting task [Stage-0:DDL] in serial mode
2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) - 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
 Error showing privileges: null
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
... 23 more
{code}

  was:
When trying to see the grants in Hive CLI the following NullPointerException 
bug occurs:
{code}
2015-10-22 16:36:31,535 INFO  [main]: ql.Driver (Driver.java:launchTask(1604)) 
- Starting task [Stage-0:DDL] in serial mode
2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) - 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
 Error showing privileges: null
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDrive

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Priority: Major  (was: Critical)

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs (despite "should use beeline", an NPE is still a bug):
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Affects Version/s: 0.14.0

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Hive
>Affects Versions: 0.14.0
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Priority: Critical
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs:
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Component/s: HiveServer2
 Hive
 Authorization

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Hive
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Priority: Critical
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs:
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Component/s: (was: HiveServer2)

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, Hive
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Priority: Critical
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs:
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12233) NullPointerException StdSQLAuthorizer showing grants via Hive CLI

2015-10-22 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12233:
---
Summary: NullPointerException StdSQLAuthorizer showing grants via Hive CLI  
(was: NullPointerException when trying to get grants with StdSQLAuthorizer via 
Hive CLI)

> NullPointerException StdSQLAuthorizer showing grants via Hive CLI
> -
>
> Key: HIVE-12233
> URL: https://issues.apache.org/jira/browse/HIVE-12233
> Project: Hive
>  Issue Type: Bug
> Environment: HDP 2.2 + Kerberos
>Reporter: Hari Sekhon
>Priority: Critical
>
> When trying to see the grants in Hive CLI the following NullPointerException 
> bug occurs:
> {code}
> 2015-10-22 16:36:31,535 INFO  [main]: ql.Driver 
> (Driver.java:launchTask(1604)) - Starting task [Stage-0:DDL] in serial mode
> 2015-10-22 16:36:31,536 ERROR [main]: exec.DDLTask (DDLTask.java:failed(511)) 
> - 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error showing privileges: null
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPluginException(SQLAuthorizationUtils.java:419)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:445)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.showPrivileges(SQLStdHiveAccessControllerWrapper.java:141)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.showPrivileges(HiveAuthorizerImpl.java:96)
> at org.apache.hadoop.hive.ql.exec.DDLTask.showGrants(DDLTask.java:649)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:465)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.isUserAdmin(SQLStdHiveAccessController.java:561)
> at 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.showPrivileges(SQLStdHiveAccessController.java:387)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12176) NullPointerException in VectorUDAFMinString

2015-10-14 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12176:
---
Description: 
Hive gets the following NullPointerException when trying to go a group by 
aggregate.

This occurs whether the engine is Tez or MR, but I'm told this works on our 
other cluster which is HDP 2.2.

I'm attaching the full outputs from the hive session, but here is the crux of 
it:
{code}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.assign(VectorUDAFMinString.java:78)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.checkValue(VectorUDAFMinString.java:65)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString.aggregateInput(VectorUDAFMinString.java:279)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:335)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:880)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:138)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:117)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more
{code}

  was:
Hive gets the following NullPointerException when trying to go a group by 
aggregate.

This occurs whether the engine is Tez or MR.

I'm attaching the full outputs from the hive session, but here is the crux of 
it:
{code}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.assign(VectorUDAFMinString.java:78)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.checkValue(VectorUDAFMinStrin

[jira] [Updated] (HIVE-12176) NullPointerException in VectorUDAFMinString

2015-10-14 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-12176:
---
Attachment: hive-nullpointexception-tez.txt
hive-nullpointexception-mr.txt

> NullPointerException in VectorUDAFMinString
> ---
>
> Key: HIVE-12176
> URL: https://issues.apache.org/jira/browse/HIVE-12176
> Project: Hive
>  Issue Type: Bug
>  Components: SQL, UDF, Vectorization
>Affects Versions: 1.2.1
> Environment: HDP 2.3 + Kerberos
>Reporter: Hari Sekhon
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: hive-nullpointexception-mr.txt, 
> hive-nullpointexception-tez.txt
>
>
> Hive gets the following NullPointerException when trying to go a group by 
> aggregate.
> This occurs whether the engine is Tez or MR.
> I'm attaching the full outputs from the hive session, but here is the crux of 
> it:
> {code}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
> ... 8 more
> Caused by: java.lang.NullPointerException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.assign(VectorUDAFMinString.java:78)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString$Aggregation.checkValue(VectorUDAFMinString.java:65)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFMinString.aggregateInput(VectorUDAFMinString.java:279)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:335)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:880)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:138)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:117)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> ... 9 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9544) Error dropping fully qualified partitioned table - Internal error processing get_partition_names

2015-08-26 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714424#comment-14714424
 ] 

Hari Sekhon commented on HIVE-9544:
---

[~rnpridgeon] No it was on Hortonworks so no Sentry.

> Error dropping fully qualified partitioned table - Internal error processing 
> get_partition_names
> 
>
> Key: HIVE-9544
> URL: https://issues.apache.org/jira/browse/HIVE-9544
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> When attempting to drop a partitioned table using a fully qualified name I 
> get this error:
> {code}
> hive -e 'drop table myDB.my_table_name;'
> Logging initialized using configuration in 
> file:/etc/hive/conf/hive-log4j.properties
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> org.apache.thrift.TApplicationException: Internal error processing 
> get_partition_names
> {code}
> It succeeds if I instead do:
> {code}hive -e 'use myDB; drop table my_table_name;'{code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11558) Hive generates Parquet files with broken footers, causes NullPointerException in Spark / Drill / Parquet tools

2015-08-14 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-11558:
---
Description: 
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at 
scala.collection.parallel.mutable.ParArray$Map.tryLeaf(ParArray.scala:650)
at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at 
scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}

What's interesting is that the table works fine in Hive when selecting out of 
it, even when doing select * on the whole table and letting it run to the end 
(it's a sample data set), it's only other tools it causes problems for.

All fields are string except for the first one which is timestamp, but this is 
not that known issue since if I create another parquet table with 3 fields 
including the timestamp and two string fields using CTAS those hive generated 
parquet files works fine in the other tools.

The only thing I can see which appears to cause this is the other fields have 
lots of NULLs in them as those json fields may or may not be present.

I've converted this exact same json data set to parquet using Apache Drill and 
also using Apache Spark SQL and both of those tools create parquet files from 
this data set as a straight conversion that are fine when accessed via Parquet 
tools or Drill or Spark or Hive (using an external Hive table definition 
layered over the generated parquet files).

This implies that it's Hive's generation of Parquet that is broken since both 
Drill and Spark can convert the dataset from JSON to Parquet without any issues 
on reading the files back in any of the other mentioned tools.

  was:
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at

[jira] [Updated] (HIVE-11558) Hive generates Parquet files with broken footers, causes NullPointerException in Spark / Drill / Parquet tools

2015-08-14 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-11558:
---
Description: 
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at 
scala.collection.parallel.mutable.ParArray$Map.tryLeaf(ParArray.scala:650)
at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at 
scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}

What's interesting is that the table works fine in Hive when selecting out of 
it, even when doing select * on the whole table and letting it run to the end 
(it's a sample data set), it's only other tools it causes problems for.

All fields are string except for the first one which is timestamp, but this is 
not that known issue since if I create another parquet table with 3 fields 
including the timestamp and two string fields using CTAS those hive generated 
parquet files works fine in the other tools.

The only thing I can see which appears to cause this is the other fields have 
lots of NULLs in them as those json fields may or may not be present.

I've converted this exact same json data set to parquet using Apache Drill and 
also using Apache Spark SQL and both of those tools create parquet files from 
this data set as a straight conversion that are fine when accessed via Parquet 
tools or Drill or Spark or Hive (using an external Hive table definition 
layered over the generated parquet files).

This implies that it's Hive's generation of Parquet that is broken since both 
Drill and Spark can convert the dataset from JSON to Parquet without any issues 
on reading the files back in any of other tools.

  was:
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at 
scala.collecti

[jira] [Updated] (HIVE-11558) Hive generates Parquet files with broken footers, causes NullPointerException in Spark / Drill / Parquet tools

2015-08-14 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-11558:
---
Description: 
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:54)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:53)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:56)
at 
scala.collection.parallel.mutable.ParArray$Map.tryLeaf(ParArray.scala:650)
at 
scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:165)
at 
scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:514)
at 
scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{code}

What's interesting is that the table works fine in Hive when selecting out of 
it, even when doing select * on the whole table and letting it run to the end 
(it's a sample data set), it's only other tools it causes problems for.

All fields are string exception for the first one which is timestamp, but this 
is not that known issue since if I create another table with 3 fields including 
the timestamp and two string fields it works fine in other tools.

The only thing I can see which appears to cause this is the other fields have 
lots of NULLs in them as those json fields may or may not be present.

I've converted this exact same json data set to parquet using Apache Drill and 
also using Apache SparkSQL and both of those tools create parquet files from 
this data set as a straight conversion that are fine when accessed via Parquet 
tools or Drill or Spark or Hive (using an external Hive table definition 
layered over the generated parquet files).

This implies that it's Hive's generation of Parquet that is broken since both 
Drill and Spark can convert the dataset from JSON to Parquet without any issues 
on reading the files back in any of other tools.

  was:
When creating a Parquet table in Hive from a table in another format (in this 
case JSON) using CTAS, the generated parquet files are created with broken 
footers and cause NullPointerExceptions in both Parquet tools and Spark when 
reading the files directly.

Here is the error from parquet tools:

{code}Could not read footer: java.lang.NullPointerException{code}

Here is the error from Spark reading the parquet file back:
{code}java.lang.NullPointerException
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:249)
at 
parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:543)
at 
parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:520)
at 
parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:426)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:298)
at 
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$refresh$6.apply(newParquet.scala:297)
at 
scala.collection.parallel.mutable.ParArray$Map.leaf(ParArray.scala:658)
at 
scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks

[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI

2015-05-26 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559313#comment-14559313
 ] 

Hari Sekhon commented on HIVE-10304:


If just recommending to users to use Beeline instead of Hive CLI that is fine 
but if Hive 1 CLI was every removed that would cause major headaches to users 
such as myself who have lots of scripts and programs that make calls to Hive 
CLI and rewriting things that already work fine for years is not cool. In fact 
it's the opposite of cool.

> Add deprecation message to HiveCLI
> --
>
> Key: HIVE-10304
> URL: https://issues.apache.org/jira/browse/HIVE-10304
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: 1.1.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>  Labels: TODOC1.2
> Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch
>
>
> As Beeline is now the recommended command line tool to Hive, we should add a 
> message to HiveCLI to indicate that it is deprecated and redirect them to 
> Beeline.  
> This is not suggesting to remove HiveCLI for now, but just a helpful 
> direction for user to know the direction to focus attention in Beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10570) HiveServer2 shut downs due to temporary ZooKeeper unavailability, causes permanent outage instead of temporary

2015-05-05 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14528150#comment-14528150
 ] 

Hari Sekhon commented on HIVE-10570:


[~thejas] it was working fine for a long time, then it went down when it could 
no longer contact ZooKeeper.

> HiveServer2 shut downs due to temporary ZooKeeper unavailability, causes 
> permanent outage instead of temporary
> --
>
> Key: HIVE-10570
> URL: https://issues.apache.org/jira/browse/HIVE-10570
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Critical
>
> HiveServer2 should not shut down when there is temporary ZooKeeper 
> unavailability (eg. temporary network outage). This prevents retry and 
> recovery later as HiveServer2 is no longer running and therefore cannot retry 
> - HiveServer2 stays offline indefinitely until operator intervention to 
> restart it, even for minor temporary problems.
> I believe this behaviour is due to recent ZooKeeper dependency addition for 
> HiveServer2 HA.
> {code}2015-05-01 11:35:05,367 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x14d004cb02c001e for server null, 
> unexpected error, closing socket
> connection and attempting reconnect
> java.net.SocketException: Network is unreachable
> at sun.nio.ch.Net.connect0(Native Method)
> at sun.nio.ch.Net.connect(Net.java:465)
> at sun.nio.ch.Net.connect(Net.java:457)
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:967)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1003)
> 2015-05-01 11:35:05,629 INFO  client.ZooKeeperSaslClient 
> (ZooKeeperSaslClient.java:run(285)) - Client will use GSSAPI as SASL 
> mechanism.
> 2015-05-01 11:35:05,630 INFO  zookeeper.ClientCnxn 
> (ClientCnxn.java:logStartConnect(975)) - Opening socket connection to server 
> /:2181. Will attempt to SASL-authenticate using Login 
> Context section 'HiveZooKeeperClient'
> 2015-05-01 11:35:05,630 ERROR zookeeper.ClientCnxnSocketNIO 
> (ClientCnxnSocketNIO.java:connect(289)) - Unable to open socket to 
> /:2181
> 2015-05-01 11:35:05,630 ERROR zookeeper.ClientCnxnSocketNIO 
> (ClientCnxnSocketNIO.java:connect(289)) - Unable to open socket to 
> /:2181
> 2015-05-01 11:35:05,630 WARN  zookeeper.ClientCnxn 
> (ClientCnxn.java:run(1102)) - Session 0x14d004cb02c001e for server null, 
> unexpected error, closing socket
> connection and attempting reconnect
> java.net.SocketException: Network is unreachable
> at sun.nio.ch.Net.connect0(Native Method)
> at sun.nio.ch.Net.connect(Net.java:465)
> at sun.nio.ch.Net.connect(Net.java:457)
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:967)
> at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1003)
> 2015-05-01 11:35:05,943 INFO  server.HiveServer2 (HiveServer2.java:stop(299)) 
> - Shutting down HiveServer2
> 2015-05-01 11:35:05,944 INFO  thrift.ThriftCLIService 
> (ThriftCLIService.java:stop(137)) - Thrift server has stopped
> 2015-05-01 11:35:05,944 INFO  service.AbstractService 
> (AbstractService.java:stop(125)) - Service:ThriftBinaryCLIService is stopped.
> 2015-05-01 11:35:05,944 INFO  service.AbstractService 
> (AbstractService.java:stop(125)) - Service:OperationManager is stopped.
> 2015-05-01 11:35:05,944 INFO  service.AbstractService 
> (AbstractService.java:stop(125)) - Service:SessionManager is stopped.
> 2015-05-01 11:35:05,946 INFO  server.HiveServer2 
> (HiveStringUtils.java:run(679)) - SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down HiveServer2 at /
> /{code}
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10399) from_unixtime_millis() Hive UDF

2015-04-28 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon resolved HIVE-10399.

Resolution: Not A Problem

Yes that does exactly what I need, and is an easier workaround than the one I 
had implemented. Thanks!

I'm going to close this ticket as the only other reason to extend 
from_unixtime() support for millis is for custom formatting which seems to be 
added in 1.2 anyway with date_format.

> from_unixtime_millis() Hive UDF
> ---
>
> Key: HIVE-10399
> URL: https://issues.apache.org/jira/browse/HIVE-10399
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Priority: Minor
>
> Feature request for a
> {code}from_unixtime_millis(){code}
> Hive UDF - from_unixtime() accepts only secs since epoch, and right now the 
> solution is to create a custom UDF, but this seems like quite a standard 
> thing to support millisecond precision dates in Hive natively.
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-633) ADD FILE command does not accept quoted filenames

2015-04-17 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14499883#comment-14499883
 ] 

Hari Sekhon commented on HIVE-633:
--

This has been bugging me for such a long time with add jar 'blah.jar', add file 
"blah.py", source "file.sql" etc.

Seems like it would be a minor improvement to have the parser to do a 
string.replaceAll() or similar method to replace single and double quotes in 
these file tokens?

> ADD FILE command does not accept quoted filenames
> -
>
> Key: HIVE-633
> URL: https://issues.apache.org/jira/browse/HIVE-633
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.3.0
> Environment: Ubuntu Linux (intrepid)
>Reporter: Saurabh Nanda
>Priority: Minor
>
> The following command says file does not exist. Removing the quotes around 
> the filename makes it work.
> hive> add files '/tmp/testing.jar'; 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9584) Hive CLI hangs while waiting for a container

2015-03-12 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358899#comment-14358899
 ] 

Hari Sekhon commented on HIVE-9584:
---

I've noticed this problem today too.

I suggest container start be done in another thread in the background and the 
CLI be presented to the user for metadata operations without requiring a tez 
container.

> Hive CLI hangs while waiting for a container
> 
>
> Key: HIVE-9584
> URL: https://issues.apache.org/jira/browse/HIVE-9584
> Project: Hive
>  Issue Type: Bug
>Reporter: Rich Haase
>
> The Hive CLI, with Tez set as the execution engine, hangs if a container 
> cannot be immediately be allocated as the Tez application master.  From a 
> user perspective this behavior is broken.  
> Users should be able to start a CLI and execute hive metadata commands 
> without needing a Tez application master.  Since users are accustomed to 
> queries with Hive on MapReduce taking a long time, but access to the CLI 
> being near instantaneous, the correct behavior should be to wait for a query 
> to be run before starting the Tez application master.
> This behavior is avoided with the Beeline CLI since it connects through 
> HiveServer2, however many users are accustomed to using the Hive CLI and 
> should not be penalized for their choice until the Hive CLI is completely 
> deprecated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-26 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
--
Description: 
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future HIVE-9452) that must first be completely 
fullfilled before the Hive LLAP query even starts to run, which would slow down 
the response to the user if not pre-loaded. Also, any temporary outage of the 
metadata layer would affect the speed LLAP layer so pre-loading and caching the 
metadata adds resilience against this.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the original trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon

  was:
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future HIVE-9452) that must first be completely 
fullfilled before the Hive LLAP query even starts to run, which would slow down 
the response to the user if not pre-loaded. Also, any temporary outage of the 
metadata layer would affect the speed LLAP layer so pre-loading and caching the 
metadata adds resilience against this.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon


> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore, Query Planning, Query Processor
>Affects Versions: llap
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the original trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-24 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335437#comment-14335437
 ] 

Hari Sekhon commented on HIVE-9768:
---

Ok I agree with your point Alan about not requiring users to invalidate the 
cache themselves, but metadata caching itself seems like it would be beneficial 
to reduce the query latency.

We're really waiting on Hive LLAP out here so I'm eager to see the speed-up and 
low latency improvements since there is no doubt going to be the usual round of 
benchmarks and debates - it would be nice if LLAP closed the gap with Impala to 
the point where it's irrelevant so we can shut down SQL-on-Hadoop fragmentation.

Perhaps LLAP can cache the metadata but Hive can add hooks to invalidate the 
metadata cluster-wide automatically whenever a DDL statement is executed in 
Hive?

This seems like the best of both worlds and addresses your concerns?

> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore, Query Planning, Query Processor
>Affects Versions: llap
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-24 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
--
Affects Version/s: 1.0.0

> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore, Query Planning, Query Processor
>Affects Versions: 0.14.0, llap, 1.0.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-24 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
--
Component/s: (was: Database/Schema)

> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore, Query Planning, Query Processor
>Affects Versions: 0.14.0, llap
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-24 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
--
Component/s: Query Processor
 Query Planning
 Metastore
 HCatalog

> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: HCatalog, Metastore, Query Planning, Query Processor
>Affects Versions: 0.14.0, llap
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9768) Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata refresh/invalidate command

2015-02-24 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HIVE-9768:
--
Description: 
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future HIVE-9452) that must first be completely 
fullfilled before the Hive LLAP query even starts to run, which would slow down 
the response to the user if not pre-loaded. Also, any temporary outage of the 
metadata layer would affect the speed LLAP layer so pre-loading and caching the 
metadata adds resilience against this.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon

  was:
Feature request for Hive LLAP to preload table metadata across all running 
nodes to reduce query latency (this is what Impala does).

The design decision behind this in Impala was to avoid the latency overhead of 
fetching the metadata at query time, since that's an extra database query (or 
possibly HBase query in future) that must first be completely fullfilled before 
the Hive LLAP query even starts to run, which would slow down the response to 
the user if not pre-loaded.

This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
operation, something Impala added later, and now calls "INVALIDATE METADATA" in 
it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
command instead.

(Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)

Regards,

Hari Sekhon
ex-Cloudera
http://www.linkedin.com/in/harisekhon


> Hive LLAP Metadata pre-load for low latency, + cluster-wide metadata 
> refresh/invalidate command
> ---
>
> Key: HIVE-9768
> URL: https://issues.apache.org/jira/browse/HIVE-9768
> Project: Hive
>  Issue Type: New Feature
>  Components: Database/Schema
>Affects Versions: 0.14.0, llap
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>
> Feature request for Hive LLAP to preload table metadata across all running 
> nodes to reduce query latency (this is what Impala does).
> The design decision behind this in Impala was to avoid the latency overhead 
> of fetching the metadata at query time, since that's an extra database query 
> (or possibly HBase query in future HIVE-9452) that must first be completely 
> fullfilled before the Hive LLAP query even starts to run, which would slow 
> down the response to the user if not pre-loaded. Also, any temporary outage 
> of the metadata layer would affect the speed LLAP layer so pre-loading and 
> caching the metadata adds resilience against this.
> This pre-loaded metadata also requires a cluster-wide "refresh metadata" 
> operation, something Impala added later, and now calls "INVALIDATE METADATA" 
> in it's SQL dialect. I propose using a more intuitive "REFRESH METADATA" Hive 
> command instead.
> (Fyi I was in the first trio of Impala SMEs at Cloudera in early 2013)
> Regards,
> Hari Sekhon
> ex-Cloudera
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

39 matches

Mail list logo