Re: ORC ACID table returning Array Index Out of Bounds

2018-02-15 Thread Aviral Agarwal
Hive version is 1.2.1000.2.6.1.0-0129 ( HDP 2.6.1.0)

For now I have mitigated the problem by recreating the table. So, I don't
have the relevant ORC files right now.

Also, I am curious, how would "*hive.acid.key.index*" help in debugging
this problem ?

I was going through the source code and it seems the following line is the
problem:

/**
 * Find the key range for bucket files.
 * @param reader the reader
 * @param options the options for reading with
 * @throws IOException
 */
private void discoverKeyBounds(Reader reader,
   Reader.Options options) throws IOException {
  RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader);
  long offset = options.getOffset();
  long maxOffset = options.getMaxOffset();
  int firstStripe = 0;
  int stripeCount = 0;
  boolean isTail = true;
  List stripes = reader.getStripes();
  for(StripeInformation stripe: stripes) {
if (offset > stripe.getOffset()) {
  firstStripe += 1;
} else if (maxOffset > stripe.getOffset()) {
  stripeCount += 1;
} else {
  isTail = false;
  break;
}
  }
  if (firstStripe != 0) {
minKey = keyIndex[firstStripe - 1];
  }
  if (!isTail) {
maxKey = keyIndex[firstStripe + stripeCount - 1];
  }
}

If this is still an open issue I would like to submit a patch to it.
Let me know how can I further debug this issue.

Thanks,
Aviral Agarwal

On Feb 15, 2018 23:10, "Eugene Koifman" <ekoif...@hortonworks.com> wrote:

> What version of Hive is this?
>
>
>
> Can you isolate this to a specific partition?
>
>
>
> The table/partition you are reading should have a directory called base_x/
> with several bucket_N files.  (if you see more than 1 base_x, take one
> with highest x)
>
>
>
> Each bucket_N should have a “*hive.acid.key.index*” property in user
> metadata section of ORC footer.
>
> Could you share the value of this property?
>
>
>
> You can use orcfiledump (https://cwiki.apache.org/conf
> luence/display/Hive/LanguageManual+ORC#LanguageManualORC-ORC
> FileDumpUtility) for this but it requires https://issues.apache.org/jira
> /browse/ORC-223.
>
>
>
> Thanks,
>
> Eugene
>
>
>
>
>
> *From: *Aviral Agarwal <aviral12...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Thursday, February 15, 2018 at 2:08 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *ORC ACID table returning Array Index Out of Bounds
>
>
>
> Hi guys,
>
>
>
> I am running into the following error when querying a ACID table :
>
>
>
> Caused by: java.lang.RuntimeException: java.io.IOException: 
> java.lang.ArrayIndexOutOfBoundsException: 8
>
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
>
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:135)
>
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
>
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
>
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
>
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674)
>
> at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633)
>
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
>
> at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
>
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405)
>
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124)
>
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
>
> ... 14 more
>
> Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 8
>
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:253)
>
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputForma

ORC ACID table returning Array Index Out of Bounds

2018-02-15 Thread Aviral Agarwal
Hi guys,

I am running into the following error when querying a ACID table :

Caused by: java.lang.RuntimeException: java.io.IOException:
java.lang.ArrayIndexOutOfBoundsException: 8
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:135)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 8
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:253)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 25 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 8
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.discoverKeyBounds(OrcRawRecordMerger.java:378)
at 
org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:447)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1436)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1323)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
... 26 more



Any help would be appreciated.


Regards,

Aviral Agarwal


ORC Schema evolution - Drop columns

2017-09-04 Thread Aviral Agarwal
Hi,

Is there any JIRA tracking the feature for Dropping columns from ORC
transaction tables ?

Regards,
Aviral Agarwal


Re: ORC Transaction Table - Spark

2017-08-23 Thread Aviral Agarwal
So, there is no way possible right now for Spark to read Hive 2.x data ?

On Thu, Aug 24, 2017 at 12:17 AM, Eugene Koifman <ekoif...@hortonworks.com>
wrote:

> This looks like you have some data written by Hive 2.x and Hive 1.x code
> trying to read it.
>
> That is not supported.
>
>
>
> *From: *Aviral Agarwal <aviral12...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Wednesday, August 23, 2017 at 12:24 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *Re: ORC Transaction Table - Spark
>
>
>
> Hi,
>
> Yes it caused by wrong naming convention of the delta directory :
>
> /apps/hive/warehouse/foo.db/bar/year=2017/month=5/delta_
> 0645253_0645253_0001
>
> How do I solve this ?
>
> Thanks !
> Aviral Agarwal
>
>
>
> On Tue, Aug 22, 2017 at 11:50 PM, Eugene Koifman <ekoif...@hortonworks.com>
> wrote:
>
> Could you do recursive “ls” in your table or partition that you are trying
> to read?
>
> Most likely you have files that don’t follow expected naming convention
>
>
>
> Eugene
>
>
>
>
>
> *From: *Aviral Agarwal <aviral12...@gmail.com>
> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
> *Date: *Tuesday, August 22, 2017 at 5:39 AM
> *To: *"user@hive.apache.org" <user@hive.apache.org>
> *Subject: *ORC Transaction Table - Spark
>
>
>
> Hi,
>
>
>
> I am trying to read hive orc transaction table through Spark but I am
> getting the following error
>
>
> Caused by: java.lang.RuntimeException: serious problem
> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(
> OrcInputFormat.java:1021)
> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(
> OrcInputFormat.java:1048)
> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
> .
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.NumberFormatException:
> For input string: "0645253_0001"
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(
> OrcInputFormat.java:998)
> ... 118 more
>
>
> Any help would be appreciated.
>
> Thanks and Regards,
> Aviral Agarwal
>
>
>


ORC Transaction Table - Spark

2017-08-22 Thread Aviral Agarwal
Hi,

I am trying to read hive orc transaction table through Spark but I am
getting the following error

Caused by: java.lang.RuntimeException: serious problem
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
.
Caused by: java.util.concurrent.ExecutionException:
java.lang.NumberFormatException: For input string: "0645253_0001"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
... 118 more

Any help would be appreciated.

Thanks and Regards,
Aviral Agarwal


Re: Hive Merge Fails with bucketid = -1

2017-07-17 Thread Aviral Agarwal
I found the solution to this. It was because the partition I was merging to
had data from external source.

Regards,
Aviral Agarwal

On Mon, Jul 17, 2017 at 4:13 PM, Aviral Agarwal <aviral12...@gmail.com>
wrote:

> Hi ,
>
> I am getting the following error when using Hive Merge
>
> Vertex failed, vertexName=Reducer 3, vertexId=vertex_1499670090756_4766_1_04, 
> diagnostics=[Task failed, taskId=task_1499670090756_4766_1_04_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"_col0":3800,"_col1":18712160,"_col2":"RAMPURANI
>  , RAMPUR KHEM KARAN","_col3":"Vaishali","_col4":"PO BIRNALAKHAN 
> SEN","_col5":"844125","_col6":"Bihar","_col7":"Office","_col8":1,"_col9":1,"_col10":"2016-12-05
>  19:29:59.148","_col11":"2017-07-16 
> 22:46:11.369","_col12":4,"_col13":2017,"_col14":6}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"//Masked
>  Values"}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:266)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row (tag=0) 
> {"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"//Masked
>  Values"}}
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
>   ... 16 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:790)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
>   ... 17 more
>
>
>
> Any help would be appreciated.
>
>
> Thanks,
>
> Aviral Agarwal
> Date Engineer
> PhonePe
>
>
>


Hive Merge Fails with bucketid = -1

2017-07-17 Thread Aviral Agarwal
Hi ,

I am getting the following error when using Hive Merge

Vertex failed, vertexName=Reducer 3,
vertexId=vertex_1499670090756_4766_1_04, diagnostics=[Task failed,
taskId=task_1499670090756_4766_1_04_00, diagnostics=[TaskAttempt 0
failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row (tag=0)
{"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"_col0":3800,"_col1":18712160,"_col2":"RAMPURANI
, RAMPUR KHEM KARAN","_col3":"Vaishali","_col4":"PO BIRNALAKHAN
SEN","_col5":"844125","_col6":"Bihar","_col7":"Office","_col8":1,"_col9":1,"_col10":"2016-12-05
19:29:59.148","_col11":"2017-07-16
22:46:11.369","_col12":4,"_col13":2017,"_col14":6}}
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row (tag=0)
{"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"//Masked
Values"}}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:266)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
Runtime Error while processing row (tag=0)
{"key":{"reducesinkkey0":{"transactionid":0,"bucketid":-1,"rowid":5}},"value":{"//Masked
Values"}}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
... 16 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:790)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
... 17 more



Any help would be appreciated.


Thanks,

Aviral Agarwal
Date Engineer
PhonePe


Re: Re:Re: Re:Re:Re:Re: How to access linux kerberosed hive from windows eclipse workspace?

2016-07-05 Thread Aviral Agarwal
at sun.security.krb5.internal.TGSRep.init(Unknown Source)
>> >   at sun.security.krb5.internal.TGSRep.(Unknown Source)
>> >   ... 25 more
>> >java.sql.SQLException: Could not open client transport with JDBC Uri:
>> jdbc:hive2://hm:1/default;principal=hive/h...@hadoop.com: GSS initiate
>> failed
>> >   at
>> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:231)
>> >   at
>> org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:176)
>> >   at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
>> >   at java.sql.DriverManager.getConnection(Unknown Source)
>> >   at java.sql.DriverManager.getConnection(Unknown Source)
>> >   at
>> org.apache.hadoop.hive.ql.security.authorization.plugin.KerberosTest.main(KerberosTest.java:50)
>> >Caused by: org.apache.thrift.transport.TTransportException: GSS initiate
>> failed
>> >   at
>> org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
>> >   at
>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
>> >   at
>> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>> >   at
>> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>> >   at
>> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>> >   at java.security.AccessController.doPrivileged(Native Method)
>> >   at javax.security.auth.Subject.doAs(Unknown Source)
>> >   at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>> >   at
>> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>> >   at
>> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:204)
>> >   ... 5 more
>> >
>> >As if kerberos configuration is incorrect 
>> >
>> >
>> >At 2016-07-04 21:26:53, "Vivek Shrivastava" <vivshrivast...@gmail.com>
>> wrote:
>> >
>> >
>> >The renewal lifetime at client krb5.conf level does make any difference.
>> The renewal time period is defined at  kdc in kdc.conf. Client can not
>> override it. The renewal is also a property set at the principal level,
>> both the settings ( renewal_lifetime, +renewal ) dictate if a ticket can be
>> renewed. I don't think your problem has anything to do with that.
>> >
>> >
>> >Seems something basic is missing in your environment. I would probably,
>> run the same piece of code in the unix environment and ensure that there is
>> no error. Enabling Kerberos debugging logging as suggested in the previous
>> post will also help you compare the sequence of execution.
>> >
>> >
>> >On Mon, Jul 4, 2016 at 7:52 AM, Aviral Agarwal <aviral12...@gmail.com>
>> wrote:
>> >
>> >
>> >Hi,
>> >Could you enable kerberos logs with
>> >
>> >-Dsun.security.krb5.debug=true
>> >
>> >
>> >and paste the output ?
>> >
>> >
>> >
>> >
>> >On Mon, Jul 4, 2016 at 3:47 PM, Maria <linanmengxia...@126.com> wrote:
>> >
>> >The qestion "kinit: Ticket expired while renewing credentials" has
>> been solved. I can successfully execute "kinit -R",
>> >
>> >but the error “java.lang.RuntimeException:
>> org.apache.thrift.transport.TTransportException: Peer indicated failure:
>> GSS initiate failed”
>> >
>> >is still there..
>> >
>> >
>> >
>> >
>> >
>> >At 2016-07-04 14:39:04, "Maria" <linanmengxia...@126.com> wrote:
>> >
>> >>I saw a  mail named "HCatalog Security",His or her problem was similar
>> to mine,and the reply answer were:
>> >
>> >>"This issue goes away after doing a kinit -R".
>> >
>> >>
>> >
>> >>So I did the same operation.while it is failed:
>> >
>> >>kinit: Ticket expired while renewing credentials
>> >
>> >>
>> >
>> >>But in my /etc/krb5.conf, I have configed this item:
>> >
>> >>renew_lifetime=7d
>> >
>> >>
>> >
>> >>So, Can anybody give me some suggestions, please? Thankyou.
>> >
>> >>
>> >
>&

Re: Re:Re:Re:Re: How to access linux kerberosed hive from windows eclipse workspace?

2016-07-04 Thread Aviral Agarwal
Hi,
Could you enable kerberos logs with
-Dsun.security.krb5.debug=true

and paste the output ?

On Mon, Jul 4, 2016 at 3:47 PM, Maria  wrote:

> The qestion "kinit: Ticket expired while renewing credentials" has
> been solved. I can successfully execute "kinit -R",
> but the error “java.lang.RuntimeException:
> org.apache.thrift.transport.TTransportException: Peer indicated failure:
> GSS initiate failed”
> is still there..
>
> At 2016-07-04 14:39:04, "Maria"  wrote:
> >I saw a  mail named "HCatalog Security",His or her problem was similar to
> mine,and the reply answer were:
> >"This issue goes away after doing a kinit -R".
> >
> >So I did the same operation.while it is failed:
> >kinit: Ticket expired while renewing credentials
> >
> >But in my /etc/krb5.conf, I have configed this item:
> >renew_lifetime=7d
> >
> >So, Can anybody give me some suggestions, please? Thankyou.
> >
> >At 2016-07-04 11:32:30, "Maria"  wrote:
> >>
> >>
> >>And  I can suucessfully access hiveserver2 from beeline.
> >>
> >>
> >>I was so confused by this error"Peer indicated failure: GSS initiate
> failed".
> >>
> >> Can you anybody please help me? Any reply will be much appreciated.
> >>
> >>At 2016-07-04 11:26:53, "Maria"  wrote:
> >>>Yup,my  hiveserver2 log errors are:
> >>>
> >>>ERROR [Hiveserver2-Handler-Pool:
> Thread-48]:server.TThreadPoolServer(TThreadPoolServer.java:run(296)) -
> error occurred during processing of message.
> >>>java.lang.RuntimeException:
> org.apache.thrift.transport.TTransportException: Peer indicated failure:
> GSS initiate failed
> >>>at
> org.apache.thrift.transport.TSaslServerTransport$FactorygetTransport(TSaslServerTransport.java:219)
> >>>at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
> >>>at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
> >>>at java.security.AccessController.doPrivileged(Native Method)
> >>>at javax.security.auth.Subject.doAs(Subject.java:356)
> >>>at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1608)
> >>>at
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
> >>>at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
> >>>at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>at java.lang.Thread.run(Thread.java:745)
> >>>Caused by: org.apache.thrift.transport.TTransportException:Peer
> indicated failure: GSS initiate failed
> >>>at
> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199)
> >>>at
> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
> >>>at
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
> >>>at
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> >>>at
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> >>> ... 10 more
> >>>
> >>>As if the windows  hive JDBC client can communicate with the
> hiveserver2,isn't it?
> >>>
> >>>while I checked everything I can :
> >>>(1)in hiveserver2 node, I execute command "klist",the results are:
> >>>Ticket cache: FILE:/tmp/krb5cc_0
> >>>Default principal: hive/h...@hadoop.com
> >>>
> >>>Valid startingExpires Service principal
> >>>07/04/16 10:28:1407/05/16 10:28:14 krbtgt/hadoop@hadoop.com
> >>> renew until 07/04/16 10:28:14
> >>>(2)in windows dos cmd,I execute command "klist",the results are:
> >>>Ticket cache:API: 1
> >>>Default principal: hive/h...@hadoop.com
> >>>
> >>>Valid startingExpires Service principal
> >>>07/04/16 10:24:3207/05/16 10:24:32 krbtgt/hadoop@hadoop.com
> >>> renew until 07/04/16 10:24:32
> >>>
> >>> Is there any thing else I have to add or set for hiveserver2?
> >>>
> >>>Thanks in advance.
> >>>
> >>>
> >>>Maria.
> >>>
> >>>At 2016-07-03 04:39:31, "Vivek Shrivastava" 
> wrote:
> >>>
> >>>
> >>>Please look at the hiveserver2 log, it will have better error
> information. You can paste error from the logs if you need help.
> >>>
> >>>
> >>>Regards,
> >>>
> >>>
> >>>Vivek
> >>>
> >>>
> >>>On Sat, Jul 2, 2016 at 5:52 AM, Maria  wrote:
> >>>
> >>>
> >>>
> >>>Hi,all:
> >>>
> >>> recently,I  attempted to access Kerberized hadoop cluster by
> launching JAVA applications from Windows workstations. And I 

Re: Optimized Hive query

2016-06-15 Thread Aviral Agarwal
I ok to digging down to the AST Builder class. Can you guys point me to the
right class ?

Meanwhile "explain (rewrite | logical | extended) ", all are not able to
flatten even a basic query of the form:

select * from ( select * from ( select c from d) alias_1 ) alias_2

into

select c from d

Thanks,
Aviral Agarwal

On Wed, Jun 15, 2016 at 6:24 AM, Gopal Vijayaraghavan <gop...@apache.org>
wrote:

>
> > So I was hoping of using internal Hive CBO to somehow change the AST
> >generated for the query somehow.
>
> Hive does have an "explain rewrite" but that prints out the query before
> CBO runs.
>
> For CBO, you need to dig all the way down to the ASTBuilder class and work
> upwards from there.
>
> Perhaps add it as an "explain optimized" (there exists "explain logical",
> "explain extended" and 2 versions of regular "explain").
>
> Cheers,
> Gopal
>
>
>


Re: Optimized Hive query

2016-06-14 Thread Aviral Agarwal
Hi,
Thanks for the replies.
I already knew that the optimizer already does that.
My usecase is a bit different though.
I want to display the flattened query back to the user.
So I was hoping of using internal Hive CBO to somehow change the AST
generated for the query somehow.

Thanks,
Aviral

On Tue, Jun 14, 2016 at 12:42 PM, Gopal Vijayaraghavan 
wrote:

>
> > You can see that you get identical execution plans for the nested query
> >and the flatten one.
>
> Wasn't that always though. Back when I started with Hive, before Stinger,
> it didn't have the identity project remover.
>
> To know if your version has this fix, try looking at
>
> hive> set hive.optimize.remove.identity.project;
>
>
> Cheers,
> Gopal
>
>
>
>
>


Re: Optimized Hive query

2016-06-14 Thread Aviral Agarwal
Yes I want to flatten the query.
Also the Insert code is correct.

Thanks,
Aviral Agarwal


On Tue, Jun 14, 2016 at 3:46 AM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> you want to flatten the query I understand.
>
> create temporary table tmp as select c from d;
>
> INSERT INTO TABLE a
> SELECT c from tmp where
> condition
>
> Is the INSERT code correct?
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 13 June 2016 at 17:55, Aviral Agarwal <aviral12...@gmail.com> wrote:
>
>> Hi,
>> I would like to know if there is a way to convert nested hive sub-queries
>> into optimized queries.
>>
>> For example :
>> INSERT INTO TABLE a.b SELECT * FROM ( SELECT c FROM d)
>>
>> into
>>
>> INSERT INTO TABLE a.b SELECT c FROM D
>>
>> This is a simple example but the solution should apply is there were
>> deeper nesting levels present.
>>
>> Thanks,
>> Aviral Agarwal
>>
>>
>


Optimized Hive query

2016-06-13 Thread Aviral Agarwal
Hi,
I would like to know if there is a way to convert nested hive sub-queries
into optimized queries.

For example :
INSERT INTO TABLE a.b SELECT * FROM ( SELECT c FROM d)

into

INSERT INTO TABLE a.b SELECT c FROM D

This is a simple example but the solution should apply is there were deeper
nesting levels present.

Thanks,
Aviral Agarwal