Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/
---

Review request for hive.


Bugs: HIVE-4911
https://issues.apache.org/jira/browse/HIVE-4911


Repository: hive-git


Description
---

The QoP for hive server 2 should be configurable to enable encryption. A new 
configuration should be exposed "hive.server2.thrift.rpc.protection". This 
would give greater control configuring hive server 2 service.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
11c31216495d0c4e454f2627af5c93a9f270b1fe 
  data/conf/hive-site.xml 4e6ff16135833da1a4df12a12a6fe59ad4f870ba 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
00f43511b478c687b7811fc8ad66af2b507a3626 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
379dafb8377aed55e74f0ae18407996bb9e1216f 
  service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
  
shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 777226f8da0af2235d4294cd6a676fa8192c89e4 
  
shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 
9b0ec0a75563b41339e6fc747556440fdf83e31e 

Diff: https://reviews.apache.org/r/12824/diff/


Testing
---


Thanks,

Arup Malakar



Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-24 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/
---

(Updated July 24, 2013, 4:43 p.m.)


Review request for hive.


Changes
---

Thank you Thejas for the review. I have fixed incorporated most of them except 
the HIVE_AUTH_TYPE comment. Let me know what you think would be the best 
approach given HIVE-4232 is not committed.


Bugs: HIVE-4911
https://issues.apache.org/jira/browse/HIVE-4911


Repository: hive-git


Description
---

The QoP for hive server 2 should be configurable to enable encryption. A new 
configuration should be exposed "hive.server2.thrift.rpc.protection". This 
would give greater control configuring hive server 2 service.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
11c31216495d0c4e454f2627af5c93a9f270b1fe 
  conf/hive-default.xml.template 603b475802152a4bd5ab92a4c7146b56f6be020d 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
00f43511b478c687b7811fc8ad66af2b507a3626 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
72eac989394a388e52d3845b02bb38ebeaad 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
cef50f40ccb047a8135f704b2997968a2cf477b8 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
88151a1d48b12cf3a8346ae94b6d1a182a331992 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
379dafb8377aed55e74f0ae18407996bb9e1216f 
  service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
  
shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 777226f8da0af2235d4294cd6a676fa8192c89e4 
  
shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 
9b0ec0a75563b41339e6fc747556440fdf83e31e 

Diff: https://reviews.apache.org/r/12824/diff/


Testing
---


Thanks,

Arup Malakar



Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-24 Thread Arup Malakar


> On July 23, 2013, 9:54 p.m., Thejas Nair wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 728
> > <https://reviews.apache.org/r/12824/diff/1/?file=324967#file324967line728>
> >
> > should we just call this 
> > hive.server2.thrift.sasl.qop ? That seems more self describing.
> >

I derived the name from "hadoop.rpc.protection", but I totally agree that 
hadoop.rpc.protection itself was a bit of misnomer. And using sasl.qop is 
self-describing and people can easily relate this to auth, auth-int, auth-conf 
etc.


- Arup


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/#review23722
-------


On July 24, 2013, 4:43 p.m., Arup Malakar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12824/
> ---
> 
> (Updated July 24, 2013, 4:43 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-4911
> https://issues.apache.org/jira/browse/HIVE-4911
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 11c31216495d0c4e454f2627af5c93a9f270b1fe 
>   conf/hive-default.xml.template 603b475802152a4bd5ab92a4c7146b56f6be020d 
>   jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
> 00f43511b478c687b7811fc8ad66af2b507a3626 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
> 72eac989394a388e52d3845b02bb38ebeaad 
>   
> metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
> cef50f40ccb047a8135f704b2997968a2cf477b8 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 88151a1d48b12cf3a8346ae94b6d1a182a331992 
>   service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
> 1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
>   service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
> 379dafb8377aed55e74f0ae18407996bb9e1216f 
>   service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
>   
> shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
>  777226f8da0af2235d4294cd6a676fa8192c89e4 
>   
> shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
>  9b0ec0a75563b41339e6fc747556440fdf83e31e 
> 
> Diff: https://reviews.apache.org/r/12824/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arup Malakar
> 
>



Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-07-24 Thread Arup Malakar


> On July 23, 2013, 9:48 p.m., Thejas Nair wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java, line 142
> > <https://reviews.apache.org/r/12824/diff/1/?file=324969#file324969line142>
> >
> > the HIVE_AUTH_TYPE env variable is called "auth".
> > Should we use something more descriptive like "sasl.qop" as the 
> > variable that sets the QOP level.
> >
> 
> Arup Malakar wrote:
> I am totally agree that a different key name should be used for qop 
> settings. As the current HIVE_AUTH_TYPE configuration key is overloaded. 
> Original idea was to clean up the configuration keys which is being taken 
> care of in: https://issues.apache.org/jira/browse/HIVE-4232. Once the auth 
> params are taken care of, I had plans of introducing a new parameter called 
> qop which would be used to configure the QoP alone. But since HIVE-4232 is 
> not yet committed, I ended up using the HIVE_AUTH_TYPE. I can rebase if 
> HIVE-4232 goes in.

I am totally agree that a different key name should be used for qop settings. 
As the current HIVE_AUTH_TYPE configuration key is overloaded. Original idea 
was to clean up the configuration keys which is being taken care of in: 
https://issues.apache.org/jira/browse/HIVE-4232. Once the auth params are taken 
care of, I had plans of introducing a new parameter called qop which would be 
used to configure the QoP alone. But since HIVE-4232 is not yet committed, I 
ended up using the HIVE_AUTH_TYPE. I can rebase if HIVE-4232 goes in.


> On July 23, 2013, 9:48 p.m., Thejas Nair wrote:
> > jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java, line 142
> > <https://reviews.apache.org/r/12824/diff/1/?file=324969#file324969line142>
> >
> > the HIVE_AUTH_TYPE env variable is called "auth".
> > Should we use something more descriptive like "sasl.qop" as the 
> > variable that sets the QOP level.
> >
> 
> Arup Malakar wrote:
> I am totally agree that a different key name should be used for qop 
> settings. As the current HIVE_AUTH_TYPE configuration key is overloaded. 
> Original idea was to clean up the configuration keys which is being taken 
> care of in: https://issues.apache.org/jira/browse/HIVE-4232. Once the auth 
> params are taken care of, I had plans of introducing a new parameter called 
> qop which would be used to configure the QoP alone. But since HIVE-4232 is 
> not yet committed, I ended up using the HIVE_AUTH_TYPE. I can rebase if 
> HIVE-4232 goes in.

I am totally agree that a different key name should be used for qop settings. 
As the current HIVE_AUTH_TYPE configuration key is overloaded. Original idea 
was to clean up the configuration keys which is being taken care of in: 
https://issues.apache.org/jira/browse/HIVE-4232. Once the auth params are taken 
care of, I had plans of introducing a new parameter called qop which would be 
used to configure the QoP alone. But since HIVE-4232 is not yet committed, I 
ended up using the HIVE_AUTH_TYPE. I can rebase if HIVE-4232 goes in.


> On July 23, 2013, 9:48 p.m., Thejas Nair wrote:
> > shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java,
> >  line 111
> > <https://reviews.apache.org/r/12824/diff/1/?file=324973#file324973line111>
> >
> > This function is called from hive metastore client. Using 
> > SaslRpcServer.SASL_PROPS here means that setting hadoop.rpc.protection will 
> > determine the QOP level, if we make a call to SaslRpcServer.init(conf) from 
> > anywhere in the code. But that function is not being called.
> > 
> > I think it makes sense to use hadoop.rpc.protection for metastore QOP, 
> > since metastore usually not exposed 'outside' the cluster unlike hive 
> > server2. It is often viewed as something 'inside the cluster'.
> > 
> > Should we change this function to take in a configuration object and 
> > use that to call SaslRpcServer.init(conf) ?

The current createClientTransport method (without this patch) uses 
SaslRpcServer.SASL_PROPS too, but it doesn't call SaslRpcServer.init(conf) so I 
assumed SaslRpcServer.init(conf) is being called before reaching this method. 
But looking at https://issues.apache.org/jira/browse/HIVE-4232 I realized that 
this is indeed a bug in current code.

Rather than doing init() in createTransportFactory() and 
createClientTransport() I removed the default method that uses 
SaslRpcServer.SASL_PROPS. Both these methods now only takes Map. In case of both metastore client/server the code gets the Sasl 
propeties from MetaStoreUtils.getMetaStoreSaslProperties(conf) and passes it to 
the methods in HadoopThriftAuthBridge20S. 
Reasons:
1. We could remove the redundant

Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-08-02 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/
---

(Updated Aug. 2, 2013, 10:51 p.m.)


Review request for hive.


Changes
---

1. Incorporated sasl.qop renaming of param
2. Moved getHadoopSaslProperties to HadoopThriftAuthBridge


Bugs: HIVE-4911
https://issues.apache.org/jira/browse/HIVE-4911


Repository: hive-git


Description
---

The QoP for hive server 2 should be configurable to enable encryption. A new 
configuration should be exposed "hive.server2.thrift.rpc.protection". This 
would give greater control configuring hive server 2 service.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
11c31216495d0c4e454f2627af5c93a9f270b1fe 
  conf/hive-default.xml.template 603b475802152a4bd5ab92a4c7146b56f6be020d 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
00f43511b478c687b7811fc8ad66af2b507a3626 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
72eac989394a388e52d3845b02bb38ebeaad 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
cef50f40ccb047a8135f704b2997968a2cf477b8 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
88151a1d48b12cf3a8346ae94b6d1a182a331992 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
379dafb8377aed55e74f0ae18407996bb9e1216f 
  service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
  
shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 777226f8da0af2235d4294cd6a676fa8192c89e4 
  
shims/src/common-secure/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 172e03115372dc2c742469cbc5f0fefd1053163d 
  
shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 
9b0ec0a75563b41339e6fc747556440fdf83e31e 

Diff: https://reviews.apache.org/r/12824/diff/


Testing
---


Thanks,

Arup Malakar



Re: Review Request 12824: [HIVE-4911] Enable QOP configuration for Hive Server 2 thrift transport

2013-08-05 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12824/
---

(Updated Aug. 5, 2013, 6:54 p.m.)


Review request for hive.


Changes
---

Rebased.


Bugs: HIVE-4911
https://issues.apache.org/jira/browse/HIVE-4911


Repository: hive-git


Description
---

The QoP for hive server 2 should be configurable to enable encryption. A new 
configuration should be exposed "hive.server2.thrift.rpc.protection". This 
would give greater control configuring hive server 2 service.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
555343ebffb9dcd5e58d5b99ce9ca52904f68ecf 
  conf/hive-default.xml.template f01e715e4de95b4011210143f7d3add2d8a4d432 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java 
00f43511b478c687b7811fc8ad66af2b507a3626 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
cde58c25991641573453217da71a7ac1acf6adfd 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
cef50f40ccb047a8135f704b2997968a2cf477b8 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
88151a1d48b12cf3a8346ae94b6d1a182a331992 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 
1809e1b26ceee5de14a354a0e499aa8c0ab793bf 
  service/src/java/org/apache/hive/service/auth/KerberosSaslHelper.java 
379dafb8377aed55e74f0ae18407996bb9e1216f 
  service/src/java/org/apache/hive/service/auth/SaslQOP.java PRE-CREATION 
  
shims/src/common-secure/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1df6993cb9aac1bb195667b3123faee27d657c0a 
  
shims/src/common-secure/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 3e850ec3991cbb2d4343969ba8fe9df4a7d137b5 
  
shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 
ab7f5c0eb5345e68e3f223c9dfed8414de946661 

Diff: https://reviews.apache.org/r/12824/diff/


Testing
---


Thanks,

Arup Malakar



Review Request: Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11031/
---

Review request for hive.


Description
---

Enforce minmum ant version required in build script


This addresses bug HIVE-4530.
https://issues.apache.org/jira/browse/HIVE-4530


Diffs
-

  build.xml f1a03df157e889e732f948f83f4c1dc0812146ef 

Diff: https://reviews.apache.org/r/11031/diff/


Testing
---


Thanks,

Arup Malakar



HiveServer 2 encryption performance (takes 2.3x more time)

2013-08-19 Thread Arup Malakar
Hi,

With HIVE-4911[1] hive server 2 now supports encryption for thrift
transport. The quality of protection (QoP) could be set
in hive-site.xml to either of auth, auth-int and auth-conf. Of these
auth-conf enables both encryption as well as integrity check.
In my testing I have observed that with auth-conf the amount of time taken
to transfer data  is 2.3 times the time it takes
without encryption. In my test I have a table of size 1GB, and I did
"select * " on the table using the jdbc driver once with
encryption and once without encryption.

No encryption: ~9 minutes
Encryption:  ~20 minutes

I was wondering if anyone has experience with SASL encryption, if it is
possible to tune any JVM/SASL settings to bring down this time.
I am also interested in understanding if it is advisable to use a different
crypto provider than the default one that ships with the JDK.
If this much overhead is to be expected with encryption methods I would
like to know that too. I am using patched version of hive-10 with Hive
Server 2
on hadoop 23/jdk 1.7.

1. https://issues.apache.org/jira/browse/HIVE-4911

Thanks,
Arup Malakar


[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2014-10-01 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155957#comment-14155957
 ] 

Arup Malakar commented on HIVE-7787:


I tried building hive from trunk, and running it. But I am seeing the same 
error:

{code}
Caused by: java.lang.NoSuchFieldError: DECIMAL
at 
org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:168)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:35)
at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:152)
at 
parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
at 
parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
at 
parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:71)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
... 16 more
{code}


> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Priority: Minor
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(Data

[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2014-10-02 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157146#comment-14157146
 ] 

Arup Malakar commented on HIVE-7787:


The exception in the above comment was due to the fact that the hadoop cluster 
I had run had an older version of parquet in. 
I did the following and got rid of the error: {{SET 
mapreduce.job.user.classpath.first=true}}

But I hit another issue:
{code}
Diagnostic Messages for this Task:
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:300)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:247)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:371)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:652)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:168)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:409)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:286)
... 11 more
Caused by: java.lang.IllegalStateException: Field count must be either 1 or 2: 3
at 
org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:38)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
at 
org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:35)
at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:152)
at 
parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
at 
parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
at 
parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:71)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
{code}

It appears that ArrayWritableGroupConverter allows either 1 or 2 elements in 
the structure for an array. Is there a reason for that?

Is the following schema not supported?
{code}
enum MyEnumType {
EnumOne,
EnumTwo,
EnumThree
}
struct MyStruct {
1: optional MyEnumType myEnumType;
2: optional string field2;
3: optional string field3;
}

struct outerStruct {
1: optional list myStructs
}
{code}

I can file another JIRA for this issue.

> Reading 

[jira] [Updated] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2014-10-03 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-7787:
---
Attachment: HIVE-7787.trunk.1.patch

Looks like {{ArrayWritableGroupConverter}} enforces that the struct should have 
either 1 or 2 elements. I am not sure the rational behind this, since a struct 
may have more than two elements. I did a quick patch to omit the check and 
handle any number of fields. I have tested it and it seems to be working for me 
for the schema in the description. Given there were explicit checks for the 
filed count to be either 1 or 2, I am not sure if it is the right approach. 
Please take a look.

> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2014-10-03 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-7787:
---
Fix Version/s: 0.14.0
 Assignee: Arup Malakar
   Status: Patch Available  (was: Open)

> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.13.1, 0.13.0, 0.12.0, 0.12.1, 0.14.0
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2014-12-09 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240062#comment-14240062
 ] 

Arup Malakar commented on HIVE-7787:


[~rdblue] I haven't tried trunk yet, I am using the patch I submitted here. But 
lets close this issue. I would reopen if I happen to see the issue after trying 
hive trunk/0.15. 

> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320356#comment-14320356
 ] 

Arup Malakar commented on HIVE-7787:


I tried release 1.0 and still have the same problem, I am going to reopen the 
JIRA. I will resubmit the patch when I get time.

> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-7787) Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError

2015-02-13 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar reopened HIVE-7787:


> Reading Parquet file with enum in Thrift Encoding throws NoSuchFieldError
> -
>
> Key: HIVE-7787
> URL: https://issues.apache.org/jira/browse/HIVE-7787
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema, Thrift API
>Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1
> Environment: Hive 0.12 CDH 5.1.0, Hadoop 2.3.0 CDH 5.1.0
>Reporter: Raymond Lau
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-7787.trunk.1.patch
>
>
> When reading Parquet file, where the original Thrift schema contains a struct 
> with an enum, this causes the following error (full stack trace blow): 
> {code}
>  java.lang.NoSuchFieldError: DECIMAL.
> {code} 
> Example Thrift Schema:
> {code}
> enum MyEnumType {
> EnumOne,
> EnumTwo,
> EnumThree
> }
> struct MyStruct {
> 1: optional MyEnumType myEnumType;
> 2: optional string field2;
> 3: optional string field3;
> }
> struct outerStruct {
> 1: optional list myStructs
> }
> {code}
> Hive Table:
> {code}
> CREATE EXTERNAL TABLE mytable (
>   mystructs array>
> )
> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
> STORED AS
> INPUTFORMAT 'parquet.hive.DeprecatedParquetInputFormat'
> OUTPUTFORMAT 'parquet.hive.DeprecatedParquetOutputFormat'
> ; 
> {code}
> Error Stack trace:
> {code}
> Java stack trace for Hive 0.12:
> Caused by: java.lang.NoSuchFieldError: DECIMAL
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter.getNewConverter(ETypeConverter.java:146)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:31)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.ArrayWritableGroupConverter.(ArrayWritableGroupConverter.java:45)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:34)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:47)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.HiveGroupConverter.getConverterFromDescription(HiveGroupConverter.java:36)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:64)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableGroupConverter.(DataWritableGroupConverter.java:40)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.convert.DataWritableRecordConverter.(DataWritableRecordConverter.java:32)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.prepareForRead(DataWritableReadSupport.java:128)
>   at 
> parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:142)
>   at 
> parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:118)
>   at 
> parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:107)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:92)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:66)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:65)
>   ... 16 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-9960) Hive not backward compatibilie while adding optional new field to struct in parquet files

2015-03-13 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-9960:
--

 Summary: Hive not backward compatibilie while adding optional new 
field to struct in parquet files
 Key: HIVE-9960
 URL: https://issues.apache.org/jira/browse/HIVE-9960
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Arup Malakar


I recently added an optional field to a struct, when I tried to query old data 
with the new hive table which has the new field as column it throws error. Any 
clue how I can make it backward compatible so that I am still able to query old 
data with the new table definition.
 
I am using hive-0.14.0 release with  HIVE-8909 patch applied.

Details:

New optional field in a struct
{code}
struct Event {
1: optional Type type;
2: optional map values;
3: optional i32 num = -1; // <--- New field
}
{code}

Main thrift definition
{code}
 10: optional list events;
{code}

Corresponding hive table definition
{code}
  events array< struct , num: int>>)
{code}

Try to read something from the old data, using the new table definition
{{select events from table1 limit 1;}}

Failed with exception:
{code}
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 2   

Error thrown:   

15/03/12 17:23:43 [main]: ERROR CliDriver: Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 2   

java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 2 
  

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:152)   

 

at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1621)

 

at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:267)

  

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)  

 

at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) 

 

at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)  

  

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) 

 

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)

 

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  

 

at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)   

  

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 

at java.lang.reflect.Method.invoke(Method.java:597) 

 

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)  

 

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 2 


at 
org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOper

[jira] [Commented] (HIVE-4471) Build fails with hcatalog checkstyle error

2013-05-20 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13662472#comment-13662472
 ] 

Arup Malakar commented on HIVE-4471:


[~ashutoshc]/[~traviscrawford] I see that the checkstyle entry has many 
directories/file types to exclude. Instead of excluding the ones could we 
change it to include only **/*.java? 

> Build fails with hcatalog checkstyle error
> --
>
> Key: HIVE-4471
> URL: https://issues.apache.org/jira/browse/HIVE-4471
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: 0.12.0
>
> Attachments: HIVE-4471.1.patch, HIVE-4471.2.patch
>
>
> This is the output:
> checkstyle:
>  [echo] hcatalog
> [checkstyle] Running Checkstyle 5.5 on 412 files
> [checkstyle] 
> /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/src/test/.gitignore:1:
>  Missing a header - not enough lines in file.
> BUILD FAILED
> /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build.xml:296: 
> The following error occurred while executing this line:
> /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build.xml:298: 
> The following error occurred while executing this line:
> /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/build.xml:109:
>  The following error occurred while executing this line:
> /home/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/hcatalog/build-support/ant/checkstyle.xml:32:
>  Got 1 errors and 0 warnings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4835) Methods in Metrics class could avoid throwing IOException

2013-07-09 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-4835:
--

 Summary: Methods in Metrics class could avoid throwing IOException
 Key: HIVE-4835
 URL: https://issues.apache.org/jira/browse/HIVE-4835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Arup Malakar
Priority: Minor


I see that most of the methods in the Metrics class throws exception:

{code:java}
public void resetMetrics() throws IOException {
public void open() throws IOException {
public void close() throws IOException {
public void reopen() throws IOException {
public static void init() throws Exception {
public static Long incrementCounter(String name) throws IOException{
public static Long incrementCounter(String name, long increment) throws 
IOException{
public static void set(String name, Object value) throws IOException{
public static Object get(String name) throws IOException{
public static void initializeScope(String name) throws IOException {
public static MetricsScope startScope(String name) throws IOException{
public static MetricsScope getScope(String name) throws IOException {
public static void endScope(String name) throws IOException{
{code}

I believe Metrics should be best effort and the Metrics system should just log 
error messages in case it is unable to capture the Metrics. Throwing exception 
makes the caller code unnecessarily lengthy. Also the caller would never want 
to stop execution because of failure to capture metrics, so it ends up just 
logging the exception. 

The kind of code we see is like:
{code:java}
  // Snippet from HiveMetaStore.java
  try {
Metrics.startScope(function);
  } catch (IOException e) {
LOG.debug("Exception when starting metrics scope"
+ e.getClass().getName() + " " + e.getMessage());
MetaStoreUtils.printStackTrace(e);
  }
{code} 

which could have been:
{code:java}
Metrics.startScope(function);
{code}

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4835) Methods in Metrics class could avoid throwing IOException

2013-07-09 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703975#comment-13703975
 ] 

Arup Malakar commented on HIVE-4835:


[~thiruvel] Makes sense, that way the caller could handle failure if it prefers 
to, but is not forced to.

> Methods in Metrics class could avoid throwing IOException
> -
>
> Key: HIVE-4835
> URL: https://issues.apache.org/jira/browse/HIVE-4835
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>    Reporter: Arup Malakar
>Priority: Minor
>
> I see that most of the methods in the Metrics class throws exception:
> {code:java}
> public void resetMetrics() throws IOException {
> public void open() throws IOException {
> public void close() throws IOException {
> public void reopen() throws IOException {
> public static void init() throws Exception {
> public static Long incrementCounter(String name) throws IOException{
> public static Long incrementCounter(String name, long increment) throws 
> IOException{
> public static void set(String name, Object value) throws IOException{
> public static Object get(String name) throws IOException{
> public static void initializeScope(String name) throws IOException {
> public static MetricsScope startScope(String name) throws IOException{
> public static MetricsScope getScope(String name) throws IOException {
> public static void endScope(String name) throws IOException{
> {code}
> I believe Metrics should be best effort and the Metrics system should just 
> log error messages in case it is unable to capture the Metrics. Throwing 
> exception makes the caller code unnecessarily lengthy. Also the caller would 
> never want to stop execution because of failure to capture metrics, so it 
> ends up just logging the exception. 
> The kind of code we see is like:
> {code:java}
>   // Snippet from HiveMetaStore.java
>   try {
> Metrics.startScope(function);
>   } catch (IOException e) {
> LOG.debug("Exception when starting metrics scope"
> + e.getClass().getName() + " " + e.getMessage());
> MetaStoreUtils.printStackTrace(e);
>   }
> {code} 
> which could have been:
> {code:java}
> Metrics.startScope(function);
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Issue Type: New Feature  (was: Bug)

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>        Reporter: Arup Malakar
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-4911:
--

 Summary: Enable QOP configuration for Hive Server 2 thrift 
transport
 Key: HIVE-4911
 URL: https://issues.apache.org/jira/browse/HIVE-4911
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar


The QoP for hive server 2 should be configurable to enable encryption. A new 
configuration should be exposed "hive.server2.thrift.rpc.protection". This 
would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar reassigned HIVE-4911:
--

Assignee: Arup Malakar

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>        Reporter: Arup Malakar
>    Assignee: Arup Malakar
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Status: Patch Available  (was: Open)

Review: https://reviews.apache.org/r/12824/

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>        Reporter: Arup Malakar
>    Assignee: Arup Malakar
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Attachment: HIVE-4911-trunk-0.patch

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>        Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: HIVE-4911-trunk-0.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-22 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715798#comment-13715798
 ] 

Arup Malakar commented on HIVE-4911:


[~brocknoland], HIVE-4225 proposes a way to configure QoP for the Hive Server 2 
thrift service. But it uses the  {{SaslRpcServer.SaslRpcServer}} object to 
determine what QoP to use. {{SaslRpcServer.SaslRpcServer}}  reads this 
configuration from the parameter {{hadoop.rpc.protection}}, as can be seen in: 
https://svn.apache.org/repos/asf/hadoop/common/branches/HADOOP-6685/src/java/org/apache/hadoop/security/SaslRpcServer.java

{code:java}
  public static void init(Configuration conf) {
QualityOfProtection saslQOP = QualityOfProtection.AUTHENTICATION;
String rpcProtection = conf.get("hadoop.rpc.protection",
QualityOfProtection.AUTHENTICATION.name().toLowerCase());
if (QualityOfProtection.INTEGRITY.name().toLowerCase()
.equals(rpcProtection)) {
  saslQOP = QualityOfProtection.INTEGRITY;
} else if (QualityOfProtection.PRIVACY.name().toLowerCase().equals(
rpcProtection)) {
  saslQOP = QualityOfProtection.PRIVACY;
}

SASL_PROPS.put(Sasl.QOP, saslQOP.getSaslQop());
SASL_PROPS.put(Sasl.SERVER_AUTH, "true");
  }
{code}

I believe {{hadoop.rpc.protection}} configuration shouldn't dictate what QoP 
hive server 2 would use. The QoP of Hive Server 2 should rather be exposed via 
a new Hive Server 2 specific setting. That way either can change independent of 
each other.


> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>      Issue Type: New Feature
>    Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: HIVE-4911-trunk-0.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-07-24 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Attachment: HIVE-4911-trunk-1.patch

[~brocknoland]The reason I implemented fromString() instead of using the 
valueOf() is because I wanted to use the strings auth, auth-int, auth-conf in 
the configuration file "hive-site.xml". As using
this strings for QoP is well understood and can be seen in various online 
documentations: 
http://docs.oracle.com/javase/jndi/tutorial/ldap/security/sasl.html.
But I also wanted to follow enum naming conventions and use enum names AUTH, 
AUTH_INT and AUTH_CONF. Given these two are different I couldn't use valueOf() 
and ended up implementing fromString() method.


[~thejas] Thanks for quickly reviewing it. I have incorporated all the comments 
except one. I need to test the hadoop-at-higher-security-level warning log 
message part though. Will give it a try when I get the chance.

I think it may be a good idea to expose another setting for MS as well rather 
than piggybacking on hadoop.rpc.protection. That would give finer control on 
the deployment. The recent changes are geared towards making it easier. But as 
Thejas has mentioned, that change could be discussed in a separate JIRA.

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-02 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Attachment: HIVE-4911-trunk-2.patch

New changes:

1. Incorporated sasl.qop renaming of param
2. Moved getHadoopSaslProperties to HadoopThriftAuthBridge

I can't get it to compile it with the following arguments though. The classes I 
made change to compile fine but the SessionState.java complains:

{code}
compile:
 [echo] Project: ql
[javac] Compiling 898 source files to 
/Users/malakar/code/oss/hive/build/ql/classes
[javac] 
/Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35:
 package org.apache.commons.io does not exist
[javac] import org.apache.commons.io.FileUtils;
[javac] ^
[javac] 
/Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743:
 cannot find symbol
[javac] symbol  : variable FileUtils
[javac] location: class org.apache.hadoop.hive.ql.session.SessionState
[javac] FileUtils.deleteDirectory(resourceDir);
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 2 errors
{code}


> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>    Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-02 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728225#comment-13728225
 ] 

Arup Malakar commented on HIVE-4911:


For the above comment, I meant it errors out when compiled with hadoop 20. I 
used the following command:

{code}ant clean package  -Dhadoop.mr.rev=20{code}

It compiles fine with hadoop 23.

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>    Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-02 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728346#comment-13728346
 ] 

Arup Malakar commented on HIVE-4911:


Thanks [~thejas]for confirming that build is broken for 20. I was wondering if 
something was wrong in my environment. I will update the patch so that it 
applies cleanly on trunk.

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>    Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, 
> HIVE-4911-trunk-1.patch, HIVE-4911-trunk-2.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails

2013-08-05 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729784#comment-13729784
 ] 

Arup Malakar commented on HIVE-4343:


This patch causes compilation failure when compiled against hadoop 20.
I tried _ant clean package  -Dhadoop.mr.rev=20_

{code}
 [echo] Project: ql
[javac] Compiling 904 source files to 
/Users/malakar/code/oss/hive/build/ql/classes
[javac] 
/Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/exec/SecureCmdDoAs.java:51:
 cannot find symbol
[javac] symbol  : variable HADOOP_TOKEN_FILE_LOCATION
[javac] location: class org.apache.hadoop.security.UserGroupInformation
[javac] env.put(UserGroupInformation.HADOOP_TOKEN_FILE_LOCATION,
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 1 error
{code}


> HS2 with kerberos- local task for map join fails
> 
>
> Key: HIVE-4343
> URL: https://issues.apache.org/jira/browse/HIVE-4343
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.12.0
>
> Attachments: HIVE-4343.1.patch, HIVE-4343.2.patch, HIVE-4343.3.patch
>
>
> With hive server2 configured with kerberos security, when a (map) join query 
> is run, it results in failure with "GSSException: No valid credentials 
> provided "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4343) HS2 with kerberos- local task for map join fails

2013-08-05 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729805#comment-13729805
 ] 

Arup Malakar commented on HIVE-4343:


[~thejas]Thanks for  the update. Saw the patch in HIVE-4991.

> HS2 with kerberos- local task for map join fails
> 
>
> Key: HIVE-4343
> URL: https://issues.apache.org/jira/browse/HIVE-4343
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Fix For: 0.12.0
>
> Attachments: HIVE-4343.1.patch, HIVE-4343.2.patch, HIVE-4343.3.patch
>
>
> With hive server2 configured with kerberos security, when a (map) join query 
> is run, it results in failure with "GSSException: No valid credentials 
> provided "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4991) hive build with 0.20 is broken

2013-08-05 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729808#comment-13729808
 ] 

Arup Malakar commented on HIVE-4991:


I see the following error complaining about an import in HiveSessionImpl.java I 
see that HiveSessionImpl doesn't use the import. Removing the import fixed the 
problem for me.

{code}
 [echo] Project: service
[javac] Compiling 144 source files to 
/Users/malakar/code/oss/hive/build/service/classes
[javac] 
/Users/malakar/code/oss/hive/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java:29:
 package org.apache.commons.io does not exist
[javac] import org.apache.commons.io.FileUtils;
[javac] ^
[javac] Note: 
/Users/malakar/code/oss/hive/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
 uses or overrides a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 1 error
{code}

> hive build with 0.20 is broken
> --
>
> Key: HIVE-4991
> URL: https://issues.apache.org/jira/browse/HIVE-4991
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Thejas M Nair
>Assignee: Edward Capriolo
>Priority: Blocker
>  Labels: newbie
> Attachments: HIVE-4991.2.patch.txt, HIVE-4991.patch.txt
>
>
> As reported in HIVE-4911 
> ant clean package -Dhadoop.mr.rev=20
> Fails with - 
> {code}
> compile:
>  [echo] Project: ql
> [javac] Compiling 898 source files to 
> /Users/malakar/code/oss/hive/build/ql/classes
> [javac] 
> /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:35:
>  package org.apache.commons.io does not exist
> [javac] import org.apache.commons.io.FileUtils;
> [javac] ^
> [javac] 
> /Users/malakar/code/oss/hive/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java:743:
>  cannot find symbol
> [javac] symbol  : variable FileUtils
> [javac] location: class org.apache.hadoop.hive.ql.session.SessionState
> [javac] FileUtils.deleteDirectory(resourceDir);
> [javac] ^
> [javac] Note: Some input files use or override a deprecated API.
> [javac] Note: Recompile with -Xlint:deprecation for details.
> [javac] Note: Some input files use unchecked or unsafe operations.
> [javac] Note: Recompile with -Xlint:unchecked for details.
> [javac] 2 errors
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-05 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4911:
---

Attachment: 20-build-temp-change-1.patch
HIVE-4911-trunk-3.patch

I used 20-build-temp-change-1.patch to compile against 20. 

[~thejas] Let me know if you have any comments.

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>        Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: 20-build-temp-change-1.patch, 
> 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-07 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732709#comment-13732709
 ] 

Arup Malakar commented on HIVE-4911:


[~ashutoshc]That is correct. 20-build* patch are temporary patch I used to 
build against 20 until HIVE-4991 is committed. 

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>    Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Attachments: 20-build-temp-change-1.patch, 
> 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-08 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733931#comment-13733931
 ] 

Arup Malakar commented on HIVE-4911:


Thanks [~ashutoshc].

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>    Reporter: Arup Malakar
>    Assignee: Arup Malakar
> Fix For: 0.12.0
>
> Attachments: 20-build-temp-change-1.patch, 
> 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2012-10-25 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-3620:
--

 Summary: Drop table using hive CLI throws error when the total 
number of partition in the table is around 50K.
 Key: HIVE-3620
 URL: https://issues.apache.org/jira/browse/HIVE-3620
 Project: Hive
  Issue Type: Bug
Reporter: Arup Malakar


hive> drop table load_test_table_2_0;   
   
FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timedout  

  
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask 

The DB used is Oracle and hive had only one table:
select COUNT(*) from PARTITIONS;
54839

I can try and play around with the parameter 
hive.metastore.client.socket.timeout if that is what is being used. But it is 
200 seconds as of now, and 200 seconds for a drop table calls seems high 
already.

Thanks,
Arup

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-14 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Release Note: Make deleteDir work with federated namenode
  Status: Patch Available  (was: Open)

Submitting the patch for trunk, if the change is acceptable will submit patch 
for branch-0.9 as well.

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
> Attachments: HIVE-3648-trunk-0.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-14 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Attachment: HIVE-3648-trunk-0.patch

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
> Attachments: HIVE-3648-trunk-0.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-11-15 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498262#comment-13498262
 ] 

Arup Malakar commented on HIVE-3645:



Looking at PIG-2791 looks like the following needs to be done:

1. Use getDefaultBlockSize(Path) and getDefaultReplication(Path) instead of 
getDefaultBlockSize() and getDefaultReplication(). As the ones without Path 
argument wont work in case of federated namenode. These methods need to 
be shimmed.
 
2. Bump hadoop dependency to 2.0.0-alpha as  
getDefaultBlockSize(Path)/getDefaultReplication(Path) are  not available in 
0.23.1


> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-15 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498328#comment-13498328
 ] 

Arup Malakar commented on HIVE-3648:


Review for trunk: https://reviews.facebook.net/D6759

[arc diff  origin/trunk  --jira HIVE-3648 throws error.]

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
> Attachments: HIVE-3648-trunk-0.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-11-16 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3645:
---

Release Note: HIVE-3645 shimmed getDefaultBlockSize/getDefaultReplication 
to make RCFiles work with federated namenode
  Status: Patch Available  (was: Open)

Review for trunk: https://reviews.facebook.net/D6765

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
> Attachments: HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-11-16 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3645:
---

Attachment: HIVE_3645_trunk_0.patch

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
> Attachments: HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3017) hive-exec jar, contains classes from other modules(hive-serde, hive-shims, hive-common etc) duplicating those classes in two jars

2012-11-16 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3017:
---

Summary: hive-exec jar, contains classes from other modules(hive-serde, 
hive-shims, hive-common etc) duplicating those classes in two jars  (was: 
HIVE-2646 added Serde classes to hive-exec jar, duplicating hive-serde)

> hive-exec jar, contains classes from other modules(hive-serde, hive-shims, 
> hive-common etc) duplicating those classes in two jars
> -
>
> Key: HIVE-3017
> URL: https://issues.apache.org/jira/browse/HIVE-3017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>
> HIVE-2646 added the jars from hive-serde to the hive-exec class:
> {noformat}
> ...
>  0 Wed May 09 20:56:30 PDT 2012 org/apache/hadoop/hive/serde2/typeinfo/
>   1971 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/ListTypeInfo.class
>   2396 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/MapTypeInfo.class
>   2788 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/PrimitiveTypeInfo.class
>   4408 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/StructTypeInfo.class
>900 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfo.class
>   6576 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.class
>   1231 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$1.class
>   1239 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser$Token.class
>   7145 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser.class
>  14482 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.class
>   2594 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/UnionTypeInfo.class
>144 Wed May 09 20:56:30 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/package-info.class
> ...{noformat}
> Was this intentional? If so, the serde jar should be deprecated. If not, the 
> serde classes should be removed since this creates two sources of truth for 
> them and can cause other problems (see HCATALOG-407).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3017) hive-exec jar, contains classes from other modules(hive-serde, hive-shims, hive-common etc) duplicating those classes in two jars

2012-11-16 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499300#comment-13499300
 ] 

Arup Malakar commented on HIVE-3017:


I figured that hive-exec also contains all the classes from 
hive-shims-0.10.0-SNAPSHOT.jar, hive-common-0.10.0-SNAPSHOT.jar etc.
I haven't checked what else it duplicates. Having the same class present in two 
jars lead to confusion and is more error prone. In case we have both the jars 
in classpath and they don't contain the same version of a class, wouldn't know 
which class got used.

(Edited the title)

> hive-exec jar, contains classes from other modules(hive-serde, hive-shims, 
> hive-common etc) duplicating those classes in two jars
> -
>
> Key: HIVE-3017
> URL: https://issues.apache.org/jira/browse/HIVE-3017
> Project: Hive
>  Issue Type: Bug
>Reporter: Jakob Homan
>
> HIVE-2646 added the jars from hive-serde to the hive-exec class:
> {noformat}
> ...
>  0 Wed May 09 20:56:30 PDT 2012 org/apache/hadoop/hive/serde2/typeinfo/
>   1971 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/ListTypeInfo.class
>   2396 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/MapTypeInfo.class
>   2788 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/PrimitiveTypeInfo.class
>   4408 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/StructTypeInfo.class
>900 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfo.class
>   6576 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.class
>   1231 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$1.class
>   1239 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser$Token.class
>   7145 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils$TypeInfoParser.class
>  14482 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/TypeInfoUtils.class
>   2594 Wed May 09 20:56:28 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/UnionTypeInfo.class
>144 Wed May 09 20:56:30 PDT 2012 
> org/apache/hadoop/hive/serde2/typeinfo/package-info.class
> ...{noformat}
> Was this intentional? If so, the serde jar should be deprecated. If not, the 
> serde classes should be removed since this creates two sources of truth for 
> them and can cause other problems (see HCATALOG-407).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-19 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Attachment: HIVE_3648_branch_0.patch
HIVE_3648_trunk_1.patch

Patch available for branch. Added one missing abstract method in 
HadoopShimsSecure class.

Updated trunk review: https://reviews.facebook.net/D6759
Branch review: https://reviews.facebook.net/D6801

Thanks,
Arup

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
> Attachments: HIVE_3648_branch_0.patch, HIVE-3648-trunk-0.patch, 
> HIVE_3648_trunk_1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-27 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Attachment: HIVE-3648-trunk-1.patch

Thanks Ashutosh for looking into the patch. I have updated the patch to reflect 
the last commit.

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
>    Assignee: Arup Malakar
> Attachments: HIVE_3648_branch_0.patch, HIVE-3648-trunk-0.patch, 
> HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-11-27 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504846#comment-13504846
 ] 

Arup Malakar commented on HIVE-3645:


Thanks Ashutosh for looking into the patch. If the branch patch looks fine can 
you please commit this to 0.9 branch as well?

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
>Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3645_branch_0.patch, HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-27 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505061#comment-13505061
 ] 

Arup Malakar commented on HIVE-3648:


Ashutosh, thanks for committing in trunk. Can you commit it to branch-0.9 as 
well? I will provide the rebased patch once HIVE-3645 is committed for branch.

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
>Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3648_branch_0.patch, HIVE-3648-trunk-0.patch, 
> HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3754) Trunk hadoop 23 build fails

2012-11-28 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506174#comment-13506174
 ] 

Arup Malakar commented on HIVE-3754:


Hi Gang the API getDefaultBlockSize(Path) and getDefaultReplication(Path) in 
FileSystem is not available on 0.23.1 they are in 0.23.3 though or 2.0.0-alpha. 
That is why they are building fine with default configuration , but won't 
compile when you use Hadoop 0.23.1. Is this a concern?

> Trunk hadoop 23 build fails
> ---
>
> Key: HIVE-3754
> URL: https://issues.apache.org/jira/browse/HIVE-3754
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Gang Tim Liu
>
> check out the latest code from trunk
> {code}
> svn info 
> {code}
> {quote}
> Path: .
> URL: http://svn.apache.org/repos/asf/hive/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1415005
> Node Kind: directory
> Schedule: normal
> Last Changed Author: namit
> Last Changed Rev: 1414608
> Last Changed Date: 2012-11-28 01:36:27 -0800 (Wed, 28 Nov 2012)
> {quote}
> {code}
> ant clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
> -Dhadoop.mr.rev=23
> {code}
> {quote}
> ivy-retrieve-hadoop-shim:
>  [echo] Project: shims
> [javac] Compiling 2 source files to 
> /Users/gang/hive-trunk-11-28/build/shims/classes
> [javac] 
> /Users/gang/hive-trunk-11-28/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:122:
>  getDefaultBlockSize() in org.apache.hadoop.fs.FileSystem cannot be applied 
> to (org.apache.hadoop.fs.Path)
> [javac] return fs.getDefaultBlockSize(path);
> [javac]  ^
> [javac] 
> /Users/gang/hive-trunk-11-28/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:127:
>  getDefaultReplication() in org.apache.hadoop.fs.FileSystem cannot be applied 
> to (org.apache.hadoop.fs.Path)
> [javac] return fs.getDefaultReplication(path);
> [javac]  ^
> [javac] 2 errors
> BUILD FAILED
> /Users/gang/hive-trunk-11-28/build.xml:302: The following error occurred 
> while executing this line:
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3754) Trunk hadoop 23 build fails

2012-11-29 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506643#comment-13506643
 ] 

Arup Malakar commented on HIVE-3754:


Gang I am assuming it is now compiling for you with the commands Ashutosh 
provided.

> Trunk hadoop 23 build fails
> ---
>
> Key: HIVE-3754
> URL: https://issues.apache.org/jira/browse/HIVE-3754
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Gang Tim Liu
>
> check out the latest code from trunk
> {code}
> svn info 
> {code}
> {quote}
> Path: .
> URL: http://svn.apache.org/repos/asf/hive/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1415005
> Node Kind: directory
> Schedule: normal
> Last Changed Author: namit
> Last Changed Rev: 1414608
> Last Changed Date: 2012-11-28 01:36:27 -0800 (Wed, 28 Nov 2012)
> {quote}
> {code}
> ant clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
> -Dhadoop.mr.rev=23
> {code}
> {quote}
> ivy-retrieve-hadoop-shim:
>  [echo] Project: shims
> [javac] Compiling 2 source files to 
> /Users/gang/hive-trunk-11-28/build/shims/classes
> [javac] 
> /Users/gang/hive-trunk-11-28/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:122:
>  getDefaultBlockSize() in org.apache.hadoop.fs.FileSystem cannot be applied 
> to (org.apache.hadoop.fs.Path)
> [javac] return fs.getDefaultBlockSize(path);
> [javac]  ^
> [javac] 
> /Users/gang/hive-trunk-11-28/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java:127:
>  getDefaultReplication() in org.apache.hadoop.fs.FileSystem cannot be applied 
> to (org.apache.hadoop.fs.Path)
> [javac] return fs.getDefaultReplication(path);
> [javac]  ^
> [javac] 2 errors
> BUILD FAILED
> /Users/gang/hive-trunk-11-28/build.xml:302: The following error occurred 
> while executing this line:
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-12-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Attachment: HIVE_3648_branch_1.patch

Rebased branch patch.

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
>    Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3648_branch_0.patch, HIVE_3648_branch_1.patch, 
> HIVE-3648-trunk-0.patch, HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-12-04 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510143#comment-13510143
 ] 

Arup Malakar commented on HIVE-3645:


Thank you Ashutosh.

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
>Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3645_branch_0.patch, HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-12-05 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510799#comment-13510799
 ] 

Arup Malakar commented on HIVE-3648:


Thanks Ashutosh.

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
>Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3648_branch_0.patch, HIVE_3648_branch_1.patch, 
> HIVE-3648-trunk-0.patch, HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-11 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529499#comment-13529499
 ] 

Arup Malakar commented on HIVE-3789:


Chris, I am seeing the errors too, I am investigating it. Meanwhile if someone 
has any clue on the following exception that I am seeing for the test failures:

{code}
[junit] Running org.apache.hadoop.hive.ql.parse.TestContribParse
[junit] Cleaning up TestContribParse
[junit] Exception: MetaException(message:Got exception: 
java.lang.IllegalArgumentException Wrong FS: 
pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json,
 expected: file:///)
[junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 4.433 sec
[junit] org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:Got exception: java.lang.IllegalArgumentException Wrong 
FS: 
pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json,
 expected: file:///)
[junit] at 
org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:813)
[junit] at 
org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:789)
[junit] at 
org.apache.hadoop.hive.ql.QTestUtil.cleanUp(QTestUtil.java:421)
[junit] at 
org.apache.hadoop.hive.ql.QTestUtil.shutdown(QTestUtil.java:278)
[junit] at 
org.apache.hadoop.hive.ql.parse.TestContribParse.tearDown(TestContribParse.java:59)
[junit] at junit.framework.TestCase.runBare(TestCase.java:140)
[junit] at junit.framework.TestResult$1.protect(TestResult.java:110)
[junit] at junit.framework.TestResult.runProtected(TestResult.java:128)
[junit] at junit.framework.TestResult.run(TestResult.java:113)
[junit] at junit.framework.TestCase.run(TestCase.java:124)
[junit] at junit.framework.TestSuite.runTest(TestSuite.java:243)
[junit] at junit.framework.TestSuite.run(TestSuite.java:238)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:520)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1060)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:911)
[junit] Caused by: MetaException(message:Got exception: 
java.lang.IllegalArgumentException Wrong FS: 
pfile:/Users/malakar/code/oss/hive_09/hive/build/contrib/test/data/warehouse/src_json,
 expected: file:///)
[junit] at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:785)
[junit] at 
org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:61)
[junit] at 
org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:200)
[junit] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:929)
[junit] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:944)
[junit] at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:553)
[junit] at 
org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:807)
[junit] ... 14 more
[junit] Test org.apache.hadoop.hive.ql.parse.TestContribParse FAILED
  [for] /Users/malakar/code/oss/hive_09/hive/contrib/build.xml: The 
following error occurred while executing this line:
  [for] /Users/malakar/code/oss/hive_09/hive/build.xml:321: The following 
error occurred while executing this line:
  [for] /Users/malakar/code/oss/hive_09/hive/build-common.xml:448: Tests 
failed!
{code}

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-12 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530545#comment-13530545
 ] 

Arup Malakar commented on HIVE-3789:


The exception gets eaten up and doesn't show up in console. Here is the exact 
stacktrace which I caught and logged, before rethrowing.

{code}
[junit] Something wrong happened while moving to trash Wrong FS: 
pfile:/Users/malakar/code/oss/hive_09/hive/build/metastore/test/data/warehouse/testtablefilter.db/table1,
 expected: file:///[junit] 
org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:581)
[junit] org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:680)
[junit] 
org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:139)
[junit] 
org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:139)
[junit] 
org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:139)
[junit] org.apache.hadoop.fs.Trash.moveToAppropriateTrash(Trash.java:70)
[junit] 
org.apache.hadoop.hive.shims.Hadoop23Shims.moveToAppropriateTrash(Hadoop23Shims.java:133)
[junit] 
org.apache.hadoop.hive.metastore.HiveMetaStoreFsImpl.deleteDir(HiveMetaStoreFsImpl.java:45)
[junit] 
org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:201)
[junit] 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:929)
[junit] 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:944)
[junit] 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table.getResult(ThriftHiveMetastore.java:4955)
[junit] 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_table.getResult(ThriftHiveMetastore.java:4943)
[junit] org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
[junit] org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
[junit] 
org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
[junit] 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
[junit] 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[junit] 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[junit] java.lang.Thread.run(Thread.java:680)
{code}

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-12-17 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13534224#comment-13534224
 ] 

Arup Malakar commented on HIVE-3645:


From: 
https://issues.apache.org/jira/browse/HIVE-3754?focusedCommentId=13506596&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506596
You can use either of
{code}
ant clean package -Dhadoop.version=0.23.3 -Dhadoop-0.23.version=0.23.3 
-Dhadoop.mr.rev=23
ant clean package -Dhadoop.version=2.0.0-alpha 
-Dhadoop-0.23.version=2.0.0-alpha -Dhadoop.mr.rev=23
{code}

See HIVE-3754 for more details.

I also see that default hadoop 23 version is 0.23.3 for branch-0.9 as well, so 
this should have worked without the arguments:
{code}
hadoop-0.23.version=0.23.3
{code}

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
>Assignee: Arup Malakar
> Fix For: 0.11
>
> Attachments: HIVE_3645_branch_0.patch, HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Fix Version/s: 0.9.0
   0.10.0
 Assignee: Arup Malakar
Affects Version/s: 0.10.0
 Release Note: [HIVE-3789] Added resolvePath method in ProxyFileSystem, 
so that the underlying filesystem resolvePath is not called. Fixed checkPath as 
well, since it was ignoring the schema and authority of the path being passed.
   Status: Patch Available  (was: Open)

Trash.moveToAppropriateTrash calls resolvePath, whose implementation is in the 
actual FileSystem behind ProxyFileSystem. resolvePath checks if the path being 
moved belongs to that filessystem or not. This check fails since it sees the 
proxy schema( "pfile") in the path instead of its own schema ("file"). 
Overriding resolvePath to call the checkPath in ProxyFileSystem, fixed the 
problem.

Also the old implementation of checkPath was incorrect, as it throws away the 
schema/authority being passed before calling super. It should check if they 
match the proxy schema/authority.

The problem here was that ProxyFileSystem contains the FileSystem as a class 
member and it doesn't extend it. Because of this reason if a method in 
FileSystem calls another method in it, the method in FileSystem gets called not 
the overriden method in ProxyFileSystem. In this case resolvePath internally 
calls checkPath(), but the checkPath of RawFileSystem gets called instead of 
the overridden checkPath() in ProxyFileSystem. 

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>        Reporter: Chris Drome
>Assignee: Arup Malakar
> Fix For: 0.10.0, 0.9.0
>
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Attachment: HIVE-3789.trunk.1.patch
HIVE-3789.branch-0.9_1.patch

Trunk review: https://reviews.facebook.net/D7467

Branch-0.9 review: https://reviews.facebook.net/D7473

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>Assignee: Arup Malakar
> Fix For: 0.9.0, 0.10.0
>
> Attachments: HIVE-3789.branch-0.9_1.patch, HIVE-3789.trunk.1.patch
>
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3648:
---

Labels: namenode_federation  (was: )

> HiveMetaStoreFsImpl is not compatible with hadoop viewfs
> 
>
> Key: HIVE-3648
> URL: https://issues.apache.org/jira/browse/HIVE-3648
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.9.0, 0.10.0
>Reporter: Kihwal Lee
>    Assignee: Arup Malakar
>  Labels: namenode_federation
> Fix For: 0.11
>
> Attachments: HIVE_3648_branch_0.patch, HIVE_3648_branch_1.patch, 
> HIVE-3648-trunk-0.patch, HIVE_3648_trunk_1.patch, HIVE-3648-trunk-1.patch
>
>
> HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may 
> not work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
> instead.  Please note that this method is not available in hadoop versions 
> earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3645) RCFileWriter does not implement the right function to support Federation

2012-12-18 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3645:
---

Labels: namenode_federation  (was: )

> RCFileWriter does not implement the right function to support Federation
> 
>
> Key: HIVE-3645
> URL: https://issues.apache.org/jira/browse/HIVE-3645
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadoop 0.23.3 federation, Hive 0.9 and Pig 0.10
>Reporter: Viraj Bhat
>Assignee: Arup Malakar
>  Labels: namenode_federation
> Fix For: 0.11
>
> Attachments: HIVE_3645_branch_0.patch, HIVE_3645_trunk_0.patch
>
>
> Create a table using Hive DDL
> {code}
> CREATE TABLE tmp_hcat_federated_numbers_part_1 (
>   id   int,  
>   intnum   int,
>   floatnum float
> )partitioned by (
>   part1string,
>   part2string
> )
> STORED AS rcfile
> LOCATION 'viewfs:///database/tmp_hcat_federated_numbers_part_1';
> {code}
> Populate it using Pig:
> {code}
> A = load 'default.numbers_pig' using org.apache.hcatalog.pig.HCatLoader();
> B = filter A by id <=  500;
> C = foreach B generate (int)id, (int)intnum, (float)floatnum;
> store C into
> 'default.tmp_hcat_federated_numbers_part_1'
> using org.apache.hcatalog.pig.HCatStorer
>('part1=pig, part2=hcat_pig_insert',
> 'id: int,intnum: int,floatnum: float');
> {code}
> Generates the following error when running on a Federated Cluster:
> {quote}
> 2012-10-29 20:40:25,011 [main] ERROR
> org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate
> exception from backed error: AttemptID:attempt_1348522594824_0846_m_00_3
> Info:Error: org.apache.hadoop.fs.viewfs.NotInMountpointException:
> getDefaultReplication on empty path is invalid
> at
> org.apache.hadoop.fs.viewfs.ViewFileSystem.getDefaultReplication(ViewFileSystem.java:479)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:723)
> at org.apache.hadoop.hive.ql.io.RCFile$Writer.(RCFile.java:705)
> at
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat.getRecordWriter(RCFileOutputFormat.java:86)
> at
> org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:100)
> at
> org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:228)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
> at
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.(MapTask.java:587)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:706)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2013-01-07 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546285#comment-13546285
 ] 

Arup Malakar commented on HIVE-3789:


Hi Ashutosh, you are right. My concern was that checkPath() should look for 
pfile:// scheme in the path that is passed. It  

For the test cases to pass adding resolvePath() is sufficient. I will submit a 
patch without the modification in checkPath().

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>Assignee: Arup Malakar
> Attachments: HIVE-3789.branch-0.9_1.patch, HIVE-3789.trunk.1.patch
>
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2013-01-07 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Attachment: HIVE-3789.branch-0.9_2.patch
HIVE-3789.trunk.2.patch

Patch with reverted checkPath()

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>Assignee: Arup Malakar
> Attachments: HIVE-3789.branch-0.9_1.patch, 
> HIVE-3789.branch-0.9_2.patch, HIVE-3789.trunk.1.patch, HIVE-3789.trunk.2.patch
>
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3789) Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9

2013-01-09 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3789:
---

Attachment: HIVE-3789.branch-0.10_1.patch

Hi Ashutosh, can you also commit the changes for branch-0.10?

> Patch HIVE-3648 causing the majority of unit tests to fail on branch 0.9
> 
>
> Key: HIVE-3789
> URL: https://issues.apache.org/jira/browse/HIVE-3789
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Tests
>Affects Versions: 0.9.0, 0.10.0
> Environment: Hadooop 0.23.5, JDK 1.6.0_31
>Reporter: Chris Drome
>Assignee: Arup Malakar
> Fix For: 0.11.0
>
> Attachments: HIVE-3789.branch-0.10_1.patch, 
> HIVE-3789.branch-0.9_1.patch, HIVE-3789.branch-0.9_2.patch, 
> HIVE-3789.trunk.1.patch, HIVE-3789.trunk.2.patch
>
>
> Rolling back to before this patch shows that the unit tests are passing, 
> after the patch, the majority of the unit tests are failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH

2013-02-01 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-3978:
--

 Summary: HIVE_AUX_JARS_PATH should have : instead of , as 
separator since it gets appended to HADOOP_CLASSPATH
 Key: HIVE-3978
 URL: https://issues.apache.org/jira/browse/HIVE-3978
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
hcatalog-0.5
hadoop 0.23
hbase 0.94
Reporter: Arup Malakar
Assignee: Arup Malakar


The following code gets executed only in case of cygwin.
HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`

But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should 
get replaced by : for all cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3978:
---

Attachment: HIVE-3978_branch_0.10_0.patch
HIVE-3978_trunk_0.patch

> HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets 
> appended to HADOOP_CLASSPATH
> -
>
> Key: HIVE-3978
> URL: https://issues.apache.org/jira/browse/HIVE-3978
> Project: Hive
>  Issue Type: Bug
> Environment: hive-0.10
> hcatalog-0.5
> hadoop 0.23
> hbase 0.94
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch
>
>
> The following code gets executed only in case of cygwin.
> HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
> But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should 
> get replaced by : for all cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3978:
---

Fix Version/s: 0.10.0
   0.11.0
 Release Note: Use ':' in HIVE_AUX_JARS_PATH instead of ','
   Status: Patch Available  (was: Open)

Review: https://reviews.facebook.net/D8373

> HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets 
> appended to HADOOP_CLASSPATH
> -
>
> Key: HIVE-3978
> URL: https://issues.apache.org/jira/browse/HIVE-3978
> Project: Hive
>  Issue Type: Bug
> Environment: hive-0.10
> hcatalog-0.5
> hadoop 0.23
> hbase 0.94
>        Reporter: Arup Malakar
>Assignee: Arup Malakar
> Fix For: 0.11.0, 0.10.0
>
> Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch
>
>
> The following code gets executed only in case of cygwin.
> HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
> But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should 
> get replaced by : for all cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3983) Select on table with hbase storage handler fails with an SASL error

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3983:
---

Summary: Select on table with hbase storage handler fails with an SASL 
error  (was: Select on table with hbase storage handler fails with an SASL)

> Select on table with hbase storage handler fails with an SASL error
> ---
>
> Key: HIVE-3983
> URL: https://issues.apache.org/jira/browse/HIVE-3983
> Project: Hive
>  Issue Type: Bug
> Environment: hive-0.10
> hbase-0.94.5.5
> hadoop-0.23.3.1
> hcatalog-0.5
>Reporter: Arup Malakar
>
> The table is created using the following query:
> {code}
> CREATE TABLE hbase_table_1(key int, value string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
> TBLPROPERTIES ("hbase.table.name" = "xyz"); 
> {code}
> Doing a select on the table launches a map-reduce job. But the job fails with 
> the following error:
> {code}
> 2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] 
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
> attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: 
> java.lang.RuntimeException: SASL authentication failed. The most likely cause 
> is missing or invalid credentials. Consider 'kinit'.
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
>   at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:160)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
> Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
> likely cause is missing or invalid credentials. Consider 'kinit'.
>   at 
> org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
>   at org.apache.hadoop.hbase.security.User.call(User.java:590)
>   at org.apache.hadoop.hbase.security.User.access$700(User.java:51)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444)
>   at 
> org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203)
>   at 
> org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104)
>   at $Proxy12.getProtocolVersion(Unknown Source)
>   at 
> org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectio

[jira] [Created] (HIVE-3983) Select on table with hbase storage handler fails with an SASL

2013-02-04 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-3983:
--

 Summary: Select on table with hbase storage handler fails with an 
SASL
 Key: HIVE-3983
 URL: https://issues.apache.org/jira/browse/HIVE-3983
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
hbase-0.94.5.5
hadoop-0.23.3.1
hcatalog-0.5
Reporter: Arup Malakar


The table is created using the following query:

{code}
CREATE TABLE hbase_table_1(key int, value string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz"); 
{code}

Doing a select on the table launches a map-reduce job. But the job fails with 
the following error:

{code}
2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: 
java.lang.RuntimeException: SASL authentication failed. The most likely cause 
is missing or invalid credentials. Consider 'kinit'.
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:160)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
likely cause is missing or invalid credentials. Consider 'kinit'.
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
at org.apache.hadoop.hbase.security.User.call(User.java:590)
at org.apache.hadoop.hbase.security.User.access$700(User.java:51)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444)
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203)
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291)
at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
at 
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104)
at $Proxy12.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1291)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:882)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:984)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnect

[jira] [Commented] (HIVE-2038) Metastore listener

2013-03-05 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593844#comment-13593844
 ] 

Arup Malakar commented on HIVE-2038:


[~ashutoshc] is the MetaStoreListener implementation supposed to be threadsafe? 
I am seeing issues related to that in HCatalog. The javadoc of the listener 
interface doesn't mention anything.

> Metastore listener
> --
>
> Key: HIVE-2038
> URL: https://issues.apache.org/jira/browse/HIVE-2038
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 0.8.0
>
> Attachments: hive_2038_3.patch, hive_2038_4.patch, hive-2038.patch, 
> metastore_listener.patch, metastore_listener.patch, metastore_listener.patch
>
>
> Provide to way to observe changes happening on Metastore

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.

2013-04-09 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627148#comment-13627148
 ] 

Arup Malakar commented on HIVE-3620:


Error log I see in the server is:
{code}
2013-04-09 19:47:41,955 ERROR thrift.ProcessFunction 
(ProcessFunction.java:process(41)) - Internal error processing 
get_databasejava.lang.OutOfMemoryError: Java heap spaceat 
java.lang.AbstractStringBuilder.(AbstractStringBuilder.java:45)at 
java.lang.StringBuilder.(StringBuilder.java:80)at 
oracle.net.ns.Packet.(Packet.java:513)at 
oracle.net.ns.Packet.(Packet.java:142)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:279)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1042)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:301)
at 
oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:531)   
 at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:221)at 
oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32) 
   at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:503)
at 
yjava.database.jdbc.oracle.KeyDbDriverWrapper.connect(KeyDbDriverWrapper.java:81)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)at 
org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
at 
org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at 
org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
at 
org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
at 
org.datanucleus.store.rdbms.ConnectionProviderPriorityList.getConnection(ConnectionProviderPriorityList.java:57)
at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:363)
at 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getXAResource(ConnectionFactoryImpl.java:322)
at 
org.datanucleus.store.connection.ConnectionManagerImpl.enlistResource(ConnectionManagerImpl.java:388)
at 
org.datanucleus.store.connection.ConnectionManagerImpl.allocateConnection(ConnectionManagerImpl.java:253)
at 
org.datanucleus.store.connection.AbstractConnectionFactory.getConnection(AbstractConnectionFactory.java:60)
at 
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:338)
at 
org.datanucleus.store.AbstractStoreManager.getConnection(AbstractStoreManager.java:307)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:582)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1692)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1527)
at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMDatabase(ObjectStore.java:405)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:424)
{code}

Show table takes time too:
{code}
hive> show tables;
OK
load_test_table_2_0
testTime taken: 285.705 seconds

Log in server:

2013-04-09 19:53:52,783 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 5: get_database: default
2013-04-09 19:54:09,143 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:newRawStore(391)) - 5: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
2013-04-09 19:57:44,812 INFO  metastore.ObjectStore 
(ObjectStore.java:initialize(222)) - ObjectStore, initialize called
2013-04-09 19:57:44,816 INFO  metastore.ObjectStore 
(ObjectStore.java:setConf(205)) - Initialized ObjectStore
2013-04-09 19:57:51,700 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 6: get_database: default
2013-04-09 19:57:51,706 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:newRawStore(391)) - 6: Opening raw store with implemenation 
class:org.apache.hadoop.hive.metastore.ObjectStore
2013-04-09 19:57:51,712 INFO  metastore.ObjectStore 
(ObjectStore.java:initialize(222)) - ObjectStore, initialize called
2013-04-09 19:57:51,714 INFO  metastore.ObjectStore 
(ObjectStore.java:setConf(205)) - Initialized ObjectStore
2013-04-09 19:57:52,048 INFO  metastore.HiveMetaStore 
(HiveMetaStore.java:logInfo(434)) - 6: get_tables: db=default pat=.*
2013-04-09 19:57:52,262 ERROR DataNucleus.Transaction 
(Log4JLogger.java:error(115)) - Operation rollback failed on resource: 
org.datanucleus.store.rdbms.ConnectionFactoryImpl$EmulatedXAResource@18d3a2f, 
error code UNKNOWN and transaction: [DataNucleus Transaction, ID=Xid=�, 
enlisted 
re

[jira] [Created] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-4530:
--

 Summary: Enforce minmum ant version required in build script 
 Key: HIVE-4530
 URL: https://issues.apache.org/jira/browse/HIVE-4530
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Arup Malakar
Assignee: Arup Malakar
Priority: Minor


I observed that hive doesn't build with older versions of ant (I tried with 
1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4530:
---

Attachment: HIVE-4530-trunk-0.patch

I have put the minimum ant version as 1.8.0 currently, but if it is known to 
work for any other old version I can tune the minimum version required.

Review: https://reviews.apache.org/r/11031/

> Enforce minmum ant version required in build script 
> 
>
> Key: HIVE-4530
> URL: https://issues.apache.org/jira/browse/HIVE-4530
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>    Reporter: Arup Malakar
>Assignee: Arup Malakar
>Priority: Minor
> Attachments: HIVE-4530-trunk-0.patch
>
>
> I observed that hive doesn't build with older versions of ant (I tried with 
> 1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4530) Enforce minmum ant version required in build script

2013-05-09 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-4530:
---

Fix Version/s: 0.12.0
   Status: Patch Available  (was: Open)

> Enforce minmum ant version required in build script 
> 
>
> Key: HIVE-4530
> URL: https://issues.apache.org/jira/browse/HIVE-4530
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>    Reporter: Arup Malakar
>Assignee: Arup Malakar
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4530-trunk-0.patch
>
>
> I observed that hive doesn't build with older versions of ant (I tried with 
> 1.6.5). It would be a good idea to have the check in our build.xml. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4911) Enable QOP configuration for Hive Server 2 thrift transport

2013-08-20 Thread Arup Malakar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13745443#comment-13745443
 ] 

Arup Malakar commented on HIVE-4911:


I thought I will add the performance numbers I have seen here for reference.  
In my testing I have observed that with auth-conf the amount of time taken
to transfer data is {color:red}~2.3 times{color} the time it takes without 
encryption. In my test I have a table of size *1GB*, and I did
"select *" on the table using the jdbc driver once with encryption and once 
without encryption.

Time taken:

* No encryption: *~9 minutes*
* Encryption:  *~20 minutes*

I was wondering if anyone has experience with SASL encryption, if it is 
possible to tune any JVM/SASL settings to bring down this time. I am also 
interested in understanding if it is advisable to use a different crypto 
provider than the default one that ships with the JDK. If this much overhead is 
to be expected with encryption methods I would like to know that too. I am 
using patched version of _hive-10_ with _Hive Server 2_ on _hadoop 23/jdk 
1.7/RHEL 5_.

PS: This comment is a repost of a mail I sent out to hive-dev mailing list.

> Enable QOP configuration for Hive Server 2 thrift transport
> ---
>
> Key: HIVE-4911
> URL: https://issues.apache.org/jira/browse/HIVE-4911
> Project: Hive
>  Issue Type: New Feature
>Reporter: Arup Malakar
>Assignee: Arup Malakar
> Fix For: 0.12.0
>
> Attachments: 20-build-temp-change-1.patch, 
> 20-build-temp-change.patch, HIVE-4911-trunk-0.patch, HIVE-4911-trunk-1.patch, 
> HIVE-4911-trunk-2.patch, HIVE-4911-trunk-3.patch
>
>
> The QoP for hive server 2 should be configurable to enable encryption. A new 
> configuration should be exposed "hive.server2.thrift.rpc.protection". This 
> would give greater control configuring hive server 2 service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira