[jira] [Commented] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space

2014-04-13 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967814#comment-13967814
 ] 

Damien Carol commented on HIVE-6212:


*IT'S NOT A BUG*.

User report say that presto cli was used like this (WRONG) :
{code}
./presto --server localhost:9083 --catalog hive --schema default
{code}

BUT, presto process listen on port 8080 by default. This is defined by key 
*http-server.http.port* of in file *config.properties*.
By default *http-server.http.port*=8080)

This is the correct command for the configuration of this user (OK):
{code}
./presto --server localhost:8080 --catalog hive --schema default
{code}


 Using Presto-0.56 for sql query,but HiveServer the console print 
 java.lang.OutOfMemoryError: Java heap space
 

 Key: HIVE-6212
 URL: https://issues.apache.org/jira/browse/HIVE-6212
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
Reporter: apachehadoop
 Fix For: 0.11.0


 Hi friends:
 Now I can't open the page 
 https://groups.google.com/forum/#!forum/presto-users ,so show my question 
 here.
 I have started hiveserver and started presto-server on a machine with 
 commands below:
 hive --service hiveserver -p 9083
 ./launcher run
 When I use the presto-client-cli command ./presto --server localhost:9083 
 --catalog hive --schema default ,the console shows presto:default,input the 
 command as show tables the console prints Error running command: 
 java.nio.channels.ClosedChannelException,
 and the hiveserver console print as below:
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap 
 space
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 my configuration file below:
 node.properties
 node.environment=production
 node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774
 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data
 config.properties
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore
 task.max-memory=1GB
 discovery-server.enabled=true
 discovery.uri=http://slave4:8080
 jvm.config
 -server
 -Xmx16G
 -XX:+UseConcMarkSweepGC
 -XX:+ExplicitGCInvokesConcurrent
 -XX:+CMSClassUnloadingEnabled
 -XX:+AggressiveOpts
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:OnOutOfMemoryError=kill -9 %p
 -XX:PermSize=150M
 -XX:MaxPermSize=150M
 -XX:ReservedCodeCacheSize=150M
 -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar
 log.properties
 com.facebook.presto=DEBUG
 catalog/hive.properties
 connector.name=hive-cdh4
 hive.metastore.uri=thrift://slave4:9083
 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
 Last I had increased the Java heap size for the Hive metastore,but it still 
 given me the same error informations ,please help me to check if that is a 
 bug of CDH5.Now I have no idea,god !
 please help me ,thanks.
 **
 
 **
 Add some informations as below:
 Help,help,help!
 I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or 
 hive-0.11,but it still shown error informations above.
 ON coordinator machine the directory etc and configuration files as below:
 =coordinator 
  config.properties:
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore
 

[jira] [Resolved] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space

2014-04-13 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-6212.
---

Resolution: Won't Fix

Contact presto developers. 

 Using Presto-0.56 for sql query,but HiveServer the console print 
 java.lang.OutOfMemoryError: Java heap space
 

 Key: HIVE-6212
 URL: https://issues.apache.org/jira/browse/HIVE-6212
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
Reporter: apachehadoop
 Fix For: 0.11.0


 Hi friends:
 Now I can't open the page 
 https://groups.google.com/forum/#!forum/presto-users ,so show my question 
 here.
 I have started hiveserver and started presto-server on a machine with 
 commands below:
 hive --service hiveserver -p 9083
 ./launcher run
 When I use the presto-client-cli command ./presto --server localhost:9083 
 --catalog hive --schema default ,the console shows presto:default,input the 
 command as show tables the console prints Error running command: 
 java.nio.channels.ClosedChannelException,
 and the hiveserver console print as below:
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap 
 space
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 my configuration file below:
 node.properties
 node.environment=production
 node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774
 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data
 config.properties
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore
 task.max-memory=1GB
 discovery-server.enabled=true
 discovery.uri=http://slave4:8080
 jvm.config
 -server
 -Xmx16G
 -XX:+UseConcMarkSweepGC
 -XX:+ExplicitGCInvokesConcurrent
 -XX:+CMSClassUnloadingEnabled
 -XX:+AggressiveOpts
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:OnOutOfMemoryError=kill -9 %p
 -XX:PermSize=150M
 -XX:MaxPermSize=150M
 -XX:ReservedCodeCacheSize=150M
 -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar
 log.properties
 com.facebook.presto=DEBUG
 catalog/hive.properties
 connector.name=hive-cdh4
 hive.metastore.uri=thrift://slave4:9083
 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
 Last I had increased the Java heap size for the Hive metastore,but it still 
 given me the same error informations ,please help me to check if that is a 
 bug of CDH5.Now I have no idea,god !
 please help me ,thanks.
 **
 
 **
 Add some informations as below:
 Help,help,help!
 I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or 
 hive-0.11,but it still shown error informations above.
 ON coordinator machine the directory etc and configuration files as below:
 =coordinator 
  config.properties:
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore
 task.max-memory=1GB
 discovery-server.enabled=true
 discovery.uri=http://name:8080
 --jvm.config:
 -server
 -Xmx4G
 -XX:+UseConcMarkSweepGC
 -XX:+ExplicitGCInvokesConcurrent
 -XX:+CMSClassUnloadingEnabled
 -XX:+AggressiveOpts
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:OnOutOfMemoryError=kill -9 %p
 -XX:PermSize=150M
 -XX:MaxPermSize=150M
 -XX:ReservedCodeCacheSize=150M
 -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.55/presto-server-0.55/lib/floatingdecimal-0.1.jar
 

[jira] [Commented] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space

2014-04-13 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967874#comment-13967874
 ] 

Edward Capriolo commented on HIVE-6212:
---

We dont support presto.

 Using Presto-0.56 for sql query,but HiveServer the console print 
 java.lang.OutOfMemoryError: Java heap space
 

 Key: HIVE-6212
 URL: https://issues.apache.org/jira/browse/HIVE-6212
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
Reporter: apachehadoop
 Fix For: 0.11.0


 Hi friends:
 Now I can't open the page 
 https://groups.google.com/forum/#!forum/presto-users ,so show my question 
 here.
 I have started hiveserver and started presto-server on a machine with 
 commands below:
 hive --service hiveserver -p 9083
 ./launcher run
 When I use the presto-client-cli command ./presto --server localhost:9083 
 --catalog hive --schema default ,the console shows presto:default,input the 
 command as show tables the console prints Error running command: 
 java.nio.channels.ClosedChannelException,
 and the hiveserver console print as below:
 SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
 explanation.
 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
 Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap 
 space
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
 at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
 at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:662)
 my configuration file below:
 node.properties
 node.environment=production
 node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774
 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data
 config.properties
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore
 task.max-memory=1GB
 discovery-server.enabled=true
 discovery.uri=http://slave4:8080
 jvm.config
 -server
 -Xmx16G
 -XX:+UseConcMarkSweepGC
 -XX:+ExplicitGCInvokesConcurrent
 -XX:+CMSClassUnloadingEnabled
 -XX:+AggressiveOpts
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:OnOutOfMemoryError=kill -9 %p
 -XX:PermSize=150M
 -XX:MaxPermSize=150M
 -XX:ReservedCodeCacheSize=150M
 -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar
 log.properties
 com.facebook.presto=DEBUG
 catalog/hive.properties
 connector.name=hive-cdh4
 hive.metastore.uri=thrift://slave4:9083
 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56
 Last I had increased the Java heap size for the Hive metastore,but it still 
 given me the same error informations ,please help me to check if that is a 
 bug of CDH5.Now I have no idea,god !
 please help me ,thanks.
 **
 
 **
 Add some informations as below:
 Help,help,help!
 I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or 
 hive-0.11,but it still shown error informations above.
 ON coordinator machine the directory etc and configuration files as below:
 =coordinator 
  config.properties:
 coordinator=true
 datasources=jmx
 http-server.http.port=8080
 presto-metastore.db.type=h2
 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore
 task.max-memory=1GB
 discovery-server.enabled=true
 discovery.uri=http://name:8080
 --jvm.config:
 -server
 -Xmx4G
 -XX:+UseConcMarkSweepGC
 -XX:+ExplicitGCInvokesConcurrent
 -XX:+CMSClassUnloadingEnabled
 -XX:+AggressiveOpts
 -XX:+HeapDumpOnOutOfMemoryError
 -XX:OnOutOfMemoryError=kill -9 %p
 -XX:PermSize=150M
 -XX:MaxPermSize=150M
 -XX:ReservedCodeCacheSize=150M
 

[jira] [Updated] (HIVE-6059) Add union type support in LazyBinarySerDe

2014-04-13 Thread Johndee Burks (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johndee Burks updated HIVE-6059:


Attachment: repro.tar

[~houckman] The file repro.tar has all the reproduction information. The ddl 
needs to be changed for the schema file path. Run the dml after and the problem 
should reproduce. 

 Add union type support in LazyBinarySerDe
 -

 Key: HIVE-6059
 URL: https://issues.apache.org/jira/browse/HIVE-6059
 Project: Hive
  Issue Type: New Feature
  Components: File Formats
Affects Versions: 0.12.0
Reporter: Chaoyu Tang
 Attachments: Hive Issue 0.jpeg, Hive Issue 1.jpeg, Hive Issue 2.jpeg, 
 Hive Issue 3.jpeg, repro.tar


 We need the support to type union in LazyBinarySerDe, which is required to 
 the join query with any union types in its select values. The reduce values 
 in Join operation is serialized/deserialized using LazyBinarySerDe, otherwise 
 we will see some errors like:
 {code}
 Caused by: java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:106)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:156)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.getStandardObjectInspectors(JoinUtil.java:98)
 at 
 org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:261)
 at 
 org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:61)
 at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
 at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


49 config params without descriptions

2014-04-13 Thread Lefty Leverenz
Here's a list of 49 configuration parameters in RC0 (and trunk) that don't
have descriptions in hive-default.xml.template:


*Release 1 or 2 *

hive.exec.submitviachild

hive.metastore.metadb.dir

hive.jar.path

hive.aux.jars.path

hive.table.name

hive.partition.name

hive.alias


*Release 3 *

hive.cli.errors.ignore


*Release 4 *

hive.added.files.path

hive.added.jars.path


*Release 5 *

hive.intermediate.compression.codec

hive.intermediate.compression.type

hive.added.archives.path


*Release 6 *

hive.metastore.archive.intermediate.archived

hive.metastore.archive.intermediate.extracted

hive.mapred.partitioner

hive.exec.script.trust

hive.hadoop.supports.splittable.combineinputformat


*Release 7 *

hive.lockmgr.zookeeper.default.partition.name

hive.metastore.fs.handler.class

hive.query.result.fileformat

hive.hashtable.initialCapacity

hive.hashtable.loadfactor

hive.debug.localtask

hive.lock.manager

hive.outerjoin.supports.filters

hive.semantic.analyzer.hook


*Release 8 *

hive.exec.job.debug.timeout

hive.exec.tasklog.debug.timeout

hive.merge.rcfile.block.level

hive.merge.input.format.block.level

hive.merge.current.job.has.dynamic.partitions

hive.stats.collect.rawdatasize


*Release 8.1 *

hive.optimize.metadataonly


*Release 9 *


*Release 10 *


*Release 11 *

hive.exec.rcfile.use.sync.cache

hive.stats.key.prefix--- *internal*


*Release 12 *

hive.scratch.dir.permission

datanucleus.fixedDatastore

datanucleus.rdbms.useLegacyNativeValueStrategy

hive.optimize.sampling.orderby --- *internal?*

hive.optimize.sampling.orderby.number

hive.optimize.sampling.orderby.percent

hive.server2.authentication.ldap.Domain

hive.server2.session.hook

hive.typecheck.on.insert


*Release 13 *

hive.metastore.expression.proxy

hive.txn.manager

hive.stageid.rearrange

hive.explain.dependency.append.tasktype



What's the best way to deal with these?

   1. Ignore them (or identify those that can be ignored).
   2. Add some descriptions in Hive 0.13.0 RC1.
   3. Deal with them after
HIVE-6037https://issues.apache.org/jira/browse/HIVE-6037gets
committed.
  - Try to cover all of them by Hive 0.14.0:
  - Put the list in a JIRA and create a common HiveConf.java patch,
 which can be appended until release 0.14.0 is ready.
 - Accumulate descriptions in JIRA comments, then create a patch
 from the comments.
  - Deal with them as soon as possible:
  - Put the list in an umbrella JIRA and use sub-task JIRAs to add
 descriptions individually or in small groups.
 4. Deal with them in the wiki, then patch HiveConf.java before
   release 0.14.0.
   5. [Your idea goes here.]


-- Lefty


Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-13 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20096/#review40245
---



ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
https://reviews.apache.org/r/20096/#comment73193

private static final variable names should be ALL_CAPS_WITH_UNDERSCORES 
(see variables on preceding lines).




ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
https://reviews.apache.org/r/20096/#comment73194

Formatting and whitespace cleanup should generally be reserved for patches 
specifically devoted to that task. While I sympathize with the urge to clean 
things up it makes backporting and merging patches a lot harder. If your IDE is 
automatically doing this you need to disable this behavior.



ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
https://reviews.apache.org/r/20096/#comment73198

I think it would be good to explain the motivation for this change in the 
comment.



ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java
https://reviews.apache.org/r/20096/#comment73199

I think this would be a bit cleaner if lines 173 and 174 were left 
unchanged and line 181 was updated to iterate over tableDesc.getProperties().



ql/src/test/queries/clientpositive/avro_partitioned.q
https://reviews.apache.org/r/20096/#comment73195

Good attention to detail!



ql/src/test/queries/clientpositive/avro_partitioned.q
https://reviews.apache.org/r/20096/#comment73196

May want to add ... even if it has an old schema relative to the current 
table level schema.



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/20096/#comment73200

We should avoid defining this string token in two locations. I think it 
makes sense to refer to the one in PartitionDesc.



serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java
https://reviews.apache.org/r/20096/#comment73201

I think it's a little confusing that useTablePropertiesIfAvailable mutates 
the contents of the properties object, which is then read on the next line. I 
think this code will be easier to understand if useTablePropertiesIfAvailable 
is eliminated and the code is moved into an if/else if/else block in 
determineSchemaOrThrowException().



serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java
https://reviews.apache.org/r/20096/#comment73202

A comment explaining what you're testing would be nice.


- Carl Steinbach


On April 7, 2014, 7:18 p.m., Anthony Hsu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/20096/
 ---
 
 (Updated April 7, 2014, 7:18 p.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The problem occurs when you store the avro.schema.(literal|url) in the 
 SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the 
 table's schema, and then try reading from the old partition.
 
 I fixed this problem by passing the table properties to the partition with a 
 table. prefix, and changing the Avro SerDe to always use the table 
 properties when available.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c 
   ql/src/test/queries/clientpositive/avro_partitioned.q 068a13c 
   ql/src/test/results/clientpositive/avro_partitioned.q.out 352ec0d 
   serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 
 9d58d13 
   serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 
 67d5570 
 
 Diff: https://reviews.apache.org/r/20096/diff/
 
 
 Testing
 ---
 
 Added test cases
 
 
 Thanks,
 
 Anthony Hsu
 




[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema

2014-04-13 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-6835:
-

Status: Open  (was: Patch Available)

[~erwaman]: Please see my comments on reviewboard. Thanks.

 Reading of partitioned Avro data fails if partition schema does not match 
 table schema
 --

 Key: HIVE-6835
 URL: https://issues.apache.org/jira/browse/HIVE-6835
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: Anthony Hsu
Assignee: Anthony Hsu
 Attachments: HIVE-6835.1.patch


 To reproduce:
 {code}
 create table testarray (a arraystring);
 load data local inpath '/home/ahsu/test/array.txt' into table testarray;
 # create partitioned Avro table with one array column
 create table avroarray partitioned by (y string) row format serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties 
 ('avro.schema.literal'='{namespace:test,name:avroarray,type: 
 record, fields: [ { name:a, type:{type:array,items:string} 
 } ] }')  STORED as INPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'  OUTPUTFORMAT  
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat';
 insert into table avroarray partition(y=1) select * from testarray;
 # add an int column with a default value of 0
 alter table avroarray set serde 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with 
 serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type:
  record, fields: [ {name:intfield,type:int,default:0},{ 
 name:a, type:{type:array,items:string} } ] }');
 # fails with ClassCastException
 select * from avroarray;
 {code}
 The select * fails with:
 {code}
 Failed with exception java.io.IOException:java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)