[jira] [Commented] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967814#comment-13967814 ] Damien Carol commented on HIVE-6212: *IT'S NOT A BUG*. User report say that presto cli was used like this (WRONG) : {code} ./presto --server localhost:9083 --catalog hive --schema default {code} BUT, presto process listen on port 8080 by default. This is defined by key *http-server.http.port* of in file *config.properties*. By default *http-server.http.port*=8080) This is the correct command for the configuration of this user (OK): {code} ./presto --server localhost:8080 --catalog hive --schema default {code} Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space Key: HIVE-6212 URL: https://issues.apache.org/jira/browse/HIVE-6212 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Reporter: apachehadoop Fix For: 0.11.0 Hi friends: Now I can't open the page https://groups.google.com/forum/#!forum/presto-users ,so show my question here. I have started hiveserver and started presto-server on a machine with commands below: hive --service hiveserver -p 9083 ./launcher run When I use the presto-client-cli command ./presto --server localhost:9083 --catalog hive --schema default ,the console shows presto:default,input the command as show tables the console prints Error running command: java.nio.channels.ClosedChannelException, and the hiveserver console print as below: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) my configuration file below: node.properties node.environment=production node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data config.properties coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore task.max-memory=1GB discovery-server.enabled=true discovery.uri=http://slave4:8080 jvm.config -server -Xmx16G -XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent -XX:+CMSClassUnloadingEnabled -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -XX:PermSize=150M -XX:MaxPermSize=150M -XX:ReservedCodeCacheSize=150M -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar log.properties com.facebook.presto=DEBUG catalog/hive.properties connector.name=hive-cdh4 hive.metastore.uri=thrift://slave4:9083 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Last I had increased the Java heap size for the Hive metastore,but it still given me the same error informations ,please help me to check if that is a bug of CDH5.Now I have no idea,god ! please help me ,thanks. ** ** Add some informations as below: Help,help,help! I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or hive-0.11,but it still shown error informations above. ON coordinator machine the directory etc and configuration files as below: =coordinator config.properties: coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore
[jira] [Resolved] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-6212. --- Resolution: Won't Fix Contact presto developers. Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space Key: HIVE-6212 URL: https://issues.apache.org/jira/browse/HIVE-6212 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Reporter: apachehadoop Fix For: 0.11.0 Hi friends: Now I can't open the page https://groups.google.com/forum/#!forum/presto-users ,so show my question here. I have started hiveserver and started presto-server on a machine with commands below: hive --service hiveserver -p 9083 ./launcher run When I use the presto-client-cli command ./presto --server localhost:9083 --catalog hive --schema default ,the console shows presto:default,input the command as show tables the console prints Error running command: java.nio.channels.ClosedChannelException, and the hiveserver console print as below: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) my configuration file below: node.properties node.environment=production node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data config.properties coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore task.max-memory=1GB discovery-server.enabled=true discovery.uri=http://slave4:8080 jvm.config -server -Xmx16G -XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent -XX:+CMSClassUnloadingEnabled -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -XX:PermSize=150M -XX:MaxPermSize=150M -XX:ReservedCodeCacheSize=150M -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar log.properties com.facebook.presto=DEBUG catalog/hive.properties connector.name=hive-cdh4 hive.metastore.uri=thrift://slave4:9083 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Last I had increased the Java heap size for the Hive metastore,but it still given me the same error informations ,please help me to check if that is a bug of CDH5.Now I have no idea,god ! please help me ,thanks. ** ** Add some informations as below: Help,help,help! I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or hive-0.11,but it still shown error informations above. ON coordinator machine the directory etc and configuration files as below: =coordinator config.properties: coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore task.max-memory=1GB discovery-server.enabled=true discovery.uri=http://name:8080 --jvm.config: -server -Xmx4G -XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent -XX:+CMSClassUnloadingEnabled -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -XX:PermSize=150M -XX:MaxPermSize=150M -XX:ReservedCodeCacheSize=150M -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.55/presto-server-0.55/lib/floatingdecimal-0.1.jar
[jira] [Commented] (HIVE-6212) Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space
[ https://issues.apache.org/jira/browse/HIVE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967874#comment-13967874 ] Edward Capriolo commented on HIVE-6212: --- We dont support presto. Using Presto-0.56 for sql query,but HiveServer the console print java.lang.OutOfMemoryError: Java heap space Key: HIVE-6212 URL: https://issues.apache.org/jira/browse/HIVE-6212 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Environment: HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Reporter: apachehadoop Fix For: 0.11.0 Hi friends: Now I can't open the page https://groups.google.com/forum/#!forum/presto-users ,so show my question here. I have started hiveserver and started presto-server on a machine with commands below: hive --service hiveserver -p 9083 ./launcher run When I use the presto-client-cli command ./presto --server localhost:9083 --catalog hive --schema default ,the console shows presto:default,input the command as show tables the console prints Error running command: java.nio.channels.ClosedChannelException, and the hiveserver console print as below: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Exception in thread pool-1-thread-1 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) my configuration file below: node.properties node.environment=production node.id=cc4a1bbf-5b98-4935-9fde-2cf1c98e8774 node.data-dir=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/data config.properties coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.56/presto/db/MetaStore task.max-memory=1GB discovery-server.enabled=true discovery.uri=http://slave4:8080 jvm.config -server -Xmx16G -XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent -XX:+CMSClassUnloadingEnabled -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -XX:PermSize=150M -XX:MaxPermSize=150M -XX:ReservedCodeCacheSize=150M -Xbootclasspath/p:/home/hadoop/cloudera-5.0.0/presto-0.56/presto-server-0.56/lib/floatingdecimal-0.1.jar log.properties com.facebook.presto=DEBUG catalog/hive.properties connector.name=hive-cdh4 hive.metastore.uri=thrift://slave4:9083 HADOOP ENVIRONMENT IS CDH5+CDH5-HIVE-0.11+PRESTO-0.56 Last I had increased the Java heap size for the Hive metastore,but it still given me the same error informations ,please help me to check if that is a bug of CDH5.Now I have no idea,god ! please help me ,thanks. ** ** Add some informations as below: Help,help,help! I have test prest-server-0.55 and 0.56 and 0.57 on CDH4 +hive-0.10 or hive-0.11,but it still shown error informations above. ON coordinator machine the directory etc and configuration files as below: =coordinator config.properties: coordinator=true datasources=jmx http-server.http.port=8080 presto-metastore.db.type=h2 presto-metastore.db.filename=/home/hadoop/cloudera-5.0.0/presto-0.55/presto/db/MetaStore task.max-memory=1GB discovery-server.enabled=true discovery.uri=http://name:8080 --jvm.config: -server -Xmx4G -XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent -XX:+CMSClassUnloadingEnabled -XX:+AggressiveOpts -XX:+HeapDumpOnOutOfMemoryError -XX:OnOutOfMemoryError=kill -9 %p -XX:PermSize=150M -XX:MaxPermSize=150M -XX:ReservedCodeCacheSize=150M
[jira] [Updated] (HIVE-6059) Add union type support in LazyBinarySerDe
[ https://issues.apache.org/jira/browse/HIVE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johndee Burks updated HIVE-6059: Attachment: repro.tar [~houckman] The file repro.tar has all the reproduction information. The ddl needs to be changed for the schema file path. Run the dml after and the problem should reproduce. Add union type support in LazyBinarySerDe - Key: HIVE-6059 URL: https://issues.apache.org/jira/browse/HIVE-6059 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.12.0 Reporter: Chaoyu Tang Attachments: Hive Issue 0.jpeg, Hive Issue 1.jpeg, Hive Issue 2.jpeg, Hive Issue 3.jpeg, repro.tar We need the support to type union in LazyBinarySerDe, which is required to the join query with any union types in its select values. The reduce values in Join operation is serialized/deserialized using LazyBinarySerDe, otherwise we will see some errors like: {code} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:106) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardObjectInspector(ObjectInspectorUtils.java:156) at org.apache.hadoop.hive.ql.exec.JoinUtil.getStandardObjectInspectors(JoinUtil.java:98) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:261) at org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:61) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360) at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
49 config params without descriptions
Here's a list of 49 configuration parameters in RC0 (and trunk) that don't have descriptions in hive-default.xml.template: *Release 1 or 2 * hive.exec.submitviachild hive.metastore.metadb.dir hive.jar.path hive.aux.jars.path hive.table.name hive.partition.name hive.alias *Release 3 * hive.cli.errors.ignore *Release 4 * hive.added.files.path hive.added.jars.path *Release 5 * hive.intermediate.compression.codec hive.intermediate.compression.type hive.added.archives.path *Release 6 * hive.metastore.archive.intermediate.archived hive.metastore.archive.intermediate.extracted hive.mapred.partitioner hive.exec.script.trust hive.hadoop.supports.splittable.combineinputformat *Release 7 * hive.lockmgr.zookeeper.default.partition.name hive.metastore.fs.handler.class hive.query.result.fileformat hive.hashtable.initialCapacity hive.hashtable.loadfactor hive.debug.localtask hive.lock.manager hive.outerjoin.supports.filters hive.semantic.analyzer.hook *Release 8 * hive.exec.job.debug.timeout hive.exec.tasklog.debug.timeout hive.merge.rcfile.block.level hive.merge.input.format.block.level hive.merge.current.job.has.dynamic.partitions hive.stats.collect.rawdatasize *Release 8.1 * hive.optimize.metadataonly *Release 9 * *Release 10 * *Release 11 * hive.exec.rcfile.use.sync.cache hive.stats.key.prefix--- *internal* *Release 12 * hive.scratch.dir.permission datanucleus.fixedDatastore datanucleus.rdbms.useLegacyNativeValueStrategy hive.optimize.sampling.orderby --- *internal?* hive.optimize.sampling.orderby.number hive.optimize.sampling.orderby.percent hive.server2.authentication.ldap.Domain hive.server2.session.hook hive.typecheck.on.insert *Release 13 * hive.metastore.expression.proxy hive.txn.manager hive.stageid.rearrange hive.explain.dependency.append.tasktype What's the best way to deal with these? 1. Ignore them (or identify those that can be ignored). 2. Add some descriptions in Hive 0.13.0 RC1. 3. Deal with them after HIVE-6037https://issues.apache.org/jira/browse/HIVE-6037gets committed. - Try to cover all of them by Hive 0.14.0: - Put the list in a JIRA and create a common HiveConf.java patch, which can be appended until release 0.14.0 is ready. - Accumulate descriptions in JIRA comments, then create a patch from the comments. - Deal with them as soon as possible: - Put the list in an umbrella JIRA and use sub-task JIRAs to add descriptions individually or in small groups. 4. Deal with them in the wiki, then patch HiveConf.java before release 0.14.0. 5. [Your idea goes here.] -- Lefty
Re: Review Request 20096: HIVE-6835: Reading of partitioned Avro data fails if partition schema does not match table schema
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20096/#review40245 --- ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java https://reviews.apache.org/r/20096/#comment73193 private static final variable names should be ALL_CAPS_WITH_UNDERSCORES (see variables on preceding lines). ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java https://reviews.apache.org/r/20096/#comment73194 Formatting and whitespace cleanup should generally be reserved for patches specifically devoted to that task. While I sympathize with the urge to clean things up it makes backporting and merging patches a lot harder. If your IDE is automatically doing this you need to disable this behavior. ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java https://reviews.apache.org/r/20096/#comment73198 I think it would be good to explain the motivation for this change in the comment. ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java https://reviews.apache.org/r/20096/#comment73199 I think this would be a bit cleaner if lines 173 and 174 were left unchanged and line 181 was updated to iterate over tableDesc.getProperties(). ql/src/test/queries/clientpositive/avro_partitioned.q https://reviews.apache.org/r/20096/#comment73195 Good attention to detail! ql/src/test/queries/clientpositive/avro_partitioned.q https://reviews.apache.org/r/20096/#comment73196 May want to add ... even if it has an old schema relative to the current table level schema. serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java https://reviews.apache.org/r/20096/#comment73200 We should avoid defining this string token in two locations. I think it makes sense to refer to the one in PartitionDesc. serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java https://reviews.apache.org/r/20096/#comment73201 I think it's a little confusing that useTablePropertiesIfAvailable mutates the contents of the properties object, which is then read on the next line. I think this code will be easier to understand if useTablePropertiesIfAvailable is eliminated and the code is moved into an if/else if/else block in determineSchemaOrThrowException(). serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java https://reviews.apache.org/r/20096/#comment73202 A comment explaining what you're testing would be nice. - Carl Steinbach On April 7, 2014, 7:18 p.m., Anthony Hsu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20096/ --- (Updated April 7, 2014, 7:18 p.m.) Review request for hive. Repository: hive-git Description --- The problem occurs when you store the avro.schema.(literal|url) in the SERDEPROPERTIES instead of the TBLPROPERTIES, add a partition, change the table's schema, and then try reading from the old partition. I fixed this problem by passing the table properties to the partition with a table. prefix, and changing the Avro SerDe to always use the table properties when available. Diffs - ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 43cef5c ql/src/test/queries/clientpositive/avro_partitioned.q 068a13c ql/src/test/results/clientpositive/avro_partitioned.q.out 352ec0d serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 9d58d13 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerdeUtils.java 67d5570 Diff: https://reviews.apache.org/r/20096/diff/ Testing --- Added test cases Thanks, Anthony Hsu
[jira] [Updated] (HIVE-6835) Reading of partitioned Avro data fails if partition schema does not match table schema
[ https://issues.apache.org/jira/browse/HIVE-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6835: - Status: Open (was: Patch Available) [~erwaman]: Please see my comments on reviewboard. Thanks. Reading of partitioned Avro data fails if partition schema does not match table schema -- Key: HIVE-6835 URL: https://issues.apache.org/jira/browse/HIVE-6835 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Anthony Hsu Assignee: Anthony Hsu Attachments: HIVE-6835.1.patch To reproduce: {code} create table testarray (a arraystring); load data local inpath '/home/ahsu/test/array.txt' into table testarray; # create partitioned Avro table with one array column create table avroarray partitioned by (y string) row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties ('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ { name:a, type:{type:array,items:string} } ] }') STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'; insert into table avroarray partition(y=1) select * from testarray; # add an int column with a default value of 0 alter table avroarray set serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' with serdeproperties('avro.schema.literal'='{namespace:test,name:avroarray,type: record, fields: [ {name:intfield,type:int,default:0},{ name:a, type:{type:array,items:string} } ] }'); # fails with ClassCastException select * from avroarray; {code} The select * fails with: {code} Failed with exception java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector {code} -- This message was sent by Atlassian JIRA (v6.2#6252)