[jira] [Commented] (HIVE-8485) HMS on Oracle incompatibility
[ https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237100#comment-14237100 ] Andy Jefferson commented on HIVE-8485: -- DataNucleus has no problem with nulls or empty strings in that it issues SQL to a datastore and the store replies, and DN also provides the ability for the user to decide how nulls are stored in Oracle via property datanucleus.rdbms.persistEmptyStringAsNull. Can't see how whatever your problem is (table missing?!) relates to DataNucleus software. HMS on Oracle incompatibility - Key: HIVE-8485 URL: https://issues.apache.org/jira/browse/HIVE-8485 Project: Hive Issue Type: Bug Components: Metastore Environment: Oracle as metastore DB Reporter: Ryan Pridgeon Oracle does not distinguish between empty strings and NULL,which proves problematic for DataNucleus. In the event a user creates a table with some property stored as an empty string the table will no longer be accessible. i.e. TBLPROPERTIES ('serialization.null.format'='') If they try to select, describe, drop, etc the client prints the following exception. ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found table name The work around for this was to go into the hive metastore on the Oracle database and replace NULL with some other string. Users could then drop the tables or alter their data to use the new null format they just set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7368) datanucleus sometimes returns an empty result instead of an error or data
[ https://issues.apache.org/jira/browse/HIVE-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194386#comment-14194386 ] Andy Jefferson commented on HIVE-7368: -- DN creates a bunch of deleteme* tables ... It creates a _single_ DELETEME* table _when_ the RDBMS doesn't provide another mechanism for checking the catalog/schema in use _per PMF_ (and it isn't 'testing connectivity'). The only way there will be a bunch is if the application is using multiple PMFs (why would it need multiple), and even then the cost of a create/drop of a DELETEME table is so insignificant compared to the overall PMF startup cost. datanucleus sometimes returns an empty result instead of an error or data - Key: HIVE-7368 URL: https://issues.apache.org/jira/browse/HIVE-7368 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan I investigated a scenario wherein a user needed to use a large number of concurrent hive clients doing simple DDL tasks, while not using a standalone metastore server. Say, for eg., each of them doing drop table if exists tmp_blah_${i}; This would consistently fail stating that it could not create a db, which is a funny error to have when trying to drop a db if exists. On digging in, it turned out that the error was a mistaken report, coming instead from an attempt by the embedded metastore attempting to create a default db when it did not exist. The funny thing being that the default db did exist, and the getDatabase call would return empty, rather than returning an error or returning a result. We could disable hive.metastore.checkForDefaultDb and the number of these reports would drastically fall, but that only moved the problem, and we'd get phantom reports from time to time of various other databases that existed that were being reported as non-existent. On digging further, parallelism seemed to be an important factor in whether or not hive was able to perform getDatabases without error. With about 20 simultaneous processes, there seemed to be no errors whatsoever. At about 40 simultaneous processes, at least 1 would consistently fail. At about 200, about 15-20 would consistently fail, in addition to taking a long time to run. I wrote a sample JDBC ping (actually a get_database mimic) utility to see whether the issue was with connecting from that machine to the database server, and this had no errors whatsoever up to 400 simultaneous processes. The mysql server in question was configured to serve up to 650 connections, and it seemed to be serving responses quickly and did not seem overloaded. We also disabled connection pooling in case that was exacerbating a connection availability issue with that many concurrent processes, each with an embedded metastore. That, especially in conjunction with disabling schema checking, and specifying a datanucleus.connectionPool.testSQL=SELECT 1 did a fair amount for performance in this scenarios, but the errors (or rather, the null-result-successes when there shouldn't have been one) continued. On checking through hive again, if we modified hive to have datanucleus simply return a connection, with which we did a direct sql get database, there would not be any error, but if we tried to use jdo on datanucleus to construct a db object, we would get an empty result, so the issue seems to crop up in the jdo mapping. One of the biggest issues with this investigation, for me, was the difficulty of reproducibility. When trying to reproduce in a lab, we were unable to create a similar enough environment that caused the issue. Even in the client's environment, moving from RHEL5 to RHEL6 made the issue go away. Thus, we still have work to do on determining the underlying issue, I'm logging this issue to collect information on similar issues we discover so we can work towards nailing down the issue and then fixing it(in DN if need be) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6336) Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe
[ https://issues.apache.org/jira/browse/HIVE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889488#comment-13889488 ] Andy Jefferson commented on HIVE-6336: -- @Nigel Savage, The most recent release is as follows : datanucleus-core-3.2.12, datanucleus-api-jdo-3.2.8, datanucleus-api-rdbms-3.2.11. Note that HIVE-5218 requires datanucleus-rdbms-3.2.7 or later, and HIVE-6136 requires datanucleus-rdbms-3.2.11 too. Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe -- Key: HIVE-6336 URL: https://issues.apache.org/jira/browse/HIVE-6336 Project: Hive Issue Type: Wish Components: HiveServer2 Affects Versions: 0.12.0 Environment: Hadoop 2.2 local derby Meatastore embedded Reporter: Nigel Savage Priority: Blocker Labels: HADOOP There is an with hive 12 datanucleus incompatability which seems to have invompatibility with org.apache.hadoop.hive.contrib.serde2.RegexSerDe The main question: *IF hive 0.12.0 and datanucleus are compatabile, then what is the version of datanucleus I should be using with Hive 12 and Hadoop 2.2?* The error which Im getting (this blocks me from properly running hive queries invoked from the test phase of a maven project) *To reproduce* I have hadoop and hive running as a pseudo cluster local mode and derby as the metastore I have the following environment variables {noformat} HADOOP_HOME=/home/ubu/hadoop JAVA_HOME=/usr/lib/jvm/java-7-oracle {noformat} I have the RegexSerDe declared in the hive-site.xml {noformat} property namehive.aux.jars.path/name valuefile:///home/ubu/hadoop/lib/hive-contrib-0.12.0.jar /value descriptionThis JAR file available to all users for alljobs/description /property {noformat} If I run with {noformat} datanucleus.version3.0.2/datanucleus.version {noformat} I get the following 1 exception only 'java.lang.ClassNotFoundException...org.datanucleus.store.types.backed.Ma' HOWEVER, If I run with {noformat} datanucleus.version3.2.0-release/datanucleus.version {noformat} I get the following 1 exception exception only java.lang.ClassNotFoundException: org/apache/hadoop/hive/contrib/serde2/RegexSerDe EXPLANATION The RegexSerDe class is picked up at run time but the datanucleus Map class is not available, I have checked in the datanucleus-core 3.0.2 jar and it is missing, Upgrading to the first datanucleus above 3.0.2 that includes the Map class throws the ClassNotFoundException for RegexSerDe. The earlier *3.0.2* datanucleus, code fails with the missing Map class but the RegexSerDe class is found, then when I upgrade to the 3.2.0-release the Map class is found but for some unkown reason the code/Hive no longer finds the RegexSerDe class I started using the same datanucleus dependencies found in this hive pom http://maven-repository.com/artifact/org.apache.hive/hive-metastore/0.12.0/pom below are the dependencies my latest attempts to get a functioning pom {noformat} dependency groupIdorg.apache.hbase/groupId artifactIdhbase-server/artifactId version0.96.0-hadoop2/version /dependency dependency groupIdorg.apache.hbase/groupId artifactIdhbase-client/artifactId version0.96.0-hadoop2/version /dependency !-- misc -- dependency groupIdorg.apache.commons/groupId artifactIdcommons-lang3/artifactId version3.1/version /dependency dependency groupIdcom.google.guava/groupId artifactIdguava/artifactId version${guava.version}/version /dependency dependency groupIdorg.apache.derby/groupId artifactIdderby/artifactId version${derby.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-core/artifactId version${datanucleus.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-rdbms/artifactId version${datanucleus-rdbms.version}/version /dependency dependency groupIdjavax.jdo/groupId artifactIdjdo-api/artifactId version3.0.1/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-api-jdo/artifactId version${datanucleus.jdo.version}/version exclusions exclusion
[jira] [Commented] (HIVE-6336) Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe
[ https://issues.apache.org/jira/browse/HIVE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888599#comment-13888599 ] Andy Jefferson commented on HIVE-6336: -- For some reason your HIVE pom has datanucleus-api-jdo with exclusions for jdo2-api, junit, log4j ... yet these are clearly seen to be either non-existent (jdo2-api) or test scope only (junit, log4j) for the datanucleus-api-jdo project itself ; such exclusions have no logical reason to be (IMHO). See http://central.maven.org/maven2/org/datanucleus/datanucleus-api-jdo/3.2.1/datanucleus-api-jdo-3.2.1.pom Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe -- Key: HIVE-6336 URL: https://issues.apache.org/jira/browse/HIVE-6336 Project: Hive Issue Type: Wish Components: HiveServer2 Affects Versions: 0.12.0 Environment: Hadoop 2.2 local derby Meatastore embedded Reporter: Nigel Savage Priority: Blocker Labels: HADOOP There is an with hive 12 datanucleus incompatability which seems to have invompatibility with org.apache.hadoop.hive.contrib.serde2.RegexSerDe The main question: *IF hive 0.12.0 and datanucleus are compatabile, then what is the version of datanucleus I should be using with Hive 12 and Hadoop 2.2?* The error which Im getting (this blocks me from properly running hive queries invoked from the test phase of a maven project) *To reproduce* I have hadoop and hive running as a pseudo cluster local mode and derby as the metastore I have the following environment variables {noformat} HADOOP_HOME=/home/ubu/hadoop JAVA_HOME=/usr/lib/jvm/java-7-oracle {noformat} I have the RegexSerDe declared in the hive-site.xml {noformat} property namehive.aux.jars.path/name valuefile:///home/ubu/hadoop/lib/hive-contrib-0.12.0.jar /value descriptionThis JAR file available to all users for alljobs/description /property {noformat} If I run with {noformat} datanucleus.version3.0.2/datanucleus.version {noformat} I get the following 1 exception only 'java.lang.ClassNotFoundException...org.datanucleus.store.types.backed.Ma' HOWEVER, If I run with {noformat} datanucleus.version3.2.0-release/datanucleus.version {noformat} I get the following 1 exception exception only java.lang.ClassNotFoundException: org/apache/hadoop/hive/contrib/serde2/RegexSerDe EXPLANATION The RegexSerDe class is picked up at run time but the datanucleus Map class is not available, I have checked in the datanucleus-core 3.0.2 jar and it is missing, Upgrading to the first datanucleus above 3.0.2 that includes the Map class throws the ClassNotFoundException for RegexSerDe. The earlier *3.0.2* datanucleus, code fails with the missing Map class but the RegexSerDe class is found, then when I upgrade to the 3.2.0-release the Map class is found but for some unkown reason the code/Hive no longer finds the RegexSerDe class I started using the same datanucleus dependencies found in this hive pom http://maven-repository.com/artifact/org.apache.hive/hive-metastore/0.12.0/pom below are the dependencies my latest attempts to get a functioning pom {noformat} dependency groupIdorg.apache.hbase/groupId artifactIdhbase-server/artifactId version0.96.0-hadoop2/version /dependency dependency groupIdorg.apache.hbase/groupId artifactIdhbase-client/artifactId version0.96.0-hadoop2/version /dependency !-- misc -- dependency groupIdorg.apache.commons/groupId artifactIdcommons-lang3/artifactId version3.1/version /dependency dependency groupIdcom.google.guava/groupId artifactIdguava/artifactId version${guava.version}/version /dependency dependency groupIdorg.apache.derby/groupId artifactIdderby/artifactId version${derby.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-core/artifactId version${datanucleus.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-rdbms/artifactId version${datanucleus-rdbms.version}/version /dependency dependency groupIdjavax.jdo/groupId artifactIdjdo-api/artifactId version3.0.1/version /dependency dependency groupIdorg.datanucleus/groupId
[jira] [Commented] (HIVE-6136) Hive metastore configured with DB2 LUW doesn't work
[ https://issues.apache.org/jira/browse/HIVE-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883768#comment-13883768 ] Andy Jefferson commented on HIVE-6136: -- FWIW datanucleus-core 3.2.12, datanucleus-rdbms 3.2.11 are released Hive metastore configured with DB2 LUW doesn't work --- Key: HIVE-6136 URL: https://issues.apache.org/jira/browse/HIVE-6136 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0 Reporter: Thomas Friedrich Attachments: hive.log Hive 0.12 with datanucleus 3.2.1 generates invalid SQL syntax if the metastore is configured with DB2. To reproduce the issue, simply create a table and drop it using Hive CLI: create table test(i1 int); drop table test; Drop will fail and this is the stacktrace: {noformat} com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-206, SQLSTATE=42703, SQLERRMC=SUBQ.A0.CREATE_TIME, DRIVER=4.16.53 at com.ibm.db2.jcc.am.fd.a(fd.java:739) at com.ibm.db2.jcc.am.fd.a(fd.java:60) at com.ibm.db2.jcc.am.fd.a(fd.java:127) at com.ibm.db2.jcc.am.to.c(to.java:2771) at com.ibm.db2.jcc.am.to.d(to.java:2759) at com.ibm.db2.jcc.am.to.a(to.java:2192) at com.ibm.db2.jcc.am.uo.a(uo.java:7827) at com.ibm.db2.jcc.t4.ab.h(ab.java:141) at com.ibm.db2.jcc.t4.ab.b(ab.java:41) at com.ibm.db2.jcc.t4.o.a(o.java:32) at com.ibm.db2.jcc.t4.tb.i(tb.java:145) at com.ibm.db2.jcc.am.to.kb(to.java:2161) at com.ibm.db2.jcc.am.uo.wc(uo.java:3657) at com.ibm.db2.jcc.am.uo.b(uo.java:4454) at com.ibm.db2.jcc.am.uo.jc(uo.java:760) at com.ibm.db2.jcc.am.uo.executeQuery(uo.java:725) at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:172) at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:381) at org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:504) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:637) at org.datanucleus.store.query.Query.executeQuery(Query.java:1786) at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672) at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:266) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1698) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1428) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1402) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:124) at com.sun.proxy.$Proxy7.getPartitions(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.dropPartitionsAndGetLocations(HiveMetaStore.java:1286) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:1189) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1328) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler. at com.sun.proxy.$Proxy8.drop_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreCl at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j at java.lang.reflect.Method.invoke(Method.java:611) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaSt at com.sun.proxy.$Proxy9.dropTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:869) at
[jira] [Commented] (HIVE-4456) Datanucleus throws NPE after passing a config from test file (.q) to hive metastore
[ https://issues.apache.org/jira/browse/HIVE-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818793#comment-13818793 ] Andy Jefferson commented on HIVE-4456: -- Using ancient version of DataNucleus; suggest whoever saw this use something recent ... Datanucleus throws NPE after passing a config from test file (.q) to hive metastore --- Key: HIVE-4456 URL: https://issues.apache.org/jira/browse/HIVE-4456 Project: Hive Issue Type: Bug Components: Configuration, Metastore Reporter: Gang Tim Liu Priority: Critical Attachments: err.txt create a configuration file with the following: set hive.metastore.ds.retry.interval=2000; create table analyze_srcpart like srcpart; run ant test -Dtestcase=TestCliDriver -Dqfile=file NPE is thrown. See attached files. Anything special for hive.metastore.ds.retry.interval? It is a config listed under HiveConf.metaVars. Then, HiveConf.get(HiveConf c) will recreate a new conf while detecting a difference. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in
[ https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818791#comment-13818791 ] Andy Jefferson commented on HIVE-5457: -- This is with an old version of DataNucleus, yet Hive upgraded a while ago (before this was raised). Using current DataNucleus there are zero reported concurrency problems, particularly if using either an auto-start mechanism or persistence.xml with datanucleus.persistenceUnitLoadClasses=true Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping --- Key: HIVE-5457 URL: https://issues.apache.org/jira/browse/HIVE-5457 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Lenni Kuff Priority: Critical Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping This happens when using a Hive Metastore Service directly connecting to the backend metastore db. I have been able to hit this with as few as 2 concurrent calls. When I update my app to serialize all calls to getTable() this problem is resolved. Stack Trace: {code} Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. at org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257) at org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46) at org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440) at org.datanucleus.sco.backed.List.size(List.java:557) at org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017) at org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872) at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy6.getTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5218) datanucleus does not work with MS SQLServer in Hive metastore
[ https://issues.apache.org/jira/browse/HIVE-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803019#comment-13803019 ] Andy Jefferson commented on HIVE-5218: -- FYI 3.2.7 of datanucleus-rdbms is released datanucleus does not work with MS SQLServer in Hive metastore - Key: HIVE-5218 URL: https://issues.apache.org/jira/browse/HIVE-5218 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0 Reporter: shanyu zhao Attachments: 0001-HIVE-5218-datanucleus-does-not-work-with-SQLServer-i.patch, HIVE-5218.patch HIVE-3632 upgraded datanucleus version to 3.2.x, however, this version of datanucleus doesn't work with SQLServer as the metastore. The problem is that datanucleus tries to use fully qualified object name to find a table in the database but couldn't find it. If I downgrade the version to HIVE-2084, SQLServer works fine. It could be a bug in datanucleus. This is the detailed exception I'm getting when using datanucleus 3.2.x with SQL Server: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTa sk. MetaException(message:javax.jdo.JDOException: Exception thrown calling table .exists() for a2ee36af45e9f46c19e995bfd2d9b5fd1hivemetastore..SEQUENCE_TABLE at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusExc eption(NucleusJDOHelper.java:596) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPe rsistenceManager.java:732) … at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawS tore.java:111) at $Proxy0.createTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_core(HiveMetaStore.java:1071) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_tabl e_with_environment_context(HiveMetaStore.java:1104) … at $Proxy11.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6417) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$cr eate_table_with_environment_context.getResult(ThriftHiveMetastore.java:6401) NestedThrowablesStackTrace: com.microsoft.sqlserver.jdbc.SQLServerException: There is already an object name d 'SEQUENCE_TABLE' in the database. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError (SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ erStatement.java:1493) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ LServerStatement.java:775) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute (SQLServerStatement.java:676) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4615) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe rverConnection.java:1400) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer verStatement.java:179) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS erverStatement.java:154) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat ement.java:649) at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:300) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(A bstractTable.java:760) at org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementLi st(AbstractTable.java:711) at org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable. java:425) at org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable. java:488) at org.datanucleus.store.rdbms.valuegenerator.TableGenerator.repositoryE xists(TableGenerator.java:242) at org.datanucleus.store.rdbms.valuegenerator.AbstractRDBMSGenerator.obt ainGenerationBlock(AbstractRDBMSGenerator.java:86) at org.datanucleus.store.valuegenerator.AbstractGenerator.obtainGenerati onBlock(AbstractGenerator.java:197) at org.datanucleus.store.valuegenerator.AbstractGenerator.next(AbstractG enerator.java:105) at org.datanucleus.store.rdbms.RDBMSStoreManager.getStrategyValueForGene rator(RDBMSStoreManager.java:2019) at org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractS toreManager.java:1385) at org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl .java:3727) at
[jira] [Commented] (HIVE-3256) Update asm version in Hive
[ https://issues.apache.org/jira/browse/HIVE-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722245#comment-13722245 ] Andy Jefferson commented on HIVE-3256: -- https://issues.apache.org/jira/browse/HIVE-3632 upgraded Hive to use DN 3.2.x. This comes with its own repackaged ASM internally so you don't need any ASM for DataNucleus any longer. Consequently any DN-utilising system can use whichever version of ASM it requires Update asm version in Hive -- Key: HIVE-3256 URL: https://issues.apache.org/jira/browse/HIVE-3256 Project: Hive Issue Type: Bug Reporter: Zhenxiao Luo Assignee: Zhenxiao Luo Hive trunk are currently using asm version 3.1, Hadoop trunk are on 3.2. Any objections to bumping the Hive version to 3.2 to be inline with Hadoop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3994) Hive metastore is not working on PostgreSQL 9.2 (most likely on anything 9.0+)
[ https://issues.apache.org/jira/browse/HIVE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626362#comment-13626362 ] Andy Jefferson commented on HIVE-3994: -- Obviously DataNucleus 3.x supports this new Postgresql syntax, but sadly you keep on using an ancient version Hive metastore is not working on PostgreSQL 9.2 (most likely on anything 9.0+) -- Key: HIVE-3994 URL: https://issues.apache.org/jira/browse/HIVE-3994 Project: Hive Issue Type: Improvement Reporter: Jarek Jarcec Cecho I'm getting following exception when running metastore on PostgreSQL 9.2: {code} Caused by: javax.jdo.JDODataStoreException: Error executing JDOQL query SELECT THIS.TBL_NAME AS NUCORDER0 FROM TBLS THIS LEFT OUTER JOIN DBS THIS_DATABASE_NAME ON THIS.DB_ID = THIS_DATABASE_NAME.DB_ID WHERE THIS_DATABASE_NAME.NAME = ? AND (LOWER(THIS.TBL_NAME) LIKE ? ESCAPE '\\' ) ORDER BY NUCORDER0 : ERROR: invalid escape string Hint: Escape string must be empty or one character.. NestedThrowables: org.postgresql.util.PSQLException: ERROR: invalid escape string Hint: Escape string must be empty or one character. at org.datanucleus.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:313) at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:252) at org.apache.hadoop.hive.metastore.ObjectStore.getTables(ObjectStore.java:759) ... 28 more Caused by: org.postgresql.util.PSQLException: ERROR: invalid escape string Hint: Escape string must be empty or one character. at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2096) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1829) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:510) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:386) at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:271) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96) at org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:457) at org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:123) at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288) at org.datanucleus.store.query.Query.executeQuery(Query.java:1657) at org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245) at org.datanucleus.store.query.Query.executeWithArray(Query.java:1499) at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:243) ... 29 more {code} I've google a bit about that and I found a lot of similar issues in different projects thus I'm assuming that this might be some backward compatibility issue on PostgreSQL side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3521) Concurrent metastore calls provoke Datanucleus IllegalStateException: Table object has not been been initialised
[ https://issues.apache.org/jira/browse/HIVE-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575197#comment-13575197 ] Andy Jefferson commented on HIVE-3521: -- I've never seen it, and that private company running that proprietary benchmark (for their proprietary product) didn't bother configuring DataNucleus (surprising?). Use of the DN persistence property datanucleus.PersistenceUnitLoadClasses at startup together with a persistence-unit, or alternatively use of an auto-start mechanism, as well as locking of your PMF/EMF until it was initialised would clearly mean that the internal schema was initialised. You seem to be executing a query and it is still discovering classes that map to your schema; not a good way of running Concurrent metastore calls provoke Datanucleus IllegalStateException: Table object has not been been initialised -- Key: HIVE-3521 URL: https://issues.apache.org/jira/browse/HIVE-3521 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3632) datanucleus breaks when using JDK7
[ https://issues.apache.org/jira/browse/HIVE-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13498101#comment-13498101 ] Andy Jefferson commented on HIVE-3632: -- You refer to a DataNucleus JIRA issue that was marked fixed in June 2012. How does that imply that they don't plan to actively support JDK7+ bytecode any time soon ? DataNucleus 3.1.x supports JDK1.7+ and has for some time. There are 0 reported problems using DataNucleus v3.1 with JDK1.7. You don't define not successful datanucleus breaks when using JDK7 -- Key: HIVE-3632 URL: https://issues.apache.org/jira/browse/HIVE-3632 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0, 0.9.1 Reporter: Chris Drome Priority: Critical I found serious problems with datanucleus code when using JDK7, resulting in some sort of exception being thrown when datanucleus code is entered. I tried source=1.7, target=1.7 with JDK7 as well as source=1.6, target=1.6 with JDK7 and there was no visible difference in that the same unit tests failed. I tried upgrading datanucleus to 3.0.1, as per HIVE-2084.patch, which did not fix the failing tests. I tried upgrading datanucleus to 3.1-release, as per the advise of http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-86, which suggests using ASMv4 will allow datanucleus to work with JDK7. I was not successful with this either. I tried upgrading datanucleus to 3.1.2. I was not successful with this either. Regarding datanucleus support for JDK7+, there is the following JIRA http://www.datanucleus.org/servlet/jira/browse/NUCENHANCER-81 which suggests that they don't plan to actively support JDK7+ bytecode any time soon. I also tested the following JVM parameters found on http://veerasundar.com/blog/2012/01/java-lang-verifyerror-expecting-a-stackmap-frame-at-branch-target-jdk-7/ with no success either. This will become a more serious problem as people move to newer JVMs. If there are other who have solved this issue, please post how this was done. Otherwise, it is a topic that I would like to raise for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463840#comment-13463840 ] Andy Jefferson commented on HIVE-2084: -- allowNulls can either be set in metadata (XML,annotation) for the container field or it can default to what the particular java.util type does so, for example, HashMap/HashSet/LinkedHashSet/any-type-of-list default to allowNulls=true, and all others default to false currently. Only you know what type is your map Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2084.D2397.1.patch, HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D5685.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461784#comment-13461784 ] Andy Jefferson commented on HIVE-2084: -- @Carl, http://datanucleus.svn.sourceforge.net/viewvc/datanucleus/test/accessplatform/trunk/test.jdo.datastore/src/test/org/datanucleus/tests/types/SCOMapTests.java?revision=15499content-type=text%2Fplain look for checkPutNullValues. Always has passed for me since we started supporting null map values (2 or 3 years ago at a guess), most recently with datanucleus-core-3.1.2, datanucleus-api-jdo-3.1.2, datanucleus-rdbms-3.1.1. Since your problem description provides no information of the circumstances (such as log entries, persistence code etc) then cannot speculate as to what you have Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2084.D2397.1.patch, HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13449528#comment-13449528 ] Andy Jefferson commented on HIVE-2084: -- Obviously DataNucleus has testcases that persist Maps with null values, and they work (since all tests pass with every release), so clearly down to your map and how you're doing things. Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2084.D2397.1.patch, HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414882#comment-13414882 ] Andy Jefferson commented on HIVE-2084: -- No idea what your table not found problem is, but if this is at startup logic would suggest passing all classes to DataNucleus via an auto-start mechanism, or using a persistence.xml. If using persistence.xml then make sure you also have persistence property datanucleus.PersistenceUnitLoadClasses set to true. That way, all classes are known about once started, and the store knows about these classes too (hence knows of their tables). There are no outstanding issues reported around tables not being found, so if you have something then you need to generate a reproduceable testcase and report it. DataNucleus 2.x versions haven't been supported for some time. DataNucleus 3.0 is now not being updated (except for commercial requests), since DataNucleus 3.1 will be out in 2 weeks. Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D2397.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 2.2.3
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114647#comment-13114647 ] Andy Jefferson commented on HIVE-2084: -- Regarding upgrading to DN 3.0, follow this link http://www.datanucleus.org/products/accessplatform_3_0/migration.html 3.0.2 will be out in a week, but no API changes involved from 3.0.1 to 3.0.2 Upgrade datanucleus from 2.0.3 to 2.2.3 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Ning Zhang Labels: datanucleus Attachments: HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2015) Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages
[ https://issues.apache.org/jira/browse/HIVE-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011155#comment-13011155 ] Andy Jefferson commented on HIVE-2015: -- or just use recent DataNucleus (3.0Mx) which, by default, omits checks on OSGi dependencies. PS. if having such issues with third party software i'd expect people to go to that third-party software and register an issue there to be able to turn something off etc, rather than rely on that projects developers to just happen across issues like this in a web trawl. Eliminate bogus Datanucleus.Plugin Bundle ERROR log messages Key: HIVE-2015 URL: https://issues.apache.org/jira/browse/HIVE-2015 Project: Hive Issue Type: Bug Components: Diagnosability, Metastore Reporter: Carl Steinbach Every time I start up the Hive CLI with logging enabled I'm treated to the following ERROR log messages courtesy of DataNucleus: {code} DEBUG metastore.ObjectStore: datanucleus.plugin.pluginRegistryBundleCheck = LOG ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.core.resources but it cannot be resolved. ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.core.runtime but it cannot be resolved. ERROR DataNucleus.Plugin: Bundle org.eclipse.jdt.core requires org.eclipse.text but it cannot be resolved. {code} Here's where this comes from: * The bin/hive scripts cause Hive to inherit Hadoop's classpath. * Hadoop's classpath includes $HADOOP_HOME/lib/core-3.1.1.jar, an Eclipse library. * core-3.1.1.jar includes a plugin.xml file defining an OSGI plugin * At startup, Datanucleus scans the classpath looking for OSGI plugins, and will attempt to initialize any that it finds, including the Eclipse OSGI plugins located in core-3.1.1.jar * Initialization of the OSGI plugin in core-3.1.1.jar fails because of unresolved dependencies. * We see an ERROR message telling us that Datanucleus failed to initialize a plugin that we don't care about in the first place. I can think of two options for solving this problem: # Rewrite the scripts in $HIVE_HOME/bin so that they don't inherit ALL of Hadoop's CLASSPATH. # Replace DataNucleus's NOnManagedPluginRegistry with our own implementation that does nothing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira