perhaps HIVE-4914 relevant
On Wed, Aug 28, 2013 at 3:11 AM, Michał Czerwiński <[email protected] > wrote: > Also what is worth mentioning I have tried running 0.4.0-cdh4.3.0-SNAPSHOT > jars (from > https://repository.cloudera.com/content/groups/public/org/apache/hcatalog/hcatalog-core/) > with exactly the same issue. That could possibly indicate that problem may > be related to the actual hive-metastore component and the way it interacts > with metastore, thoughts? > > > On 27 August 2013 18:41, Michał Czerwiński <[email protected]>wrote: > >> In PIG I am doing query like this: >> >> sdp1 = load 'db1.table1' using org.apache.hcatalog.pig.HCatLoader; >> sdp = FILTER sdp1 BY key1=='value1' AND key2=='value2'; >> ll = LIMIT sdp 100; >> dump ll; >> >> and hcatalog starts talking for few minutes to mysql asking for metadata, >> in the meantime after few seconds pig >> does: org.apache.thrift.transport.TTransportException: >> java.net.SocketTimeoutException: Read timed out >> >> Number of partitions I have: >> hive -e 'use db1; show partitions table1' |wc -l >> Time taken: 1.467 seconds >> 37748 >> >> When I run the same query on a different environment where I have only >> ~1000 partitions all works fine. >> >> Also problem does not exist on cdh3 and hcatalog-0.4.0. >> >> In hcatalog's logs I can see: >> (note the timestamp, I run the query at 17:10:45,216) >> >> 2013-08-27 17:10:46,275 INFO DataNucleus.MetaData >> (Log4JLogger.java:info(77)) - Listener found initialisation for persistable >> class org.apache.hadoop.hive.metastore.model.MPartition >> >> 2013-08-27 17:14:23,661 DEBUG metastore.ObjectStore >> (ObjectStore.java:listMPartitionsByFilter(1832)) - Done retrieving all >> objects for listMPartitionsByFilter >> >> 2013-08-27 17:22:32,410 INFO metastore.ObjectStore >> (ObjectStore.java:getPartitionsByFilter(1699)) - # parts after pruning = >> 37748 >> >> After that the hcatalog continues to: >> 2013-08-27 17:30:14,631 DEBUG DataNucleus.Transaction >> (Log4JLogger.java:debug(58)) - Transaction committed in 462221 ms >> >> Please note that I have datanucleus set to DEBUG and that slows things >> down significantly, without that, it still takes around 7 minutes for >> hcatalog to settle. >> >> Also datanucleus settings from the hcatalog's logs: >> >> datanucleus.autoStartMechanismMode = checked >> javax.jdo.option.Multithreaded = true >> datanucleus.identifierFactory = datanucleus >> datanucleus.transactionIsolation = read >> datanucleus.validateTables = false >> javax.jdo.option.ConnectionURL = jdbc:mysql://XXX >> javax.jdo.option.DetachAllOnCommit = true >> javax.jdo.option.NonTransactionalRead = true >> datanucleus.validateConstraints = false >> javax.jdo.option.ConnectionDriverName = com.mysql.jdbc.Driver >> javax.jdo.option.ConnectionUserName = hive >> datanucleus.validateColumns = false >> datanucleus.cache.level2 = false >> datanucleus.plugin.pluginRegistryBundleCheck = LOG >> datanucleus.cache.level2.type = none >> javax.jdo.PersistenceManagerFactoryClass = >> org.datanucleus.jdo.JDOPersistenceManagerFactory >> datanucleus.autoCreateSchema = true >> datanucleus.storeManagerType = rdbms >> datanucleus.connectionPoolingType = DBCP >> >> This runs on CDH4 4.3.0 >> hcatalog version: 0.5.0+9-1.cdh4.3.0.p0.12~precise-cdh4.3.0 >> >> Ideas? >> > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
