Hi, Disclaimer: I am not too familiar with the webhcat yet. From the logs, I see, that: - the first 3 seconds spent on starting a new session, and maybe a driver - this can be reduced, if the session is already there, and the HiveServer2 is started (but do not know if webhcat could use HS2, or reuse sessions) - this delay could be avoided if you use any of the 3 solutions suggested in my last mail. - the next 3 seconds spent on initializing the metastore. This can be reduced if a standalone metastore is started, and the webhcat is configured to access this metastore.
Hope this helps, Peter > On Apr 23, 2018, at 9:27 AM, 侯宗田 <zongtian...@icloud.com> wrote: > > Thank you very much for your reply, I am wondering whether I use the webhcat > rightly, I don’t think it is normal to create all the directories and objects > to get a table describ and take 8 seconds. The webhcat should not be so slow, > Or it is because I forget to start some server which can respond immediately? > >> 在 2018年4月23日,下午3:06,Peter Vary <pv...@cloudera.com> 写道: >> >> Hi, >> >> Alexander Kolbasov has a project which might interest you (keeping in mind, >> that this is not production ready - more like a proof of concept): >> https://github.com/akolb1/gometastore/blob/master/hmstool/doc/hmstool.md >> >> Also you can use HMS thrift API directly to access the MetaStore, or if you >> can/want write java code, you can use HiveMetastoreClient class to do it in >> java. >> >> I am not sure about the performance gains compared to HCat, but currently >> there are no faster interfaces for HMS that I know of. >> >> Regards, >> Peter >> >> >> 侯宗田 <zongtian...@icloud.com> ezt írta (időpont: 2018. ápr. 23., Hét 2:40): >> >>> Can anyone give me some suggestions? I have been stuck in this problem for >>> several days. Need help!! >>>> 在 2018年4月22日,下午9:38,侯宗田 <zongtian...@icloud.com> 写道: >>>> >>>> >>>> Hi, >>>> >>>> I am writing a application which needs the metastore about hive tables. >>> I have used webhcat to get the information about tables and process them. >>> But a simple request takes over eight seconds to respond on localhost. Why >>> is this so slow, and how can I fix it or is there other way I can extract >>> the metadata in C? >>>> >>>> $ time curl -s ' >>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean >>> < >>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean >>>> ' >>>> {"columns": >>>> [{"name":"id","type":"int"}], >>>> "database":"default", >>>> "table":"haha"} >>>> >>>> real 0m8.400s >>>> user 0m0.053s >>>> sys 0m0.019s >>>> it seems to run a hcat.py, and it create a bunch of things then clear >>> them, it takes very long time, does anyone have some ideas about it?? Any >>> suggestions will be very appreciated! >>>> >>>> $hcat.py -e "use default; desc haha; " >>>> SLF4J: Class path contains multiple SLF4J bindings. >>>> SLF4J: Found binding in >>> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: Found binding in >>> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings < >>> http://www.slf4j.org/codes.html#multiple_bindings> for an explanation. >>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >>>> 18/04/21 16:38:13 INFO conf.HiveConf: Found configuration file >>> file:/usr/local/hive/conf/hive-site.xml >>>> 18/04/21 16:38:15 WARN util.NativeCodeLoader: Unable to load >>> native-hadoop library for your platform... using builtin-java classes where >>> applicable >>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: >>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668 >>>> 18/04/21 16:38:16 INFO session.SessionState: Created local directory: >>> /tmp/hive/java/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668 >>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: >>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668/_tmp_space.db >>>> 18/04/21 16:38:16 INFO ql.Driver: Compiling >>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): >>> use default >>>> 18/04/21 16:38:17 INFO metastore.HiveMetaStore: 0: Opening raw store >>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore >>>> 18/04/21 16:38:17 INFO metastore.ObjectStore: ObjectStore, initialize >>> called >>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property >>> hive.metastore.integral.jdo.pushdown unknown - will be ignored >>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property >>> datanucleus.cache.level2 unknown - will be ignored >>>> 18/04/21 16:38:18 INFO metastore.ObjectStore: Setting MetaStore object >>> pin classes with >>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" >>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, >>> underlying DB is MYSQL >>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added admin role in >>> metastore >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added public role in >>> metastore >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: No user is added in >>> admin role, since config is empty >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_all_functions >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=get_all_functions >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=get_database: default >>>> 18/04/21 16:38:20 INFO ql.Driver: Semantic Analysis Completed >>>> 18/04/21 16:38:20 INFO ql.Driver: Returning Hive schema: >>> Schema(fieldSchemas:null, properties:null) >>>> 18/04/21 16:38:20 INFO ql.Driver: Completed compiling >>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62); >>> Time taken: 3.936 seconds >>>> 18/04/21 16:38:20 INFO ql.Driver: Concurrency mode is disabled, not >>> creating a lock manager >>>> 18/04/21 16:38:20 INFO ql.Driver: Executing >>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): >>> use default >>>> 18/04/21 16:38:20 INFO sqlstd.SQLStdHiveAccessController: Created >>> SQLStdHiveAccessController for session context : HiveAuthzSessionContext >>> [sessionString=05096382-f9b6-4dae-aee2-dfa6750c0668, clientType=HIVECLI] >>>> 18/04/21 16:38:20 WARN session.SessionState: METASTORE_FILTER_HOOK will >>> be ignored, since hive.security.authorization.manager is set to instance of >>> HiveAuthorizerFactory. >>>> 18/04/21 16:38:20 INFO hive.metastore: Mestastore configuration >>> hive.metastore.filter.hook changed from >>> org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to >>> org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Cleaning up thread >>> local RawStore... >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=Cleaning up thread local RawStore... >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Done cleaning up >>> thread local RawStore >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore >>>> 18/04/21 16:38:20 INFO ql.Driver: Starting task [Stage-0:DDL] in serial >>> mode >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=get_database: default >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Opening raw store >>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore >>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: ObjectStore, initialize >>> called >>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, >>> underlying DB is MYSQL >>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore >>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>> ip=unknown-ip-addr cmd=get_database: default >>>> 18/04/21 16:38:20 INFO ql.Driver: Completed executing >>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62); >>> Time taken: 0.202 seconds >>>> OK >>> >>> >