> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > > Lines 4273 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007723#file2007723line4273> > > > > Better name : hive.runtime.stats.batch.size > > Also lets use the default value of 100_000 > > This can be in MetastoreConf since it affects MS client and server.
renamed...I've also noticed it :) I wouldn't want to set it a default since it may only be usefull if loading of the runtime stats becomes problematic. > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > > Line 4276 (original) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007723#file2007723line4280> > > > > I see that you have moved it to Metastoreconf. But I think it belongs > > to hiveconf. Since this cache is in HS2 process not in metastore process. okay; I may add it, but I don't see any case in which someone wants to set the 2 parameters to different values - so I wanted to make it easier for the user to have just 1 setting; > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > > Lines 68 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line68> > > > > Better name: StatsLoader. > > Also instead runnable inside thread, using Future will be easier to > > read as well as efficient. I've renamed it to RuntimeStatsLoader; about using future; I'm not sure - but anyway this update will happen only once during the hs2 lifetime; so performance might not be that crucial > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > > Lines 75 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line75> > > > > Since our cache is of limited size. We should also pass in limit. No > > point in retrieving more than 100_000 items if we are not gonna load them. this batch size is here if there are problems with loading the data; the default -1 means: no limit, load all with a single request I don't think that will cause any problems; but in case it does ; the user may set it to a smaller value to initiate multi request loading > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > > Lines 78 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line78> > > > > This create time is not used after its loaded in cache. Whats the > > reason for doing this max() ? > > This line can be removed. createtime serves as the offset for the next batched request; or it would get back the same resultset moved createTime to be a filed of the loader; I've added it as a field in the main class because I've thinked about that in the future if more than 1 hs2 is working - it might be needed to load new stat infos thru the metastore...but that might not be needed - at least right now... > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > > Lines 108 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line108> > > > > Better name: StatsPersister. this is a non-static internal class; so it's full name is: MetastoreStatsConnector$Submitter because I've used the persist keyword in naming the String2Object converter class; I've renamed this to RuntimeStatsSubmitter > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > > Lines 130 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line130> > > > > Should size be part of thrift api? Server can do size() and use that on > > other side. Seems redundant info in api. I'm not exposing the transferred object at the thrift level; this information is opaque at thrift and also in the metastore. so its weight should be communicated by the writer. > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > > Lines 3253 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007765#file2007765line3253> > > > > Caller passes in batch size, but we use that as maxCount. We need both. > > We want to fetch only maxCount entries but in batches of batch_size. I don't see why would the server need to know that it servers a batched client or not - I think offset + limit should be enough... > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java > > Lines 11615 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007767#file2007767line11617> > > > > Retrieving all stats entry can be expensive. We shall retrieve only > > entries which are older than retention threshold. yes, but in that case it may miss the entry limit.. > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java > > Lines 11629 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007767#file2007767line11631> > > > > Simple time based deletion might be better. Since that will make sure > > we never delete too many entries. If we check for 2 diff conditions and > > delete when of that is met, its possible that we too many entries too > > quickly. okay; say we set the timeframe to 1 year; and the max count to 10; and every hour we add a new query which adds 10 entries...that would add a lot of entries which would be never even used; because the cache size is so small... I think entry limit is needed; currently this cleanup routine might seem a bit of resouce wasting - but its the most straight forward implementation of the entry limited one - this can be improved later by storing the entry aggregate as a column > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java > > Lines 586 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007771#file2007771line586> > > > > This config belongs in Hiveconf. I'm not sure ... I don't think it would be good to store elements which will be unused by hs2 > On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote: > > standalone-metastore/src/main/thrift/hive_metastore.thrift > > Lines 1528 (patched) > > <https://reviews.apache.org/r/66663/diff/3/?file=2007784#file2007784line1528> > > > > its better to also have weight in request, although server will ignore > > it for now. we may later try to use that. e.g., we may want to retrieve > > only entries higher than threshold weight. I don't see in what scenario would that be usefull...I think adding it when it's really needed will be better. - Zoltan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66663/#review201652 ----------------------------------------------------------- On April 20, 2018, 1:32 p.m., Zoltan Haindrich wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66663/ > ----------------------------------------------------------- > > (Updated April 20, 2018, 1:32 p.m.) > > > Review request for hive and Ashutosh Chauhan. > > > Bugs: HIVE-19171 > https://issues.apache.org/jira/browse/HIVE-19171 > > > Repository: hive-git > > > Description > ------- > > * > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 536c7b427f > > itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java > 801de7aca2 > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java > 88022be9b9 > metastore/scripts/upgrade/derby/056-HIVE-19171.derby.sql PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpSignature.java > e87bbceb7a > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpTreeSignature.java > c3dc848a32 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpTreeSignatureFactory.java > 3df5ee946e > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/RuntimeStatsMap.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/RuntimeStatsPersister.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/SignatureUtils.java > 4f3e3384a9 > ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java e15a49f838 > ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java > a61a47e390 > ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java e7ca7f617c > ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java 54b705db6e > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/CachingStatsSource.java > c51527621f > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/EmptyStatsSource.java > 19df13a843 > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java > a37280407d > > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/SimpleRuntimeStatsSource.java > 3d6c257026 > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/StatsSources.java > a4e33c3804 > ql/src/java/org/apache/hadoop/hive/ql/reexec/ReOptimizePlugin.java > 409cc7312c > ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStats.java 52e18a8030 > > ql/src/test/org/apache/hadoop/hive/ql/optimizer/signature/TestRuntimeStatsPersistence.java > PRE-CREATION > ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java > 81269702de > ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestReOptimization.java > b7263005ed > service/src/java/org/apache/hive/service/server/HiveServer2.java 16423578d5 > standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h > 802d8e3fb2 > standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp > dfa13a0614 > > standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp > c0a39f80e0 > standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h > 2c95007daa > standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp > 99024279c5 > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetRuntimeStatsRequest.java > PRE-CREATION > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/RuntimeStat.java > PRE-CREATION > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java > a354f27cad > > standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php > 9c949429c5 > standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php c4969d567f > > standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote > 079c7fc322 > > standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py > d241414bc3 > standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py > 9bf9843314 > standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb > 3dbe4d8068 > standalone-metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb > 58ebd29523 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java > 9c88cf9e06 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java > feae991bb3 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java > 27f8775a10 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java > a7acdcbc23 > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java > f6c46ee7bd > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RuntimeStatsCleanerTask.java > PRE-CREATION > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java > ebdcbc237e > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java > 88b07cbdbf > > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MRuntimeStat.java > PRE-CREATION > standalone-metastore/src/main/resources/package.jdo 9ddf598d36 > standalone-metastore/src/main/sql/derby/hive-schema-3.0.0.derby.sql > 240a282f17 > standalone-metastore/src/main/sql/derby/upgrade-2.3.0-to-3.0.0.derby.sql > 334ddfadd9 > standalone-metastore/src/main/sql/mssql/hive-schema-3.0.0.mssql.sql > e15fc2e31c > standalone-metastore/src/main/sql/mssql/upgrade-2.3.0-to-3.0.0.mssql.sql > 53bdcc4f59 > standalone-metastore/src/main/sql/mysql/hive-schema-3.0.0.mysql.sql > f9efd56833 > standalone-metastore/src/main/sql/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql > 1c540379d7 > standalone-metastore/src/main/sql/oracle/hive-schema-3.0.0.oracle.sql > a87e4464d5 > standalone-metastore/src/main/sql/oracle/upgrade-2.3.0-to-3.0.0.oracle.sql > 13b0d96bb4 > standalone-metastore/src/main/sql/postgres/hive-schema-3.0.0.postgres.sql > 07dd83d747 > > standalone-metastore/src/main/sql/postgres/upgrade-2.3.0-to-3.0.0.postgres.sql > ddb5e5009b > standalone-metastore/src/main/thrift/hive_metastore.thrift 5bba3294fa > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java > 304f567533 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java > 85c67270d1 > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java > cb51763ccd > > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestRuntimeStats.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/66663/diff/3/ > > > Testing > ------- > > > Thanks, > > Zoltan Haindrich > >