> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 4273 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007723#file2007723line4273>
> >
> >     Better name : hive.runtime.stats.batch.size
> >     Also lets use the default value of 100_000
> >     This can be in MetastoreConf since it affects MS client and server.

renamed...I've also noticed it :)
I wouldn't want to set it a default since it may only be usefull if loading of 
the runtime stats becomes problematic.


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Line 4276 (original)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007723#file2007723line4280>
> >
> >     I see that you have moved it to Metastoreconf. But I think it belongs 
> > to hiveconf. Since this cache is in HS2 process not in metastore process.

okay; I may add it, but I don't see any case in which someone wants to set the 
2 parameters to different values - so I wanted to make it easier for the user 
to have just 1 setting;


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
> > Lines 68 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line68>
> >
> >     Better name: StatsLoader.
> >     Also instead runnable inside thread, using Future will be easier to 
> > read as well as efficient.

I've renamed it to RuntimeStatsLoader; about using future; I'm not sure - but 
anyway this update will happen only once during the hs2 lifetime; so 
performance might not be that crucial


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
> > Lines 75 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line75>
> >
> >     Since our cache is of limited size. We should also pass in limit. No 
> > point in retrieving more than 100_000 items if we are not gonna load them.

this batch size is here if there are problems with loading the data; the 
default -1 means: no limit, load all with a single request
I don't think that will cause any problems; but in case it does ; the user may 
set it to a smaller value to initiate multi request loading


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
> > Lines 78 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line78>
> >
> >     This create time is not used after its loaded in cache. Whats the 
> > reason for doing this max() ?
> >     This line can be removed.

createtime serves as the offset for the next batched request; or it would get 
back the same resultset
moved createTime to be a filed of the loader;
I've added it as a field in the main class because I've thinked about that in 
the future if more than 1 hs2 is working - it might be needed to load new stat 
infos thru the metastore...but that might not be needed - at least right now...


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
> > Lines 108 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line108>
> >
> >     Better name: StatsPersister.

this is a non-static internal class; so it's full name is:
MetastoreStatsConnector$Submitter
because I've used the persist keyword in naming the String2Object converter 
class; I've renamed this to  RuntimeStatsSubmitter


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
> > Lines 130 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007739#file2007739line130>
> >
> >     Should size be part of thrift api? Server can do size() and use that on 
> > other side. Seems redundant info in api.

I'm not exposing the transferred object at the thrift level; this information 
is opaque at thrift and also in the metastore.
so its weight should be communicated by the writer.


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> > Lines 3253 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007765#file2007765line3253>
> >
> >     Caller passes in batch size, but we use that as maxCount. We need both. 
> > We want to fetch only maxCount entries but in batches of batch_size.

I don't see why would the server need to know that it servers a batched client 
or not - I think offset + limit should be enough...


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 11615 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007767#file2007767line11617>
> >
> >     Retrieving all stats entry can be expensive. We shall retrieve only 
> > entries which are older than retention threshold.

yes, but in that case it may miss the entry limit..


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> > Lines 11629 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007767#file2007767line11631>
> >
> >     Simple time based deletion might be better. Since that will make sure 
> > we never delete too many entries. If we check for 2 diff conditions and 
> > delete when of that is met, its possible that we too many entries too 
> > quickly.

okay; say we set the timeframe to 1 year; and the max count to 10; and every 
hour we add a new query which adds 10 entries...that would add a lot of entries 
which would be never even used; because the cache size is so small...

I think entry limit is needed; currently this cleanup routine might seem a bit 
of resouce wasting - but its the most straight forward implementation of the 
entry limited one - this can be improved later by storing the entry aggregate 
as a column


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
> > Lines 586 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007771#file2007771line586>
> >
> >     This config belongs in Hiveconf.

I'm not sure ... I don't think it would be good to store elements which will be 
unused by hs2


> On April 21, 2018, 5:59 a.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/thrift/hive_metastore.thrift
> > Lines 1528 (patched)
> > <https://reviews.apache.org/r/66663/diff/3/?file=2007784#file2007784line1528>
> >
> >     its better to also have weight in request, although server will ignore 
> > it for now. we may later try to use that. e.g., we may want to retrieve 
> > only entries higher than threshold weight.

I don't see in what scenario would that be usefull...I think adding it when 
it's really needed will be better.


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66663/#review201652
-----------------------------------------------------------


On April 20, 2018, 1:32 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66663/
> -----------------------------------------------------------
> 
> (Updated April 20, 2018, 1:32 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-19171
>     https://issues.apache.org/jira/browse/HIVE-19171
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> *
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 536c7b427f 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  801de7aca2 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
> 88022be9b9 
>   metastore/scripts/upgrade/derby/056-HIVE-19171.derby.sql PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpSignature.java 
> e87bbceb7a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpTreeSignature.java
>  c3dc848a32 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/OpTreeSignatureFactory.java
>  3df5ee946e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/RuntimeStatsMap.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/RuntimeStatsPersister.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/signature/SignatureUtils.java 
> 4f3e3384a9 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java e15a49f838 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 
> a61a47e390 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java e7ca7f617c 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java 54b705db6e 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/CachingStatsSource.java 
> c51527621f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/EmptyStatsSource.java 
> 19df13a843 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/MetastoreStatsConnector.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java 
> a37280407d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/SimpleRuntimeStatsSource.java
>  3d6c257026 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/StatsSources.java 
> a4e33c3804 
>   ql/src/java/org/apache/hadoop/hive/ql/reexec/ReOptimizePlugin.java 
> 409cc7312c 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/OperatorStats.java 52e18a8030 
>   
> ql/src/test/org/apache/hadoop/hive/ql/optimizer/signature/TestRuntimeStatsPersistence.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestCounterMapping.java 
> 81269702de 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/mapping/TestReOptimization.java 
> b7263005ed 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 16423578d5 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 
> 802d8e3fb2 
>   standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 
> dfa13a0614 
>   
> standalone-metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
>  c0a39f80e0 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h 
> 2c95007daa 
>   standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp 
> 99024279c5 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/GetRuntimeStatsRequest.java
>  PRE-CREATION 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/RuntimeStat.java
>  PRE-CREATION 
>   
> standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
>  a354f27cad 
>   
> standalone-metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php 
> 9c949429c5 
>   standalone-metastore/src/gen/thrift/gen-php/metastore/Types.php c4969d567f 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
>  079c7fc322 
>   
> standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
>  d241414bc3 
>   standalone-metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py 
> 9bf9843314 
>   standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb 
> 3dbe4d8068 
>   standalone-metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 
> 58ebd29523 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  9c88cf9e06 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  feae991bb3 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  27f8775a10 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  a7acdcbc23 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  f6c46ee7bd 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RuntimeStatsCleanerTask.java
>  PRE-CREATION 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  ebdcbc237e 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  88b07cbdbf 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/model/MRuntimeStat.java
>  PRE-CREATION 
>   standalone-metastore/src/main/resources/package.jdo 9ddf598d36 
>   standalone-metastore/src/main/sql/derby/hive-schema-3.0.0.derby.sql 
> 240a282f17 
>   standalone-metastore/src/main/sql/derby/upgrade-2.3.0-to-3.0.0.derby.sql 
> 334ddfadd9 
>   standalone-metastore/src/main/sql/mssql/hive-schema-3.0.0.mssql.sql 
> e15fc2e31c 
>   standalone-metastore/src/main/sql/mssql/upgrade-2.3.0-to-3.0.0.mssql.sql 
> 53bdcc4f59 
>   standalone-metastore/src/main/sql/mysql/hive-schema-3.0.0.mysql.sql 
> f9efd56833 
>   standalone-metastore/src/main/sql/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql 
> 1c540379d7 
>   standalone-metastore/src/main/sql/oracle/hive-schema-3.0.0.oracle.sql 
> a87e4464d5 
>   standalone-metastore/src/main/sql/oracle/upgrade-2.3.0-to-3.0.0.oracle.sql 
> 13b0d96bb4 
>   standalone-metastore/src/main/sql/postgres/hive-schema-3.0.0.postgres.sql 
> 07dd83d747 
>   
> standalone-metastore/src/main/sql/postgres/upgrade-2.3.0-to-3.0.0.postgres.sql
>  ddb5e5009b 
>   standalone-metastore/src/main/thrift/hive_metastore.thrift 5bba3294fa 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  304f567533 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  85c67270d1 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
>  cb51763ccd 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestRuntimeStats.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66663/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>

Reply via email to