Github user karanmehta93 commented on the issue:

    https://github.com/apache/phoenix/pull/419
  
    _First of all, apologies for loooong PR._ (Most of it is refactoring but 
still its hard to review)
    
    **Here's the high level idea** 
    1. 7 classes were inherited from `StatsCollectorIT`, testing stats 
collection for different types of table properties. There was a lot of 
redundancy in the test suite. Also, all the tests were running with namespaces 
enabled all the time (This is because it is set once for the JVM and we cannot 
go back without restarting the server). We were controlling the parameterized 
property for new `PhoenixConnection`, which is disallowed according to 
documentation.
    The code is now refactored to have only 3 classes, 
    `NamespaceMappedStatsCollectorIT` --> namespaces enabled, collect stats via 
snapshots as well as SQL statement
    `NonTxStatsCollectorIT` --> mutable/immutable tables, column encoded/non 
column encoded
    `TxStatsCollectorIT` --> mutable/immutable tables, column encoded/non 
column encoded, TEPHRA/OMID
    
    2. The `StatsCollectorIT` is renamed to `BaseStatsCollectorIT` and tests 
have been improved to cover certain scenarios. More tests coming along the way.
    
    3. Server side changes:
    `DefaultStatisticsCollector` is now an abstract class, 
RegionServerStatisticsCollector and `MapperStatisticsCollector` are its 
children. The former is triggered for SQL statements and the latter is used for 
this Jira (Map Reduce Job). Most of the common code is moved to base class.
    
    4. The snapshot scanner has been improved to collect statistics if the scan 
is configured accordingly. A `NoOpStatisticsCollector` instance is instantiated 
if its a regular phoenix MR job on snapshots. 
    
    5. Also have the configuration changes in `PhoenixConfigurationUtil` class.
    
    Finally, `UpdateStatisticsTool` is the tool to launch the MR job.
    
    This is the v1 version for some initial feedback. Please comment wherever 
its not clear.
    
    **Coming up:** 
    More tests covering other scenarios.
    Perf testing for sample tables and the results.
    Better/useful log lines
    General code cleanup for nits


---

Reply via email to