[ 
https://issues.apache.org/jira/browse/HAWQ-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051825#comment-15051825
 ] 

ASF GitHub Bot commented on HAWQ-191:
-------------------------------------

Github user hornn commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/174#discussion_r47301589
  
    --- Diff: pxf/build.gradle ---
    @@ -218,7 +218,7 @@ project('pxf-service') {
     
     project('pxf-hdfs') {
         dependencies {
    -        compile(project(':pxf-service')) //Yikes, HdfsAnalyzer is directly 
accessing the bridge
    +        compile(project(':pxf-service'))
    --- End diff --
    
    Can't we replace the dependency with a dependency on pxf-api (like 
pxf-hbase does)?


> Remove Analyzer from PXF
> ------------------------
>
>                 Key: HAWQ-191
>                 URL: https://issues.apache.org/jira/browse/HAWQ-191
>             Project: Apache HAWQ
>          Issue Type: Improvement
>          Components: PXF
>            Reporter: Noa Horn
>            Assignee: Shivram Mani
>
> Analyzer plugin was used to gather statistics when running ANALYZE.
> The API provides one function getEstimatedStats() which returns the estimated 
> number of tuples, blocks and the size of block.
> We also have one implementation for it - HdfsAnalyzer.
> After the introduction of advanced stats (HAWQ-44), the Analyzer is no longer 
> used by HAWQ. Instead a new function in the Fragmenter API 
> (getFragmentsStats) is used to gather initial statistics for the data source, 
> and further queries gather sampling tuples for that data source.
> The advantage in the new approach is that the Fragmenter.getFragmentsStats 
> uses only the Fragmenter to gather stats. The Analyzer, on the other hand, 
> instantiated both Fragmenter and Accessor of the table in order to estimate 
> the number of tuples. In the HdfsAnalyzer implementation, it caused a 
> dependency of pxf-hdfs jar on pxf-service (which takes care of instantiating 
> the plugins), which is contrary to the isolation we want to achieve between 
> core functionality (pxf-service) and the plugins (pxf-hdfs, pxf-hive, 
> pxf-hbase, etc.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to