[
https://issues.apache.org/jira/browse/HAWQ-191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049136#comment-15049136
]
ASF GitHub Bot commented on HAWQ-191:
-------------------------------------
GitHub user shivzone opened a pull request:
https://github.com/apache/incubator-hawq/pull/174
HAWQ-191. Remove Analyzer plugin from PXF
TBD
1. Handle gracefully if /analyzer api is used with a warning message
2. Automation test
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-hawq HAWQ-191
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-hawq/pull/174.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #174
----
commit 7b5532c8a587cc217fcebf0d29e19197188a1f51
Author: Shivram Mani <[email protected]>
Date: 2015-12-09T18:25:24Z
HAWQ-191. Remove Analyzer plugin from PXF
----
> Remove Analyzer from PXF
> ------------------------
>
> Key: HAWQ-191
> URL: https://issues.apache.org/jira/browse/HAWQ-191
> Project: Apache HAWQ
> Issue Type: Improvement
> Components: PXF
> Reporter: Noa Horn
> Assignee: Shivram Mani
>
> Analyzer plugin was used to gather statistics when running ANALYZE.
> The API provides one function getEstimatedStats() which returns the estimated
> number of tuples, blocks and the size of block.
> We also have one implementation for it - HdfsAnalyzer.
> After the introduction of advanced stats (HAWQ-44), the Analyzer is no longer
> used by HAWQ. Instead a new function in the Fragmenter API
> (getFragmentsStats) is used to gather initial statistics for the data source,
> and further queries gather sampling tuples for that data source.
> The advantage in the new approach is that the Fragmenter.getFragmentsStats
> uses only the Fragmenter to gather stats. The Analyzer, on the other hand,
> instantiated both Fragmenter and Accessor of the table in order to estimate
> the number of tuples. In the HdfsAnalyzer implementation, it caused a
> dependency of pxf-hdfs jar on pxf-service (which takes care of instantiating
> the plugins), which is contrary to the isolation we want to achieve between
> core functionality (pxf-service) and the plugins (pxf-hdfs, pxf-hive,
> pxf-hbase, etc.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)