[
https://issues.apache.org/jira/browse/MINDEXER-185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tamas Cservenak reassigned MINDEXER-185:
----------------------------------------
Assignee: Tamas Cservenak
> Document filter doesn't seem to do anything
> -------------------------------------------
>
> Key: MINDEXER-185
> URL: https://issues.apache.org/jira/browse/MINDEXER-185
> Project: Maven Indexer
> Issue Type: Bug
> Affects Versions: 7.0.1
> Reporter: Michael Bien
> Assignee: Tamas Cservenak
> Priority: Major
> Fix For: 7.0.2
>
>
> Hello devs!
>
> I tried to filter the index during extraction using a DocumentFilter and it
> didn't appear to do anything.
> As test, I simply set {{indexUpdateRequest.setDocumentFilter(doc -> false);}}
> before calling {{DefaultIndexUpdater.fetchAndUpdateIndex}} and the extracted
> index had the same size of 5.6gb as without the filter.
>
> The filter is actually called and it does also add a few minutes to the
> extraction time.
> https://github.com/apache/maven-indexer/blob/1cd122b1487150613005c8f9aced9aec20fded3e/indexer-core/src/main/java/org/apache/maven/index/updater/DefaultIndexUpdater.java#L238-L241
>
> I am not sure why the implementation is filtering the index *after*
> extraction. Wouldn't it be easier and also more efficient to do it in
> IndexDataReader?
> e.g
> https://github.com/apache/maven-indexer/blob/1cd122b1487150613005c8f9aced9aec20fded3e/indexer-core/src/main/java/org/apache/maven/index/updater/IndexDataReader.java#L269
--
This message was sent by Atlassian Jira
(v8.20.10#820010)