jira-importer commented on issue #563:
URL: https://github.com/apache/maven-indexer/issues/563#issuecomment-2965151620

   **[Michael 
Bien](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=mbien)** 
commented
   
   i was reading up on lucene yesterday and I entirely forgot that a key point 
of their data structure is that it is entirely immutable! This means deleting a 
doc won't delete anything - all it does is to set a flag, it is updated later 
while queries run during segment merges.
   
   Feel free to close this issue, however, I believe it is worth investigating 
if the filter could be run in the reader itself while it is building the index, 
this should hopefully have an actual effect on the resulting index size.
   
    
   
   edit: maybe the filter could be even used to throw away data within the 
docs, which would be possible at that stage I believe. E.g I have the suspicion 
that the major contribution to the index size is because some artifacts put 
their documentation into the description field (but this is only a guess at 
this point - not verified).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to