[ 
https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920805#action_12920805
 ] 

Jack Krupansky commented on CONNECTORS-118:
-------------------------------------------

At least for file system crawls we can depend on modification date to decide 
whether to re-crawl an archive file, can't we?

I wouldn't rate crawling of archive files over the web efficiently too high a 
priority.


> Crawled archive files should be expanded into their constituent files
> ---------------------------------------------------------------------
>
>                 Key: CONNECTORS-118
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-118
>             Project: ManifoldCF
>          Issue Type: New Feature
>          Components: Framework crawler agent
>            Reporter: Jack Krupansky
>
> Archive files such as zip, mbox, tar, etc. should be expanded into their 
> constituent files during crawling of repositories so that any output 
> connector would output the flattened archive.
> This could be an option, defaulted to ON, since someone may want to implement 
> a "copy" connector that maintains crawled files as-is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to