[ 
https://issues.apache.org/jira/browse/CONNECTORS-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481863#comment-17481863
 ] 

DK commented on CONNECTORS-1695:
--------------------------------

In that case, What is the significance of 'interestingMimeType". As per the 
defect related to application/xml, it was missing in that variable and got 
added.

My understanding is that web connector would treat as special sitemap and pull 
individual urls and submit html to solr for indexing.

If that is the not case, Can we say manifoldcf does not support sitemap 
indexing? and what does it take to add the support? I am willing to help.

> Sitemap xml not detected in version 2.17 webconnector
> -----------------------------------------------------
>
>                 Key: CONNECTORS-1695
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1695
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Web connector
>    Affects Versions: ManifoldCF 2.17
>            Reporter: DK
>            Priority: Major
>
> Trying to index sitemap xml and web connector index the whole xml into solr.
> Please fix in version 2.17.
> If it is any special config that needs to be taken care, please add here and 
> add in documentation to make it clear.
>  
> Sitemap.xml:
> <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";>
> <sitemap>
> <loc>https://<url>/sitemap_1.xml</loc>
> <lastmod>2022-01-21T16:04:45Z</lastmod>
> </sitemap>
> </sitemapindex>
>  
> sitemap_1.xml:
> <urlset>
> <url>
> <loc>https://<docurl></loc>
> <lastmod>2018-10-31T11:25:27Z</lastmod>
> </url>
> </urlset>



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to