Karl Wright created CONNECTORS-917:
--------------------------------------

             Summary: SharePoint connector would benefit from site discovery
                 Key: CONNECTORS-917
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-917
             Project: ManifoldCF
          Issue Type: Improvement
          Components: SharePoint connector
    Affects Versions: ManifoldCF 1.7
            Reporter: Karl Wright
            Assignee: Karl Wright
             Fix For: ManifoldCF 1.7


The current SharePoint connector only can crawl a single SharePoint site.  But 
SharePoint can support multiple sites.  Indeed, in some cases there are 
hundreds of such sites.  Setting up a connection and jobs for each one would be 
a difficult task.

The SharePoint admin site allows you to discover the sites that exist.  Using 
this feature as part of the crawl would allow for a much more automated way of 
handling large SharePoint installations.

Some notes:

   - Not yet clear how "one site" vs. "many sites" should coexist in one 
connector
     - Form of document identifier must change
     - Each document identifier must include the site path first
     - Since subsite path can be just "/", also needs to be resilient against 
that
     - Something like: <site_path>//<current_subsite_doc_list_item_etc_path>.  
But "//" will collide with old-style.
     - If old-style document identifier always must start with a "/", then we 
can simply start it with (say) a "+", to signal that it is a new-style 
identifier
     - Not clear yet if there's a new form that would allow us to know if a doc 
identifier was old form or not
   - Native authority also right now needs to know what site it is working with
     - Site discovery therefore must also be run in the authority, and tokens 
for each discovered site must be returned
     - Native tokens must therefore be qualified with a site ID





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to