Thanks Karl. I will most certainly be reading the document you linked to in great detail. It looks like stuff I need to know.
That said, we have a given technology that we have developed and that we will be using. It creates a separate index for each user. The technology has vastly greater utility than just for sharepoint and Its been in development for about six years . (in fact this sharepoint thing is a recent add-on request.) So my question is, notwithstanding that this is not the "typical" way ManifoldCF works, can we use it in the way that I am describing. Is it malleable enough to work or is it designed to do something so different from what we need that it would be useless. I guess the key question is really, can we tell ManifoldCF to limit results to those visible to a specific user and would there be any performance or other unexpected downsides to doing that. Hank On Thu, Mar 19, 2015 at 1:53 PM, Karl Wright <[email protected]> wrote: > Hi Hank, > > "Our project involves a database that has a private secure user space for > each user. Our database is built on Lucene and indexes every object in the > database. Each user presumably has some number of SharePoint sites that > they have access to. We want to index each sharepoint object (file or > sharepoint page) as we find it, for each user. The user then ends up with > an index of just the objects that they have perrmissions for. But to do > that we need to, for each user crawl all of the sharepoint sites that they > have access to. Permissions to each sharepoint site are managed by K > erberos." > > This is not the typical ManifoldCF model. In the typical case, there is > ONE lucene search engine (not N), and any searches that take place apply > security restrictions internally based on the user's security information, > as obtained from the ManifoldCF authority service, which is in turn > querying SharePoint. > > You can read more about the standard authorization setup here: > > > https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs/MCFiA%20CH%2004.pdf > > Karl > > > > > On Thu, Mar 19, 2015 at 1:44 PM, hank williams <[email protected]> wrote: > >> I am embarking on an effort for which ManifoldCF may be an appropriate >> tool. I am a total noob, having just discovered this project and have a few >> questions that I am hoping someone can answer so that I can begin to gain >> some confidence about the way things work. Basically I am trying to make >> sure I understand, at a top level, how ManifoldCF works. >> >> Our project involves a database that has a private secure user space for >> each user. Our database is built on Lucene and indexes every object in the >> database. Each user presumably has some number of SharePoint sites that >> they have access to. We want to index each sharepoint object (file or >> sharepoint page) as we find it, for each user. The user then ends up with >> an index of just the objects that they have perrmissions for. But to do >> that we need to, for each user crawl all of the sharepoint sites that they >> have access to. Permissions to each sharepoint site are managed by K >> erberos. >> >> So the questions are: >> >> a. Can I, with ManifoldCF take list of sharepoint sites and a list of >> users and relevant Kerberos appropriate authentication tokens or keys (just >> learning about Kerberos), and get back a list of indexable objects/URIs >> (HTML, .docx, pptx, etc.)? >> >> b. Is this the right way to think about it? >> >> c. If so, is there any example code or documentation that would explain >> how I do this? >> >> d. Does manifoldCF provide any information to help indicate whether the >> given object has changed, or is that something we need to figure out by >> manually comparing the old and new documents in our code? >> > >
