Seems to be a SP-internal thing. http://msdn.microsoft.com/en-us/library/aa661294.ASPX
Mark On Mon, Nov 18, 2013 at 5:39 PM, Karl Wright <[email protected]> wrote: > Hi Mark, > > Is "Cache Profiles" a list in your SharePoint? If not, what is it? > > Karl > > > > On Mon, Nov 18, 2013 at 8:37 PM, Mark Libucha <[email protected]> wrote: > >> Hi Karl, >> >> It's not the first problem you mentioned. I don't have a site specified >> in my SP connection. But it could well be the misconfigured IIS issue... >> >> Here's what I get with your modified log message: >> >> ERROR 2013-11-18 20:35:47,440 (Worker thread '7') - Exception tossed: >> Expected path to start with /Lists/, saw: '/Cache Profiles/1_.000' >> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Expected path >> to start with /Lists/, saw: '/Cache Profiles/1_.000' >> >> Thanks, >> >> Mark >> >> >> >> On Mon, Nov 18, 2013 at 5:29 PM, Karl Wright <[email protected]> wrote: >> >>> Hi Mark, >>> >>> The exception is very helpful. >>> >>> I've seen this before. I know of two ways it can happen. >>> >>> First way: your Repository Connection is not actually pointing at the >>> SharePoint root, but rather a subsite of the root. That usually messes >>> things up pretty well - and it's not easy to detect in the connector >>> properly either. You must point at the actual root, not a subsite, and use >>> the criteria to limit what you include. >>> >>> Second way: your SharePoint instance has a malconfigured IIS, which is >>> mapping paths in ways that are unexpected. >>> >>> There may be other ways that this can happen; SharePoint has a myriad >>> different configuration options and it is possible your instance has one >>> that is not something we've ever seen before. If you think that is what is >>> happening, change this line: >>> >>> throw new ManifoldCFException("Expected path to start with >>> /Lists/"); >>> >>> to: >>> >>> throw new ManifoldCFException("Expected path to start with >>> /Lists/, saw: '"+relPath+"'"); >>> >>> Karl >>> >>> >>> >>> >>> On Mon, Nov 18, 2013 at 8:20 PM, Mark Libucha <[email protected]>wrote: >>> >>>> Screen shot attached. Using 4.1, SharePoint 2010. >>>> >>>> Throws this exception: >>>> >>>> ERROR 2013-11-18 20:12:58,058 (Worker thread '13') - Exception tossed: >>>> Expected path to start with /Lists/ >>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Expected >>>> path to start with /Lists/ >>>> at >>>> org.apache.manifoldcf.crawler.connectors.sharepoint.SharePointRepository$ListItemStream.addFile(SharePointRepository.java:2255) >>>> >>>> I added a debug log message to the SharePoint crawler so the line >>>> number may be off by 1 or 2... >>>> >>>> Thanks, >>>> >>>> Mark >>>> >>>> >>>> >>>> On Mon, Nov 18, 2013 at 4:59 PM, Karl Wright <[email protected]>wrote: >>>> >>>>> Hi Mark, >>>>> >>>>> First, what version of ManifoldCF are you using? 1.3 has some bugs >>>>> where lists are concerned. >>>>> >>>>> Second, I've recently and repeatedly run exactly this crawl against a >>>>> site that one of our ManifoldCF users set up in Amazon, so I know it works >>>>> properly. So now the question is to determine exactly what you are doing >>>>> that is not correct. >>>>> >>>>> If you want to crawl just lists, you will nevertheless need to enter >>>>> both a Site match and a List match. Otherwise you will get nothing, >>>>> because no sites can be crawled. >>>>> >>>>> To enter ANY of the rules I specified above, type a "*" in the type-in >>>>> box, then select "Add Text". Then, select one of "File","Site","List",or >>>>> "Library" from the pulldown, and then click the "Add new Rule" button. >>>>> The >>>>> Metadata tab works similarly. >>>>> >>>>> If you want me to verify you have done this correctly, please include >>>>> a screen shot of the job's View page. >>>>> >>>>> If this still isn't helping you, please include a screen shot of the >>>>> Simple History report after you have run a crawl. >>>>> >>>>> Thanks, >>>>> Karl >>>>> >>>>> >>>>> >>>>> On Mon, Nov 18, 2013 at 7:49 PM, Mark Libucha <[email protected]>wrote: >>>>> >>>>>> I've seen this issue come up before, but I'd like to hear more about >>>>>> it (Karl), if there is more to say about it... >>>>>> >>>>>> Why isn't there an option to crawl an entire SharePoint site. I mean >>>>>> it's awesome that the UI gives us the option of drilling down dynamically >>>>>> and specifying exactly which parts we want crawled, but isn't the default >>>>>> case for most users to just crawl the whole thing? >>>>>> >>>>>> So, why exactly is this not an option, and what would adding that >>>>>> functionality (I would be volunteering to try this) be feasible? >>>>>> >>>>>> On a more specific level, Karl wrote this in an earlier thread: >>>>>> >>>>>> <quote> >>>>>> For SharePoint, if you want to crawl everything beneath your root >>>>>> site, the simplest way is to define 4 rules: >>>>>> (1) SITE rule "/*" >>>>>> (2) LIST rule "/*" >>>>>> (3) LIBRARY rule "/*" >>>>>> (4) FILE rule "/*" >>>>>> </quote> >>>>>> >>>>>> I haven't be able to get this to work. It only seems to get files. >>>>>> >>>>>> Limiting the scope to just Lists, when I use "/*" and specify List, I >>>>>> get nothing crawled. Also tried "/Lists/*". Still nothing. >>>>>> >>>>>> Maybe I'm not specifying the Metadata correctly? Could you expand on >>>>>> this Karl? What exactly needs to be specified to crawl all Lists? If I >>>>>> can >>>>>> get that to work I can probably figure out the rest of it. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Mark >>>>>> >>>>>> >>>>> >>>> >>> >> >
