This code is now complete in both trunk and the dev_1x branch. Karl
On Tue, Oct 7, 2014 at 11:18 AM, Karl Wright <[email protected]> wrote: > Hi Jitu, > > I would suggest that we do not try for multiple date ranges, but just an > "earliest document date" filtering parameter. Adding this functionality to > the Document Filter transformation connector would be what I'd do. If > necessary, we can also add an IOutputActivities method which will allow a > connector to decide whether a document needs to be fetched or not based on > its date stamp; this would help prevent unnecessary work opening older > documents. > > Oddly enough, I think that the work involved would largely be in coming up > with a reasonable date selection UI. > > If this sounds like it is what you want, please go ahead and create a > ticket describing this functionality. > > Karl > > > On Tue, Oct 7, 2014 at 10:35 AM, Jitu <[email protected]> wrote: > >> Hi Karl, >> Thanks for the support. what you said is absolutely what we >> are looking for too. Crawling is absolutely fine but we should not process >> the documents until the criteria is met. here the criteria is file modified >> during last 2 months or 3 months or date range. >> >> It is something similar to getDocumentVersions which checks if that >> document version is updated and process the file only if the version is >> updated. so crawl the documents but don't process them unless the criteria >> matches. is there a way to achieve it. >> >> Thanks, >> Jitu >> >> On Tue, Oct 7, 2014 at 7:50 PM, Karl Wright <[email protected]> wrote: >> >>> Hi Jitu, >>> >>> I know of no way to crawl only those documents that were created after a >>> specified date. SharePoint crawling involves walking a tree, not querying >>> SharePoint for a list of documents that fulfills a specific criteria. >>> >>> What this means is that we will need to crawl the entire tree >>> *regardless* of what documents we decide to index. We can filter the >>> discovered documents by looking at their creation date, and exclude those >>> last modified prior to 2011-01-01 from being indexed. That would cut down >>> on the work that your index needs to do, and the work of actually fetching >>> the content itself. But we would still need to crawl all documents. >>> >>> Karl >>> >>> >>> On Tue, Oct 7, 2014 at 10:11 AM, Jitu <[email protected]> wrote: >>> >>>> Hi Karl, >>>> >>>> Here is the requirement: >>>> >>>> One of our customers would like to selectively publish the documents >>>> from his SharePoint which is over grown in size in due course. Since >>>> filtering based on folder names is not an easy task, he likes us to crawl >>>> all the documents created in sharepoint between 2 dates. >>>> >>>> >>>> >>>> All documents created/modified between 2011-01-01 till 2013-12-31 are >>>> needed to crawl and if that is possible to do, then the additional filters >>>> get added to the date range. Ex: get only the Docx and Doc files created >>>> between 2011-01-01 to 2013-12-31 etc… >>>> >>>> >>>> similarly all documents created/modified in last 2 months etc... >>>> >>>> >>>> Thanks, >>>> >>>> Jitu >>>> >>>> On Mon, Oct 6, 2014 at 5:04 PM, Karl Wright <[email protected]> wrote: >>>> >>>>> Hi Jitu, >>>>> >>>>> Did you ever figure out what the customer requirement really was here? >>>>> >>>>> Thanks, >>>>> Karl >>>>> >>>>> >>>>> On Fri, Oct 3, 2014 at 6:09 PM, Karl Wright <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Jitu, >>>>>> >>>>>> SharePoint does not provide a way to crawl documents by date range, >>>>>> so all documents will need to be crawled regardless of any date range >>>>>> requirement, and then filtered. >>>>>> >>>>>> So at this point it is important to ask the client if their >>>>>> requirement's purpose is to save crawling load on the server, because if >>>>>> it >>>>>> is, you won't get much savings. But if the client wants this feature for >>>>>> other reasons, we can support it with some work. >>>>>> >>>>>> Please open a ticket if you find that the client has a legitimate >>>>>> reason for this requirement. >>>>>> >>>>>> Karl >>>>>> >>>>>> Sent from my Windows Phone >>>>>> ------------------------------ >>>>>> From: Jitu >>>>>> Sent: 10/3/2014 4:22 PM >>>>>> To: [email protected] >>>>>> Subject: regarding crawl parameters >>>>>> >>>>>> Hi Karl, >>>>>> >>>>>> Thanks for your continuous support. we have a requirement from our >>>>>> client to crawl files which are created/modified in last one month or 2 >>>>>> months from share point server and that parameter should be configurable >>>>>> in >>>>>> gui. we are using manifoldcf 1.7 version. Is there a way to achieve this. >>>>>> Please help. >>>>>> >>>>>> Thanks, >>>>>> Jitu >>>>>> >>>>> >>>>> >>>> >>> >> >
