Hi Will,

Any reason why you are not using mcf's built-in filesystemoutputconnector?

http://manifoldcf.apache.org/release/release-1.6/en_US/end-user-documentation.html#filesystemoutputconnector

Regarding aspx files, may be your Path Rules are not capturing all aspx files?
Can you give some example URLs that are not included in index and your rules?

Ahmet

On Monday, May 12, 2014 9:09 AM, Will Parkinson <[email protected]> 
wrote:



Hello

I have a crawl of Sharepoint 2010 running using ManifoldCF which is storing 
data to the filesystem using a custom output connector i built.

After looking at the files stored on the filesystem  i noticed that the 
AllItems.aspx and the default.aspx for some areas are being stored, whereas 
others are being ignored and not downloaded.  I have by debug verbosity set to 
the maximum level and i can't see that these files are ever encountered for 
download, but clearly exist on the Sharepoint site.

Does anybody know why some AllItems.aspx and default.aspx files would be 
downloaded and others with the same name ignored?

Cheers,

Will 

Reply via email to