You can inspect the CrawlDB with the readdb tool, check if it's there.
 
-----Original message-----
> From:Tolga <[email protected]>
> Sent: Wed 23-May-2012 14:21
> To: [email protected]
> Subject: Re: Apparently far from last question :)
> 
> My colleague has just made me realize something. Is it possible that 
> this xls file wasn't crawled because there isn't a link to it within the 
> website?
> 
> Regards,
> 
> On 5/23/12 2:05 PM, Lewis John Mcgibbney wrote:
> > There is absolutely no requirement to add this configuration to this file.
> > If you you look at the XML file in question, one of the first XML
> > configuration blocks says
> >
> > <!--  by default if the mimeType is set to *, or
> >          if it can't be determined, use parse-tika -->
> >     <mimeType name="*">
> >     <plugin id="parse-tika" />
> >     </mimeType>
> >
> > Just remove your unnecessary config and Tika will do the work for you :0)
> >
> > Lewis
> >
> > On Wed, May 23, 2012 at 11:44 AM, Tolga<[email protected]>  wrote:
> >> Hi,
> >>
> >> I put the lines<mimeType name="application/x-excel">
> >> <plugin id="parse-tika" />
> >> <plugin id="feed" />
> >> </mimeType>
> >>
> >> in parse-plugins.xml, but I still can't crawl xls files. Why is that?
> >>
> >> Regards,
> >
> >
> 

Reply via email to