Hi Ameya,

The file system connector does not retrieve any metadata for a document at
all.  So I'm not sure what metadata you are talking about.

Karl



On Thu, Jul 31, 2014 at 2:44 PM, Ameya Aware <[email protected]> wrote:

> So the thing here is i am not looking for any data or content of any of
> files. I am just interested in metadata of file.
>
> So i thought it should be possible to not read any file and just get
> metadata of file and give to Solr.
>
> This should save lots of time.
>
> Is it possible to do this?
>
> Thanks,
> Ameya
>
>
>
> On Thu, Jul 31, 2014 at 2:13 PM, Karl Wright <[email protected]> wrote:
>
>> Hi Ameya,
>>
>> (1) Please look at the Simple History report.  Note what kinds of
>> documents are being fetched, what kinds are being indexed, and how long it
>> is taking.  I have noted from your previous posts that you seem to be
>> indexing a lot of very large EXE files.  This is useless and you should be
>> excluding them.
>>
>> (2) Please look in the manifoldcf.log file for evidence that fetches
>> and/or Solr indexing requests are being retried due to errors.  It doesn't
>> take many documents being chronically retried before forward progress drops
>> to near zero.
>>
>> (3) If you look into (1) & (2) and everything seems fine, it may be a
>> misalignment between availability of several kinds of resources that is the
>> problem.  Please get a thread dump of the agents process while it is
>> crawling, using jstack.  Post that thread dump and we can tell you what to
>> look at next.
>>
>> Karl
>>
>>
>>
>> On Thu, Jul 31, 2014 at 2:07 PM, Ameya Aware <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>> I am using filesystem connector to index my entire C drive using Solr as
>>> output connector.
>>>
>>> Initial 100000 documents were crawled and indexed successfully in couple
>>> of hours but after that indexing slowed down badly (around 15-20 documents
>>> per min).
>>>
>>>
>>> I am not able to figure out whether there is issue with MCF or Solr.
>>>
>>>
>>> Can you advice me how to proceed with this?
>>>
>>>
>>> Thanks,
>>> Ameya
>>>
>>
>>
>

Reply via email to