Hi Karl,
Thanks for your quick reply. I'm using MCF 1.7.1 and below is the solr log for
one specific document pushed by file system connector
349529202 [qtp1191043673-15] INFO
org.apache.solr.update.processor.LogUpdateProcessor รป [core1] webapp=/solr
path=/update/extract
params={literal.deny_token_document=MyGroup:DEAD_AUTHORITY&literal.size=373&literal.stream_name=Indexing_test1.doc&literal.createdOn=Fri+Sep+26+04:23:31+BRT+2014&literal.id=file://///server1/shared/Kanik/Indexing_test1.doc&resource.name=Indexing_test1.doc&literal.allow_token_document=MyGroup:S-1-5-21-220523388-1085031214-725345543-1306383&literal.allow_token_document=MyGroup:S-1-5-32-544&literal.xisourcetype=FileServer&wt=xml&version=2.2&literal.xifilename=Indexing_test1.doc&literal.attributes=32&literal.Content-Type=application/rtf&literal.lastModified=Tue+Nov+18+12:04:48+BRT+2014&literal.shareName=shared}
{add=[file://///server1/shared/Kanik/Indexing_test1.doc
(1485122413007994880)]} 0 15
I see that URL Mapping tab in the job created based on Windows share repository.
by the way, How the request should be created?
Regards
Kambiz
________________________________
From: Karl Wright <[email protected]>
To: "[email protected]" <[email protected]>; Kambiz Niktabar
<[email protected]>
Sent: Tuesday, November 18, 2014 3:29 PM
Subject: Re: Date normalization & URL mapping
Hi Kambiz,
What version of MCF are you using? In 1.7, the file system connector sets the
RepositoryDocument's modifiedDate field, which the solr output connector
formats as iso 8601 format:
if ( modifiedDateAttributeName != null )
{
Date date = document.getModifiedDate();
if ( date != null )
{
outputDoc.addField( modifiedDateAttributeName,
DateParser.formatISO8601Date( date ) );
}
}
As for mapping urls in the file system connector, it does not at this time have
that kind of feature. This is something you will need to request if you need
it. Where are you seeing a URL mapping tab in either Solr or File system
connectors?
Karl
On Tue, Nov 18, 2014 at 9:01 AM, Kambiz Niktabar <[email protected]> wrote:
Hello,
>
>
>Actually I have two questions this time:
> 1. Is there any way to handle date normalization as part of document
> processing in Manifold CF? I tried running file system connector and want to
> map last modified field to a field in Solr but looking at the Solr log it
> looks like this lastModified=Fri+Sep+26+05:51:08+BRT+2014 which is not
> acceptable by Solr. How can I normalize date and time to an acceptable format
> for Solr?
> 2. How can I use that URL mapping tab to replace part of URL with
> something else like below:
>e.g. \\server1\shared\test\file.doc -> H:\test\file.doc
>
>
>Regards
>Kambiz Niktabar