Re: Tikka content extractor transformation connection

Karl Wright Tue, 03 Mar 2015 03:35:43 -0800

Hi Madalina,

If you are using MCF 1.7 or greater, you can specify multiple output
connections for a job, and different transformations for each output
connection.  So you should be able to do anything you like, provided the
transformations you are attempting are supported as transformation
connectors.

For extraction to Solr, you can either extract the documents within MCF
from binary to text, and index those through the update handler, OR you can
send the documents intact to Solr via the update/extract handler.  If you
want to make a separate copy of the text somewhere, then you would probably
want to do the extraction once, and output the result both to Solr's update
handler and to the file system.

Please note that the file system output connector does not do anything with
metadata, so that would be lost.

Karl

On Tue, Mar 3, 2015 at 5:14 AM, Madalina Rogoz <[email protected]> wrote:

> Can a Tikka transformation connection be used to actually move documents
> from SharePoint to Solr/FileShare or can that only be achieved with a File
> System Output Connection?
>
> What I am trying to figure out is if ManifoldCF can handle a migration
> from SharePoint 2010 to Solr. I am thinking to crawl SP with ManifoldCF,
> but I also need the actual Office Documents to be available outside of
> SharePoint after the crawl.
> So I either need to use a Tikka transformation connection that saves the
> document as an attachment/binary string to Solr or use an additional File
> System Output connection where I only save the Office Documents and then
> figure out how to update the Solr metadata so that the document links point
> to the file share instead.
>
> Thoughts? Any idea is appreciated.
> Thank you!
>

Re: Tikka content extractor transformation connection

Reply via email to