Mmm but if I express a stream type in the output connector I am expecting
that it has some influence before sending the doc ... Think to this use
case,  I have a repository connector that extract all the mime types.  Then
I have 4 different output connectors,  one for each mimetype.  In that case
I want to specify for each one the stream type and I want only that type.
Without loosing time to send to the tezt/plain connector uselle mp4.

Do you agree?
Il 16/dic/2013 18:02 "Karl Wright" <[email protected]> ha scritto:

> "So I was expecting that if we express the stream.type, we check this type
> before sending a Request to Solr."
>
> Actually, the desired mime types selected by the output connection are
> queried by the repository connection, so that document filtering can take
> place before the document is even fetched.  See
> IOutputConnector.checkMimeTypeIndexable .
>
> Karl
>
>
>
> On Mon, Dec 16, 2013 at 12:11 PM, Alessandro Benedetti <
> [email protected]> wrote:
>
> > Hi guys,
> > I was investigating on the use of the stream.type parameter that we can
> > pass to a Solr Connector as an argument.
> >
> > Form the wiki : "Tika will automatically attempt to determine the input
> > document type (word, pdf, etc.) and extract the content appropriately. If
> > you want, you can explicitly specify a MIME type for Tika wth the
> > stream.type parameter" .
> >
> > So I was expecting that if we express the stream.type, we check this type
> > before sending a Request to Solr.
> > In the way that we avoid to send Request for types that are not the
> wanted
> > one.
> >
> > But in the org.apache.manifoldcf.agents.output.solr.HttpPoster when we
> add
> > the content to the ContentStreamUpdateRequest we don't check the type at
> > all :
> >
> > contentStreamUpdateRequest.addContentStream(new
> > RepositoryDocumentStream(is,length,contentType,contentName));
> >
> > So, if we pass the parameter stream.type=text/plain, and we have one
> > content that is video/mp4 we expect to not send that ( maybe is 1 Gb long
> > and can cause problems) .
> >
> > What do you think ? Should we put a control on the type before sending
> the
> > content ?
> > Am i missing something ?
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>

Reply via email to