On 05.08.2011 18:16, Gora Mohanty wrote:
Hi,
Not too familiar these days
with Nutch, but my guess is
that a Solr analyser is getting applied. To have a field exactly as is, use
the String fieldtype on Solr's schema.xml rather than tje text fieldtype.
Regards,
Gora
Hi Gora,
thank you for your answer. The field was already a String. The splitting
was done by nutch index-more plugin an passed to the type field, which
is multivalued.
But good to know for future use, that the string type is not processed
by solr.
Thank you.
On 05-Aug-2011 6:35 PM, "Marek Bachmann"<[email protected]> wrote:
Hello people,
I was just wondering how to avoid that the content-type string is split
in to multiple values.
For example: If a document has the content-type: "Application/pdf" it is
broken into three pieces "Application/pdf", "Application", "pdf" in the
solr filed type.
I am not sure if this is done by nutch, or if it is an index topic in
solr.
Sure someone knows the answer to that.
Thank you.