On 07.08.2011 15:35, Markus Jelsma wrote:
700 <property>
701 <name>moreIndexingFilter.indexMimeTypeParts</name>
702 <value>true</value>
703 <description>Determines whether the index-more plugin will split the
mime-
type
704 in sub parts, this requires the type field to be multi valued. Set to
true
for backward
705 compatibility. False will not split the mime-type.
706 </description>
707 </property>
Thank you very much Markus,
I have copied this to my nutch-site.xml. It works very well now.
But I hadn't this option in my nutch-default.xml. Is there a standard
way to get informed about the options that I can pass to a plugin?
Hello people,
I was just wondering how to avoid that the content-type string is split
in to multiple values.
For example: If a document has the content-type: "Application/pdf" it is
broken into three pieces "Application/pdf", "Application", "pdf" in the
solr filed type.
I am not sure if this is done by nutch, or if it is an index topic in solr.
Sure someone knows the answer to that.
Thank you.