One solution to this problem is to change the order of field operation 
(http://wiki.apache.org/solr/ExtractingRequestHandler#Order_of_field_operations)
 to first do fmap.*= processing, then add the fields from literal.*=. Why would 
anyone want to rename a field they just have explicitly named anyway?

Another solution that would work for me is an option to let ALL tika generated 
fields be prefixed, e.g. tprefix=tika_. But I need Extracting handler to output 
to fields which do not exist in schema.xml. This is because later in the 
UpdateChain I do field choosing and renaming in another UpdateProcessor, so the 
field names coming from ExtractingHandler are only tempoprary and will not be 
sent to Solr. Thus, an option to skip the schema check would be useful, perhaps 
in the form of a whitelist for uprefix 
&uprefix.whitelist=fielda,other-non-existing-field, causing uprefix not rename 
those.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 9. juni 2011, at 11.26, Jan Høydahl wrote:

> Hi,
> 
> I post a PDF from a CMS client, which has metadata about the document. One of 
> those metadata is the title. I trust the title of the CMS more than the title 
> extracted from the PDF, but I cannot find a way to both send 
> &literal.title=CMS-Title as well as changing the name of the title field 
> generated by Tika/SolrCell. If I do fmap.title=tika_title then my 
> literal.title also also changes name. Any ideas?
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
> 

Reply via email to