If you can do it by yourself and use Tika directly, I’d definitely do that and 
don’t use the mapper attachment plugin.
You will have more control on what you exactly want to do than with the mapper 
attachment plugin.

My 2 cents

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr 
<https://twitter.com/elasticsearchfr> | @scrutmydocs 
<https://twitter.com/scrutmydocs>



> Le 10 déc. 2014 à 12:42, Peter Bowyer <pe...@mapledesign.co.uk> a écrit :
> 
> Hi list,
> 
> I'm indexing a website which has a lot of files on it.
> 
> I found the attachment plugin which handles all file types we have, but our 
> files are not "attached" (associated) with a particular web page -- in many 
> cases the same file is attached to multiple pages. So we want files to show 
> in the search results alongside other items.
> 
> I can extract data from the file myself using Apache Tika and index it as 
> with any other document in the system; but given Tika runs inside the 
> attachment plugin, is there any way to use the built-in system?
> 
> Thanks,
> Peter
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com 
> <mailto:elasticsearch+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/fa853486-ef71-48f2-a711-998b792cfb35%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/fa853486-ef71-48f2-a711-998b792cfb35%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/AD71C2AC-6257-4E39-8235-23A2C763B3B9%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Reply via email to