So that’s the expected behavior. Mapper attachment only index the content but never modify the _source document..
If you want to see extracted text, you need to store the field and explicitly ask for it at query time using fields option. Have a look here: https://github.com/elasticsearch/elasticsearch-mapper-attachments#highlighting-attachments <https://github.com/elasticsearch/elasticsearch-mapper-attachments#highlighting-attachments> -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr <https://twitter.com/elasticsearchfr> | @scrutmydocs <https://twitter.com/scrutmydocs> > Le 20 nov. 2014 à 20:14, Raymond Giorgi <[email protected]> a écrit : > > Also, this is the first line of what's posted along the river > > { "index": {"_index":"resumes","_type":"resume","_id":"2158912"}} > > Things can get truncated when they're as big as a Base64 encoded file :) > > > On Wednesday, November 19, 2014 6:01:29 PM UTC-5, Raymond Giorgi wrote: > Hey all, > > I'm hoping someone can help me out with something I'm having an issue with. > > The short: I'm trying to extract plaintext from the attachment-mapper. > > The long: I'm posting the contents of a file Base64 encoded to RabbitMQ which > is feeding an ElasticSearch river plugin. Querying against the field works > fine, but it only seems to store the Base64 encoding of the file instead of > the plaintext. I'd like to extract the contents as plaintext and have that be > returnable (i.e. query for the text of a docx). I'm feeding it from a PHP > front end, so there are places in the app where I'd like to rely on > Elasticsearch's built in Tika processor. > > Thanks! > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/68456ac0-14b9-49f8-a0a0-b930223004f8%40googlegroups.com > > <https://groups.google.com/d/msgid/elasticsearch/68456ac0-14b9-49f8-a0a0-b930223004f8%40googlegroups.com?utm_medium=email&utm_source=footer>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8A848658-E1A7-4192-B66C-104D664C7A66%40pilato.fr. For more options, visit https://groups.google.com/d/optout.
