You can index attachment content in the same document. That's really fine.
I would only recommend to extract document content and metadata in another 
process.
Then, in that process, generate the JSon document with all needed fields.

This is basically what I did in FSRiver. I removed the mapper attachment as it 
was not flexible enough for my use case.
Imagine that you send a big PDF file (100Mb) which contains mostly pictures and 
10kb of text. Instead of sending the full 100Mb document encoded in Base64, you 
can extract text and only send text over the wire. (your network bandwidth will 
say thank you :-) )

https://github.com/dadoonet/fsriver/blob/master/src/main/java/fr/pilato/elasticsearch/river/fs/river/FsRiver.java#L687

I don't remember which programming language are you using? Can you use Tika 
from it?


-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 21 février 2014 à 16:31:51, Patrick Aljord ([email protected]) a écrit:

Salut David :) Thanks for the quick reply.

So,  the best way to do this would be to index the attachment in another 
document in another process? Or could it be in the same document as an 
attachment but in a different process always? Also is there another way to 
index files than by mapper attachment?

On Friday, February 21, 2014 4:12:09 PM UTC+1, David Pilato wrote:
Salut Patrick! :-)


You can not update an existing field with new specification for this field.
You need to either add a new field, create a new type (with the new mapping) or 
create a new index.

In addition to this, if you have existing documents, you'll probably need to 
reindex them.

Note: that although mapper attachment is cool to start with elasticsearch, I'd 
prefer to do text extraction in another process than at index time.

-- 
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr


Le 21 février 2014 à 16:09:00, Patrick Aljord ([email protected]) a écrit:

Hey all,

I'm trying to map a field to have type attachment, this works on new indices 
but not on existing ones. 
Is there a way to do this on existing indices? Here is the gist of it:

https://gist.github.com/patcito/281143ee4f440171c875

Thanks in advance,

Pat
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52be5155-1d34-47e4-9484-284c976c49c2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/23429696-37f0-4994-bbe7-daa83efd3de2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/etPan.5307734c.2eb141f2.5655%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to