I'm using golang so I can index the attachment in a different go routine.
FSRiver looks cool, with it I just need to upload files to /some/path and
FSRiver will index them automatically every 15 minutes or so and doing the
hard work of separating text from binary using tika, is that correct? I
think I may just use that.


On Fri, Feb 21, 2014 at 4:39 PM, David Pilato <[email protected]> wrote:

> You can index attachment content in the same document. That's really fine.
> I would only recommend to extract document content and metadata in another
> process.
> Then, in that process, generate the JSon document with all needed fields.
>
> This is basically what I did in FSRiver. I removed the mapper attachment
> as it was not flexible enough for my use case.
> Imagine that you send a big PDF file (100Mb) which contains mostly
> pictures and 10kb of text. Instead of sending the full 100Mb document
> encoded in Base64, you can extract text and only send text over the wire.
> (your network bandwidth will say thank you :-) )
>
>
> https://github.com/dadoonet/fsriver/blob/master/src/main/java/fr/pilato/elasticsearch/river/fs/river/FsRiver.java#L687
>
> I don't remember which programming language are you using? Can you use
> Tika from it?
>
>
> --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 21 février 2014 à 16:31:51, Patrick Aljord ([email protected]) a écrit:
>
> Salut David :) Thanks for the quick reply.
>
> So,  the best way to do this would be to index the attachment in another
> document in another process? Or could it be in the same document as an
> attachment but in a different process always? Also is there another way to
> index files than by mapper attachment?
>
> On Friday, February 21, 2014 4:12:09 PM UTC+1, David Pilato wrote:
>>
>>  Salut Patrick! :-)
>>
>>
>>  You can not update an existing field with new specification for this
>> field.
>>  You need to either add a new field, create a new type (with the new
>> mapping) or create a new index.
>>
>>  In addition to this, if you have existing documents, you'll probably
>> need to reindex them.
>>
>>  Note: that although mapper attachment is cool to start with
>> elasticsearch, I'd prefer to do text extraction in another process than at
>> index time.
>>
>>      --
>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
>>  @dadoonet <https://twitter.com/dadoonet> | 
>> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>>
>>
>> Le 21 février 2014 à 16:09:00, Patrick Aljord ([email protected]) a écrit:
>>
>>  Hey all,
>>
>> I'm trying to map a field to have type attachment, this works on new
>> indices but not on existing ones.
>> Is there a way to do this on existing indices? Here is the gist of it:
>>
>>  https://gist.github.com/patcito/281143ee4f440171c875
>>
>> Thanks in advance,
>>
>> Pat
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit https://groups.google.com/d/
>> msgid/elasticsearch/52be5155-1d34-47e4-9484-284c976c49c2%
>> 40googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>   --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/23429696-37f0-4994-bbe7-daa83efd3de2%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/5wkxrfMECZA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/etPan.5307734c.2eb141f2.5655%40MacBook-Air-de-David.local
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAK9TW4rFBYRrVjLEiO1q4AcR%3D-J4F%2B0Cv7p02pE5-bzXan6Jxg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to