Hm, do you know how I would proceed to attach FSRiver docs to my regular elasticsearch index? I mean, from my app point of view, some docs have attachments, is there a trick to link FSRiver indexed files to other elasticsearch docs in a different index so I can query docs with FSRiver attached files (I hope this makes sense)?
On Fri, Feb 21, 2014 at 4:58 PM, David Pilato <[email protected]> wrote: > Yes. That's it. > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 21 février 2014 à 16:57:29, Patrick Aljord ([email protected]) a écrit: > > I'm using golang so I can index the attachment in a different go routine. > FSRiver looks cool, with it I just need to upload files to /some/path and > FSRiver will index them automatically every 15 minutes or so and doing the > hard work of separating text from binary using tika, is that correct? I > think I may just use that. > > > On Fri, Feb 21, 2014 at 4:39 PM, David Pilato <[email protected]> wrote: > >> You can index attachment content in the same document. That's really >> fine. >> I would only recommend to extract document content and metadata in >> another process. >> Then, in that process, generate the JSon document with all needed fields. >> >> This is basically what I did in FSRiver. I removed the mapper attachment >> as it was not flexible enough for my use case. >> Imagine that you send a big PDF file (100Mb) which contains mostly >> pictures and 10kb of text. Instead of sending the full 100Mb document >> encoded in Base64, you can extract text and only send text over the wire. >> (your network bandwidth will say thank you :-) ) >> >> >> https://github.com/dadoonet/fsriver/blob/master/src/main/java/fr/pilato/elasticsearch/river/fs/river/FsRiver.java#L687 >> >> I don't remember which programming language are you using? Can you use >> Tika from it? >> >> >> -- >> *David Pilato* | *Technical Advocate* | *Elasticsearch.com* >> @dadoonet <https://twitter.com/dadoonet> | >> @elasticsearchfr<https://twitter.com/elasticsearchfr> >> >> >> Le 21 février 2014 à 16:31:51, Patrick Aljord ([email protected]) a >> écrit: >> >> Salut David :) Thanks for the quick reply. >> >> So, the best way to do this would be to index the attachment in another >> document in another process? Or could it be in the same document as an >> attachment but in a different process always? Also is there another way to >> index files than by mapper attachment? >> >> On Friday, February 21, 2014 4:12:09 PM UTC+1, David Pilato wrote: >>> >>> Salut Patrick! :-) >>> >>> >>> You can not update an existing field with new specification for this >>> field. >>> You need to either add a new field, create a new type (with the new >>> mapping) or create a new index. >>> >>> In addition to this, if you have existing documents, you'll probably >>> need to reindex them. >>> >>> Note: that although mapper attachment is cool to start with >>> elasticsearch, I'd prefer to do text extraction in another process than at >>> index time. >>> >>> -- >>> *David Pilato* | *Technical Advocate* | *Elasticsearch.com* >>> @dadoonet <https://twitter.com/dadoonet> | >>> @elasticsearchfr<https://twitter.com/elasticsearchfr> >>> >>> >>> Le 21 février 2014 à 16:09:00, Patrick Aljord ([email protected]) a >>> écrit: >>> >>> Hey all, >>> >>> I'm trying to map a field to have type attachment, this works on new >>> indices but not on existing ones. >>> Is there a way to do this on existing indices? Here is the gist of it: >>> >>> https://gist.github.com/patcito/281143ee4f440171c875 >>> >>> Thanks in advance, >>> >>> Pat >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/52be5155-1d34-47e4-9484-284c976c49c2%40googlegroups.com >>> . >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send >> an email to [email protected]. >> >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/23429696-37f0-4994-bbe7-daa83efd3de2%40googlegroups.com >> . >> For more options, visit https://groups.google.com/groups/opt_out. >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "elasticsearch" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/elasticsearch/5wkxrfMECZA/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/etPan.5307734c.2eb141f2.5655%40MacBook-Air-de-David.local. >> >> >> For more options, visit https://groups.google.com/groups/opt_out. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAK9TW4rFBYRrVjLEiO1q4AcR%3D-J4F%2B0Cv7p02pE5-bzXan6Jxg%40mail.gmail.com > . > > For more options, visit https://groups.google.com/groups/opt_out. > > -- > You received this message because you are subscribed to a topic in the > Google Groups "elasticsearch" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/elasticsearch/5wkxrfMECZA/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/etPan.530777b2.216231b.5655%40MacBook-Air-de-David.local > . > > For more options, visit https://groups.google.com/groups/opt_out. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAK9TW4pffMyF6WU1TZU2p%2BQMX-X1JVsPba3iWa_rVfXR7_xQTQ%40mail.gmail.com. For more options, visit https://groups.google.com/groups/opt_out.
