"[email protected]" <[email protected]> writes: > 4. You have to write a program that traverses your folders, picks up each > document, and extracts fields from the document to get them indexed.
Or you might use es-nozzle [1], which traverses your folders and indexes documents into elasticsearch. It uses tika to extract content from various file formats and will incrementally synchronize the folders content to the elasticsearch index. I.e. it updates your index with new documents and deletes documents from elasticsearch if they have been removed from the folder. Please visit http://brainbot.com/es-nozzle/doc/ for detailed documentation. The code lives on github: https://github.com/brainbot-com/es-nozzle Please let me know about any problems you run into if you give it a try. I'm the author of es-nozzle. Another option might be fsriver: https://github.com/dadoonet/fsriver -- Cheers Ralf -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/87zjmxl012.fsf%40systemexit.de. For more options, visit https://groups.google.com/groups/opt_out.
