Matt, I'm trying out the 1.0 version of nifi. I'm trying to get documents using the FetchElasticSearch(Http) Maybe that's the problem I'm having. I was not aware or noticed in the docs mentioning to use the invokehttp. So basically what I'm trying to do is get all the syslogs in a specific index using nifi then store them on HDFS.
On Tue, Oct 25, 2016 at 6:34 PM, Matt Burgess <[email protected]> wrote: > Johny, > > What version of NiFi are you using? Also are you trying to get > documents from ES using FetchElasticSearch(Http) or put docs to it > using PutElasticsearch(Http)? For Fetching, the Document Identifier > is the _id of the document you want to retrieve. If you're looking to > do a search on documents from a given index, type, etc. then (before > NiFi 1.1.0 comes out) you'd have to use InvokeHttp to interact with > the Elasticsearch REST API, then parse the response to get the > document identifiers for each of the results and put that into > FetchElasticsearch. NiFi 1.1.0 will have QueryElasticsearchHttp and > ScrollElasticsearchHttp [1], which are made for getting results from > searches vs direct "gets" (via FetchES). Out of curiosity, what REST > endpoint are you using with curl? > > If you are trying to put docs into ES, then the field is named > Document Identifier Attribute, and that refers to the name of a > FlowFile attribute whose value is the identifier you want to use for > the document (whose body is the content of the FlowFile). > PutElasticsearchHttp supports leaving that field blank when adding to > an index (the ID will be auto-generated), but it is an open issue [2] > to support auto-generation in PutElasticsearch. > > Does this answer your question? If not please let me know and I can > provide more info. > > Regards, > Matt > > [1] https://issues.apache.org/jira/browse/NIFI-2417 > [2] https://issues.apache.org/jira/browse/NIFI-1576 > > On Tue, Oct 25, 2016 at 2:36 PM, johny casanova > <[email protected]> wrote: > > > > > > > > Hello, > > > > Do you guys have an example config of how this processor should look? I > have > > a regular easticsearch install that is only receiving syslogs. I'm > trying to > > figure out how to find or what to put for document identifier. I did a > curl > > in elasticsearch and saw a field "id" but, it does not look like that > works. > > >
