If I configure an InvokeHttp processor to query against an elasticsearch
node, I should get one json object written to a flowfile.  If I use the
QueryElasticsearchHttp processor, if the query returns two documents from
the index, I should get two json objects, each written to their own
flowfile.

However, the InvokeHttp processor is writing two flowfiles.  They have
separate UUIDs, but the contents are the same.  Yes, the processor is
scheduled to run every 900 seconds.

The QueryElasticsearchHttp processor is writing 4 flowfiles.  It, too, is
scheduled to run every 900 seconds.

Elasticsearch is returning:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "etltodoc",
        "_type": "document_record",
        "_id": "2045680246129",
        "_score": 0.2876821,
        "_source": {
          "myguid": "2045680246129",
          "filename": "sample1.pdf",
          "exception": "",
          "original_filename": "\\\\f1\\DocsRepo\\CF\\sample1.pdf",
          "conceptCode": "C2159782",
          "timestamp": "2019-03-12T12:43:21.166531",
          "status": "delivered"
        }
      },
      {
        "_index": "etltodock",
        "_type": "document_record",
        "_id": "2045680246128",
        "_score": 0.2876821,
        "_source": {
          "myguid": "2045680246128",
          "filename": "sample2.pdf",
          "exception": "",
          "original_filename": "\\\\f1\\DocsRepo\\CF\\sample2.pdf",
          "conceptCode": "C2159782",
          "timestamp": "2019-03-12T12:43:21.165467",
          "status": "delivered"
        }
      }
    ]
  }
}

I'm hoping I just have something misconfigured, but I have tried playing
with just about every setting.  On the QueryElasticsearchHttp processor, if
I set limit to one, I still get two flowfiles instead of four.

Any help will be much appreciated.

Martin

Reply via email to