Hi Ali,

We have recently faced some data sources that generate data in a nested
format. For example, AWS Cloudtrail generates data in the following JSON
format:

{
  "Records": [
    {
      "eventVersion": *"2.0"*,
      "userIdentity": {
        "type": *"IAMUser"*,
        "principalId": *"EX_PRINCIPAL_ID"*,
        "arn": *"arn:aws:iam::123456789012:user/Alice"*,
        "accessKeyId": *"EXAMPLE_KEY_ID"*,
        "accountId": *"123456789012"*,
        "userName": *"Alice"*
      },
      "eventTime": *"2014-03-07T21:22:54Z"*,
      "eventSource": *"ec2.amazonaws.com <http://ec2.amazonaws.com>"*,
      "eventName": *"StartInstances"*,
      "awsRegion": *"us-east-2"*,
      "sourceIPAddress": *"205.251.233.176"*,
      "userAgent": *"ec2-api-tools 1.6.12.2"*,
      "requestParameters": {
        "instancesSet": {
          "items": [
            {
              "instanceId": *"i-ebeaf9e2"*
            }
          ]
        }
      },
      "responseElements": {
        "instancesSet": {
          "items": [
            {
              "instanceId": *"i-ebeaf9e2"*,
              "currentState": {
                "code": 0,
                "name": *"pending"*
              },
              "previousState": {
                "code": 80,
                "name": *"stopped"*
              }
            }
          ]
        }
      }
    }
  ]
}

We are able to make this as a flat JSON file. However, a nested object is supported by data backends in Metron (ES, ORC, etc.), so I was wondering
whether with the current version of Metron we are able to index nested
documents or we have to make it flat?

We parse the same CloudTrail data. The way we parse this is first of all, we have Apache NiFi running which extracts the individual events from the records. Second, make sure that you use set mapStrategy to UNFOLD in your JSON Parser.

Reply via email to