I didn't knew that command ... i've edited some confidential values in the result, but here it is

$ bq --project_id={{PROJECT_ID}} --format=prettyjson show -j 9e790299-dc77-46f4-8978-476f284fe5b5
{
  "configuration": {
    "jobType": "LOAD",
    "load": {
      "createDisposition": "CREATE_IF_NEEDED",
      "destinationTable": {
        "datasetId": "Consents",
        "projectId": "{{PROJECT_ID}}",
        "tableId": "{{TABLE_ID}}"
      },
      "ignoreUnknownValues": false,
      "maxBadRecords": 0,
      "schema": {
        "fields": [
          {
            "fields": [
              {
                "mode": "NULLABLE",
                "name": "id",
                "type": "STRING"
              },
              {
                "fields": [
                  {
                    "mode": "NULLABLE",
                    "name": "id",
                    "type": "STRING"
                  },
                  {
                    "mode": "NULLABLE",
                    "name": "type",
                    "type": "STRING"
                  },
                  {
                    "mode": "NULLABLE",
                    "name": "businessUnit",
                    "type": "STRING"
                  }
                ],
                "mode": "NULLABLE",
                "name": "identity",
                "type": "RECORD"
              },
              {
                "mode": "NULLABLE",
                "name": "finality",
                "type": "STRING"
              },
              {
                "mode": "NULLABLE",
                "name": "consentDate",
                "type": "TIMESTAMP"
              },
              {
                "mode": "NULLABLE",
                "name": "expiryDate",
                "type": "TIMESTAMP"
              },
              {
                "mode": "NULLABLE",
                "name": "expired",
                "type": "BOOLEAN"
              },
              {
                "mode": "NULLABLE",
                "name": "createdBy",
                "type": "STRING"
              },
              {
                "mode": "NULLABLE",
                "name": "createdDate",
                "type": "TIMESTAMP"
              },
              {
                "fields": [
                  {
                    "mode": "NULLABLE",
                    "name": "id",
                    "type": "STRING"
                  },
                  {
                    "mode": "NULLABLE",
                    "name": "application",
                    "type": "STRING"
                  },
                  {
                    "mode": "NULLABLE",
                    "name": "type",
                    "type": "STRING"
                  }
                ],
                "mode": "NULLABLE",
                "name": "sender",
                "type": "RECORD"
              },
              {
                "fields": [
                  {
                    "mode": "NULLABLE",
                    "name": "id",
                    "type": "STRING"
                  },
                  {
                    "mode": "NULLABLE",
                    "name": "type",
                    "type": "STRING"
                  }
                ],
                "mode": "NULLABLE",
                "name": "relatedEvent",
                "type": "RECORD"
              }
            ],
            "mode": "NULLABLE",
            "name": "ContractualConsent",
            "type": "RECORD"
          }
        ]
      },
      "sourceFormat": "NEWLINE_DELIMITED_JSON",
      "writeDisposition": "WRITE_APPEND"
    }
  },
  "etag": "RqYxd6o2jzl6YiTARI5nxg==",
  "id": "{{PROJECT_ID}}:EU.9e790299-dc77-46f4-8978-476f284fe5b5",
  "jobReference": {
    "jobId": "9e790299-dc77-46f4-8978-476f284fe5b5",
    "location": "EU",
    "projectId": "{{PROJECT_ID}}"
  },
  "kind": "bigquery#job",
  "selfLink": "https://bigquery.googleapis.com/bigquery/v2/projects/{{PROJECT_ID}}/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU";,
  "statistics": {
    "creationTime": "1569491661818",
    "endTime": "1569491662935",
    "startTime": "1569491662366"
  },
  "status": {
    "errorResult": {
      "message": "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.",
      "reason": "invalid"
    },
    "errors": [
      {
        "message": "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.",
        "reason": "invalid"
      },
      {
        "message": "Error while reading data, error message: JSON processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0",
        "reason": "invalid"
      },
      {
        "message": "Error while reading data, error message: JSON parsing error in row starting at position 0: Start of array encountered without start of object.",
        "reason": "invalid"
      }
    ],
    "state": "DONE"
  },
  "user_email": "rabbitmq-inges...@psh-analytics-automation.iam.gserviceaccount.com"
}

Error message is interesting.

If I look in data provenance at the data I'm expected to send to BigQuery, I get


[{"ContractualConsent":{"id":"5d847c5c92913700017692fc","identity":{"id":"511096128","type":"customer","businessUnit":"lmit"},"finality":"commercial_relationship","consentDate":"2019-06-04T15:39:32Z","expiryDate":"2024-06-04T15:39:32Z","expired":false,"createdBy":"DynamoCRM_DC","createdDate":"2019-09-20T07:14:36.576Z","sender":{"id":"511096128","application":"DYNAMO-CRM","type":"CUSTOMER"},"relatedEvent":{"id":"a72c44f1-de86-e911-a827-000d3a2aa91d","type":"customer_request"}}},{"ContractualConsent":{"id":"5d847c5c5fa9420001ebf04e","identity":{"id":"509582521","type":"customer","businessUnit":"lmit"},"finality":"commercial_relationship","consentDate":"2019-06-07T08:09:32Z","expiryDate":"2024-06-07T08:09:32Z","expired":false,"createdBy":"DynamoCRM_DC","createdDate":"2019-09-20T07:14:36.708Z","sender":{"id":"509582521","application":"DYNAMO-CRM","type":"CUSTOMER"},"relatedEvent":{"id":"6c335392-fb88-e911-a827-000d3a2aa91d","type":"customer_request"}}}]


Which is indeed an array, instead of an object.

And maybe it is because my JsonRecordSetWriter has for "Output grouping" the "Array" value selected ...


Well, strangely, even after having changed configuration of my JsoNRecordSetwriter, values continue to be json arrays ...

Anyway, I guess i'm on the right path ... (thanks a lot Pierre)


Le 26/09/2019 à 13:18, Pierre Villard a écrit :
What if you run the below command in Cloud Shell:
bq --format=prettyjson show -j <job id>

In your case (with your last email):
bq --format=prettyjson show -j 9e790299-dc77-46f4-8978-476f284fe5b5

Does it give you more details?

Le jeu. 26 sept. 2019 à 12:13, Nicolas Delsaux <nicolas.dels...@gmx.fr <mailto:nicolas.dels...@gmx.fr>> a écrit :

    Sorry for the late reply.

    As of today, the issue is still present.

    Nifi Web UI just shows the message "Error while reading data,
    error message: JSON table encountered too many errors, giving up.
    Rows: 1; errors: 1. Please look into the errors[] collection for
    more details."

    But the log is clearer :


    --------------------------------------------------

    Standard FlowFile Attributes

    Key: 'entryDate'

    Value: 'Thu Sep 26 09:53:49 UTC 2019'

    Key: 'lineageStartDate'

    Value: 'Thu Sep 26 09:53:49 UTC 2019'

    Key: 'fileSize'

    Value: '999'

    FlowFile Attribute Map Content

    Key: 'avro.schema'

    Value:
    
'{"type":"record","name":"nifiRecord","namespace":"org.apache.nifi","fields":[{"name":"ExplicitConsent","type":["null",{"type":"record","name":"ExplicitConsentType","fields":[{"name":"id","type":["null","string"]},{"name":"identity","type":["null",{"type":"record","name":"identityType","fields":[{"name":"id","type":["null","string"]},{"name":"type","type":["null","string"]},{"name":"businessUnit","type":["null","string"]}]}]},{"name":"finality","type":["null","string"]},{"name":"expired","type":["null","boolean"]},{"name":"createdBy","type":["null","string"]},{"name":"createdDate","type":["null","string"]},{"name":"sender","type":["null",{"type":"record","name":"senderType","fields":[{"name":"id","type":["null","string"]},{"name":"application","type":["null","string"]},{"name":"type","type":["null","string"]}]}]},{"name":"state","type":["null","string"]}]}]}]}'


    Key: 'bq.error.message'

    Value: 'Error while reading data, error message: JSON table
    encountered too many errors, giving up. Rows: 1; errors: 1. Please
    look into the errors[] collection for more details.'

    Key: 'bq.error.reason'

    Value: 'invalid'

    Key: 'bq.job.link'

    Value:
    
'https://www.googleapis.com/bigquery/v2/projects/{{PROJECT_ID}}/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU'
    
<https://www.googleapis.com/bigquery/v2/projects/psh-datacompliance/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU'>


    Key: 'bq.job.stat.creation_time'

    Value: '1569491661818'

    Key: 'bq.job.stat.end_time'

    Value: '1569491662935'

    Key: 'bq.job.stat.start_time'

    Value: '1569491662366'

    Key: 'filename'

    Value: 'e6d604d7-b517-4a87-a398-e4a5df342ce6'

    Key: 'kafka.key'

    Value: '--'

    Key: 'kafka.partition'

    Value: '0'

    Key: 'kafka.topic'

    Value: 'dc.consent-life-cycle.kpi-from-dev-nifi-json'

    Key: 'merge.bin.age'

    Value: '1'

    Key: 'merge.count'

    Value: '3'

    Key: 'mime.type'

    Value: 'application/json'

    Key: 'path'

    Value: './'

    Key: 'record.count'

    Value: '3'

    Key: 'uuid'

    Value: 'e6d604d7-b517-4a87-a398-e4a5df342ce6'
    2019-09-26 10:09:39,633 INFO [Timer-Driven Process Thread-4]
    o.a.n.processors.standard.LogAttribute
    LogAttribute[id=ce9c171f-0c8f-3cab-e0f2-16156faf15b8] logging for
    flow file
    
StandardFlowFileRecord[uuid=e6d604d7-b517-4a87-a398-e4a5df342ce6,claim=StandardContentClaim
    [resourceClaim=StandardResourceClaim[id=1569490848560-6,
    container=default, section=6], offset=569098,
    length=999],offset=0,name=e6d604d7-b517-4a87-a398-e4a5df342ce6,size=999]



    I don't exactly understand why i would have to set an
    authentication, because I've set the service.json content into the
    GCP Credentials Provider I use for my PutBigQueryBatch processor ...

    Is there anything I'm missing ? or a simple way to make sure
    verything work as expected ?

    Thanks


    Le 24/09/2019 à 16:12, Pierre Villard a écrit :
    Hey Nicolas,

    Did you manage to solve your issue? Happy to help on this one.

    Thanks,
    Pierre

    Le ven. 20 sept. 2019 à 16:42, Nicolas Delsaux
    <nicolas.dels...@gmx.fr <mailto:nicolas.dels...@gmx.fr>> a écrit :

        Hello

        I'm using PutBigQueryBash and having weird auth issues.

        I have set the GCP Credentials Controller Service to use
        Service Account JSON which I have copied from the value given
        in Google Cloud Console.

        But when I run my flow, I get the error message "Error while
        reading data, error message: JSON table encountered too many
        errors, giving up. Rows: 1; errors: 1. Please look into the
        errors[] collection for more details."


        What is stranger is that when I log all properties, there is
        a bq.job.link which messages indicate "Request is missing
        required authentication credential. Expected OAuth 2 access
        token, login cookie or other valid authentication credential.
        See
        https://developers.google.com/identity/sign-in/web/devconsole-project.";
        ...

        But nifi can access the bigquery workspace and dataset (I've
        checked that by deleting the table schema that I have already
        written).

        So, is there something I'm doing wrong ?

        Thanks !

Reply via email to