I didn't knew that command ... i've edited some confidential values in
the result, but here it is
$ bq --project_id={{PROJECT_ID}} --format=prettyjson show -j
9e790299-dc77-46f4-8978-476f284fe5b5
{
"configuration": {
"jobType": "LOAD",
"load": {
"createDisposition": "CREATE_IF_NEEDED",
"destinationTable": {
"datasetId": "Consents",
"projectId": "{{PROJECT_ID}}",
"tableId": "{{TABLE_ID}}"
},
"ignoreUnknownValues": false,
"maxBadRecords": 0,
"schema": {
"fields": [
{
"fields": [
{
"mode": "NULLABLE",
"name": "id",
"type": "STRING"
},
{
"fields": [
{
"mode": "NULLABLE",
"name": "id",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "type",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "businessUnit",
"type": "STRING"
}
],
"mode": "NULLABLE",
"name": "identity",
"type": "RECORD"
},
{
"mode": "NULLABLE",
"name": "finality",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "consentDate",
"type": "TIMESTAMP"
},
{
"mode": "NULLABLE",
"name": "expiryDate",
"type": "TIMESTAMP"
},
{
"mode": "NULLABLE",
"name": "expired",
"type": "BOOLEAN"
},
{
"mode": "NULLABLE",
"name": "createdBy",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "createdDate",
"type": "TIMESTAMP"
},
{
"fields": [
{
"mode": "NULLABLE",
"name": "id",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "application",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "type",
"type": "STRING"
}
],
"mode": "NULLABLE",
"name": "sender",
"type": "RECORD"
},
{
"fields": [
{
"mode": "NULLABLE",
"name": "id",
"type": "STRING"
},
{
"mode": "NULLABLE",
"name": "type",
"type": "STRING"
}
],
"mode": "NULLABLE",
"name": "relatedEvent",
"type": "RECORD"
}
],
"mode": "NULLABLE",
"name": "ContractualConsent",
"type": "RECORD"
}
]
},
"sourceFormat": "NEWLINE_DELIMITED_JSON",
"writeDisposition": "WRITE_APPEND"
}
},
"etag": "RqYxd6o2jzl6YiTARI5nxg==",
"id": "{{PROJECT_ID}}:EU.9e790299-dc77-46f4-8978-476f284fe5b5",
"jobReference": {
"jobId": "9e790299-dc77-46f4-8978-476f284fe5b5",
"location": "EU",
"projectId": "{{PROJECT_ID}}"
},
"kind": "bigquery#job",
"selfLink":
"https://bigquery.googleapis.com/bigquery/v2/projects/{{PROJECT_ID}}/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU",
"statistics": {
"creationTime": "1569491661818",
"endTime": "1569491662935",
"startTime": "1569491662366"
},
"status": {
"errorResult": {
"message": "Error while reading data, error message: JSON table
encountered too many errors, giving up. Rows: 1; errors: 1. Please look
into the errors[] collection for more details.",
"reason": "invalid"
},
"errors": [
{
"message": "Error while reading data, error message: JSON table
encountered too many errors, giving up. Rows: 1; errors: 1. Please look
into the errors[] collection for more details.",
"reason": "invalid"
},
{
"message": "Error while reading data, error message: JSON
processing encountered too many errors, giving up. Rows: 1; errors: 1;
max bad: 0; error percent: 0",
"reason": "invalid"
},
{
"message": "Error while reading data, error message: JSON
parsing error in row starting at position 0: Start of array encountered
without start of object.",
"reason": "invalid"
}
],
"state": "DONE"
},
"user_email":
"rabbitmq-inges...@psh-analytics-automation.iam.gserviceaccount.com"
}
Error message is interesting.
If I look in data provenance at the data I'm expected to send to
BigQuery, I get
[{"ContractualConsent":{"id":"5d847c5c92913700017692fc","identity":{"id":"511096128","type":"customer","businessUnit":"lmit"},"finality":"commercial_relationship","consentDate":"2019-06-04T15:39:32Z","expiryDate":"2024-06-04T15:39:32Z","expired":false,"createdBy":"DynamoCRM_DC","createdDate":"2019-09-20T07:14:36.576Z","sender":{"id":"511096128","application":"DYNAMO-CRM","type":"CUSTOMER"},"relatedEvent":{"id":"a72c44f1-de86-e911-a827-000d3a2aa91d","type":"customer_request"}}},{"ContractualConsent":{"id":"5d847c5c5fa9420001ebf04e","identity":{"id":"509582521","type":"customer","businessUnit":"lmit"},"finality":"commercial_relationship","consentDate":"2019-06-07T08:09:32Z","expiryDate":"2024-06-07T08:09:32Z","expired":false,"createdBy":"DynamoCRM_DC","createdDate":"2019-09-20T07:14:36.708Z","sender":{"id":"509582521","application":"DYNAMO-CRM","type":"CUSTOMER"},"relatedEvent":{"id":"6c335392-fb88-e911-a827-000d3a2aa91d","type":"customer_request"}}}]
Which is indeed an array, instead of an object.
And maybe it is because my JsonRecordSetWriter has for "Output grouping"
the "Array" value selected ...
Well, strangely, even after having changed configuration of my
JsoNRecordSetwriter, values continue to be json arrays ...
Anyway, I guess i'm on the right path ... (thanks a lot Pierre)
Le 26/09/2019 à 13:18, Pierre Villard a écrit :
What if you run the below command in Cloud Shell:
bq --format=prettyjson show -j <job id>
In your case (with your last email):
bq --format=prettyjson show -j 9e790299-dc77-46f4-8978-476f284fe5b5
Does it give you more details?
Le jeu. 26 sept. 2019 à 12:13, Nicolas Delsaux <nicolas.dels...@gmx.fr
<mailto:nicolas.dels...@gmx.fr>> a écrit :
Sorry for the late reply.
As of today, the issue is still present.
Nifi Web UI just shows the message "Error while reading data,
error message: JSON table encountered too many errors, giving up.
Rows: 1; errors: 1. Please look into the errors[] collection for
more details."
But the log is clearer :
--------------------------------------------------
Standard FlowFile Attributes
Key: 'entryDate'
Value: 'Thu Sep 26 09:53:49 UTC 2019'
Key: 'lineageStartDate'
Value: 'Thu Sep 26 09:53:49 UTC 2019'
Key: 'fileSize'
Value: '999'
FlowFile Attribute Map Content
Key: 'avro.schema'
Value:
'{"type":"record","name":"nifiRecord","namespace":"org.apache.nifi","fields":[{"name":"ExplicitConsent","type":["null",{"type":"record","name":"ExplicitConsentType","fields":[{"name":"id","type":["null","string"]},{"name":"identity","type":["null",{"type":"record","name":"identityType","fields":[{"name":"id","type":["null","string"]},{"name":"type","type":["null","string"]},{"name":"businessUnit","type":["null","string"]}]}]},{"name":"finality","type":["null","string"]},{"name":"expired","type":["null","boolean"]},{"name":"createdBy","type":["null","string"]},{"name":"createdDate","type":["null","string"]},{"name":"sender","type":["null",{"type":"record","name":"senderType","fields":[{"name":"id","type":["null","string"]},{"name":"application","type":["null","string"]},{"name":"type","type":["null","string"]}]}]},{"name":"state","type":["null","string"]}]}]}]}'
Key: 'bq.error.message'
Value: 'Error while reading data, error message: JSON table
encountered too many errors, giving up. Rows: 1; errors: 1. Please
look into the errors[] collection for more details.'
Key: 'bq.error.reason'
Value: 'invalid'
Key: 'bq.job.link'
Value:
'https://www.googleapis.com/bigquery/v2/projects/{{PROJECT_ID}}/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU'
<https://www.googleapis.com/bigquery/v2/projects/psh-datacompliance/jobs/9e790299-dc77-46f4-8978-476f284fe5b5?location=EU'>
Key: 'bq.job.stat.creation_time'
Value: '1569491661818'
Key: 'bq.job.stat.end_time'
Value: '1569491662935'
Key: 'bq.job.stat.start_time'
Value: '1569491662366'
Key: 'filename'
Value: 'e6d604d7-b517-4a87-a398-e4a5df342ce6'
Key: 'kafka.key'
Value: '--'
Key: 'kafka.partition'
Value: '0'
Key: 'kafka.topic'
Value: 'dc.consent-life-cycle.kpi-from-dev-nifi-json'
Key: 'merge.bin.age'
Value: '1'
Key: 'merge.count'
Value: '3'
Key: 'mime.type'
Value: 'application/json'
Key: 'path'
Value: './'
Key: 'record.count'
Value: '3'
Key: 'uuid'
Value: 'e6d604d7-b517-4a87-a398-e4a5df342ce6'
2019-09-26 10:09:39,633 INFO [Timer-Driven Process Thread-4]
o.a.n.processors.standard.LogAttribute
LogAttribute[id=ce9c171f-0c8f-3cab-e0f2-16156faf15b8] logging for
flow file
StandardFlowFileRecord[uuid=e6d604d7-b517-4a87-a398-e4a5df342ce6,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1569490848560-6,
container=default, section=6], offset=569098,
length=999],offset=0,name=e6d604d7-b517-4a87-a398-e4a5df342ce6,size=999]
I don't exactly understand why i would have to set an
authentication, because I've set the service.json content into the
GCP Credentials Provider I use for my PutBigQueryBatch processor ...
Is there anything I'm missing ? or a simple way to make sure
verything work as expected ?
Thanks
Le 24/09/2019 à 16:12, Pierre Villard a écrit :
Hey Nicolas,
Did you manage to solve your issue? Happy to help on this one.
Thanks,
Pierre
Le ven. 20 sept. 2019 à 16:42, Nicolas Delsaux
<nicolas.dels...@gmx.fr <mailto:nicolas.dels...@gmx.fr>> a écrit :
Hello
I'm using PutBigQueryBash and having weird auth issues.
I have set the GCP Credentials Controller Service to use
Service Account JSON which I have copied from the value given
in Google Cloud Console.
But when I run my flow, I get the error message "Error while
reading data, error message: JSON table encountered too many
errors, giving up. Rows: 1; errors: 1. Please look into the
errors[] collection for more details."
What is stranger is that when I log all properties, there is
a bq.job.link which messages indicate "Request is missing
required authentication credential. Expected OAuth 2 access
token, login cookie or other valid authentication credential.
See
https://developers.google.com/identity/sign-in/web/devconsole-project."
...
But nifi can access the bigquery workspace and dataset (I've
checked that by deleting the table schema that I have already
written).
So, is there something I'm doing wrong ?
Thanks !