Peter Turcsanyi created NIFI-14291:
--------------------------------------
Summary: Refactor recursive folder listing in ListGoogleDrive
Key: NIFI-14291
URL: https://issues.apache.org/jira/browse/NIFI-14291
Project: Apache NiFi
Issue Type: Bug
Reporter: Peter Turcsanyi
Assignee: Peter Turcsanyi
Recursive folder listing in ListGoogleDrive is implemented in two phases
currently:
# traverse the folder structure from the base folder and collect all the
subfolder ids recursively (files are skipped in this step)
# execute an overall query with all the folder ids (like "folder_id_1 in
parents or folder_id_2 in parents or ...")
The composite query may lead to the following errors:
* the query is too big/complex and fails
[https://stackoverflow.com/questions/29738020/google-drive-api-limit-on-search-query-parameter]
* in rare cases, some files are just skipped silently and not returned by the
Drive service for some reason
[https://stackoverflow.com/questions/60131503/is-it-possible-to-query-for-multiple-folders-parents-using-googles-drive-api]
Refactor the recursive listing to traverse and query the folders one by one
(like other List processors do).
{code:java}
2025-02-24 11:44:57,436 ERROR [Timer-Driven Process Thread-5]
o.a.n.p.gcp.drive.ListGoogleDrive
ListGoogleDrive[id=c96787d9-bdbc-3511-db04-6529ae078f7f] Failed to perform
listing on remote host due to 400 Bad Request
POST https://www.googleapis.com/drive/v3/files
{
"code": 400,
"errors": [
{
"domain": "global",
"location": "q",
"locationType": "parameter",
"message": "The query is too complex.",
"reason": "invalid"
}
],
"message": "The query is too complex."
}
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad
Request
POST https://www.googleapis.com/drive/v3/files
{
"code": 400,
"errors": [
{
"domain": "global",
"location": "q",
"locationType": "parameter",
"message": "The query is too complex.",
"reason": "invalid"
}
],
"message": "The query is too complex."
}
at
com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
at
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:479)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:565)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:506)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:616)
at
org.apache.nifi.processors.gcp.drive.ListGoogleDrive.performListing(ListGoogleDrive.java:278)
at
org.apache.nifi.processor.util.list.AbstractListProcessor.listByNoTracking(AbstractListProcessor.java:460)
at
org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:426)
at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1272)
at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244)
at
org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
at org.apache.nifi.engine.FlowEngine.lambda$wrap$1(FlowEngine.java:105)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)