[ 
https://issues.apache.org/jira/browse/NIFI-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929838#comment-17929838
 ] 

ASF subversion and git services commented on NIFI-14291:
--------------------------------------------------------

Commit 5b9c0e4b3079e7bc7485e63836cd1b987a5a03cb in nifi's branch 
refs/heads/main from Peter Turcsanyi
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=5b9c0e4b30 ]

NIFI-14291 Refactored recursive listing in ListGoogleDrive to query folders one 
by one

This closes #9742.

Signed-off-by: Tamas Palfy <[email protected]>


> Refactor recursive folder listing in ListGoogleDrive
> ----------------------------------------------------
>
>                 Key: NIFI-14291
>                 URL: https://issues.apache.org/jira/browse/NIFI-14291
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Peter Turcsanyi
>            Assignee: Peter Turcsanyi
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recursive folder listing in ListGoogleDrive is implemented in two phases 
> currently:
>  # traverse the folder structure from the base folder and collect all the 
> subfolder ids recursively (files are skipped in this step)
>  # execute an overall query with all the folder ids (like "folder_id_1 in 
> parents or folder_id_2 in parents or ...")
> The composite query may lead to the following errors:
>  * the query is too big/complex and fails 
> [https://stackoverflow.com/questions/29738020/google-drive-api-limit-on-search-query-parameter]
>  * in rare cases, some files are just skipped silently and not returned by 
> the Drive service for some reason 
> [https://stackoverflow.com/questions/60131503/is-it-possible-to-query-for-multiple-folders-parents-using-googles-drive-api]
> Refactor the recursive listing to traverse and query the folders one by one 
> (like other List processors do). 
> {code:java}
> 2025-02-24 11:44:57,436 ERROR [Timer-Driven Process Thread-5] 
> o.a.n.p.gcp.drive.ListGoogleDrive 
> ListGoogleDrive[id=c96787d9-bdbc-3511-db04-6529ae078f7f] Failed to perform 
> listing on remote host due to 400 Bad Request
> POST https://www.googleapis.com/drive/v3/files
> {
>   "code": 400,
>   "errors": [
>     {
>       "domain": "global",
>       "location": "q",
>       "locationType": "parameter",
>       "message": "The query is too complex.",
>       "reason": "invalid"
>     }
>   ],
>   "message": "The query is too complex."
> }
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad 
> Request
> POST https://www.googleapis.com/drive/v3/files
> {
>   "code": 400,
>   "errors": [
>     {
>       "domain": "global",
>       "location": "q",
>       "locationType": "parameter",
>       "message": "The query is too complex.",
>       "reason": "invalid"
>     }
>   ],
>   "message": "The query is too complex."
> }
>       at 
> com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
>       at 
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
>       at 
> com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
>       at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:479)
>       at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
>       at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:565)
>       at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:506)
>       at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:616)
>       at 
> org.apache.nifi.processors.gcp.drive.ListGoogleDrive.performListing(ListGoogleDrive.java:278)
>       at 
> org.apache.nifi.processor.util.list.AbstractListProcessor.listByNoTracking(AbstractListProcessor.java:460)
>       at 
> org.apache.nifi.processor.util.list.AbstractListProcessor.onTrigger(AbstractListProcessor.java:426)
>       at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>       at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1272)
>       at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:244)
>       at 
> org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:59)
>       at org.apache.nifi.engine.FlowEngine.lambda$wrap$1(FlowEngine.java:105)
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
>       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
>       at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>       at java.base/java.lang.Thread.run(Thread.java:1583)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to