[ https://issues.apache.org/jira/browse/BEAM-7577?focusedWorklogId=280363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-280363 ]
ASF GitHub Bot logged work on BEAM-7577: ---------------------------------------- Author: ASF GitHub Bot Created on: 22/Jul/19 12:30 Start Date: 22/Jul/19 12:30 Worklog Time Spent: 10m Work Description: EDjur commented on issue #8950: [BEAM-7577] Allow ValueProviders in Datastore Query filters URL: https://github.com/apache/beam/pull/8950#issuecomment-508107035 I've noticed that a small change might be needed in `datastoreio.py` or alternatively in `query_splitter.py` in order to use this together with ReadFromDatastore. Specifically, the `validate_split` function in `query_splitter.py` is causing issues when using value providers as a filter: ``` for filter in query.filters: if filter[1] in ['<', '<=', '>', '>=']: raise SplitNotPossibleError('Query cannot have any inequality filters.') ``` Since this function is run before the query is converted to a client_query by calling the `_to_client_query` method, filter here will be of type ValueProvider, which does not support indexing, therefore raising a TypeError. I'm thinking that we should perhaps evaluate the values of our ValueProvider-filter before calculating the split. But this means we cannot evaluate in `_to_client_query`, which I thought was a neat solution that wasn't particularly hacky. For context, the flow is essentially the `expand` method in ReadFromDatastore that calls the SplitQuery before Read, and Read is what causes the `_to_client_query` method to be called. Question is basically where the best place is to evaluate these filters. @udim What's your take on this? Edit: Will explore this again after fixing the other issue first. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 280363) Time Spent: 4h 20m (was: 4h 10m) > Allow the use of ValueProviders in > datastore.v1new.datastoreio.ReadFromDatastore query > -------------------------------------------------------------------------------------- > > Key: BEAM-7577 > URL: https://issues.apache.org/jira/browse/BEAM-7577 > Project: Beam > Issue Type: New Feature > Components: io-python-gcp > Affects Versions: 2.13.0 > Reporter: EDjur > Assignee: EDjur > Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > The current implementation of ReadFromDatastore does not support specifying > the query parameter at runtime. This could potentially be fixed through the > usage of a ValueProvider to specify and build the Datastore query. > Allowing specifying the query at runtime makes it easier to use dynamic > queries in Dataflow templates. Currently, there is no way to have a Dataflow > template that includes a dynamic query (such as filtering by a timestamp or > similar). -- This message was sent by Atlassian JIRA (v7.6.14#76016)