[
https://issues.apache.org/jira/browse/BEAM-7577?focusedWorklogId=282291&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-282291
]
ASF GitHub Bot logged work on BEAM-7577:
----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Jul/19 21:18
Start Date: 24/Jul/19 21:18
Worklog Time Spent: 10m
Work Description: udim commented on issue #8950: [BEAM-7577] Allow
ValueProviders in Datastore Query filters
URL: https://github.com/apache/beam/pull/8950#issuecomment-514803909
> I've noticed that a small change might be needed in `datastoreio.py` or
alternatively in `query_splitter.py` in order to use this together with
ReadFromDatastore. Specifically, the `validate_split` function in
`query_splitter.py` is causing issues when using value providers as a filter:
>
> ```
> for filter in query.filters:
> if filter[1] in ['<', '<=', '>', '>=']:
> raise SplitNotPossibleError('Query cannot have any inequality
filters.')
> ```
>
> Since this function is run before the query is converted to a client_query
by calling the `_to_client_query` method, filter here will be of type
ValueProvider, which does not support indexing, therefore raising a TypeError.
>
> I'm thinking that we should perhaps evaluate the values of our
ValueProvider-filter before calculating the split. But this means we cannot
evaluate in `_to_client_query`, which I thought was a neat solution that wasn't
particularly hacky.
>
> For context, the flow is essentially the `expand` method in
ReadFromDatastore that calls the SplitQuery before Read, and Read is what
causes the `_to_client_query` method to be called.
>
> Question is basically where the best place is to evaluate these filters.
>
> @udim What's your take on this?
>
> Edit: Will explore this again after fixing the other issue first.
I would put
```
self.filters = self._set_runtime_filters(filters)
```
in `Query.__init__`. I believe that solves both issues.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 282291)
Time Spent: 5h 10m (was: 5h)
> Allow the use of ValueProviders in
> datastore.v1new.datastoreio.ReadFromDatastore query
> --------------------------------------------------------------------------------------
>
> Key: BEAM-7577
> URL: https://issues.apache.org/jira/browse/BEAM-7577
> Project: Beam
> Issue Type: New Feature
> Components: io-python-gcp
> Affects Versions: 2.13.0
> Reporter: EDjur
> Assignee: EDjur
> Priority: Minor
> Time Spent: 5h 10m
> Remaining Estimate: 0h
>
> The current implementation of ReadFromDatastore does not support specifying
> the query parameter at runtime. This could potentially be fixed through the
> usage of a ValueProvider to specify and build the Datastore query.
> Allowing specifying the query at runtime makes it easier to use dynamic
> queries in Dataflow templates. Currently, there is no way to have a Dataflow
> template that includes a dynamic query (such as filtering by a timestamp or
> similar).
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)