Hi Vincent!
Do you think you could add some code snippets / pseudocode as to what this
looks like? Feel free to do it on email, gist, google doc, etc?
Best
-P.

On Thu, Oct 3, 2019 at 4:16 PM Vincent Marquez <vincent.marq...@gmail.com>
wrote:

> Currently the CassandraIO connector allows a user to specify a table, and
> the CassandraSource object generates a list of queries based on token
> ranges of the table, along with grouping them by the token ranges.
>
> I often need to run (generated, sometimes a million+) queries against a
> subset of a table.  Instead of providing a filter, it is easier and much
> more performant to supply a collection of queries along with their tokens
> to both partition and group by, instead of letting CassandraIO naively run
> over the entire table or with a simple filter.
>
> I propose in addition to the current method of supplying a table and
> filter, also allowing the user to pass in a collection of queries and
> tokens.   The current way CassandraSource breaks up the table could be
> modified to build on top of the proposed implementation to reduce code
> duplication as well.  If this sounds like an acceptable alternative way of
> using the CassandraIO connector, I don't mind giving it a shot with a pull
> request.
>
> If there is a better way of doing this, I'm eager to hear and learn.
> Thanks for reading!
>

Reply via email to