[
https://issues.apache.org/jira/browse/SPARK-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-11261:
------------------------------------
Assignee: Apache Spark
> Provide a more flexible alternative to Jdbc RDD
> -----------------------------------------------
>
> Key: SPARK-11261
> URL: https://issues.apache.org/jira/browse/SPARK-11261
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Reporter: Richard Marscher
> Assignee: Apache Spark
>
> The existing JdbcRDD only covers a limited number of use cases by requiring
> the semantics of your query to operate on upper and lower bound predicates
> like: "select title, author from books where ? <= id and id <= ?"
> However, there are many use cases that cannot use such a method and/or are
> much more inefficient doing so.
> For example, we have a MySQL table partitioned on a partition key. We don't
> have range values to lookup but rather want to get all entries matching a
> predicate and have Spark run 1 query in a partition against each logical
> partition of our MySQL table. For example: "select * from devices where
> partition_id = ? and app_id = 'abcd'".
> Another use case, looking up against a distinct set of identifiers that don't
> fall within an ordering. "select * from users where user_id in
> (?,?,?,?,?,?,?)". The number of identifiers may be quite large and/or dynamic.
> Solution:
> Instead of addressing each use case differently with new RDD types, provide
> an alternate, general RDD that gives the user direct control over how the
> query is partitioned in Spark and filling in the placeholders.
> The user should be able to control which placeholder values are available on
> each partition of the RDD and also how they are inserted into the
> PreparedStatement. Ideally it can support dynamic placeholder values like
> inserting a set of values for an IN clause or similar.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]