[
https://issues.apache.org/jira/browse/SPARK-8655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Nitschinger updated SPARK-8655:
---------------------------------------
Description:
I'm working on a custom data source, porting it from 1.3 to 1.4.
On 1.3 I could easily extend the SparkSQL imports and get access to it, which
meant I could use custom options right away. One of those is I pass a Filter
down to my Relation for tighter schema inference against a schemaless database.
So I would have something like:
n1ql(filter: Filter = null, userSchema: StructType = null, bucketName: String =
null)
Since I want to move my API behind the DataFrameReader, the SQLContext is not
available anymore, only through the RelationProvider, which I've implemented
and it works nicely.
The only problem I have now is that while I can pass in custom options, they
are all String typed. So I have no way to pass down my optional Filter anymore
(since parameters is a Map[String, String]).
Would it be possible to extend the options so that more than just Strings can
be passed in? Right now I probably need to work around that by documenting how
people can pass in a string which I turn into a Filter, but that's somewhat
hacky.
Note that built-in impls like JSON or JDBC have no issues, because since they
can access the SQLContext (private) without issues, they don't need to go
through the decoupling of the RelationProvider and can do any custom arguments
they want on their methods.
was:
I'm working on a custom data source, porting it from 1.3 to 1.4.
On 1.3 I could easily extend the SparkSQL imports and get access to it, which
meant I could use custom options right away. One of those is I pass a Filter
down to my Relation for tighter schema inference against a schemaless database.
So I would have something like:
n1ql(filter: Filter = null, userSchema: StructType = null, bucketName: String =
null)
Since I want to move my API behind the DataFrameReader, the SQLContext is not
available anymore, only through the RelationProvider, which I've implemented
and it works nicely.
The only problem I have now is that while I can pass in custom options, they
are all String typed. So I have no way to pass down my optional Filter anymore
(since parameters is a Map[String, String]).
Would it be possible to extend the options so that more than just Strings can
be passed in? Right now I probably need to work around that by documenting how
people can pass in a string which I turn into a Filter, but that's somewhat
hacky.
> DataFrameReader#option supports more than String as value
> ---------------------------------------------------------
>
> Key: SPARK-8655
> URL: https://issues.apache.org/jira/browse/SPARK-8655
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.4.0
> Reporter: Michael Nitschinger
>
> I'm working on a custom data source, porting it from 1.3 to 1.4.
> On 1.3 I could easily extend the SparkSQL imports and get access to it, which
> meant I could use custom options right away. One of those is I pass a Filter
> down to my Relation for tighter schema inference against a schemaless
> database.
> So I would have something like:
> n1ql(filter: Filter = null, userSchema: StructType = null, bucketName: String
> = null)
> Since I want to move my API behind the DataFrameReader, the SQLContext is not
> available anymore, only through the RelationProvider, which I've implemented
> and it works nicely.
> The only problem I have now is that while I can pass in custom options, they
> are all String typed. So I have no way to pass down my optional Filter
> anymore (since parameters is a Map[String, String]).
> Would it be possible to extend the options so that more than just Strings can
> be passed in? Right now I probably need to work around that by documenting
> how people can pass in a string which I turn into a Filter, but that's
> somewhat hacky.
> Note that built-in impls like JSON or JDBC have no issues, because since they
> can access the SQLContext (private) without issues, they don't need to go
> through the decoupling of the RelationProvider and can do any custom
> arguments they want on their methods.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]