And incase anyone is wondering, the reason I want this may be avoided with
DataSourceV2 depending on some of the function pushdown discussions. We
want to add functions which work only with the Cassandra DataSource (ttl
and writetime), I've done the work to add in the custom expressions and
analysis rules, but I want to make sure it gets into the SQL interface.

On Thu, Sep 27, 2018 at 1:35 PM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> It would be a @dev internal api I think
>
> If we wanted to go extremely general with post session init, it could be
> added to SparkExtensions
>
> def postSessionInit(session: SparkSession) : Unit
>
> Which would allow you to do just about anything after sessionState was
> done initialized.
>
> Or if we specifically wanted to allow just functions
>
> def injectFunction(name: String, function: Seq[Expression] =>
> [Expression]) {
>   sparkSession.registerFunction(name, function) // Or add to a buffer
> which is registered later
> }
>
>
>
> On Thu, Sep 27, 2018 at 1:16 PM Reynold Xin <r...@databricks.com> wrote:
>
>> Thoughts on how the api would look like?
>>
>> On Thu, Sep 27, 2018 at 11:13 AM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> While that's easy for some users, we basically want to load up some
>>> functions by default into all session catalogues regardless of who made
>>> them. We do this with certain rules and strategies using the
>>> SparkExtensions, so all apps that run through our submit scripts get a
>>> config parameter added and it's transparent to the user. I think we'll
>>> probably have to do some forks (at least for the CliDriver), the
>>> thriftserver has a bunch of code which doesn't run under "startWithContext"
>>> so we may have an issue there as well.
>>>
>>>
>>>
>>> On Wed, Sep 26, 2018, 6:21 PM Mark Hamstra <m...@clearstorydata.com>
>>> wrote:
>>>
>>>> You're talking about users starting Thriftserver or SqlShell from the
>>>> command line, right? It's much easier if you are starting a Thriftserver
>>>> programmatically so that you can register functions when initializing a
>>>> SparkContext and then  HiveThriftServer2.startWithContext using that
>>>> context.
>>>>
>>>> On Wed, Sep 26, 2018 at 3:30 PM Russell Spitzer <
>>>> russell.spit...@gmail.com> wrote:
>>>>
>>>>> I've been looking recently on possible avenues to load new functions
>>>>> into the Thriftserver and SqlShell at launch time. I basically want to
>>>>> preload a set of functions in addition to those already present in the
>>>>> Spark Code. I'm not sure there is at present a way to do this and I was
>>>>> wondering if anyone had any ideas.
>>>>>
>>>>> I would basically want to make it so that any user launching either of
>>>>> these tools would automatically have access to some custom functions. In
>>>>> the SparkShell I can do this by adding additional lines to the init 
>>>>> section
>>>>> but I think It would be nice if we could pass in a parameter which would
>>>>> point to a class with a list of additional functions to add to all new
>>>>> session states.
>>>>>
>>>>> An interface like Spark Sessions Extensions but instead of running
>>>>> during Session Init, it would run after session init has completed.
>>>>>
>>>>> Thanks for your time and I would be glad to hear any opinions or ideas
>>>>> on this,
>>>>>
>>>> --
>> --
>> excuse the brevity and lower case due to wrist injury
>>
>

Reply via email to