[
https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Herman van Hovell updated SPARK-18127:
--------------------------------------
Description:
As a Spark user I want to be able to customize my spark session. I currently
want to be able to do the following things:
# I want to be able to add custom analyzer rules. This allows me to implement
my own logical constructs; an example of this could be a recursive operator.
# I want to be able to add my own analysis checks. This allows me to catch
problems with spark plans early on. An example of this can be some datasource
specific checks.
# I want to be able to add my own optimizations. This allows me to optimize
plans in different ways, for instance when you use a very different cluster
(for example a one-node X1 instance). This supersedes the current
{{spark.experimental}} methods
# I want to be able to add my own planning strategies. This supersedes the
current {{spark.experimental}} methods. This allows me to plan my own physical
plan, an example of this would to plan my own heavily integrated data source
(CarbonData for example).
# I want to be able to use my own customized SQL constructs. An example of this
would supporting my own dialect, or be able to add constructs to the current
SQL language. I should not have to implement a complete parse, and should be
able to delegate to an underlying parser.
# I want to be able to track modifications and calls to the external catalog. I
want this API to be stable. This allows me to do synchronize with other systems.
This API should modify the SparkSession when the session gets started, and it
should NOT change the session in flight.
was:
As a Spark user I want to be able to customize my spark session. I currently
want to be able to do the following things:
1. I want to be able to add custom analyzer rules. This allows me to implement
my own logical constructs; an example of this could be a recursive operator.
2. I want to be able to add my own analysis checks. This allows me to catch
problems with spark plans early on. An example of this can be some datasource
specific checks.
3. I want to be able to add my own optimizations. This allows me to optimize
plans in different ways, for instance when you use a very different cluster
(for example a one-node X1 instance). This supersedes the current
{{spark.experimental}} methods
4. I want to be able to add my own planning strategies. This supersedes the
current {{spark.experimental}} methods. This allows me to plan my own physical
plan, an example of this would to plan my own heavily integrated data source
(CarbonData for example).
5. I want to be able to use my own customized SQL constructs. An example of
this would supporting my own dialect, or be able to add constructs to the
current SQL language. I should not have to implement a complete parse, and
should be able to delegate to an underlying parser.
6. I want to be able to track modifications and calls to the external catalog.
I want this API to be stable. This allows me to do synchronize with other
systems.
This API should modify the SparkSession when the session gets started, and it
should NOT change the session in flight.
> Add hooks and extension points to Spark
> ---------------------------------------
>
> Key: SPARK-18127
> URL: https://issues.apache.org/jira/browse/SPARK-18127
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Reporter: Srinath
> Assignee: Herman van Hovell
>
> As a Spark user I want to be able to customize my spark session. I currently
> want to be able to do the following things:
> # I want to be able to add custom analyzer rules. This allows me to implement
> my own logical constructs; an example of this could be a recursive operator.
> # I want to be able to add my own analysis checks. This allows me to catch
> problems with spark plans early on. An example of this can be some datasource
> specific checks.
> # I want to be able to add my own optimizations. This allows me to optimize
> plans in different ways, for instance when you use a very different cluster
> (for example a one-node X1 instance). This supersedes the current
> {{spark.experimental}} methods
> # I want to be able to add my own planning strategies. This supersedes the
> current {{spark.experimental}} methods. This allows me to plan my own
> physical plan, an example of this would to plan my own heavily integrated
> data source (CarbonData for example).
> # I want to be able to use my own customized SQL constructs. An example of
> this would supporting my own dialect, or be able to add constructs to the
> current SQL language. I should not have to implement a complete parse, and
> should be able to delegate to an underlying parser.
> # I want to be able to track modifications and calls to the external catalog.
> I want this API to be stable. This allows me to do synchronize with other
> systems.
> This API should modify the SparkSession when the session gets started, and it
> should NOT change the session in flight.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]