[jira] [Updated] (SPARK-18127) Add hooks and extension points to Spark

Herman van Hovell (JIRA) Thu, 27 Oct 2016 02:52:47 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-18127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Herman van Hovell updated SPARK-18127:
--------------------------------------
    Description: 
As a Spark user I want to be able to customize my spark session. I currently 
want to be able to do the following things:
# I want to be able to add custom analyzer rules. This allows me to implement 
my own logical constructs; an example of this could be a recursive operator.
# I want to be able to add my own analysis checks. This allows me to catch 
problems with spark plans early on. An example of this can be some datasource 
specific checks.
# I want to be able to add my own optimizations. This allows me to optimize 
plans in different ways, for instance when you use a very different cluster 
(for example a one-node X1 instance). This supersedes the current 
{{spark.experimental}} methods
# I want to be able to add my own planning strategies. This supersedes the 
current {{spark.experimental}} methods. This allows me to plan my own physical 
plan, an example of this would to plan my own heavily integrated data source 
(CarbonData for example).
# I want to be able to use my own customized SQL constructs. An example of this 
would supporting my own dialect, or be able to add constructs to the current 
SQL language. I should not have to implement a complete parse, and should be 
able to delegate to an underlying parser.
# I want to be able to track modifications and calls to the external catalog. I 
want this API to be stable. This allows me to do synchronize with other systems.

This API should modify the SparkSession when the session gets started, and it 
should NOT change the session in flight.


  was:
As a Spark user I want to be able to customize my spark session. I currently 
want to be able to do the following things:
1. I want to be able to add custom analyzer rules. This allows me to implement 
my own logical constructs; an example of this could be a recursive operator.
2. I want to be able to add my own analysis checks. This allows me to catch 
problems with spark plans early on. An example of this can be some datasource 
specific checks.
3. I want to be able to add my own optimizations. This allows me to optimize 
plans in different ways, for instance when you use a very different cluster 
(for example a one-node X1 instance). This supersedes the current 
{{spark.experimental}} methods
4. I want to be able to add my own planning strategies. This supersedes the 
current {{spark.experimental}} methods. This allows me to plan my own physical 
plan, an example of this would to plan my own heavily integrated data source 
(CarbonData for example).
5. I want to be able to use my own customized SQL constructs. An example of 
this would supporting my own dialect, or be able to add constructs to the 
current SQL language. I should not have to implement a complete parse, and 
should be able to delegate to an underlying parser.
6. I want to be able to track modifications and calls to the external catalog. 
I want this API to be stable. This allows me to do synchronize with other 
systems.

This API should modify the SparkSession when the session gets started, and it 
should NOT change the session in flight.



> Add hooks and extension points to Spark
> ---------------------------------------
>
>                 Key: SPARK-18127
>                 URL: https://issues.apache.org/jira/browse/SPARK-18127
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Srinath
>            Assignee: Herman van Hovell
>
> As a Spark user I want to be able to customize my spark session. I currently 
> want to be able to do the following things:
> # I want to be able to add custom analyzer rules. This allows me to implement 
> my own logical constructs; an example of this could be a recursive operator.
> # I want to be able to add my own analysis checks. This allows me to catch 
> problems with spark plans early on. An example of this can be some datasource 
> specific checks.
> # I want to be able to add my own optimizations. This allows me to optimize 
> plans in different ways, for instance when you use a very different cluster 
> (for example a one-node X1 instance). This supersedes the current 
> {{spark.experimental}} methods
> # I want to be able to add my own planning strategies. This supersedes the 
> current {{spark.experimental}} methods. This allows me to plan my own 
> physical plan, an example of this would to plan my own heavily integrated 
> data source (CarbonData for example).
> # I want to be able to use my own customized SQL constructs. An example of 
> this would supporting my own dialect, or be able to add constructs to the 
> current SQL language. I should not have to implement a complete parse, and 
> should be able to delegate to an underlying parser.
> # I want to be able to track modifications and calls to the external catalog. 
> I want this API to be stable. This allows me to do synchronize with other 
> systems.
> This API should modify the SparkSession when the session gets started, and it 
> should NOT change the session in flight.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-18127) Add hooks and extension points to Spark

Reply via email to