FrozenGene commented on a change in pull request #5:
URL: https://github.com/apache/tvm-rfcs/pull/5#discussion_r648064710
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
Review comment:
`and variety hardware platforms`
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
+* The three systems above have isolated sets of APIs with several layers of
their own abstraction, which are not only hard to learn, but also
engineering-intensive to customize.
+
+**Benefit of Meta Schedule**
+
+* Succinct syntax, consistent APIs to TensorIR schedule with no other layer of
abstraction.
+* Provides unified APIs for implementing manual schedule, AutoTVM and
AutoScheduler (Ansor).
+* Extensibility to all the schedule primitives, including tensorization and
loop partitioning. Almost no extra effort is needed to use a new primitive in
auto-tuning.
+* The automation infrastructure is customizable across every layer.
Review comment:
I don't understand what is the meaning of `customizable across every
layer`. Could you explain a bit more?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
+* The three systems above have isolated sets of APIs with several layers of
their own abstraction, which are not only hard to learn, but also
engineering-intensive to customize.
+
+**Benefit of Meta Schedule**
+
+* Succinct syntax, consistent APIs to TensorIR schedule with no other layer of
abstraction.
+* Provides unified APIs for implementing manual schedule, AutoTVM and
AutoScheduler (Ansor).
+* Extensibility to all the schedule primitives, including tensorization and
loop partitioning. Almost no extra effort is needed to use a new primitive in
auto-tuning.
+* The automation infrastructure is customizable across every layer.
+
+## 3. Guide-level explanation
+
+In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe and auto-generate the design space.
+
+### 3.1. Manual Schedule
+
+Meta schedule APIs are almost the same as TE or TensorIR scheduling. Here is
an example of a manual schedule for matrix multiplication:
+
+```python
+# Designate a set of tile sizes
+i_tiles = [16, 8, 8, 8]
+j_tiles = [16, 8, 8, 8]
+k_tiles = [256, 8]
+
+# Tile the loops according to the tile sizes
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+
+# Organize the loops into “SSRSRS” 6-level tiles
Review comment:
I think we should add one simple note to explain what is `S` (what is
`spatial`) and `R` (what is `reduce`).
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
Review comment:
to new schedule primitives (tensorize, loop partition, software
pipelining `and so on`).
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
+* The three systems above have isolated sets of APIs with several layers of
their own abstraction, which are not only hard to learn, but also
engineering-intensive to customize.
+
+**Benefit of Meta Schedule**
+
+* Succinct syntax, consistent APIs to TensorIR schedule with no other layer of
abstraction.
+* Provides unified APIs for implementing manual schedule, AutoTVM and
AutoScheduler (Ansor).
+* Extensibility to all the schedule primitives, including tensorization and
loop partitioning. Almost no extra effort is needed to use a new primitive in
auto-tuning.
+* The automation infrastructure is customizable across every layer.
+
+## 3. Guide-level explanation
+
+In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe and auto-generate the design space.
+
+### 3.1. Manual Schedule
+
+Meta schedule APIs are almost the same as TE or TensorIR scheduling. Here is
an example of a manual schedule for matrix multiplication:
+
+```python
+# Designate a set of tile sizes
+i_tiles = [16, 8, 8, 8]
+j_tiles = [16, 8, 8, 8]
+k_tiles = [256, 8]
+
+# Tile the loops according to the tile sizes
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+
+# Organize the loops into “SSRSRS” 6-level tiles
+sch.reorder(
+ i_0, j_0, # S
+ i_1, j_1, # S
+ k_0, # R
+ i_2, j_2, # S
+ k_1, # R
+ i_3, j_3, # S
+)
+```
+
+In this example, the developers may tweak the tile sizes and measure the
performance of the generated kernels to explore the opportunities of potential
optimization.
+
+Generally speaking, while writing a schedule, there are often some parameters
that are hard to determine ahead of time, for example, tile sizes, unroll
steps, or which tensor intrinsics to use. Developers may manually enumerate
possible combinations of these unknown factors, and then pick the best schedule
according to measurement results on their device.
+
+### 3.2. AutoTVM-style Design Space Description
+
+Meta schedule extends the schedule DSL with sampling instructions. When
included in a schedule, these instructions parametrize the schedule from a
single deterministic point to a space supported by random variables (tile size,
etc.), making it possible for developers to describe the design space with meta
schedule APIs.
+
+We can extend the matmul example above to cover all possible tilings using
these sampling instructions:
+
+```python
+# Sample tile sizes
+i_tiles = sch.sample_perfect_tile(i, n=4)
+j_tiles = sch.sample_perfect_tile(j, n=4)
+k_tiles = sch.sample_perfect_tile(k, n=2)
+# Tile the loops according to the random variables
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+# Organize the loops into “SSRSRS” 6-level tiles
+sch.reorder(
+ i_0, j_0, # S
+ i_1, j_1, # S
+ k_0, # R
+ i_2, j_2, # S
+ k_1, # R
+ i_3, j_3, # S
+)
+```
+
+### 3.3. Composite Schedule
+
+Each schedule primitive handles only a very basic operation to transform the
IR, for example, `split` only splits a loop into two. In the real world, the
over-fine granularity of those primitives usually leads to repetitive and
verbose scheduling code, as
[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
by developers in our community.
+
+To counter this challenge, we allow users to register “composite schedules”
that analyze the IR, and apply a set of schedule primitives correspondingly.
For instance, a composite schedule may inspect a TensorIR block and decide
whether we should call `compute_inline` on it. The composite schedule may use
sampling instructions to fill in undecided choices.
+
+Our system also ships with some built-in composite schedules, including:
+
+* Multi-level tiling
+* Inline pure spatial blocks
+* Parallelize & vectorize & unroll
+* Auto tensorize
+* …
+
+### 3.4. AutoScheduler-style Design Space Generation
+
+AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage. Meta schedule treats a search rule as a composite
schedule, and applies each composite schedule to each block of TensorIR to
generate the design space.
+
+### 3.5. Unifying manual schedule / AutoTVM / Ansor
+
+In this section, we show that the design space induced by TE manual schedule,
AutoTVM and Ansor are all subsets of meta schedule, and meta schedule further
allows mixing those three styles to search jointly.
+
+**Manual schedule**. The TE schedule is a special case of a meta schedule
program, where there is no randomness introduced by sampling instructions. It
is a single point in terms of design space.
+
+**AutoTVM (Template-based tuning)**. Writing one or more schedule functions in
meta schedule, potentially with sampling instructions, is a natural
representation of AutoTVM’s schedule templates (knobs). The PPL generates one
or more traces as the design space to explore.
+
+**AutoScheduler (Ansor, Template-free tuning)**. As mentioned in the previous
section, application of composite schedule rules generates the design space,
which is equivalent to Ansor’s sketch generation.
+
+**Mixing styles in design space definition**. By taking union of the spaces
induced by the three special cases, our system allows developers to combine
generic rules that Ansor provides and operator-specific scheduling.
+
+## 4. Reference-level explanation
+
+In this section, we introduce the underlying techniques for the automation
system to extract and explore the design space.
+
+### 4.1. Execution trace as the design space
+
+**Trace**. To represent the design space defined by the meta schedule DSL, the
underlying system records all the instructions users applied to the schedule
class, including sampling and schedule primitives. We call this list of
instructions a trace.
+
+Executing the example above results in the following trace:
+
+```
+Instruction 0. Sample-Perfect-Tile
+Instruction 1. Sample-Perfect-Tile
+Instruction 2. Sample-Perfect-Tile
+Instruction 3. Split
+Instruction 4. Split
+Instruction 5. Split
+Instruction 6. Reorder
+```
+
+The trace is not directly user-facing, but a data structure inside the
user-facing `Schedule` class that records the execution. The automation system
extracts the trace and finds out the design space according to the sampling
instructions.
+
+**Union of traces**. Often a single trace is unable to represent the entire
space. Therefore, more precisely, our system works on a list of traces as the
union of potential design space.
+
+**Fork a trace**. When two different decisions in the scheduling process are
equally important to generate high-performance schedules, we allow forking the
trace into two, and the design space is the union of the forked traces.
Review comment:
Do you mean we will go to two directions at the same time and generate
the design space united two directions?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
+* The three systems above have isolated sets of APIs with several layers of
their own abstraction, which are not only hard to learn, but also
engineering-intensive to customize.
+
+**Benefit of Meta Schedule**
+
+* Succinct syntax, consistent APIs to TensorIR schedule with no other layer of
abstraction.
+* Provides unified APIs for implementing manual schedule, AutoTVM and
AutoScheduler (Ansor).
+* Extensibility to all the schedule primitives, including tensorization and
loop partitioning. Almost no extra effort is needed to use a new primitive in
auto-tuning.
+* The automation infrastructure is customizable across every layer.
+
+## 3. Guide-level explanation
+
+In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe and auto-generate the design space.
+
+### 3.1. Manual Schedule
+
+Meta schedule APIs are almost the same as TE or TensorIR scheduling. Here is
an example of a manual schedule for matrix multiplication:
+
+```python
+# Designate a set of tile sizes
+i_tiles = [16, 8, 8, 8]
+j_tiles = [16, 8, 8, 8]
+k_tiles = [256, 8]
+
+# Tile the loops according to the tile sizes
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+
+# Organize the loops into “SSRSRS” 6-level tiles
+sch.reorder(
+ i_0, j_0, # S
+ i_1, j_1, # S
+ k_0, # R
+ i_2, j_2, # S
+ k_1, # R
+ i_3, j_3, # S
+)
+```
+
+In this example, the developers may tweak the tile sizes and measure the
performance of the generated kernels to explore the opportunities of potential
optimization.
+
+Generally speaking, while writing a schedule, there are often some parameters
that are hard to determine ahead of time, for example, tile sizes, unroll
steps, or which tensor intrinsics to use. Developers may manually enumerate
possible combinations of these unknown factors, and then pick the best schedule
according to measurement results on their device.
+
+### 3.2. AutoTVM-style Design Space Description
+
+Meta schedule extends the schedule DSL with sampling instructions. When
included in a schedule, these instructions parametrize the schedule from a
single deterministic point to a space supported by random variables (tile size,
etc.), making it possible for developers to describe the design space with meta
schedule APIs.
+
+We can extend the matmul example above to cover all possible tilings using
these sampling instructions:
+
+```python
+# Sample tile sizes
+i_tiles = sch.sample_perfect_tile(i, n=4)
+j_tiles = sch.sample_perfect_tile(j, n=4)
+k_tiles = sch.sample_perfect_tile(k, n=2)
+# Tile the loops according to the random variables
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+# Organize the loops into “SSRSRS” 6-level tiles
+sch.reorder(
+ i_0, j_0, # S
+ i_1, j_1, # S
+ k_0, # R
+ i_2, j_2, # S
+ k_1, # R
+ i_3, j_3, # S
+)
+```
+
+### 3.3. Composite Schedule
+
+Each schedule primitive handles only a very basic operation to transform the
IR, for example, `split` only splits a loop into two. In the real world, the
over-fine granularity of those primitives usually leads to repetitive and
verbose scheduling code, as
[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
by developers in our community.
+
+To counter this challenge, we allow users to register “composite schedules”
that analyze the IR, and apply a set of schedule primitives correspondingly.
For instance, a composite schedule may inspect a TensorIR block and decide
whether we should call `compute_inline` on it. The composite schedule may use
sampling instructions to fill in undecided choices.
+
+Our system also ships with some built-in composite schedules, including:
+
+* Multi-level tiling
+* Inline pure spatial blocks
+* Parallelize & vectorize & unroll
+* Auto tensorize
+* …
+
+### 3.4. AutoScheduler-style Design Space Generation
+
+AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage. Meta schedule treats a search rule as a composite
schedule, and applies each composite schedule to each block of TensorIR to
generate the design space.
+
+### 3.5. Unifying manual schedule / AutoTVM / Ansor
+
+In this section, we show that the design space induced by TE manual schedule,
AutoTVM and Ansor are all subsets of meta schedule, and meta schedule further
allows mixing those three styles to search jointly.
+
+**Manual schedule**. The TE schedule is a special case of a meta schedule
program, where there is no randomness introduced by sampling instructions. It
is a single point in terms of design space.
+
+**AutoTVM (Template-based tuning)**. Writing one or more schedule functions in
meta schedule, potentially with sampling instructions, is a natural
representation of AutoTVM’s schedule templates (knobs). The PPL generates one
or more traces as the design space to explore.
Review comment:
What is the meaning of `PPL`?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -0,0 +1,297 @@
+* Feature Name: Meta Schedule (AutoTensorIR)
+* Start Date: 2021-05-28
+* RFC PR: TBD (apache/tvm-rfcs#0000)
+* GitHub Issue: TBD (apache/tvm-rfcs#0000)
+
+## 1. Summary
+
+This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the approaches of AutoTVM and Auto Scheduler (Ansor). Meta
schedule provides a pragmatic way to define the space of automatic tuning,
extensibility in terms of all possible TIR schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
+
+Meta Schedule is our 3rd generation automatic scheduling system.
+
+## 2. Motivation
+
+**Scheduling and Design Space**
+
+In TVM TensorIR, optimization of a TensorIR program is done via a sequence of
transformations. For example, we reorder loops for better locality and we
tensorize for specific hardware intrinsics. The process of invoking such a set
of pre-defined transformations is called “**scheduling**”, and each
transformation is called a “**schedule primitive**”. These primitives form a
domain-specific language (DSL) describing the transformation of TensorIR
programs. **Design space** is the set of all possible schedulings with respect
to a TensorIR program.
+
+**Problems with the Current Scheduling System**
+
+* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives, i.e. explore points in the design space with humans in the
loop. This can be a tedious and error-prone approach, hence the creation of
AutoTVM and AutoScheduler (Ansor).
+* **AutoTVM**: The automation system requires users to define “schedule
templates” as the design space for each operator. Therefore, it is inextensible
to hundreds of operators.
+* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space, according to a set of predefined “search rules”. However, it
is non-trivial to extend AutoScheduler to new schedule primitives (tensorize,
loop partition, software pipelining).
+* The three systems above have isolated sets of APIs with several layers of
their own abstraction, which are not only hard to learn, but also
engineering-intensive to customize.
+
+**Benefit of Meta Schedule**
+
+* Succinct syntax, consistent APIs to TensorIR schedule with no other layer of
abstraction.
+* Provides unified APIs for implementing manual schedule, AutoTVM and
AutoScheduler (Ansor).
+* Extensibility to all the schedule primitives, including tensorization and
loop partitioning. Almost no extra effort is needed to use a new primitive in
auto-tuning.
+* The automation infrastructure is customizable across every layer.
+
+## 3. Guide-level explanation
+
+In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe and auto-generate the design space.
+
+### 3.1. Manual Schedule
+
+Meta schedule APIs are almost the same as TE or TensorIR scheduling. Here is
an example of a manual schedule for matrix multiplication:
+
+```python
+# Designate a set of tile sizes
+i_tiles = [16, 8, 8, 8]
+j_tiles = [16, 8, 8, 8]
+k_tiles = [256, 8]
+
+# Tile the loops according to the tile sizes
+i_0, i_1, i_2, i_3 = sch.split(loop=i, factors=i_tiles)
+j_0, j_1, j_2, j_3 = sch.split(loop=j, factors=j_tiles)
+k_0, k_1 = sch.split(loop=k, factors=k_tiles)
+
+# Organize the loops into “SSRSRS” 6-level tiles
+sch.reorder(
+ i_0, j_0, # S
+ i_1, j_1, # S
+ k_0, # R
+ i_2, j_2, # S
+ k_1, # R
+ i_3, j_3, # S
+)
+```
+
+In this example, the developers may tweak the tile sizes and measure the
performance of the generated kernels to explore the opportunities of potential
optimization.
+
+Generally speaking, while writing a schedule, there are often some parameters
that are hard to determine ahead of time, for example, tile sizes, unroll
steps, or which tensor intrinsics to use. Developers may manually enumerate
possible combinations of these unknown factors, and then pick the best schedule
according to measurement results on their device.
+
+### 3.2. AutoTVM-style Design Space Description
+
+Meta schedule extends the schedule DSL with sampling instructions. When
included in a schedule, these instructions parametrize the schedule from a
single deterministic point to a space supported by random variables (tile size,
etc.), making it possible for developers to describe the design space with meta
schedule APIs.
+
+We can extend the matmul example above to cover all possible tilings using
these sampling instructions:
+
+```python
+# Sample tile sizes
+i_tiles = sch.sample_perfect_tile(i, n=4)
Review comment:
What is the meaning of `perfect` here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]