areusch commented on a change in pull request #5:
URL: https://github.com/apache/tvm-rfcs/pull/5#discussion_r669002230
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -22,36 +22,48 @@
## 1. Summary
-This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the
+This proposal introduces Meta Schedule: a scheduling DSL on TIR that unifies
the
approaches of AutoTVM and Auto Scheduler (Ansor). Meta schedule provides a
pragmatic way to define
the space of automatic tuning, extensibility in terms of all possible TIR
schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
-Meta Schedule is our 3rd generation automatic scheduling system.
+Meta Schedule is the 3rd generation automatic scheduling system.
## 2. Motivation
**Scheduling and design space.** In TVM TensorIR, optimization of a TensorIR
program is done via a
-sequence of transformations. For example, we reorder loops for better locality
and we tensorize for
+sequence of transformations. For example, reordering loops for better locality
and tensorizing for
specific hardware intrinsics. The process of invoking such a set of
pre-defined transformations is
called "**scheduling**", and each transformation is called a "**schedule
primitive**". These
primitives form a domain-specific language (DSL) describing the transformation
of TensorIR programs.
**Design space** is the set of all possible schedulings with respect to a
TensorIR program.
-**Problems with the current scheduling system.** Currently we have 3 sets of
scheduling APIs:
+### Problems with the current scheduling system
+
+Currently there are have 3 sets of scheduling APIs in TVM:
Review comment:
nit: Currently there are 3 sets of
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -64,8 +76,17 @@ primitives form a domain-specific language (DSL) describing
the transformation o
## 3. Guide-level explanation
-In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe
-and auto-generate the design space.
+Meta Schedule DSL is a flexible way to define or auto-generate the design
space.
Review comment:
might say: Meta Schedule DSL is a language that provides TVM backend
integrators a flexible way to define or auto-generate the operator design space.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -22,36 +22,48 @@
## 1. Summary
-This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the
+This proposal introduces Meta Schedule: a scheduling DSL on TIR that unifies
the
approaches of AutoTVM and Auto Scheduler (Ansor). Meta schedule provides a
pragmatic way to define
the space of automatic tuning, extensibility in terms of all possible TIR
schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
-Meta Schedule is our 3rd generation automatic scheduling system.
+Meta Schedule is the 3rd generation automatic scheduling system.
## 2. Motivation
**Scheduling and design space.** In TVM TensorIR, optimization of a TensorIR
program is done via a
-sequence of transformations. For example, we reorder loops for better locality
and we tensorize for
+sequence of transformations. For example, reordering loops for better locality
and tensorizing for
specific hardware intrinsics. The process of invoking such a set of
pre-defined transformations is
called "**scheduling**", and each transformation is called a "**schedule
primitive**". These
primitives form a domain-specific language (DSL) describing the transformation
of TensorIR programs.
**Design space** is the set of all possible schedulings with respect to a
TensorIR program.
-**Problems with the current scheduling system.** Currently we have 3 sets of
scheduling APIs:
+### Problems with the current scheduling system
+
+Currently there are have 3 sets of scheduling APIs in TVM:
* **Manual schedule**: Developers optimize their programs by manually invoking
schedule primitives,
i.e. explore points in the design space with humans in the loop. This can be
a tedious and
error-prone approach, hence the creation of AutoTVM and AutoScheduler
(Ansor).
-* **AutoTVM**: The automation system requires users to define "schedule
templates" as the design
- space for each operator. Therefore, it is inextensible to hundreds of
operators and variety
+* **AutoTVM**: The automation system requires users to define the design space
through
+ per-operator "schedule templates." Therefore, programmer time is a
bottleneck in scaling
+ to hundreds of operators across many hardware platforms.
hardware platforms.
* **AutoScheduler (Ansor)**: It automatically generates schedule templates as
the design space,
according to a set of predefined "search rules". However, it is non-trivial
to extend
AutoScheduler to new schedule primitives (tensorize, loop partition,
software pipelining, etc).
* The three systems above have isolated sets of APIs with several layers of
their own abstraction,
which are not only hard to learn, but also engineering-intensive to
customize.
-**Benefits of Meta Schedule.** Meta schedule provides:
+### Benefits of Meta Schedule
+
+The existing three scheduling systems are mutually incompatible with each
other in terms of API
+design and divergence: besides manual TE scheduling, AutoTVM requires users to
learn a new set of
+APIs, and AutoScheduler brings in another set of C++-based search rules. It
adds the users' mental
+overhead to understand and extend the existing systems. Further, the inability
to switch between
+template-based and template-free auto-tuning could lead to inferior
customizability and hence worse
Review comment:
or even: and hence make it needlessly difficult to achieve optimal
performance.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -64,8 +76,17 @@ primitives form a domain-specific language (DSL) describing
the transformation o
## 3. Guide-level explanation
-In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe
-and auto-generate the design space.
+Meta Schedule DSL is a flexible way to define or auto-generate the design
space.
+
+This section introduces its syntax of meta schedule DSL and usage in terms of
describing and
+auto-generating the design space, more specifically, its APIs for:
+1) Manually constructing a schedule using existing schedule primitives
(Section 3.1);
+2) Defining composite schedule to simplify the ap sequence of schedule
primitives (Section 3.2);
+3) Describing a design space of possible schedules,
+a.k.a. AutoTVM-style schedule templates (Section 3.3);
+4) Automatically generating the design space, a.k.a. Ansor-style search rules
(Section 3.4);
Review comment:
I would stick to one name here--so suggest replacing Ansor with
AutoScheduler even though Ansor originated the idea. You can explain elsewhere
that AutoScheduler was derived from Ansor.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -64,8 +76,17 @@ primitives form a domain-specific language (DSL) describing
the transformation o
## 3. Guide-level explanation
-In this section, we describe the syntax of meta schedule DSL, and how it could
be used to describe
-and auto-generate the design space.
+Meta Schedule DSL is a flexible way to define or auto-generate the design
space.
+
+This section introduces its syntax of meta schedule DSL and usage in terms of
describing and
+auto-generating the design space, more specifically, its APIs for:
Review comment:
```suggestion
This section introduces the syntax of Meta Schedule DSL by describing the 5
common usage patterns envisioned by this RFC. These patterns are:
```
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -22,36 +22,48 @@
## 1. Summary
-This proposal introduces Meta Schedule: a probabilistic scheduling DSL on TIR
that unifies the
+This proposal introduces Meta Schedule: a scheduling DSL on TIR that unifies
the
approaches of AutoTVM and Auto Scheduler (Ansor). Meta schedule provides a
pragmatic way to define
the space of automatic tuning, extensibility in terms of all possible TIR
schedule primitives like
tensorization and loop partitioning, and customizability on every layer of the
automation system.
-Meta Schedule is our 3rd generation automatic scheduling system.
+Meta Schedule is the 3rd generation automatic scheduling system.
## 2. Motivation
**Scheduling and design space.** In TVM TensorIR, optimization of a TensorIR
program is done via a
-sequence of transformations. For example, we reorder loops for better locality
and we tensorize for
+sequence of transformations. For example, reordering loops for better locality
and tensorizing for
specific hardware intrinsics. The process of invoking such a set of
pre-defined transformations is
called "**scheduling**", and each transformation is called a "**schedule
primitive**". These
Review comment:
I would introduce this first as a general concept, because a key detail
here is that all 3 scheduling APIs aim to achieve the same thing: translating a
Relay subgraph into TIR subgraph or PrimFunc. It would also help to note that
TVM's approach to workload-specific optimization is to represent such
optimizations in TensorIR.
It's not true that generally, TensorIR is always optimized by a set of TIR
transformations. Only with AutoScheduler is this true, correct? With AutoTVM,
TensorIR is merely templated. That's a key benefit you could introduce later of
AutoScheduler APIs.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -107,29 +128,64 @@ best schedule according to measurement results on their
device.
As introduced in the previous section, in TensorIR, each schedule primitive
handles only a very
basic transformation of the IR. For example, `split` only splits a loop into
two new loops. In the
real world, the over-fine granularity of those primitives usually leads to
repetitive and verbose
-scheduling code, as
-[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
-by developers in our community.
+scheduling code. Take the code snippet in the previous section as an example,
a sequence of `split`s
+are invoked, followed by a `reorder`, and all these together are called
"SSRSRS" tiling.
Review comment:
yeah i like this better, tweaking it a bit:
```suggestion
scheduling code. Take the code snippet in the previous section as an
example: a sequence of `split`s
are invoked, followed by a `reorder`. Taken together these 4 primitives are
colloquially known as "SSRSRS" tiling.
```
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -107,29 +128,64 @@ best schedule according to measurement results on their
device.
As introduced in the previous section, in TensorIR, each schedule primitive
handles only a very
basic transformation of the IR. For example, `split` only splits a loop into
two new loops. In the
real world, the over-fine granularity of those primitives usually leads to
repetitive and verbose
-scheduling code, as
-[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
-by developers in our community.
+scheduling code. Take the code snippet in the previous section as an example,
a sequence of `split`s
+are invoked, followed by a `reorder`, and all these together are called
"SSRSRS" tiling.
+
+To make it more convenient and modular, users are allowed to register
**composite schedules** that apply
+a sequence of schedule primitives according to certain analysis of the IR. The
word *composite* here
+is used against the word *primitive*, which means it is a transformation
*composed* of those
+*primitives*.
+
+For example, suppose there is a composite schedule called
`Inline-All-Elementwise-Operations`, which
Review comment:
could you explain the parameters of `Inline-All-Elementwise-Operations`,
since you use it below in code? Also, can you list at a high level the schedule
primitives that are composed here? I think that's the unclear bit here and the
rest is good.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -107,29 +128,64 @@ best schedule according to measurement results on their
device.
As introduced in the previous section, in TensorIR, each schedule primitive
handles only a very
basic transformation of the IR. For example, `split` only splits a loop into
two new loops. In the
real world, the over-fine granularity of those primitives usually leads to
repetitive and verbose
-scheduling code, as
-[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
-by developers in our community.
+scheduling code. Take the code snippet in the previous section as an example,
a sequence of `split`s
+are invoked, followed by a `reorder`, and all these together are called
"SSRSRS" tiling.
+
+To make it more convenient and modular, users are allowed to register
**composite schedules** that apply
+a sequence of schedule primitives according to certain analysis of the IR. The
word *composite* here
+is used against the word *primitive*, which means it is a transformation
*composed* of those
Review comment:
follow-up from
https://github.com/apache/tvm-rfcs/pull/5#discussion_r666310349 (GH won't allow
me to continue the thread in review)
in the previous sentence I think it's clear you're using *composite* and
*primitive* together, so i might phrase this more like:
"The word **composite** here means the schedule transformation is *composed*
of those **primitives**"
(use different emphases for "composed" than you use for "composite" and
"primitives" since the latter two are definitions and the former is emphasis)
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -147,17 +203,20 @@ sch.reorder(
### 3.4. AutoScheduler-style Design Space Generation
-AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage.
-SearchRule analyzes TE and eagerly trigger schedule primitives accordingly in
its internally
-maintained mini IR.
+To generate design space, AutoScheduler (Ansor) applies a set of rules to each
TE stage.
Review comment:
could you explain what a te stage is?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -147,17 +203,20 @@ sch.reorder(
### 3.4. AutoScheduler-style Design Space Generation
-AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage.
-SearchRule analyzes TE and eagerly trigger schedule primitives accordingly in
its internally
-maintained mini IR.
+To generate design space, AutoScheduler (Ansor) applies a set of rules to each
TE stage.
+The rules analyze the TE operations and apply an internal DSL to manipulating
its internal IR,
+which is in the end mapped to TE schedule primitives. This process is called
*sketch generation*.
-As introduced in Section 3.2, composite schedule rules are equivalent to
AutoScheduler's SearchRule
-in TensorIR scheduling. To further generate a design space for scheduling,
sampling instructions are
-used in composite schedule rules. Similarly, the sketch generation phase in
AutoScheduler is
-equivalent to applying composite schedule rules to each block in TensorIR.
+Composite schedule rules work in a similar way scheduling TensorIR, as
introduced in Section 3.2.
+It analyzes the TensorIR and apply schedule primitives directly to TensorIR
accordingly.
+When applying such rules to each TensorIR block in certain order (e.g.
Post-DFS Order),
+it generates a sequence of schedule primitives.
+If the sampling instructions are present in this sequence,
Review comment:
nit: delete "the"
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -147,17 +203,20 @@ sch.reorder(
### 3.4. AutoScheduler-style Design Space Generation
-AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage.
-SearchRule analyzes TE and eagerly trigger schedule primitives accordingly in
its internally
-maintained mini IR.
+To generate design space, AutoScheduler (Ansor) applies a set of rules to each
TE stage.
+The rules analyze the TE operations and apply an internal DSL to manipulating
its internal IR,
+which is in the end mapped to TE schedule primitives. This process is called
*sketch generation*.
-As introduced in Section 3.2, composite schedule rules are equivalent to
AutoScheduler's SearchRule
-in TensorIR scheduling. To further generate a design space for scheduling,
sampling instructions are
-used in composite schedule rules. Similarly, the sketch generation phase in
AutoScheduler is
-equivalent to applying composite schedule rules to each block in TensorIR.
+Composite schedule rules work in a similar way scheduling TensorIR, as
introduced in Section 3.2.
+It analyzes the TensorIR and apply schedule primitives directly to TensorIR
accordingly.
+When applying such rules to each TensorIR block in certain order (e.g.
Post-DFS Order),
Review comment:
When would it be applied in any other order? If never, could you just
state: "A composite schedule rule inspects a given TensorIR fragment and
applies a sequence of schedule primitives to transform the TensorIR. Composite
schedule rules are always applied in post-DFS order"
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -147,17 +203,20 @@ sch.reorder(
### 3.4. AutoScheduler-style Design Space Generation
-AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage.
-SearchRule analyzes TE and eagerly trigger schedule primitives accordingly in
its internally
-maintained mini IR.
+To generate design space, AutoScheduler (Ansor) applies a set of rules to each
TE stage.
+The rules analyze the TE operations and apply an internal DSL to manipulating
its internal IR,
+which is in the end mapped to TE schedule primitives. This process is called
*sketch generation*.
-As introduced in Section 3.2, composite schedule rules are equivalent to
AutoScheduler's SearchRule
-in TensorIR scheduling. To further generate a design space for scheduling,
sampling instructions are
-used in composite schedule rules. Similarly, the sketch generation phase in
AutoScheduler is
-equivalent to applying composite schedule rules to each block in TensorIR.
+Composite schedule rules work in a similar way scheduling TensorIR, as
introduced in Section 3.2.
+It analyzes the TensorIR and apply schedule primitives directly to TensorIR
accordingly.
+When applying such rules to each TensorIR block in certain order (e.g.
Post-DFS Order),
+it generates a sequence of schedule primitives.
+If the sampling instructions are present in this sequence,
+the support of the probability space form a design space of possible
schedulings.
Review comment:
AutoScheduler further explores the design space defined by those
sampling instructions.
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -266,55 +322,83 @@ sch.reorder(l14, l18, l15, l19, l22, l16, l20, l23, l17,
l21)
### 4.2. Exploring the Design Space
Meta Schedule provides several built-in exploration strategies to exhaustively
or efficiently search
-for efficient schedules.
-
-**Random search by replaying schedule functions.** With a user-provided
schedule function
-as a black-box design space generator, our system could repetitively invoke
such an opaque function
-without doing any extra analysis. The function could be written in C++ or
Python, or any language
-that implements packed function FFI. If sampling instructions are present in
the function, then each
-invocation results in a different IRModule after being scheduled because the
random decisions are
-possibly changed across different runs. Effectively, it is equivalent to
-random exploration without trace, allowing the flexibility for users to define
arbitrary functions
+for efficient schedules. Those strategies are mostly supported by re-execute
either a function or a
+trace with a builtin interpreter in meta schedule, and this process is called
**replay**.
+
+#### Random search by replaying schedule functions
+
+With a user-provided schedule function
+as a black-box design space generator, meta schedule could repetitively
invokes such an opaque TVM
Review comment:
remove "could" here--state exactly what happens
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -266,55 +322,83 @@ sch.reorder(l14, l18, l15, l19, l22, l16, l20, l23, l17,
l21)
### 4.2. Exploring the Design Space
Meta Schedule provides several built-in exploration strategies to exhaustively
or efficiently search
-for efficient schedules.
-
-**Random search by replaying schedule functions.** With a user-provided
schedule function
-as a black-box design space generator, our system could repetitively invoke
such an opaque function
-without doing any extra analysis. The function could be written in C++ or
Python, or any language
-that implements packed function FFI. If sampling instructions are present in
the function, then each
-invocation results in a different IRModule after being scheduled because the
random decisions are
-possibly changed across different runs. Effectively, it is equivalent to
-random exploration without trace, allowing the flexibility for users to define
arbitrary functions
+for efficient schedules. Those strategies are mostly supported by re-execute
either a function or a
+trace with a builtin interpreter in meta schedule, and this process is called
**replay**.
+
+#### Random search by replaying schedule functions
+
+With a user-provided schedule function
+as a black-box design space generator, meta schedule could repetitively
invokes such an opaque TVM
+packed function without doing any extra analysis.
+If sampling instructions are present in the trace, then scheduling is
non-deterministic
+(random decisions may not be repeated across runs)
Review comment:
follow-up from
https://github.com/apache/tvm-rfcs/pull/5#discussion_r668395363
this seems bad if you want to reproduce. is there a way to do that by
supplying the trace, rather than manually passing in the decisions as a list?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -147,17 +203,20 @@ sch.reorder(
### 3.4. AutoScheduler-style Design Space Generation
-AutoScheduler (Ansor) generates schedule templates by applying their
SearchRules to each stage.
-SearchRule analyzes TE and eagerly trigger schedule primitives accordingly in
its internally
-maintained mini IR.
+To generate design space, AutoScheduler (Ansor) applies a set of rules to each
TE stage.
+The rules analyze the TE operations and apply an internal DSL to manipulating
its internal IR,
Review comment:
is the DSL also internal or just the IR?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -107,29 +128,64 @@ best schedule according to measurement results on their
device.
As introduced in the previous section, in TensorIR, each schedule primitive
handles only a very
basic transformation of the IR. For example, `split` only splits a loop into
two new loops. In the
real world, the over-fine granularity of those primitives usually leads to
repetitive and verbose
-scheduling code, as
-[mentioned](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/43?u=junrushao1994)
-by developers in our community.
+scheduling code. Take the code snippet in the previous section as an example,
a sequence of `split`s
+are invoked, followed by a `reorder`, and all these together are called
"SSRSRS" tiling.
+
+To make it more convenient and modular, users are allowed to register
**composite schedules** that apply
+a sequence of schedule primitives according to certain analysis of the IR. The
word *composite* here
+is used against the word *primitive*, which means it is a transformation
*composed* of those
+*primitives*.
+
+For example, suppose there is a composite schedule called
`Inline-All-Elementwise-Operations`, which
+inlines all the elementwise computation into their consumers. Applying it to
the following TensorIR:
+
+```python
[email protected]
+def example_func(...):
+ for i, j in ...:
+ with tir.Block("B") ...:
+ B[i, j] = A[i, j] + 1
+ for i, j in ...:
+ with tir.Block("C") ...:
+ C[i, j] = B[i, j] + 1
+ for i, j in ...:
+ with tir.Block("D") ...:
+ D[i, j] = C[i, j] + 1
+
+sch = tir.Schedule(example_func)
+InlineAllElementwiseOperations().apply(sch, sch.get_block("D"))
+print(tvm.script.asscript(sch.mod))
+```
+
+The result after applying the composite schedule is:
-To make it more convenient and modular, we allow users to register "composite
schedules" that apply
-a sequence of schedule primitives according to certain analysis of the IR. For
instance, a composite
-schedule may inspect a TensorIR block and decide whether we should call
`compute_inline` on it.
+```python
[email protected]
+def example_func(...):
+ for i, j in ...:
+ with tir.Block("D") ...:
+ D[i, j] = A[i, j] + 1 + 1 + 1
+```
### 3.3. AutoTVM-style Design Space Description
-Meta schedule extends the schedule DSL with sampling instructions. When
included in a schedule,
-these instructions parametrize the schedule from a single deterministic point
to a space supported
-by random variables (tile size, etc.), making it possible for developers to
describe the design
-space with meta schedule APIs.
+Meta schedule extends the schedule DSL with a set of new schedule primitives
with randomness,
+called **sampling instructions**. These primitives do not transform the
TensorIR,
+but instead will generate random decisions from specific distributions in each
run,
Review comment:
i started with a clarifying suggestion but wound up with a question. i
think in general the technique is to describe a finite space and then select an
element using statistical distributions, but would be great to clarify.
```suggestion
Meta schedule extends the schedule DSL with a set of new schedule primitives
called **sampling instructions**. These primitives do not transform the
TensorIR,
but instead introduce random statistical variables which can be referenced
later in scheduling
to parameterize the schedule. Incorporating **sampling instructions** into a
operator's schedule
allows the backend integrator to succinctly describe a design space in terms
of <explain how statistical distributions can be used to parameterize an
integral quantity like tile size here>.
```
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -211,7 +267,7 @@ explore the design space. The figure below briefly
illustrates the workflow of t
**Trace**. To represent the design space defined by the meta schedule DSL, the
underlying system
records all the instructions users applied to the schedule class, including
sampling and schedule
-primitives. We call this list of instructions a trace.
+primitives. This list of instructions a trace is called a trace.
Review comment:
clarify
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -266,55 +322,83 @@ sch.reorder(l14, l18, l15, l19, l22, l16, l20, l23, l17,
l21)
### 4.2. Exploring the Design Space
Meta Schedule provides several built-in exploration strategies to exhaustively
or efficiently search
-for efficient schedules.
-
-**Random search by replaying schedule functions.** With a user-provided
schedule function
-as a black-box design space generator, our system could repetitively invoke
such an opaque function
-without doing any extra analysis. The function could be written in C++ or
Python, or any language
-that implements packed function FFI. If sampling instructions are present in
the function, then each
-invocation results in a different IRModule after being scheduled because the
random decisions are
-possibly changed across different runs. Effectively, it is equivalent to
-random exploration without trace, allowing the flexibility for users to define
arbitrary functions
+for efficient schedules. Those strategies are mostly supported by re-execute
either a function or a
+trace with a builtin interpreter in meta schedule, and this process is called
**replay**.
+
+#### Random search by replaying schedule functions
+
+With a user-provided schedule function
+as a black-box design space generator, meta schedule could repetitively
invokes such an opaque TVM
+packed function without doing any extra analysis.
+If sampling instructions are present in the trace, then scheduling is
non-deterministic
+(random decisions may not be repeated across runs)
+Effectively, it is equivalent to random exploration without trace,
+allowing the flexibility for users to define arbitrary functions
that trace may not well support (e.g. control flow divergence based on the
value of intermediate
random variables), but it forbids future opportunity of any trace-based
analysis.
-**Random search by replaying traces.** Traces are obtained from a design space
generator, and
-replayed with a builtin interpreter in our system. If sampling instructions
are present on the
-traces, then their random decisions are mutated during each replay, i.e. jumps
to a new point in the
+#### Random search by replaying traces
+
+A builtin interpreter directly replays the traces obtained
+from manual schedule, template-based or template-free design space generation.
+If sampling instructions are present on the traces,
+then their random decisions are mutated during each replay, i.e. jumps to a
new point in the
design space. Therefore, repetitive replay of those traces are equivalent to
exploration of the
-design space. Our system could potentially benefit from trace-based analysis,
including rejecting
-obviously invalid schedules (e.g. using too much CUDA resources), doing
dead-code elimination to
-simplify a trace, extracting trace-based features used in the cost model, etc.
+design space. meta schedule could potentially benefit from trace-based
analysis, making the search more
+efficient, including rejecting obviously invalid schedules (e.g. using too
much CUDA resources),
+doing dead-code elimination to simplify a trace, extracting trace-based
features used in the cost
+model, etc.
-**Cost-model-guided evolutionary search**. A more efficient exploration
strategy. We define two sets
-of rules:
+#### Cost-model-guided evolutionary search
+
+A more efficient exploration strategy, introduced in the Ansor.
Review comment:
which section is "the Ansor"?
##########
File path: rfcs/0001-meta-schedule-autotensorir.md
##########
@@ -266,55 +322,83 @@ sch.reorder(l14, l18, l15, l19, l22, l16, l20, l23, l17,
l21)
### 4.2. Exploring the Design Space
Meta Schedule provides several built-in exploration strategies to exhaustively
or efficiently search
-for efficient schedules.
-
-**Random search by replaying schedule functions.** With a user-provided
schedule function
-as a black-box design space generator, our system could repetitively invoke
such an opaque function
-without doing any extra analysis. The function could be written in C++ or
Python, or any language
-that implements packed function FFI. If sampling instructions are present in
the function, then each
-invocation results in a different IRModule after being scheduled because the
random decisions are
-possibly changed across different runs. Effectively, it is equivalent to
-random exploration without trace, allowing the flexibility for users to define
arbitrary functions
+for efficient schedules. Those strategies are mostly supported by re-execute
either a function or a
+trace with a builtin interpreter in meta schedule, and this process is called
**replay**.
+
+#### Random search by replaying schedule functions
+
+With a user-provided schedule function
+as a black-box design space generator, meta schedule could repetitively
invokes such an opaque TVM
+packed function without doing any extra analysis.
+If sampling instructions are present in the trace, then scheduling is
non-deterministic
+(random decisions may not be repeated across runs)
+Effectively, it is equivalent to random exploration without trace,
+allowing the flexibility for users to define arbitrary functions
that trace may not well support (e.g. control flow divergence based on the
value of intermediate
random variables), but it forbids future opportunity of any trace-based
analysis.
-**Random search by replaying traces.** Traces are obtained from a design space
generator, and
-replayed with a builtin interpreter in our system. If sampling instructions
are present on the
-traces, then their random decisions are mutated during each replay, i.e. jumps
to a new point in the
+#### Random search by replaying traces
+
+A builtin interpreter directly replays the traces obtained
+from manual schedule, template-based or template-free design space generation.
+If sampling instructions are present on the traces,
+then their random decisions are mutated during each replay, i.e. jumps to a
new point in the
design space. Therefore, repetitive replay of those traces are equivalent to
exploration of the
-design space. Our system could potentially benefit from trace-based analysis,
including rejecting
-obviously invalid schedules (e.g. using too much CUDA resources), doing
dead-code elimination to
-simplify a trace, extracting trace-based features used in the cost model, etc.
+design space. meta schedule could potentially benefit from trace-based
analysis, making the search more
Review comment:
rather than saying "trace-based analysis," maybe a better way to phrase
is "The Meta Schedule search rate could be improved by allowing traces to be
analyzed before they are run. For example, trace analysis could reject
obviously-invalid schedules (e.g. using too many CUDA resource), ... before
they are run."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]