[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #5962: [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule generating

GitBox Fri, 03 Jul 2020 22:10:06 -0700


merrymercy commented on a change in pull request #5962:
URL: https://github.com/apache/incubator-tvm/pull/5962#discussion_r449730879




##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.

Review comment:
       ```suggestion
           The ComputeDAG for the compute declaration.
   ```
   Do not overuse `target` because `target` already has its meaning (i.e. the 
hardware target)

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).

Review comment:
       ```suggestion
       We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
       subgraph) to a ComputeDAG. It keeps the input/output tensors of the 
compute declaration,
       a list of all operations in the DAG as well as static analysis results 
for the DAG (e.g. the total float operation count, consumer/producer relations 
of each operation 
       stage, whether an operation stage should be tiled/compute inlined ...). 
These analyses can
       help the search policy to make decisions during search process.
   
       ComputeDAG is also responsible for the interaction between Ansor 
`LoopState` and TVM schedule
       (e.g. applying the `LoopState` transform steps to TVM schedule, 
providing `LoopState` with extra
       information got from TVM schedule ...).
   ```

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.

Review comment:
       ```suggestion
           Apply the history transform steps of a State to  get a TVM schedule.
   ```

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.

Review comment:
       ```suggestion
           The workload key for the compute declaration.
   ```
   Do not overuse `target` because `target` already has its meaning (i.e. the 
hardware target)

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""

Review comment:
       ```suggestion
   The definition of the "state" in search.
   
   Each LoopState corresponds to a schedule for its ComputeDAG.
   A LoopState consists of: 1. a current loop structure; 2. a list of 
transformation steps used to
   construct the loop structure.
   The loop structure keeps a preview of how the schedule will finally look 
like after lowering the
   current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
   During the schedule search process, the loop structure can provide search 
policy with necessary
   information on how to manipulate the current state.
   The transform history is a sequence of `TransformStep` which will finally be 
mapped to TVM schedule
   primitives. The steps can also be used for the serialization of a state.
   
   The LoopState can be seen as a lightweight loop structure IR specifically 
for schedule search.
   We don't use the existing TVM IR but to extend a new structure on it is 
because:
   1. We want fast incremental change to the loop structures. The search policy 
needs to get the immediate
   loop structures update rather than after TVM lowering;
   2. We want serializable transform history for replay, backtracking, and 
mutation;
   3. We may create some macro schedule primitives that represent the 
combination of several
   TVM schedule primitives.
   
   When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
   Since we share a lot of common objects during search, the transformation is 
implemented in
   copy on write style. All objects are immutable, which is similar to TVM IR.
   """
   ```

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.
+
+    Parameters
+    ----------
+    state_object : StateObject
+        The target StateObject, corresponding to C++ internal State object.
+    dag : ComputeDAG
+        The original target ComputeDAG of this State.
+
+    Notes
+    -----
+    This is a wrapper class of StateObject to deal with copy-on-write property
+    """
+    def __init__(self, state_object, dag):
+        self.state_object = state_object
+        self.compute_dag = dag
+
+        self.stages_cache = None  # A list to cache all stages
+        self.stage_id_map = {}    # A dict maps operation to stage id
+        self._update_stage_id_map()
+
+    @property
+    def stages(self):
+        """
+        Returns
+        -------
+        stages : List[Stage]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return self.stages_cache
+
+    @property
+    def stage_ops(self):
+        """
+        Returns
+        -------
+        ops: List[Operation]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return [stage.op for stage in self.stages_cache]
+
+    def reorder(self, stage, order):
+        """ Schedule primitive corresponds to te.reorder.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be reordered, can be a Stage order index, 
Stage operation or stage

Review comment:
       ```suggestion
               The Stage to be reordered, which can be a Stage order index, 
Stage operation or stage
   ```
   Do not overuse `target`. It already has its own meaning in `compute_at` 
(i.e., the target of compute_at`)

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.

Review comment:
       ```suggestion
       the init state of input ComputeDAG.
   ```
   Do not overuse `target` because `target` already has its meaning (i.e. the 
hardware target)

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.
+
+    Parameters
+    ----------
+    state_object : StateObject
+        The target StateObject, corresponding to C++ internal State object.
+    dag : ComputeDAG
+        The original target ComputeDAG of this State.
+
+    Notes
+    -----
+    This is a wrapper class of StateObject to deal with copy-on-write property
+    """
+    def __init__(self, state_object, dag):
+        self.state_object = state_object
+        self.compute_dag = dag
+
+        self.stages_cache = None  # A list to cache all stages
+        self.stage_id_map = {}    # A dict maps operation to stage id
+        self._update_stage_id_map()
+
+    @property
+    def stages(self):
+        """
+        Returns
+        -------
+        stages : List[Stage]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return self.stages_cache
+
+    @property
+    def stage_ops(self):
+        """
+        Returns
+        -------
+        ops: List[Operation]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return [stage.op for stage in self.stages_cache]
+
+    def reorder(self, stage, order):
+        """ Schedule primitive corresponds to te.reorder.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be reordered, can be a Stage order index, 
Stage operation or stage
+            output tensor.
+        order : List[Iterator]
+            Iterators in the expected order
+        """
+        stage_id = self._resolve_stage_id(stage)
+
+        self.state_object = _ffi_api.StateReorder(self.state_object, stage_id, 
order)
+        self._clear_cache()
+
+    def split(self, stage, iterator, lengths, inner_to_outer=True):
+        """ Schedule primitive corresponds to te.split.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be split, can be a Stage order index, Stage 
operation or stage

Review comment:
       ```suggestion
               The Stage to be split, can be a Stage order index, Stage 
operation or stage
   ```

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.
+    early_stopping: int = -1
+      Stops early the tuning if no improvement get after n measurements.
+    num_measures_per_round: int = 64
+      The number of programs to be measured at each search round.
+      The whole schedule search process is designed to have several rounds to 
try a total
+      `num_measure_trials` schedules.
+      We have: `num_search_rounds` = `num_measure_trials` // 
`num_measures_per_round`
+    verbose: int = 1
+      Verbosity level. 0 for silent, 1 to output information during schedule 
search.
+    builder: Union[ProgramBuilder, str] = 'local'
+      ProgramBuilder which builds the program.
+    runner: Union[ProgramRunner, str] = 'local'
+      ProgramRunner which runs the program and measures time costs.
+    measure_callbacks: Optional[List[MeasureCallback]]
+      Callback functions called after each measure.

Review comment:
       ```suggestion
         Callback functions called after each measurement.
   ```

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.
+    early_stopping: int = -1
+      Stops early the tuning if no improvement get after n measurements.

Review comment:
       ```suggestion
         Stop the tuning early if getting no improvement after n measurements.
   ```

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.
+    early_stopping: int = -1
+      Stops early the tuning if no improvement get after n measurements.
+    num_measures_per_round: int = 64
+      The number of programs to be measured at each search round.
+      The whole schedule search process is designed to have several rounds to 
try a total
+      `num_measure_trials` schedules.

Review comment:
       ```suggestion
         The number of schedules to be measured at each search round.
         The whole schedule search process is designed to  try a total number 
of 
         `num_measure_trials` in several rounds.
   ```
   
   Be consistent with programs, schedules, and states.

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.

Review comment:
       ```suggestion
         The number of measurement trials.
         The search policy measures `num_measure_trials` schedules in total and 
returns the best one among them.
         With `num_measure_trials` == 0, the policy will do the schedule search 
but won't involve measurement.
         This can be used to get a runnable schedule quickly without 
auto-tuning.
   ```
   
   Keep consistent with `states`, `schedules` 

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.
+    early_stopping: int = -1
+      Stops early the tuning if no improvement get after n measurements.
+    num_measures_per_round: int = 64
+      The number of programs to be measured at each search round.
+      The whole schedule search process is designed to have several rounds to 
try a total
+      `num_measure_trials` schedules.
+      We have: `num_search_rounds` = `num_measure_trials` // 
`num_measures_per_round`
+    verbose: int = 1
+      Verbosity level. 0 for silent, 1 to output information during schedule 
search.
+    builder: Union[ProgramBuilder, str] = 'local'
+      ProgramBuilder which builds the program.
+    runner: Union[ProgramRunner, str] = 'local'
+      ProgramRunner which runs the program and measures time costs.
+    measure_callbacks: Optional[List[MeasureCallback]]
+      Callback functions called after each measure.
+      Candidates:
+        - ansor.LogToFile
+    pre_search_callbacks: Optional[List[SearchCallback]]
+      Callback functions called before the search process.
+      Candidates:
+        - ansor.PreloadMeasuredStates
+        - ansor.PreloadCustomSketchRule
+        TODO(jcf94): Add these implementation in later PRs.
+    """
+    def __init__(self, num_measure_trials=0, early_stopping=-1, 
num_measures_per_round=64,
+                 verbose=1, builder='local', runner='local', 
measure_callbacks=None,
+                 pre_search_callbacks=None):
+        if isinstance(builder, str):
+            if builder == 'local':
+                builder = LocalBuilder()
+            else:
+                raise ValueError("Invalid builder: " + builder)
+
+        if isinstance(runner, str):
+            if runner == 'local':
+                runner = LocalRunner()
+            else:
+                raise ValueError("Invalid runner: " + runner)
+
+        measure_callbacks = [] if measure_callbacks is None else 
measure_callbacks
+        pre_search_callbacks = [] if pre_search_callbacks is None else 
pre_search_callbacks
+
+        self.__init_handle_by_constructor__(
+            _ffi_api.TuningOptions, num_measure_trials, early_stopping, 
num_measures_per_round,
+            verbose, builder, runner, measure_callbacks, pre_search_callbacks)
+
+
+def auto_schedule(task, target, target_host=None, search_policy='default',
+                  hardware_params=None, tuning_options=None):
+    """ Do auto scheduling for a computation declaration.
+
+    The task parameter can be a `string` as workload_key, or directly
+    passing a `SearchTask` as input.
+
+    Parameters
+    ----------
+    task : Union[SearchTask, str]
+        The target search task or workload key.

Review comment:
       ```suggestion
           The search task or workload key.
   ```
   Do not overuse `target` because `target` already has its meaning (i.e. the 
hardware target).

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+            A `te.schedule` and the target `te.Tensor`s to be used in 
`tvm.lower` or `tvm.build`
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGApplyStepsFromState(self, state_obj)
+
+    def print_python_code_from_state(self, state):
+        """
+        Print transform steps in the history of a State as TVM's python 
schedule primitive.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.

Review comment:
       ```suggestion
               The state from which we get transform steps
   ```

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+            A `te.schedule` and the target `te.Tensor`s to be used in 
`tvm.lower` or `tvm.build`
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGApplyStepsFromState(self, state_obj)
+
+    def print_python_code_from_state(self, state):
+        """
+        Print transform steps in the history of a State as TVM's python 
schedule primitive.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+        str : Str
+            The Python schedule code.
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGPrintPythonCodeFromState(self, state_obj)
+
+    def infer_bound_from_state(self, state):
+        """
+        Infer and fill the bound of all iterators of a state using TVM 
schedule.

Review comment:
       ```suggestion
           Infer and fill the bound of all iterators of a state.
   ```

##########
File path: python/tvm/ansor/auto_schedule.py
##########
@@ -0,0 +1,207 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+User interface for Ansor auto-scheduler.
+
+The basic schedule search process for Ansor is designed to be:
+`Program sampling` -> `Performance Tuning`.
+
+In `Program sampling`, we use some predefined precise or heuristic rules to 
generate several
+initial schedules. Based on these initial starting points, we perform 
`Performance Tuning` which
+uses cost model based evolutionary search to select schedules with the best 
performance.
+
+Candidate schedules are measured against the specific hardware target.
+"""
+
+import tvm._ffi
+from tvm.runtime import Object
+from .compute_dag import ComputeDAG
+from .measure import LocalBuilder, LocalRunner
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.HardwareParams")
+class HardwareParams(Object):
+    """ The parameters of target hardware used to guide the search process of 
SearchPolicy.
+
+    TODO(jcf94): This is considering to merge with the new Target:
+    https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844
+
+    Parameters
+    ----------
+    num_cores : int
+        The number of device cores.
+    vector_unit_bytes : int
+        The width of vector units in bytes.
+    cache_line_bytes : int
+        The size of cache line in bytes.
+    max_unroll_vec : int
+        The max length of an axis to be unrolled or vectorized.
+    max_innermost_split_factor : int
+        The max split factor for the innermost tile.
+    """
+    def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
+                 max_unroll_vec, max_innermost_split_factor):
+        self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
+                                            vector_unit_bytes, 
cache_line_bytes,
+                                            max_unroll_vec, 
max_innermost_split_factor)
+
+
+@tvm._ffi.register_object("ansor.SearchTask")
+class SearchTask(Object):
+    """ The computation information and hardware parameters for a specific 
schedule search task.
+
+    Parameters
+    ----------
+    dag : ComputeDAG
+        The ComputeDAG for the target compute declaration.
+    workload_key : str
+        The workload key for the target compute declaration.
+    target : tvm.target.Target
+        The target device of this search task.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this search task.
+    hardware_params : Optional[HardwareParams]
+        Hardware parameters used in this search task.
+    """
+    def __init__(self, dag, workload_key, target, target_host=None,
+                 hardware_params=None):
+        self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
+                                            workload_key, target, target_host,
+                                            hardware_params)
+
+
+@tvm._ffi.register_object("ansor.SearchPolicy")
+class SearchPolicy(Object):
+    """ The base class of search policies. """
+
+
+@tvm._ffi.register_object("ansor.EmptyPolicy")
+class EmptyPolicy(SearchPolicy):
+    """ This is an example empty search policy which will always generate
+    the init state of target ComputeDAG.
+    """
+    def __init__(self):
+        self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)
+
+
+@tvm._ffi.register_object("ansor.TuningOptions")
+class TuningOptions(Object):
+    """ This controls the options of performance tuning.
+
+    Parameters
+    ----------
+    num_measure_trials: int = 0
+      The number of total schedule measure trials.
+      Ansor takes `num_measure_trials` state for measuring in total, and 
finally gets the best
+      schedule among them.
+      With `num_measure_trials` == 0, Ansor will do the schedule search but 
don't involve
+      measurement, this can be used if we want to quickly get a runnable 
schedule without
+      performance tuning.
+    early_stopping: int = -1
+      Stops early the tuning if no improvement get after n measurements.
+    num_measures_per_round: int = 64
+      The number of programs to be measured at each search round.
+      The whole schedule search process is designed to have several rounds to 
try a total
+      `num_measure_trials` schedules.
+      We have: `num_search_rounds` = `num_measure_trials` // 
`num_measures_per_round`
+    verbose: int = 1
+      Verbosity level. 0 for silent, 1 to output information during schedule 
search.
+    builder: Union[ProgramBuilder, str] = 'local'
+      ProgramBuilder which builds the program.
+    runner: Union[ProgramRunner, str] = 'local'
+      ProgramRunner which runs the program and measures time costs.
+    measure_callbacks: Optional[List[MeasureCallback]]
+      Callback functions called after each measure.
+      Candidates:
+        - ansor.LogToFile
+    pre_search_callbacks: Optional[List[SearchCallback]]
+      Callback functions called before the search process.
+      Candidates:
+        - ansor.PreloadMeasuredStates
+        - ansor.PreloadCustomSketchRule
+        TODO(jcf94): Add these implementation in later PRs.
+    """
+    def __init__(self, num_measure_trials=0, early_stopping=-1, 
num_measures_per_round=64,
+                 verbose=1, builder='local', runner='local', 
measure_callbacks=None,
+                 pre_search_callbacks=None):
+        if isinstance(builder, str):
+            if builder == 'local':
+                builder = LocalBuilder()
+            else:
+                raise ValueError("Invalid builder: " + builder)
+
+        if isinstance(runner, str):
+            if runner == 'local':
+                runner = LocalRunner()
+            else:
+                raise ValueError("Invalid runner: " + runner)
+
+        measure_callbacks = [] if measure_callbacks is None else 
measure_callbacks
+        pre_search_callbacks = [] if pre_search_callbacks is None else 
pre_search_callbacks
+
+        self.__init_handle_by_constructor__(
+            _ffi_api.TuningOptions, num_measure_trials, early_stopping, 
num_measures_per_round,
+            verbose, builder, runner, measure_callbacks, pre_search_callbacks)
+
+
+def auto_schedule(task, target, target_host=None, search_policy='default',
+                  hardware_params=None, tuning_options=None):
+    """ Do auto scheduling for a computation declaration.
+
+    The task parameter can be a `string` as workload_key, or directly
+    passing a `SearchTask` as input.
+
+    Parameters
+    ----------
+    task : Union[SearchTask, str]
+        The target search task or workload key.
+    target : tvm.target.Target
+        The target device of this schedule search.
+    target_host : Optional[tvm.target.Target]
+        The target host device of this schedule search.
+    search_policy : Union[SearchPolicy, str] = 'default'
+        The search policy to be used for schedule search.
+    hardware_params : Optional[HardwareParams]
+        The hardware parameters of this schedule search.
+    tuning_options : Optional[TuningOptions]
+        Tuning and measurement options.
+
+    Returns
+    -------
+        A `te.schedule` and the target `te.Tensor`s to be used in `tvm.lower` 
or `tvm.build`

Review comment:
       ```suggestion
           A `te.schedule` and the a list of `te.Tensor` to be used in 
`tvm.lower` or `tvm.build`
   ```
   Do not overuse target because target already has its meaning (i.e. the 
hardware target)

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+            A `te.schedule` and the target `te.Tensor`s to be used in 
`tvm.lower` or `tvm.build`
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGApplyStepsFromState(self, state_obj)
+
+    def print_python_code_from_state(self, state):
+        """
+        Print transform steps in the history of a State as TVM's python 
schedule primitive.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+        str : Str
+            The Python schedule code.
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGPrintPythonCodeFromState(self, state_obj)
+
+    def infer_bound_from_state(self, state):
+        """
+        Infer and fill the bound of all iterators of a state using TVM 
schedule.
+
+        State api supports to define a split step with its split factor to be 
a blank placeholder,
+        so sometimes we may get a State will incomplete iterator extent 
information.
+        And another situation is after some steps (for exp. compute_at), it 
may be hard to track
+        the extent change of all iterators.
+
+        We perform infer bound using TVM schedule and fill the State with 
those information. After
+        applying this methods, the State is guaranteed to have complete 
interator extent
+        information.

Review comment:
       ```suggestion
           The states can lose complete bound information after some transform 
steps (e.g., compute_at).
           We can call this function to infer and fill all the bound 
information.
           This function calls TVM InferBound pass internally to get the bound.
   
           The returned state of this function is guaranteed to have complete 
iterator extent
           information.
   ```
   
   Is `for exp` correct? I saw you use it frequently but I don't think it is a 
correct idiom.  Please use `e.g.,`.

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+            A `te.schedule` and the target `te.Tensor`s to be used in 
`tvm.lower` or `tvm.build`
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGApplyStepsFromState(self, state_obj)
+
+    def print_python_code_from_state(self, state):
+        """
+        Print transform steps in the history of a State as TVM's python 
schedule primitive.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+        str : Str
+            The Python schedule code.
+        """
+        state_obj = state if isinstance(state, StateObject) else 
state.state_object
+        return _ffi_api.ComputeDAGPrintPythonCodeFromState(self, state_obj)
+
+    def infer_bound_from_state(self, state):
+        """
+        Infer and fill the bound of all iterators of a state using TVM 
schedule.
+
+        State api supports to define a split step with its split factor to be 
a blank placeholder,
+        so sometimes we may get a State will incomplete iterator extent 
information.
+        And another situation is after some steps (for exp. compute_at), it 
may be hard to track
+        the extent change of all iterators.
+
+        We perform infer bound using TVM schedule and fill the State with 
those information. After
+        applying this methods, the State is guaranteed to have complete 
interator extent
+        information.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.

Review comment:
       ```suggestion
               The state from which we get transform steps
   ```

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.
+
+        Returns
+        -------
+            A `te.schedule` and the target `te.Tensor`s to be used in 
`tvm.lower` or `tvm.build`

Review comment:
       ```suggestion
               A `te.schedule` and a list of `te.Tensor` to be used in 
`tvm.lower` or `tvm.build`
   ```

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.

Review comment:
       ```suggestion
       and a list of transformation steps used to construct it.
   ```

##########
File path: python/tvm/ansor/compute_dag.py
##########
@@ -0,0 +1,153 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Computational graph and its analysis tools """
+
+import hashlib
+
+import tvm._ffi
+from tvm.runtime import Object
+from tvm.te import PlaceholderOp, ComputeOp
+
+from .loop_state import State, StateObject
+from .utils import get_const_tuple
+from .workload_registry import workload_key_to_tensors
+
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.ComputeDAG")
+class ComputeDAG(Object):
+    """
+    The Ansor computational graph and related program analyses.
+
+    We convert a compute declaration described by `tvm.compute` (could be a 
single operator or a
+    subgraph) to a ComputeDAG. It keeps the input/output tensors of the target 
compute declaration,
+    a list of all related operations in topo order as well as a set of 
analyses over each operation
+    stage (e.g. the total float operation count, consumer/producer relations 
of each operation
+    stage, whether a operation stage should be tiled/compute inlined ...). 
These analyses can
+    help the search policy to do some specific decisions during schedule 
search process.
+
+    ComputeDAG is also responsible for the interaction between Ansor LoopState 
and TVM schedule
+    (e.g. applying the LoopState transform steps to TVM schedule, providing 
LoopState with extra
+    information get from TVM schedule ...).
+
+    Parameters
+    ----------
+    compute : Union[List[Tensor], str]
+        `Tensor`s or workload key for a compute declaration.
+    """
+    def __init__(self, compute):
+        if isinstance(compute, str):
+            compute = workload_key_to_tensors(compute)
+        elif isinstance(compute, list):
+            for item in compute:
+                if not isinstance(item, tvm.te.Tensor):
+                    raise ValueError("The input of ComputeDAG should be a list 
of Tensor")
+        else:
+            raise ValueError("Invalid compute: " + compute +
+                             " . `ComputeDAG` expects a string or list of 
Tensor")
+        self.__init_handle_by_constructor__(_ffi_api.ComputeDAG, compute)
+
+    def get_init_state(self):
+        """ Get the init state of this ComputeDAG.
+
+        Returns
+        -------
+        state : State
+            The initial State without any transform steps.
+        """
+        return State(self.init_state, self)
+
+    def apply_steps_from_state(self, state):
+        """
+        Apply the history transform steps of a State to TVM schedule.
+
+        Parameters
+        ----------
+        state : Union[State, StateObject]
+            The target state to be applied to TVM schedule.

Review comment:
       ```suggestion
               The state from which we get transform steps
   ```
   Do not overuse `target`, it is redundant.

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.

Review comment:
       ```suggestion
       Each State corresponds to a schedule for its ComputeDAG.
   ```

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.
+
+    Parameters
+    ----------
+    state_object : StateObject
+        The target StateObject, corresponding to C++ internal State object.
+    dag : ComputeDAG
+        The original target ComputeDAG of this State.
+
+    Notes
+    -----
+    This is a wrapper class of StateObject to deal with copy-on-write property
+    """
+    def __init__(self, state_object, dag):
+        self.state_object = state_object
+        self.compute_dag = dag
+
+        self.stages_cache = None  # A list to cache all stages
+        self.stage_id_map = {}    # A dict maps operation to stage id
+        self._update_stage_id_map()
+
+    @property
+    def stages(self):
+        """
+        Returns
+        -------
+        stages : List[Stage]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return self.stages_cache
+
+    @property
+    def stage_ops(self):
+        """
+        Returns
+        -------
+        ops: List[Operation]
+        """
+        if not self.stages_cache:
+            self.stages_cache = self.state_object.stages
+        return [stage.op for stage in self.stages_cache]
+
+    def reorder(self, stage, order):
+        """ Schedule primitive corresponds to te.reorder.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be reordered, can be a Stage order index, 
Stage operation or stage
+            output tensor.
+        order : List[Iterator]
+            Iterators in the expected order
+        """
+        stage_id = self._resolve_stage_id(stage)
+
+        self.state_object = _ffi_api.StateReorder(self.state_object, stage_id, 
order)
+        self._clear_cache()
+
+    def split(self, stage, iterator, lengths, inner_to_outer=True):
+        """ Schedule primitive corresponds to te.split.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be split, can be a Stage order index, Stage 
operation or stage
+            output tensor.
+        iterator : Iterator
+            The iterator to split
+        lengths: List[int]
+            The split factors
+        inner_to_outer: bool = True
+            True to use `factor` to split from inner to outer,
+            False to use `nparts` to split from outer to inner
+
+        Returns
+        -------
+        res_its : List[Iterator]
+            The splitted new Iterators
+        """
+        stage_id = self._resolve_stage_id(stage)
+
+        self.state_object, res = _ffi_api.StateSplit(self.state_object, 
stage_id, iterator, lengths,
+                                                     inner_to_outer)
+        self._clear_cache()
+        return res
+
+    def fuse(self, stage, iters):
+        """ Schedule primitive corresponds to te.fuse.
+
+        Parameters
+        ----------
+        stage : Union[int, Operation, Tensor]
+            The target Stage to be fused, can be a Stage order index, Stage 
operation or stage

Review comment:
       ```suggestion
               The Stage to be fused, can be a Stage order index, Stage 
operation or stage
   ```

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.
+
+    Parameters
+    ----------
+    state_object : StateObject
+        The target StateObject, corresponding to C++ internal State object.
+    dag : ComputeDAG
+        The original target ComputeDAG of this State.

Review comment:
       ```suggestion
           The original ComputeDAG of this State.
   ```

##########
File path: python/tvm/ansor/loop_state.py
##########
@@ -0,0 +1,221 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=unused-import
+
+"""
+The definition of the "state" in search.
+
+Each LoopState corresponds to a specific schedule for its target ComputeDAG.
+A LoopState consists of: 1. a current loop structure; 2. a history of 
transformations used to
+construct the loop structure.
+The loop structure keeps a preview of how the schedule will finally look like 
after lowering the
+current state (e.g. number of iterators, the extent of each iterator, the 
compute_at locations ...).
+During the schedule search process, the loop structure can provide search 
policy with necessary
+information on how to perform further operations with the current state.
+The transform history is a sequence of TransformStep which will finally be 
mapped to schedule
+primitives. The steps can also be used for serialization of a state.
+
+The LoopState can be seen as a lightweight loop structure IR specifically for 
schedule search.
+We don't use the existing TVM IR but to extend a new structure on it is 
because:
+1. We want fast incremental change to the loop structures, search policy needs 
to get the immediate
+loop structures update rather than after TVM lowering;
+2. We want serializable transform history for replay, backtracking, and 
mutation;
+3. We may create some macro schedule primitives that represent the combination 
of several
+TVM schedule primitives.
+
+When the search is complete, we will lower the state to TVM IR with TVM's 
schedule primitives.
+Since we share a lot of common objects during search, the transformation is 
implemented in
+copy on write style. All objects are immutable, which is similar to TVM IR.
+"""
+
+import tvm._ffi
+from tvm.te.tensor import Operation, Tensor
+from tvm.runtime import Object
+from . import _ffi_api
+
+
+@tvm._ffi.register_object("ansor.Iterator")
+class Iterator(Object):
+    """ A loop iterator structure. """
+
+
+@tvm._ffi.register_object("ansor.Stage")
+class Stage(Object):
+    """A stage in the compute declaration. Similar to tvm.te.schedule.Stage"""
+
+
+@tvm._ffi.register_object("ansor.State")
+class StateObject(Object):
+    """ The internal State object """
+    def __eq__(self, other):
+        return _ffi_api.StateEqual(self, other)
+
+
+class State:
+    """
+    A state in the search process. It consists of the current loop structure
+    and a history of transformations used to construct it.
+
+    Each State corresponds to a specific schedule for its target ComputeDAG.
+
+    Parameters
+    ----------
+    state_object : StateObject
+        The target StateObject, corresponding to C++ internal State object.

Review comment:
       ```suggestion
           The StateObject corresponding to C++ internal State object.
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-tvm] merrymercy commented on a change in pull request #5962: [Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule generating

Reply via email to