MasterJH5574 commented on PR #15835:
URL: https://github.com/apache/tvm/pull/15835#issuecomment-1789148670

   Hi Eric, I apologize for missing this for too long. Yes it now works for 
Metal perfectly I think. While we just noticed that the changes in the PR 
breaks the “iPhone” target 
(https://github.com/mlc-ai/mlc-llm/blob/200653a82d025be7d58d0d7f04442f85aee52c98/mlc_llm/utils.py#L542-L561)
 in MLC LLM.
   
   I suppose the issue happens when building end-to-end models: the issue seems 
not reproducible when building a single TIR function on my side. So I have not 
yet got a minimal reproducible code.
   
   While meanwhile, the end-to-end build command in MLC LLM can reproduce the 
issue:
   ```
   python3 -m mlc_llm.build --model Llama-2-7b-chat-hf --target iphone 
--max-seq-len 768 --quantization q4f16_1 --build-model-only
   ```
   
   Here is the error message:
   
   <details>
   <summary>Error message</summary>
   
   ```
   > python3 -m mlc_llm.build --model Llama-2-7b-chat-hf --target iphone 
--max-seq-len 768 --quantization q4f16_1 --build-model-only
   
   Using path "dist/models/Llama-2-7b-chat-hf" for model "Llama-2-7b-chat-hf"
   Target configured: metal -keys=metal,gpu -libs=iphoneos 
-max_function_args=31 -max_num_threads=256 -max_shared_memory_per_block=32768 
-max_threads_per_block=256 -thread_warp_size=1
   Save a cached module to 
dist/Llama-2-7b-chat-hf-q4f16_1/mod_cache_before_build.pkl.
   Compilation error:
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:142:8:
 error: redefinition of 'Llama_2_7b_chat_hf_q4f16_1_transpose6_kernel_args_t'
   struct Llama_2_7b_chat_hf_q4f16_1_transpose6_kernel_args_t {
          ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:8:8: 
note: previous definition is here
   struct Llama_2_7b_chat_hf_q4f16_1_transpose6_kernel_args_t {
          ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:146:78:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
   kernel void Llama_2_7b_chat_hf_q4f16_1_transpose6_kernel(  device half* A [[ 
buffer(0) ]],
                                                                                
^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:147:31:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
     device half* T_transpose [[ buffer(1) ]],
                                 ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:148:72:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
     constant Llama_2_7b_chat_hf_q4f16_1_transpose6_kernel_args_t& arg [[ 
buffer(2) ]],
                                                                          ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:149:19:
 warning: 'threadgroup_position_in_grid' attribute ignored on function 
declaration [-Wignored-attributes]
     uint blockIdx [[threadgroup_position_in_grid]],
                     ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:150:20:
 warning: 'thread_position_in_threadgroup' attribute ignored on function 
declaration [-Wignored-attributes]
     uint threadIdx [[thread_position_in_threadgroup]]
                      ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:152:77:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
   kernel void Llama_2_7b_chat_hf_q4f16_1_extend_te_kernel(  device half* A [[ 
buffer(0) ]],
                                                                               ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:153:29:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
     device half* concat_te [[ buffer(1) ]],
                               ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:154:71:
 warning: 'buffer' attribute ignored on function declaration 
[-Wignored-attributes]
     constant Llama_2_7b_chat_hf_q4f16_1_extend_te_kernel_args_t& arg [[ 
buffer(2) ]],
                                                                         ^
   
/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.metal:155:19:
 warning: 'threadgroup_position_in_grid' attribute ignored on function 
declaration [-Wignored-attributes]
     uint blockIdx [[threadgroup_position_in_grid]],
                     ^
   
   (... many similar warnings are omitted)
   
   245 warnings and 1 error generated.
   LLVM ERROR: Error opening 
'/var/folders/0k/ckmv_5b93fl26rqn3fx8v07w0000gp/T/tmpesw5bpqd/my_lib.air': No 
such file or directory!
   Traceback (most recent call last):
     File 
"/Users/ruihang-macstudio/Workspace/miniforge3/envs/python310/lib/python3.10/runpy.py",
 line 196, in _run_module_as_main
       return _run_code(code, main_globals, None,
     File 
"/Users/ruihang-macstudio/Workspace/miniforge3/envs/python310/lib/python3.10/runpy.py",
 line 86, in _run_code
       exec(code, run_globals)
     File "/Users/ruihang-macstudio/Workspace/mlc-llm/mlc_llm/build.py", line 
47, in <module>
       main()
     File "/Users/ruihang-macstudio/Workspace/mlc-llm/mlc_llm/build.py", line 
43, in main
       core.build_model_from_args(parsed_args)
     File "/Users/ruihang-macstudio/Workspace/mlc-llm/mlc_llm/core.py", line 
742, in build_model_from_args
       build(mod, args)
     File "/Users/ruihang-macstudio/Workspace/mlc-llm/mlc_llm/core.py", line 
629, in build
       ex = relax.build(mod_deploy, args.target, system_lib=args.system_lib)
     File 
"/Users/ruihang-macstudio/Workspace/tvm/python/tvm/relax/vm_build.py", line 
343, in build
       return _vmlink(builder, target, tir_mod, ext_libs, params, 
system_lib=system_lib)
     File 
"/Users/ruihang-macstudio/Workspace/tvm/python/tvm/relax/vm_build.py", line 
242, in _vmlink
       lib = tvm.build(
     File 
"/Users/ruihang-macstudio/Workspace/tvm/python/tvm/driver/build_module.py", 
line 281, in build
       rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
     File 
"/Users/ruihang-macstudio/Workspace/tvm/python/tvm/_ffi/_ctypes/packed_func.py",
 line 239, in __call__
       raise_last_ffi_error()
     File "/Users/ruihang-macstudio/Workspace/tvm/python/tvm/_ffi/base.py", 
line 476, in raise_last_ffi_error
       raise py_err
     File "/Users/ruihang-macstudio/Workspace/tvm/src/driver/driver_api.cc", 
line 527, in tvm::$_5::operator()(tvm::runtime::Map<tvm::Target, tvm::IRModule, 
void, void> const&, tvm::Target) const
       return TIRToRuntime(inputs_arg, host_target);
     File "/Users/ruihang-macstudio/Workspace/tvm/src/driver/driver_api.cc", 
line 510, in tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, 
void, void> const&, tvm::Target const&)
       device_modules.push_back(codegen::Build(device_mod, it.first));
     File "/Users/ruihang-macstudio/Workspace/tvm/src/target/codegen.cc", line 
73, in tvm::codegen::Build(tvm::IRModule, tvm::Target)
       return (*bf)(mod, target);
     File 
"/Users/ruihang-macstudio/Workspace/tvm/src/target/source/codegen_metal.cc", 
line 360, in tvm::codegen::BuildMetal(tvm::IRModule, tvm::Target)
       fsource = (*fmetal_compile)(fsource, target).operator std::string();
   tvm.error.InternalError: Traceback (most recent call last):
     3: tvm::$_5::operator()(tvm::runtime::Map<tvm::Target, tvm::IRModule, 
void, void> const&, tvm::Target) const
           at 
/Users/ruihang-macstudio/Workspace/tvm/src/driver/driver_api.cc:527
     2: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, 
void> const&, tvm::Target const&)
           at 
/Users/ruihang-macstudio/Workspace/tvm/src/driver/driver_api.cc:510
     1: tvm::codegen::Build(tvm::IRModule, tvm::Target)
           at /Users/ruihang-macstudio/Workspace/tvm/src/target/codegen.cc:73
     0: tvm::codegen::BuildMetal(tvm::IRModule, tvm::Target)
           at 
/Users/ruihang-macstudio/Workspace/tvm/src/target/source/codegen_metal.cc:360
     File 
"/Users/ruihang-macstudio/Workspace/tvm/include/tvm/runtime/packed_func.h", 
line 836
   InternalError: Check failed: type_code_ == kTVMStr (4 vs. 11) : expected str 
but got NULL
   ```
   </details>
   
   I locally reverted the PR (https://github.com/mlc-ai/relax/commits) and am 
not gonna revert here, so hopefully the revert will not bother the unity branch 
too much. I appreciate it if you can kindly take a look at this iPhone build 
issue to see if it can be fixed quickly. Thank you so much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to