I have a network where I optimized it with autotvm for iphone using the gpu.
Upon deploying to the phone on the app I developed, the phone crashed during 
the Metal API validation step.  It logged that the maximum threads (1024) were 
reached.  -_[MTLDebugComputeCommandEncoder 
_validateThreadsPerThreadgroup:]:906: failed assertion 
`(threadsPerThreadgroup.width(1) * threadsPerThreadgroup.height(34) * 
threadsPerThreadgroup.depth(34))(1156) must be <= 1024. (device threadgroup 
size limit)'_

I turned off the validation step, and the network completed.  Unfortunately, 
the result was not correct.  (With an unoptimized net, I verified that this is 
correct.)
Examining the metal codegen, it seems that there is no check for maximum 
threads and code to work around this.

My question is what could be causing the incorrectness of the network result 
and if there is something in this thread that may be contributing to it.

Thanks, --C





---
[Visit 
Topic](http://tracking.discuss.tvm.ai/tracking/click?d=RrWWa6ZnID_g8FliIRahC84qkT4V0S-elErnSHltKIrFJBu5CtjmxBS9xGlvwYHeCZ3Eho8Jnr9I-O-60340pbYpTMp8char_BSd6Rm1E6z-odeN2Tx_qqjYhe7nSBST5G8XDSkJ6c-r2Ww94ORr2qrzGL-fRMM48B4ROI7yxX6X0)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](http://tracking.discuss.tvm.ai/tracking/click?d=7cFgOaAA4XIBVlVKt_oyC07uihTjg4Q6cjeBRNRTiPq1fjxjdESMQRgj7zvNvoIl-KqmYOtoB5XwXgzcVXEvwsnHsxwIME0AdF2Kmc0eAMQmhu4rSVT7Js7yEYoBbCPuduz3smvFmLgsn6-VAfmzUo_gO1D6vsAUJAEQVmOmkh6UFxOcBSv_p5lLVCe5lhoa8ruV0GgeY4BwkdPeOEhS4ow6cExG6tel2moPFzGnrc3I0).

Tianqi Chen, UW, Seattle, WA, 98105, United States
http://tracking.discuss.tvm.ai/tracking/unsubscribe?msgid=c12fjZXUdMbPrRxHNEa1tQ2

Reply via email to