Re: [PR] [BugFix][Ansor] Fixing BroadcastShape function [tvm]

via GitHub Wed, 12 Feb 2025 07:45:33 -0800


thaisacs commented on PR #17627:
URL: https://github.com/apache/tvm/pull/17627#issuecomment-2654104662


   > > > Thanks for the great discussion!
   > > > > This issue occurred more frequently with GoogLeNet, MobileNetV2, 
MobileNetV3, ResNet-152, and InceptionV3 models.
   > > > 
   > > > 
   > > > What do you mean by "more frequently"? IIUC, if it's a TOPI issue, it 
will always or never happen, but not a probability event.
   > > > > InternalError: Check failed: (false) is false: Incompatible 
broadcast dims: 144 and 128 in: [1, 7, 7, 144] and [1, 1, 1, 128]
   > > > 
   > > > 
   > > > The error seems reasonable.
   > > > So I wonder where the root of the issue is. To be clearer, TOPI or 
ansor
   > > 
   > > 
   > > @Hzfengsy
   > > I think it's in the interaction of the two: auto-scheduler and TOPI. The 
auto-scheduler can find schedules for some layers, but not all. When attempting 
to compile the entire model, TOPI needs to deal with the case that the internal 
tensors have different dimensions. Currently, TOPI handles this by stopping the 
compilation process.
   > > Note that, when a tensor's dimension is dynamic and cannot be determined 
at compile time, TOPI prematurely considers the shape of the output tensor.
   > 
   > @thaisacs ,
   > 
   > As practical note, if the amount of sample limit per layer is too low 
metaschedule will fail to propose any valid sketches for layers. Can be seen in 
provided sample here using relax flow (with the attached logs as proof), if 
samples are lowered from 8000 to 1000.
   > 
   > If we "shunt" things like this `boradcast`, there will be more apperantly 
"valid" proposals so searching converges faster, but is not we want, we also 
want a "legit/valid" final form of the tuned model.
   
   @cbalint13 
   
   I think that the broadcast shape function has no impact on the search. It is 
only used for the final compilation of the model.
   
   Aren't invalid schedules removed by the evolutionary search? The searches I 
performed considered 1000 points per layer of the model. For example, in 
resnet_152 with 27 layers, the evolutionary search explored 26968 schedules.
   In my tests, the broadcast shape function did not change the accuracy of the 
model.
   The only model that had not-so-good accuracy was inception_v3, but this 
happens if the broadcast function fails to give an error too.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [BugFix][Ansor] Fixing BroadcastShape function [tvm]

Reply via email to