[PR] Bump onnxruntime.version from 1.20.0 to 1.21.0 (opennlp)

via GitHub Sun, 09 Mar 2025 21:44:38 -0700


dependabot[bot] opened a new pull request, #752:
URL: https://github.com/apache/opennlp/pull/752


   Bumps `onnxruntime.version` from 1.20.0 to 1.21.0.
   Updates `com.microsoft.onnxruntime:onnxruntime` from 1.20.0 to 1.21.0
   <details>
   <summary>Release notes</summary>
   <p><em>Sourced from <a 
href="https://github.com/microsoft/onnxruntime/releases";>com.microsoft.onnxruntime:onnxruntime's
 releases</a>.</em></p>
   <blockquote>
   <h2>ONNX Runtime v1.21</h2>
   <h2>Announcements</h2>
   <ul>
   <li>No large announcements of note this release! We've made a lot of small 
refinements to streamline your ONNX Runtime experience.</li>
   </ul>
   <h2>GenAI &amp; Advanced Model Features</h2>
   <h3>Enhanced Decoding &amp; Pipeline Support</h3>
   <ul>
   <li>Added &quot;chat mode&quot; support for CPU, GPU, and WebGPU.</li>
   <li>Provided support for decoder model pipelines.</li>
   <li>Added support for Java API for MultiLoRA.</li>
   </ul>
   <h3>API &amp; Compatibility Updates</h3>
   <ul>
   <li>Chat mode introduced breaking changes in the API (see <a 
href="https://onnxruntime.ai/docs/genai/howto/migrate.html";>migration 
guide</a>).</li>
   </ul>
   <h3>Bug Fixes for Model Output</h3>
   <ul>
   <li>Fixed Phi series garbage output issues with long prompts.</li>
   <li>Resolved gibberish issues with <code>top_k</code> on CPU.</li>
   </ul>
   <h2>Execution &amp; Core Optimizations</h2>
   <h3>Core Refinements</h3>
   <ul>
   <li>Reduced default logger usage for improved efficiency(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23030";>#23030</a>).</li>
   <li>Fixed a visibility issue in theadpool (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23098";>#23098</a>).</li>
   </ul>
   <h3>Execution Provider (EP) Updates</h3>
   <h4>General</h4>
   <ul>
   <li>Removed TVM EP from the source tree(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/22827";>#22827</a>).</li>
   <li>Marked NNAPI EP for deprecation (following Google's deprecation of 
NNAPI).</li>
   <li>Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's 
usability on Windows (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23111";>#23111</a>,
 <a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23227";>#23227</a>)</li>
   </ul>
   <h4>TensorRT EP Improvements</h4>
   <ul>
   <li>Added support for TensorRT 10.8.</li>
   <li>Assigned DDS ops (<code>NMS</code>, <code>RoiAlign</code>, 
<code>NonZero</code>) to TensorRT by default.</li>
   <li>Introduced option <code>trt_op_types_to_exclude</code> to exclude 
specific ops from TensorRT assignment.</li>
   </ul>
   <h4>QNN EP Improvements</h4>
   <ul>
   <li>Introduced QNN shared memory support.</li>
   <li>Improved performance for AI Hub models.</li>
   <li>Added support for QAIRT/QNN SDK 2.31.</li>
   <li>Added Python 3.13 package.</li>
   <li>Miscellaneous bug fixes and enhancements.</li>
   <li>QNN EP is now built as a shared library/DLL by default. To retain 
previous build behavior, use build option <code>--use_qnn 
static_lib</code>.</li>
   </ul>
   <h4>DirectML EP Support &amp; Upgrades</h4>
   <ul>
   <li>Updated DirectML version from 1.15.2 to 1.15.4(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/22635";>#22635</a>).</li>
   </ul>
   <h4>OpenVINO EP Improvements</h4>
   <ul>
   <li>Introduced OpenVINO EP Weights Sharing feature.</li>
   <li>Added support for various contrib Ops in OVEP:
   <ul>
   <li><code>SkipLayerNormalization</code>, <code>MatMulNBits</code>, 
<code>FusedGemm</code>, <code>FusedConv</code>, 
<code>EmbedLayerNormalization</code>, <code>BiasGelu</code>, 
<code>Attention</code>, <code>DynamicQuantizeMatMul</code>, 
<code>FusedMatMul</code>, <code>QuickGelu</code>, 
<code>SkipSimplifiedLayerNormalization</code></li>
   </ul>
   </li>
   </ul>
   <!-- raw HTML omitted -->
   </blockquote>
   <p>... (truncated)</p>
   </details>
   <details>
   <summary>Commits</summary>
   <ul>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/e0b66cad282043d4377cea5269083f17771b6dfc";><code>e0b66ca</code></a>
 Round 2 of cherry-picks into rel-1.21.0 (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23899";>#23899</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/beb1a9242eaf46ef885f86c75202a94b13dab428";><code>beb1a92</code></a>
 Cherry-picks into rel-1.21.0 (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23846";>#23846</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/98511b0fe805f197b34945ab648bda370b07cbd8";><code>98511b0</code></a>
 Set build user's uid when creating Migraphx/ROCM docker images (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23657";>#23657</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/23f787ea7568e48e0e741315f34dfef028c8b430";><code>23f787e</code></a>
 [TensorRT EP] Add new provider option to exclude ops from running on TRT (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23";>#23</a>...</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/1b0a2ba431e5ac732680f6a1d0c7303c71ae536a";><code>1b0a2ba</code></a>
 Update cmake_cuda_architecture to control package size (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23671";>#23671</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/8eb5513be6dade1a91408313c5dd18d2dbeaef90";><code>8eb5513</code></a>
 [webgpu] Implement SubGroupMatrix based MatMulNBits for Metal (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23729";>#23729</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/d82604e802a91af0798db8fca404b10e56e46f20";><code>d82604e</code></a>
 [Optimizer] Fix exception for Q -&gt; DQ sequence with different scale types 
(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/2";>#2</a>...</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/754ee21f83518bf127ba481cf1bedf58ee3b5374";><code>754ee21</code></a>
 OVEP: Bug Fixes, Refactoring, and Contrib Ops Update (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23742";>#23742</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/6715d4ca35efbbd25bc565aac4628b00ff8d3e07";><code>6715d4c</code></a>
 Shape inference: GatherBlockQuantized dispatcher (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23748";>#23748</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/75cf166b25b2a69f229ce8863b7973b5405bf82f";><code>75cf166</code></a>
 [QNN EP] Passthrough EP Parameters in Node (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23468";>#23468</a>)</li>
   <li>Additional commits viewable in <a 
href="https://github.com/microsoft/onnxruntime/compare/v1.20.0...v1.21.0";>compare
 view</a></li>
   </ul>
   </details>
   <br />
   
   Updates `com.microsoft.onnxruntime:onnxruntime_gpu` from 1.20.0 to 1.21.0
   <details>
   <summary>Release notes</summary>
   <p><em>Sourced from <a 
href="https://github.com/microsoft/onnxruntime/releases";>com.microsoft.onnxruntime:onnxruntime_gpu's
 releases</a>.</em></p>
   <blockquote>
   <h2>ONNX Runtime v1.21</h2>
   <h2>Announcements</h2>
   <ul>
   <li>No large announcements of note this release! We've made a lot of small 
refinements to streamline your ONNX Runtime experience.</li>
   </ul>
   <h2>GenAI &amp; Advanced Model Features</h2>
   <h3>Enhanced Decoding &amp; Pipeline Support</h3>
   <ul>
   <li>Added &quot;chat mode&quot; support for CPU, GPU, and WebGPU.</li>
   <li>Provided support for decoder model pipelines.</li>
   <li>Added support for Java API for MultiLoRA.</li>
   </ul>
   <h3>API &amp; Compatibility Updates</h3>
   <ul>
   <li>Chat mode introduced breaking changes in the API (see <a 
href="https://onnxruntime.ai/docs/genai/howto/migrate.html";>migration 
guide</a>).</li>
   </ul>
   <h3>Bug Fixes for Model Output</h3>
   <ul>
   <li>Fixed Phi series garbage output issues with long prompts.</li>
   <li>Resolved gibberish issues with <code>top_k</code> on CPU.</li>
   </ul>
   <h2>Execution &amp; Core Optimizations</h2>
   <h3>Core Refinements</h3>
   <ul>
   <li>Reduced default logger usage for improved efficiency(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23030";>#23030</a>).</li>
   <li>Fixed a visibility issue in theadpool (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23098";>#23098</a>).</li>
   </ul>
   <h3>Execution Provider (EP) Updates</h3>
   <h4>General</h4>
   <ul>
   <li>Removed TVM EP from the source tree(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/22827";>#22827</a>).</li>
   <li>Marked NNAPI EP for deprecation (following Google's deprecation of 
NNAPI).</li>
   <li>Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's 
usability on Windows (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23111";>#23111</a>,
 <a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23227";>#23227</a>)</li>
   </ul>
   <h4>TensorRT EP Improvements</h4>
   <ul>
   <li>Added support for TensorRT 10.8.</li>
   <li>Assigned DDS ops (<code>NMS</code>, <code>RoiAlign</code>, 
<code>NonZero</code>) to TensorRT by default.</li>
   <li>Introduced option <code>trt_op_types_to_exclude</code> to exclude 
specific ops from TensorRT assignment.</li>
   </ul>
   <h4>QNN EP Improvements</h4>
   <ul>
   <li>Introduced QNN shared memory support.</li>
   <li>Improved performance for AI Hub models.</li>
   <li>Added support for QAIRT/QNN SDK 2.31.</li>
   <li>Added Python 3.13 package.</li>
   <li>Miscellaneous bug fixes and enhancements.</li>
   <li>QNN EP is now built as a shared library/DLL by default. To retain 
previous build behavior, use build option <code>--use_qnn 
static_lib</code>.</li>
   </ul>
   <h4>DirectML EP Support &amp; Upgrades</h4>
   <ul>
   <li>Updated DirectML version from 1.15.2 to 1.15.4(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/22635";>#22635</a>).</li>
   </ul>
   <h4>OpenVINO EP Improvements</h4>
   <ul>
   <li>Introduced OpenVINO EP Weights Sharing feature.</li>
   <li>Added support for various contrib Ops in OVEP:
   <ul>
   <li><code>SkipLayerNormalization</code>, <code>MatMulNBits</code>, 
<code>FusedGemm</code>, <code>FusedConv</code>, 
<code>EmbedLayerNormalization</code>, <code>BiasGelu</code>, 
<code>Attention</code>, <code>DynamicQuantizeMatMul</code>, 
<code>FusedMatMul</code>, <code>QuickGelu</code>, 
<code>SkipSimplifiedLayerNormalization</code></li>
   </ul>
   </li>
   </ul>
   <!-- raw HTML omitted -->
   </blockquote>
   <p>... (truncated)</p>
   </details>
   <details>
   <summary>Commits</summary>
   <ul>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/e0b66cad282043d4377cea5269083f17771b6dfc";><code>e0b66ca</code></a>
 Round 2 of cherry-picks into rel-1.21.0 (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23899";>#23899</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/beb1a9242eaf46ef885f86c75202a94b13dab428";><code>beb1a92</code></a>
 Cherry-picks into rel-1.21.0 (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23846";>#23846</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/98511b0fe805f197b34945ab648bda370b07cbd8";><code>98511b0</code></a>
 Set build user's uid when creating Migraphx/ROCM docker images (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23657";>#23657</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/23f787ea7568e48e0e741315f34dfef028c8b430";><code>23f787e</code></a>
 [TensorRT EP] Add new provider option to exclude ops from running on TRT (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23";>#23</a>...</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/1b0a2ba431e5ac732680f6a1d0c7303c71ae536a";><code>1b0a2ba</code></a>
 Update cmake_cuda_architecture to control package size (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23671";>#23671</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/8eb5513be6dade1a91408313c5dd18d2dbeaef90";><code>8eb5513</code></a>
 [webgpu] Implement SubGroupMatrix based MatMulNBits for Metal (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23729";>#23729</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/d82604e802a91af0798db8fca404b10e56e46f20";><code>d82604e</code></a>
 [Optimizer] Fix exception for Q -&gt; DQ sequence with different scale types 
(<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/2";>#2</a>...</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/754ee21f83518bf127ba481cf1bedf58ee3b5374";><code>754ee21</code></a>
 OVEP: Bug Fixes, Refactoring, and Contrib Ops Update (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23742";>#23742</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/6715d4ca35efbbd25bc565aac4628b00ff8d3e07";><code>6715d4c</code></a>
 Shape inference: GatherBlockQuantized dispatcher (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23748";>#23748</a>)</li>
   <li><a 
href="https://github.com/microsoft/onnxruntime/commit/75cf166b25b2a69f229ce8863b7973b5405bf82f";><code>75cf166</code></a>
 [QNN EP] Passthrough EP Parameters in Node (<a 
href="https://redirect.github.com/microsoft/onnxruntime/issues/23468";>#23468</a>)</li>
   <li>Additional commits viewable in <a 
href="https://github.com/microsoft/onnxruntime/compare/v1.20.0...v1.21.0";>compare
 view</a></li>
   </ul>
   </details>
   <br />
   
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   <details>
   <summary>Dependabot commands and options</summary>
   <br />
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show <dependency name> ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Bump onnxruntime.version from 1.20.0 to 1.21.0 (opennlp)

Reply via email to