This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push: new 3ccd3407d8 Publish built docs triggered by 5a861424c89414b40303a646e949b464b8ca5648 3ccd3407d8 is described below commit 3ccd3407d83648c35828a18b6bafb6cf72880e8f Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com> AuthorDate: Sun Jun 1 16:49:38 2025 +0000 Publish built docs triggered by 5a861424c89414b40303a646e949b464b8ca5648 --- _sources/library-user-guide/upgrading.md.txt | 49 +++++++++++++++++++++++ library-user-guide/upgrading.html | 60 ++++++++++++++++++++++++++++ searchindex.js | 2 +- 3 files changed, 110 insertions(+), 1 deletion(-) diff --git a/_sources/library-user-guide/upgrading.md.txt b/_sources/library-user-guide/upgrading.md.txt index 0c9c13a0ce..dd02045cf2 100644 --- a/_sources/library-user-guide/upgrading.md.txt +++ b/_sources/library-user-guide/upgrading.md.txt @@ -21,6 +21,55 @@ ## DataFusion `48.0.0` +### The `VARCHAR` SQL type is now represented as `Utf8View` in Arrow. + +The mapping of the SQL `VARCHAR` type has been changed from `Utf8` to `Utf8View` +which improves performance for many string operations. You can read more about +`Utf8View` in the [DataFusion blog post on German-style strings] + +[datafusion blog post on german-style strings]: https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-1/ + +This means that when you create a table with a `VARCHAR` column, it will now use +`Utf8View` as the underlying data type. For example: + +```sql +> CREATE TABLE my_table (my_column VARCHAR); +0 row(s) fetched. +Elapsed 0.001 seconds. + +> DESCRIBE my_table; ++-------------+-----------+-------------+ +| column_name | data_type | is_nullable | ++-------------+-----------+-------------+ +| my_column | Utf8View | YES | ++-------------+-----------+-------------+ +1 row(s) fetched. +Elapsed 0.000 seconds. +``` + +You can restore the old behavior of using `Utf8` by changing the +`datafusion.sql_parser.map_varchar_to_utf8view` configuration setting. For +example + +```sql +> set datafusion.sql_parser.map_varchar_to_utf8view = false; +0 row(s) fetched. +Elapsed 0.001 seconds. + +> CREATE TABLE my_table (my_column VARCHAR); +0 row(s) fetched. +Elapsed 0.014 seconds. + +> DESCRIBE my_table; ++-------------+-----------+-------------+ +| column_name | data_type | is_nullable | ++-------------+-----------+-------------+ +| my_column | Utf8 | YES | ++-------------+-----------+-------------+ +1 row(s) fetched. +Elapsed 0.004 seconds. +``` + ### `ListingOptions` default for `collect_stat` changed from `true` to `false` This makes it agree with the default for `SessionConfig`. diff --git a/library-user-guide/upgrading.html b/library-user-guide/upgrading.html index 8262a4d8a7..605f8d3227 100644 --- a/library-user-guide/upgrading.html +++ b/library-user-guide/upgrading.html @@ -537,6 +537,23 @@ </code> </a> <ul class="nav section-nav flex-column"> + <li class="toc-h3 nav-item toc-entry"> + <a class="reference internal nav-link" href="#the-varchar-sql-type-is-now-represented-as-utf8view-in-arrow"> + The + <code class="docutils literal notranslate"> + <span class="pre"> + VARCHAR + </span> + </code> + SQL type is now represented as + <code class="docutils literal notranslate"> + <span class="pre"> + Utf8View + </span> + </code> + in Arrow. + </a> + </li> <li class="toc-h3 nav-item toc-entry"> <a class="reference internal nav-link" href="#listingoptions-default-for-collect-stat-changed-from-true-to-false"> <code class="docutils literal notranslate"> @@ -847,6 +864,49 @@ <h1>Upgrade Guides<a class="headerlink" href="#upgrade-guides" title="Link to this heading">¶</a></h1> <section id="datafusion-48-0-0"> <h2>DataFusion <code class="docutils literal notranslate"><span class="pre">48.0.0</span></code><a class="headerlink" href="#datafusion-48-0-0" title="Link to this heading">¶</a></h2> +<section id="the-varchar-sql-type-is-now-represented-as-utf8view-in-arrow"> +<h3>The <code class="docutils literal notranslate"><span class="pre">VARCHAR</span></code> SQL type is now represented as <code class="docutils literal notranslate"><span class="pre">Utf8View</span></code> in Arrow.<a class="headerlink" href="#the-varchar-sql-type-is-now-represented-as-utf8view-in-arrow" title="Link to this heading">¶</a></h3> +<p>The mapping of the SQL <code class="docutils literal notranslate"><span class="pre">VARCHAR</span></code> type has been changed from <code class="docutils literal notranslate"><span class="pre">Utf8</span></code> to <code class="docutils literal notranslate"><span class="pre">Utf8View</span></code> +which improves performance for many string operations. You can read more about +<code class="docutils literal notranslate"><span class="pre">Utf8View</span></code> in the <a class="reference external" href="https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-1/">DataFusion blog post on German-style strings</a></p> +<p>This means that when you create a table with a <code class="docutils literal notranslate"><span class="pre">VARCHAR</span></code> column, it will now use +<code class="docutils literal notranslate"><span class="pre">Utf8View</span></code> as the underlying data type. For example:</p> +<div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">></span><span class="w"> </span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">my_table</span><span class="w"> </span><span class="p">(</span><span class="n">my_column</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">);</span> +<span class="mi">0</span><span class="w"> </span><span class="k">row</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="n">fetched</span><span class="p">.</span> +<span class="n">Elapsed</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">001</span><span class="w"> </span><span class="n">seconds</span><span class="p">.</span> + +<span class="o">></span><span class="w"> </span><span class="k">DESCRIBE</span><span class="w"> </span><span class="n">my_table</span><span class="p">;</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="o">|</span><span class="w"> </span><span class="k">column_name</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">data_type</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">is_nullable</span><span class="w"> </span><span class="o">|</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="o">|</span><span class="w"> </span><span class="n">my_column</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Utf8View</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">YES</span><span class="w"> </span><span class="o">|</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="n">fetched</span><span class="p">.</span> +<span class="n">Elapsed</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">000</span><span class="w"> </span><span class="n">seconds</span><span class="p">.</span> +</pre></div> +</div> +<p>You can restore the old behavior of using <code class="docutils literal notranslate"><span class="pre">Utf8</span></code> by changing the +<code class="docutils literal notranslate"><span class="pre">datafusion.sql_parser.map_varchar_to_utf8view</span></code> configuration setting. For +example</p> +<div class="highlight-sql notranslate"><div class="highlight"><pre><span></span><span class="o">></span><span class="w"> </span><span class="k">set</span><span class="w"> </span><span class="n">datafusion</span><span class="p">.</span><span class="n">sql_parser</span><span class="p">.</span><span class="n">map_varchar_to_utf8view</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="k">false</span><span class="p">;</span> +<span class="mi">0</span><span class="w"> </span><span class="k">row</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="n">fetched</span><span class="p">.</span> +<span class="n">Elapsed</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">001</span><span class="w"> </span><span class="n">seconds</span><span class="p">.</span> + +<span class="o">></span><span class="w"> </span><span class="k">CREATE</span><span class="w"> </span><span class="k">TABLE</span><span class="w"> </span><span class="n">my_table</span><span class="w"> </span><span class="p">(</span><span class="n">my_column</span><span class="w"> </span><span class="nb">VARCHAR</span><span class="p">);</span> +<span class="mi">0</span><span class="w"> </span><span class="k">row</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="n">fetched</span><span class="p">.</span> +<span class="n">Elapsed</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">014</span><span class="w"> </span><span class="n">seconds</span><span class="p">.</span> + +<span class="o">></span><span class="w"> </span><span class="k">DESCRIBE</span><span class="w"> </span><span class="n">my_table</span><span class="p">;</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="o">|</span><span class="w"> </span><span class="k">column_name</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">data_type</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">is_nullable</span><span class="w"> </span><span class="o">|</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="o">|</span><span class="w"> </span><span class="n">my_column</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Utf8</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">YES</span><span class="w"> </span><span class="o">|</span> +<span class="o">+</span><span class="c1">-------------+-----------+-------------+</span> +<span class="mi">1</span><span class="w"> </span><span class="k">row</span><span class="p">(</span><span class="n">s</span><span class="p">)</span><span class="w"> </span><span class="n">fetched</span><span class="p">.</span> +<span class="n">Elapsed</span><span class="w"> </span><span class="mi">0</span><span class="p">.</span><span class="mi">004</span><span class="w"> </span><span class="n">seconds</span><span class="p">.</span> +</pre></div> +</div> +</section> <section id="listingoptions-default-for-collect-stat-changed-from-true-to-false"> <h3><code class="docutils literal notranslate"><span class="pre">ListingOptions</span></code> default for <code class="docutils literal notranslate"><span class="pre">collect_stat</span></code> changed from <code class="docutils literal notranslate"><span class="pre">true</span></code> to <code class="docutils literal notranslate"><span class="pre">false</span></code><a class="headerlink" href="#listingoptions-default-for-collect-stat-changed-from-true-to-false" title="Link to this headi [...] <p>This makes it agree with the default for <code class="docutils literal notranslate"><span class="pre">SessionConfig</span></code>. diff --git a/searchindex.js b/searchindex.js index 507ba1ed18..f42717fecb 100644 --- a/searchindex.js +++ b/searchindex.js @@ -1 +1 @@ -Search.setIndex({"alltitles":{"!=":[[54,"op-neq"]],"!~":[[54,"op-re-not-match"]],"!~*":[[54,"op-re-not-match-i"]],"!~~":[[54,"id19"]],"!~~*":[[54,"id20"]],"#":[[54,"op-bit-xor"]],"%":[[54,"op-modulo"]],"&":[[54,"op-bit-and"]],"(relation, name) tuples in logical fields and logical columns are unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[54,"op-multiply"]],"+":[[54,"op-plus"]],"-":[[54,"op-minus"]],"/":[[54,"op-divide"]],"2022 Q2":[[10,"q2"]] [...] \ No newline at end of file +Search.setIndex({"alltitles":{"!=":[[54,"op-neq"]],"!~":[[54,"op-re-not-match"]],"!~*":[[54,"op-re-not-match-i"]],"!~~":[[54,"id19"]],"!~~*":[[54,"id20"]],"#":[[54,"op-bit-xor"]],"%":[[54,"op-modulo"]],"&":[[54,"op-bit-and"]],"(relation, name) tuples in logical fields and logical columns are unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[54,"op-multiply"]],"+":[[54,"op-plus"]],"-":[[54,"op-minus"]],"/":[[54,"op-divide"]],"2022 Q2":[[10,"q2"]] [...] \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org